Artificial intelligence still struggles to adapt to modern video games, researchers say
Study finds AI excels in chess but struggles to adapt to unfamiliar video games, exposing limits in real-world intelligence.
Artificial intelligence has achieved remarkable success in structured games such as chess and Go, yet new research suggests these milestones may overstate the technology’s broader capabilities. A recent study from New York University argues that while such achievements appear impressive, they do not reflect genuine progress towards general intelligence.
Table Of Content
The researchers highlight a key limitation: AI systems often fail when faced with unfamiliar environments, particularly modern video games they have not previously encountered. This shortcoming raises questions about how adaptable current AI models truly are, despite their highly publicised victories in controlled settings.
Fixed environments mask AI limitations
The study points out that games like chess and Go offer fixed rules and clearly defined structures, making them well-suited to AI optimisation. Within these constraints, machines can be trained to achieve superhuman performance by analysing vast numbers of possible moves and outcomes.
However, this success does not translate easily to more complex and dynamic environments. Modern video games typically involve changing scenarios, multiple objectives, and a need for flexible thinking. According to the researchers, these qualities expose weaknesses in AI systems that are otherwise hidden in simpler, rule-based games.
They argue that the distinction is critical. While beating a grandmaster at chess may suggest intelligence, it does not equate to the ability to adapt or learn in unpredictable situations. Video games, by contrast, demand a mix of skills, including spatial reasoning, long-term planning, experimentation, and sometimes even social understanding.
The report suggests that this diversity makes gaming a more realistic test of intelligence than isolated benchmarks. It demonstrates how well a system can cope with novelty rather than how effectively it can master a single, well-defined task.
Reinforcement learning and language models fall short
The paper also examines the performance of two major AI approaches: reinforcement learning and large language models. Reinforcement learning has produced notable successes in gaming, but these typically require millions or even billions of simulated attempts.
Such extensive training allows the system to become highly specialised in a particular environment. Yet this expertise is fragile. Even minor changes, such as altered colours or repositioned objects, can cause performance to deteriorate rapidly. The researchers note that this lack of robustness limits the real-world usefulness of such systems.
Large language models, which have gained attention for their ability to generate human-like text and assist with coding, do not resolve this issue. According to the study, these models perform poorly when introduced to unfamiliar games without additional support.
In cases where language models do show competence, they often rely on customised frameworks designed specifically for each game. These frameworks help interpret game states, manage memory, and execute actions. Without this scaffolding, the models’ performance declines significantly, suggesting that their apparent success depends heavily on external support.
The findings indicate that neither approach currently delivers the level of adaptability required for general intelligence—both struggle when removed from carefully controlled or pre-configured scenarios.
Video games as a benchmark for true intelligence
The researchers propose that a more meaningful measure of AI capability would be its ability to learn a new game from scratch within a timeframe similar to that of a human player. This would involve mastering rules, strategies, and objectives in tens of hours rather than through extensive simulation or prior training.
At present, no AI system meets this standard. The gap highlights a fundamental limitation in how machines learn and adapt compared to humans. While people can quickly grasp new concepts and adjust to unfamiliar situations, AI systems remain heavily reliant on predefined data and environments.
This limitation has implications beyond gaming. The study argues that if AI cannot reliably handle the variability of a new video game, it is unlikely to perform well in the unpredictable conditions of the real world. Tasks such as navigating complex environments or responding to unexpected events require a level of flexibility that current systems have yet to achieve.
The researchers conclude that headline achievements in games like chess should be viewed with caution. Although they demonstrate technical progress, they do not represent a comprehensive measure of intelligence. Modern video games, with their complexity and variability, reveal just how far AI still has to go before it can match human adaptability.





