Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the thegem domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/j8p72agj2cgw/fanaticalfuturist.com/wp-includes/functions.php on line 6121

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wp-2fa domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/j8p72agj2cgw/fanaticalfuturist.com/wp-includes/functions.php on line 6121
Apple report says powerful AI models aren't reasoning at all – Matthew Griffin | Keynote Speaker & Master Futurist
Scroll Top

Apple report says powerful AI models aren’t reasoning at all

WHY THIS MATTERS IN BRIEF

Reaching AGI largely depends on AI being able to reason in the same way as humans – but according to Apple AI models are still brute forcing pattern recognition instead.

 

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trendsconnect, watch a keynote, or browse my blog.

Apple – the company behind the notoriously awful Siri Artificial Intelligence (AI) bot – has claimed that new-age AI reasoning models might not be as smart as they have been made out to be. In a study titled, The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity, the tech giant controversially claimed that reasoning models like Claude, DeepSeek-R1, and o3-mini do not actually reason at all.

 

RELATED
Nvidia CEO says their new AI chips are beating Moore's Law

 

Apple claimed that these models simply memorise patterns really well, but when the questions are altered or the complexity increased, they collapse altogether. In simple terms, the models work great when they are able to match patterns, but once patterns become too complex, they fall away.

“Through extensive experimentation across diverse puzzles, we show that frontier Large Reasoning Models (LRMs) face a complete accuracy collapse beyond certain complexities,” the study highlighted.

“Moreover, they exhibit a counter-intuitive scaling limit: their reasoning effort increases with problem complexity up to a point, then declines despite having an adequate token budget,” it added.

For the study, the researchers flipped the script on the type of questions that reasoning models usually answer. Instead of the same old math tests, the models were presented with cleverly constructed puzzle games such as Tower of Hanoi, Checker Jumping, River Crossing, and Blocks World.

 

RELATED
Sweat sensor for wearables opens up a treasure trove of healthcare data

 

Each puzzle had simple, well-defined rules, and as the complexity was increased (like more disks, more blocks, more actors), the models needed to plan deeper and reason longer. The findings revealed three regimes: low complexity – regular models actually win, medium complexity – thinking models show some advantage, and high complexity – everything breaks down completely.

Apple reasoned that if the reasoning models were truly ‘reasoning’, they would be able to get better with more computing power and clear instructions. However, they started hitting walls and gave up, even when provided solutions.

“When we provided the solution algorithm for the Tower of Hanoi to the models, their performance on this puzzle did not improve,” the study stated, adding: “Moreover, investigating the first failure move of the models revealed surprising behaviours. For instance, they could perform up to 100 correct moves in the Tower of Hanoi but fail to provide more than 5 correct moves in the River Crossing puzzle.”

 

RELATED
US military genetics program to turn marine life into giant living sensor networks

 

With talks surrounding human-level AI arriving as soon as next year, popularly referred to as Artificial General Intelligence (AGI), Apple’s study suggests that we might be further away from realising AGI than the big giants in the industry say … so we’ll see who’s right.

Related Posts

Leave a comment

Pin It on Pinterest

Share This