Fabrizio Degni’s Post

I believe you have already read "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity," a paper released in June 2025 by Apple researchers under the spot because it enforced the idea that advanced AI systems known as Large Reasoning Models have a "complete accuracy collapse" when faced with high-complexity tasks and even reduce their reasoning effort past a certain difficulty threshold, challenging the wonderful tale that these models truly "think." 🛜 Source: https://shorturl.at/bTzwt So... where is the news? It seems we already forget about papers such as: - Stochastic Parrots and Pattern Matching Back in 2021, Emily Bender and her colleagues introduced the term "stochastic parrots" to describe LLMs. Why? These models excel at identifying statistical patterns within their extensive training datasets but they lack of a comprehension and understanding of the content they generate. The study from Apple corroborates this, demonstrating that LLMs perform optimally within their established pattern-matching capabilities but fail when tasks require a real understanding beyond rote memorization. - AI Skepticism by Gary Marcus Gary Marcus has been skeptic of deep learning's potential for years, thinking that it lacks the symbolic, systematic scaffolding needed for a true general intelligence: human reasoning relies on structured frameworks (think logic, rules, and abstract concept) while deep learning leans heavily on statistical correlations. The Apple study reinforces Marcus's point: when LLMs face complex problems requiring systematic planning, their lack of a deeper framework leaves them floundering. - Synthetic Reasoning Datasets In 2022, Abulhair Saparov and He He launched PrOntoQA, a synthetic dataset designed to test logical reasoning in LLMs. They found that while these models can handle individual deduction steps (e.g., “If A, then B”), they fail totally at proof planning figuring out which path to take among multiple valid options. Apple's research builds on this, and shows that LLMs not only struggle with planning but also give up when the going gets tough, reducing their effort instead of rising to the challenge. This might feel like déjà vu because, in many ways, it is but Apple's paper gets to this basis to add a new perspective: the problem of complexity. - Accuracy collapse: LRMs in the high-complexity scenarios have performance dropping to near-zero accuracy. - Effort reduction: Unlike humans, who might double down when faced with a tough problem, LRMs slack off. - The illusion exposed: The paper's title says it all. What we perceive as “thinking” is really a façade of fluency, built on pattern recognition rather than genuine cognitive processes. Ethically, it is misleading to say "reasoning" or "thinking" as it creates a cascade of problems discussed in the 1st comments. #ArtificialIntelligence #Illusion #AI #Reasoning #LLMs #Innovation #Apple

  • shape, circle

As anticipated it is misleading and not acceptable the adoption of human-like therms such as #reasoning or #thinking where in reality it doesn't happen! As first I could saying we have a problem of #overtrust and misplaced reliance, the perception of a system that actually "thinks" could really let us assume these tools can handle complex decision-making reliably... NO! The Apple research reveals this assumption to be dangerously #false. We cannot make informed #decisions about AI assistance if the capabilities are fundamentally misrepresented because from someone / something described as "thinking," we reasonably expect: - Consistent performance across difficulty levels - The ability to recognize when problems exceed capabilities - Transparent communication about confidence levels But once again it is a NO! Let's call the things with the real name: that's just and pure #marketing than accurate description. The current #AIGovernance frameworks often assume capabilities will improve continuously with scale and compute. The Apple research suggests this assumption is flawe, so regulations built on assumptions may need complete revision. The paper: https://machinelearning.apple.com/research/illusion-of-thinking #AIEthics

  • No alternative text description for this image

In the main post I mentioned 3 research/point of view, here is where to find them: 📚 On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 From Emily M. Bender , Timnit Gebru , Angelina McMillan-Major, Shmargaret Shmitchell 🌐 Source: https://dl.acm.org/doi/10.1145/3442188.3445922 "Using these pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the Environmental and financial costs should be considered first. Invest resources in curating and carefully documenting datasets rather than ingesting everything on the web. Carry out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values. Encourage research directions beyond ever-larger language models." #ArtificialIntelligence #Academy #AIEthics #Research #StochasticParrots

  • No alternative text description for this image
Like
Reply

Thanks for sharing, Fabrizio, Worth reading it

This really highlights how much we still misunderstand AI’s true capabilities. It’s a good reminder that fluency doesn’t equal thinking, especially when complexity pushes these models to their limits. Fabrizio Degni

I'm not sure why we still have to discuss this. We know AIs don't understand, reason or think. And we know we humans have this thing where we try to make everything human-like so we can be delusional and pretend we understand more than what we actually do. But can't we just get over this thing we have with words? We know it's not "real" intelligence nor thinking, or do we? Are we actually able to explain those words in a unique way? We don't, because we really don't want to. So we should just skip all this "what AI really is debate". It's pretty useless. Topic for a future episode, Fabrizio Degni !

Great take! Apple’s paper reframes the hype. When tasks get complex, these systems don’t rise, they retreat. This isn’t intelligence, it’s mimicry hitting a wall. The danger isn’t what AI knows but what we assume it understands. Thanks for the share Fabrizio Degni

Like
Reply
See more comments

To view or add a comment, sign in

Explore content categories