Apple Study Reveals Critical Flaws in AI's Logical Reasoning Abilities

Monday October 14, 2024 7:31 am PDT by Hartley Charlton

Apple's AI research team has uncovered significant weaknesses in the reasoning abilities of large language models, according to a newly published study.

Apple Silicon AI Optimized Feature Siri 1
The study, published on arXiv, outlines Apple's evaluation of a range of leading language models, including those from OpenAI, Meta, and other prominent developers, to determine how well these models could handle mathematical reasoning tasks. The findings reveal that even slight changes in the phrasing of questions can cause major discrepancies in model performance that can undermine their reliability in scenarios requiring logical consistency.

Apple draws attention to a persistent problem in language models: their reliance on pattern matching rather than genuine logical reasoning. In several tests, the researchers demonstrated that adding irrelevant information to a question—details that should not affect the mathematical outcome—can lead to vastly different answers from the models.

One example given in the paper involves a simple math problem asking how many kiwis a person collected over several days. When irrelevant details about the size of some kiwis were introduced, models such as OpenAI's o1 and Meta's Llama incorrectly adjusted the final total, despite the extra information having no bearing on the solution.

We found no evidence of formal reasoning in language models. Their behavior is better explained by sophisticated pattern matching—so fragile, in fact, that changing names can alter results by ~10%.

This fragility in reasoning prompted the researchers to conclude that the models do not use real logic to solve problems but instead rely on sophisticated pattern recognition learned during training. They found that "simply changing names can alter results," a potentially troubling sign for the future of AI applications that require consistent, accurate reasoning in real-world contexts.

According to the study, all models tested, from smaller open-source versions like Llama to proprietary models like OpenAI's GPT-4o, showed significant performance degradation when faced with seemingly inconsequential variations in the input data. Apple suggests that AI might need to combine neural networks with traditional, symbol-based reasoning called neurosymbolic AI to obtain more accurate decision-making and problem-solving abilities.

Tags: Apple Research, Artificial Intelligence

Popular Stories

iOS 18.1 With Apple Intelligence: New Features, Release Date, and More

Thursday October 10, 2024 8:26 am PDT by Joe Rossignol

iOS 18.1 will be released to the public in the coming weeks, and the software update introduces the first Apple Intelligence features for the iPhone. Below, we outline when to expect iOS 18.1 to be released. iOS 18.1: Apple Intelligence Features Here are some of the key Apple Intelligence features in the iOS 18.1 beta so far: A few Siri enhancements, including improved understanding...

• 20 comments

iPhone 17 Pro Models Rumored to Introduce These 5 New Features

Friday October 11, 2024 8:55 am PDT by Joe Rossignol

While the iPhone 16 series was released just a few weeks ago, there are already many rumored features for the iPhone 17 models, and especially for the Pro models. Below, we recap five key new features rumored for the iPhone 17 Pro and Pro Max so far: 24MP front camera for all iPhone 17 models: All four iPhone 17 models will feature an upgraded 24-megapixel front-facing camera, according...

The MacRumors Show: Apple's Shocking M4 MacBook Pro Leak

Friday October 11, 2024 8:27 am PDT by Hartley Charlton

On this week's episode of The MacRumors Show, we discuss the unprecedented leak of Apple's M4 MacBook Pro models and the company's rumored move to more staggered hardware and software releases. Subscribe to The MacRumors Show youtube channel for more videos Multiple leaks surrounding Apple's unannounced 14-inch MacBook Pro with the M4 chip recently surfaced online. The leaks began with unboxin...

• 81 comments

iOS 18.1 Includes Option to Set 'Primary' Email Address and Change iCloud Email

Friday October 11, 2024 3:55 pm PDT by Juli Clover

In iOS 18.1, there is a new option to set a "Primary" email address in the Settings app, which means it is easier to change the main email address associated with your Apple Account. The Primary email address is the one that is visible to other people when collaborating on and sharing documents, sending calendar invites, and more. Apple did not previously make it easy to change an Apple...

• 125 comments

Apple Stops Signing iOS 18.0

Thursday October 10, 2024 12:10 pm PDT by Juli Clover

Apple today stopped signing iOS 18.0, preventing iPhone users who have upgraded to iOS 18.0.1 from downgrading to iOS 18. Apple released iOS 18.0.1 a week ago on October 3. It is not unusual for Apple to stop signing older versions of iOS within a week or two after a new version of iOS is released. When Apple stops signing an update, it can no longer be installed on an iPhone due to a...

• 18 comments

iPad Mini 7 Coming Next Month: What to Expect

Tuesday October 8, 2024 6:16 am PDT by Tim Hardwick

Rumors strongly suggest Apple will release the seventh-generation iPad mini in November, nearly three years after the last refresh. Here's a roundup of what we're expecting from the next version of Apple's small form factor tablet, based on the latest rumors and reports. Design and Display The new iPad mini is likely to retain its compact 8.3-inch display and overall design introduced with...

• 135 comments

When Will Apple Launch More M4 Macs Feature

Will Apple Release M4 Macs Soon? Here's What the Latest Rumors Say

Thursday October 10, 2024 6:22 am PDT by Tim Hardwick

Apple often releases new Macs in the fall, but we are still waiting for official confirmation that the company has similar plans this year. We're approaching the middle of October now, and if Apple plans to announce new Macs before the holidays, recent history suggests it will happen this month. Here's what we know so far. As of writing this, it's been 220 days since Apple released a new...

• 53 comments

Top Rated Comments

Timpetus

1 day ago at 07:42 am

If this surprises you, you've been lied to. Next, figure out why they wanted you to think "AI" was actually thinking in a way qualitatively similar to humans. Was it just for money? Was it to scare you and make you easier to control?

Score: 57 Votes (Like | Disagree)

johnediii

1 day ago at 07:42 am

All you have to do to avoid the coming rise of the machines is change your name. :)

Score: 30 Votes (Like | Disagree)

Mitthrawnuruodo

1 day ago at 07:44 am

This shows quite clearly that LLMs aren't "intelligent" in any reasonable sense of the word, they're just highly advanced at (speech/writing) pattern recognition.

Basically electronic parrots.

They can be highly useful, though. I've used Chat-GPT (4o with canvas and o1-preview) quite a lot for tweaking code examples to show in class, for instance.

Score: 24 Votes (Like | Disagree)

applezulu

1 day ago at 07:59 am

If this surprises you, you've been lied to. Next, figure out why they wanted you to think "AI" was actually thinking in a way qualitatively similar to humans. Was it just for money? Was it to scare you and make you easier to control?

Much of it is just popular hype from people who don't know enough to know the difference. Think of the NY Times article that sort of kicked it all off in the popular media a couple of years ago. The writer seemed convinced that the AI was obsessing over him and actually asking him to leave his wife. The actual transcript for anyone who's seen this stuff back through the decades, showed the AI program bouncing off programmed parameters and being pushed by the writer into shallow territory where it lacked sufficient data to create logical interactions. The writer and most people reading it, however, thought the AI was being borderline sentient.

The simpler occam's razor explanation why AI businesses have rolled with that perception or at least haven't tried much to refute it, is that it provides cover for the LLM "learning" process that steals copyrighted intellectual property and then regurgitates it in whole or in collage form. The sheen of possible sentience clouds the theft ("people also learn by consuming the work of others") as well as the plagiarism ("people are influenced by the work of others, so what then constitutes originality?"). When it's made clear that LLM AI is merely hoovering, blending and regurgitating with no involvement of any sort of reasoning process, it becomes clear that the theft of intellectual property is just that: theft of intellectual property.

Score: 24 Votes (Like | Disagree)