Skip to main content
Software

AWS’s Automated Reasoning Group has a way to fight AI hallucinations

IT Brew sits down with Byron Cook, a VP and distinguished scientist at Amazon, to discuss how automated reasoning could make AI even better.

5 min read

Caroline Nihill is a reporter for IT Brew who primarily covers cybersecurity and the way that IT teams operate within market trends and challenges.

AI tools can hallucinate at exactly the wrong moment. They might even go rogue and delete a database.

It’s exactly those scenarios that keep IT pros up at night when deciding how to integrate AI into their current tech stacks. Fortunately, some AI experts are working on ways to boost AI’s reasoning abilities while lessening the chances of hallucinations.

Byron Cook, a VP and distinguished scientist at Amazon and a part-time program manager at the Defense Advanced Research Projects Agency (DARPA), is one of those experts. He leads Amazon Web Services’s Automated Reasoning Group, which builds tools with provable security and automated validation of responses and outputs.

Automated reasoning could make AI tools more usable and powerful, according to Cook, who says it is a powerful way for humans to determine if an AI output is accurate.

“Steve Jobs used to say that the computer was the bicycle for the mind, and the idea was that the fastest mammal is a human on a bicycle,” Cook said.

Logic tools and machine learning tools are a new kind of bicycle for the mind, Cook continued: “It’s still humans making decisions about what should be true and not true, and making sure those things are true.”

Explain it to me like I’m in 5th grade. As any IT pro who’s seen an LLM confidently spit out a ream of made-up payroll data will tell you, AI can hallucinate, particularly when it doesn’t know the actual answer to a prompted question.

What’s a good way to squish these hallucinations? With automated reasoning, an LLM could logically deduce the validity of an answer before it actually shows anything to the end user. For example, a user might ask their LLM, “Do sharks live on land?” The automated reasoning within the LLM might deduce that, because a., sharks are fish, and b., no fish live on land, therefore c., sharks do not live on land.

In a similar fashion, automated reasoning could apply logic to questions relevant to IT professionals, such as whether a code snippet will run on an endless loop, or if a series of racks will actually fit inside a data center of certain dimensions.

To better understand the underpinnings of automated reasoning, Cook pointed to basic algebra, where one has to solve for x’s meaning. The manipulation of symbols according to a defined set of rules is symbolic reasoning. With AI, this symbolic reasoning is happening at a huge scale.

“That’s the same kind of thing you were doing on the whiteboard in fifth grade,” Cook said. “Prior to generative AI and agentic AI, this area was a thing, but it was the thing for safety-critical systems or when there was enough critical mass to do it and the consequences of getting the details wrong were bad enough.”

Top insights for IT pros

From cybersecurity and big data to cloud computing, IT Brew covers the latest trends shaping business tech in our 4x weekly newsletter, virtual events with industry experts, and digital guides.

Now automated reasoning can apply to a much broader array of AI-related tasks. And with increased usage of AI agents, where AI performs complex tasks with relatively few humans in the loop, there’s an increased need to get things right.

When it’s wrong. If Cook’s approach could result in an internal proofing system to validate a chatbot’s answers, why not apply it to every AI model out there?

Cook told IT Brew that there are some things that automated reasoning isn’t quite as helpful for. For example, it might not function as well when confronted with ambiguous scenarios, such as whether an experimental physics concept is “true.”

Cook offered the “Barber of Seville” paradox as something that an automated reasoning system wouldn’t be able to answer. The puzzle is as follows: The barber of Seville shaves all men who don’t shave themselves in town, leaving a key question—who shaves the barber?

Even an automated reasoning system trained with a high degree of sophistication can't figure out an impossible paradox.

How much are hallucinations really happening? Even as AI grows in capabilities, hallucinations continue to present a pervasive problem. For example, hallucination rates on newer AI systems was up to 79% in one test, according to the New York Times.

McKinsey & Company’s 2025 “State of AI” report found that AI inaccuracy is one of the two big risks that organizations are working to mitigate, next to cybersecurity. More than half (51%) of respondents said their enterprises had seen at least one negative instance related to use of AI, while one-third reported that these consequences stemmed from AI inaccuracy.

Cook said that automated reasoning could be a way for AI to show its work and prove the answer to a question via a simple audit, assuring IT pros and consumers alike that an LLM is working as it should.

“One can imagine, for financial services applications or travel, that you could allow the agent or allow the chatbot to answer your questions or book your tickets, and then get an artifact that can be checked independently, showing why those decisions were the right decisions to make,” Cook said.


Top insights for IT pros

From cybersecurity and big data to cloud computing, IT Brew covers the latest trends shaping business tech in our 4x weekly newsletter, virtual events with industry experts, and digital guides.