Jump to content




Featured Replies

rssImage-ebdc2451c29c82fc8de2b28ca03a9fbd.webp

Up in the Cascade Mountains, 90 miles east of Seattle, a group of high-ranking Amazon engineers gather for a private off-site. They hail from the company’s North America Stores division, and they’re here at this Hyatt resort on a crisp September morning to brainstorm new ways to power Amazon’s retail experiences. Passing the hotel lobby’s IMAX-like mountain views, they filter into windowless meeting rooms.

Down the hall, the off-site’s keynote speaker—Byron Cook, vice president and distinguished scientist at Amazon—slips into an empty conference room to have some breakfast before his presentation.

Cook is 6-foot-6, but with sloping shoulders that make his otherwise imposing frame appear disarmingly concave. He’s wearing a rumpled version of his typical uniform: a thick black hoodie and loose black pants hanging slightly high at the ankles. An ashy thatch of hair points in whatever direction his hands happen to push it. Cook, 54, doesn’t look much like a scientist, distinguished or otherwise, and certainly not like a VP—more like a nerdy roadie.

“They don’t know who I am yet,” he tells me between bites of breakfast, referring to the two dozen or so engineers now taking their seats. Despite his exalted title, Cook has faced plenty of rooms like this in his self-made role as a kind of missionary within Amazon, spreading the word about a powerful but obscure type of artificial intelligence called “automated reasoning.”

As he’s done many times before, Cook is here to get the highly technical people in that room to become believers. He’s championing an approach to AI that isn’t powered by gigawatt data centers stuffed with GPUs, but by principles old enough to be written on papyrus—and one that’s already positioning Amazon as a leader in the tech industry’s quest to solve the problem of hallucinations.

Cook doesn’t have a pretalk ritual, no need to get in character. He’s riffing half-seriously to a colleague about the pleasures of riding the New York subway in the summertime when someone mentions that the session is about to begin. He immediately drops his fork and strides out. His next batch of converts awaits.


When ChatGPT hit the world with asteroid force in November 2022, Amazon was caught flat-footed just like everyone else. Not because it was an AI laggard—the tech giant had recently overhauled nearly all of its divisions, including its massive cloud-computing arm, AWS, to leverage deep learning. Amazon also dominated the smart-home market, with 300 million devices connected to Alexa, its AI-powered assistant. It had even been researching and building large language models, the tech behind ChatGPT, for “multiple years,” as CEO Andy Jassy told CNBC in April 2023.

But OpenAI’s chatbot changed the definition—and expectations—of AI overnight. Before, AI was still a mostly invisible ingredient in voice assistants, facial recognition, and other relatively narrow applications. Now it was suddenly seen as a prompt-powered genie, an infinitely flexible do-anything machine that every tech company needed to embrace—or risk irrelevance. Less than six months after ChatGPT’s debut, Amazon launched Bedrock, its own AWS-hosted generative AI service for enterprise clients, a list that currently includes 3M, DoorDash, Thomson Reuters, United Airlines, and the New York Stock Exchange, among others.

Over the next two years, Amazon injected generative AI into product after product, from Prime Video and Amazon Music (where it powers content recommendation and discovery tools) to online retail pages (where sellers can use it to optimize their product listings), and even into internal tools used by AWS’s sales teams. The company has released two chatbots (a shopping assistant called Rufus and the business-friendly Amazon Q), plus its own set of “foundation models” called Nova—they are general-purpose AI systems, akin to Google’s Gemini or OpenAI’s line of GPTs. Amazon even caught the industry fever around so-called AGI (artificial general intelligence, a yet-to-be-achieved version of AI that does any cognitive task a human can) and in late 2024 launched AGI Lab, a flashy internal incubator led by David Luan, an ex-OpenAI researcher.

Still, none of it captured the public’s imagination like the stream of shiny objects emitted by OpenAI (“reasoning” models!), Anthropic (chatbots that code!), and Google (AI Overviews! Deep Research!). Like Apple, Amazon was unable to turn its early lead in AI assistants into an advantage in this new era. Alexa and Siri simply cannot compete.

But maybe that has been for the best, because 2025 was the year that AI’s sheen suddenly started to come off: GPT-5 fell flat, vibe coding went from killer app to major risk, and an MIT study rattled the industry by claiming that 95% of businesses get no meaningful return on their AI pilot projects.

It was against this backdrop—“the summer AI turned ugly,” as Deutsche Bank analysts called it—that Amazon publicly released Automated Reasoning Checks, a feature promising to “minimize AI hallucinations and deliver up to 99% verification accuracy” for generative AI applications built on AWS. The product was Cook’s brainchild; in a nutshell, it snuffs out hallucinations using the same kind of computerized logic that lets mathematicians prove 300-page-long theorems. (In fact, a 1956 automated reasoning program called “Logic Theorist” is considered by some experts to be the world’s first AI system, finding new and shorter versions of some of the proofs in Principia Mathematica, one of the most fundamental texts in modern mathematics.)

Sexy, it ain’t. Still, Swami Sivasubramanian, one of Amazon’s highest-ranking AI executives, who serves on Jassy’s “S-team” of direct advisers, was impressed enough to call Automated Reasoning Checks “a new milestone in AI safety” in a LinkedIn post. Matt Garman, CEO of AWS, referred to it as “game-changing.”

[carousel_block id=”carousel-1763954270090″]

Automated reasoning’s promise of quashing AI misbehavior with math has quietly become an essential part of Amazon’s strategy around “agents”—those LLM-powered workbots that are supposed to transform enterprise productivity [checks watch] any day now. Apparently, businesses have serious side-eye about that, too: Earlier this year, Gartner predicted that more than 40% of “agentic AI projects” will be ditched within the next two years due to “inadequate risk controls.” The company told me recently that it predicts that 30% to 60% of the projects that do go forward “will fail due to hallucinations, risk, and lack of governance.” That’s not a prophecy Amazon can afford to let come true—not with a potential market for AI agents that Gartner estimates to be worth $512 billion by 2029. One way or another, hallucinations have got to go.

The question is how. Agents are just souped-up LLMs, which means they can and will go off the rails—in fact, as OpenAI itself recently admitted following an internal study, they can’t not. What Cook helped Amazon realize, just months after ChatGPT’s release, was that they already had a secret weapon for extinguishing hallucinations, hidden in plain sight. Automated reasoning is the polar opposite of generative AI: old, stiff, and hard to use. Many at Amazon had never heard of it. But Cook knew how to wield it, having brought it to Amazon nearly 10 years ago as a way of rooting out hidden security vulnerabilities within AWS. And he’d been amassing what he estimates to be the largest group of automated reasoning experts in the tech industry.

Now that investment is set to pay off in a way that Amazon never expected. Automated Reasoning Checks is just the first of many products that the company plans to release (on a timetable it won’t specify) that fuse the flexibility of language models with the proven reliability of automated reasoning. The latest, called Policy in Amazon Bedrock Agentcore and previewed this week at AWS’s annual Re:Invent conference, uses automated reasoning to stop agents from taking actions they’re not allowed to (such as issuing customer refunds based on fraudulent requests).

If this combined approach—known as “neuro-symbolic AI”—can reduce the potential failure rate of agentic AI projects “by even a fraction of a percent, it would be worth hundreds of millions of dollars,” say analysts at Gartner. And Amazon knows it. “To realize the transformative potential of AI agents and truly change the way we live and work, we need that trust,” Sivasubramanian says. “We believe the foundation for trustworthy, production-ready AI agents lies in automated reasoning.”


To understand why Amazon is banking on automated reasoning, it’s worth sketching out how it’s different from the kind of AI you’ve already heard of. Unlike neural networks, which learn patterns by ingesting millions or even billions of examples, automated reasoning relies on a special language called “formal logic” to express problems as a kind of arithmetic, based on principles that date back to ancient Greece. Computers can use this rule-based approach to calculate the answers to yes-or-no questions with mathematical certainty—not probabilistic best guesses, as deep learning does.

Think of automated reasoning like TurboTax for solving complex logical problems: As long as the problems are expressed in a special language, computers can do most of the work—and have been doing so for decades. Since 1994, when a flaw in Intel’s Pentium chips cost the company half a billion dollars to fix, nearly all microchip manufacturers have used automated reasoning to prove the correctness of designs in advance. The French government used it to verify the software for Paris’s first self-driving Métro train in 1998. In 2004, NASA even used it to control the Spirit and Opportunity rovers on Mars.

There’s a catch, of course: Because automated reasoning can only reduce problems to three possible outcomes—yes, no, or the equivalent of “does not compute”—finding ways to apply this logically bulletproof but incredibly rigid style of AI to the real world can be difficult and expensive. But when automated reasoning works, it really works—collapsing vast, even unknowable possibilities into a single mathematical guarantee that can compute in milliseconds on an average CPU. And Cook is very, very good at getting automated reasoning to work.

Cook began his career building a formidable scientific reputation at Microsoft Research, where he spent a decade applying automated reasoning to everything from systems biology to the famously unsolvable “halting problem” in computer science. (Want a foolproof way to tell in advance if any computer program will run normally or get stuck in an infinite loop? Sorry, not possible. That’s the halting problem.) But by 2014, he was looking to put his findings, many of which have been published as peer-reviewed research, to work outside the lab. “I was figuring out: Where is the biggest blast radius? Where’s the place I could go to foment a revolution?” he says. “I watched everyone moving to the cloud, and was like, ‘I think AWS is the place to go.’”

The first problem Amazon aimed Cook at was cloud security. Reporting directly to then chief information security officer Stephen Schmidt, Cook and his newly formed Automated Reasoning Group (ARG) painstakingly translated AWS security protocols into the language of mathematical proofs and then used their logic-based tools to surface hidden flaws. Once those flaws were corrected, those same tools could then prove with certainty that the system was secure.

Some at AWS were dubious at first. “When you look ‘mad scientist’ up in the dictionary, Byron’s picture is in the margin,” says Eric Brandwine, an Amazon distinguished engineer who at the time worked on security for AWS. “Early on, I challenged [him] on a lot of this stuff.” But as Cook’s group fleshed out plans and racked up small but significant wins—like catching a vulnerability in AWS’s Key Management Service, the cryptographic holy of holies that controls how clients safeguard their data—skeptics started becoming evangelists.

“Some of these [were] beautiful bugs—they’d been there for years and never been found by our best experts, and never been found by bad guys,” says James Hamilton, a legendary distinguished engineer within Amazon who now directly advises Andy Jassy. “And yet, automated reasoning found them.”

From 2018 onward, Amazon’s automated reasoning experts worked with engineers to encode the technology into nearly every part of AWS, from analytics and storage to developer tools and content delivery. One particular niche of cloud-computing clients—heavily regulated financial service firms, like Goldman Sachs and the global hedge fund Bridgewater Associates, with sensitive data and strict compliance requirements—found automated reasoning’s promise of “provable security” extremely compelling. When ChatGPT appeared and the world flung itself headfirst into generative AI, these companies did too. But they still wanted to keep the “one small thing,” Cook says, that they’d become accustomed to along the way: trust.

That customer feedback spurred Cook to imagine how LLMs and automated reasoning might fit together. The solution that he and his collaborators prototyped in the summer of 2023 works by leveraging the same logical framework that worked so well for squishing security bugs in AWS.

Step one: Take any “policy” meant to inform a chatbot (say, a stack of HR documentation, or zoning regulations) and translate it into formal logic—the special language of automated reasoning. Step two: Translate any responses generated by the bot too. Step three: Calculate. If there’s a discrepancy between what the LLM wants to say and what the policy allows, the automated reasoning engine will catch it, flag it, and tell the bot to try again. (For humans in the loop, it’ll also provide logical proof of what went wrong and how, and suggest specific fixes if needed.)

“We showed that to senior leadership, and they went nuts for it,” says Nadia Labai, a senior applied scientist at AWS who partnered with Cook on the project. The demo went on to become Automated Reasoning Checks, which Amazon previewed at its annual Re:Invent conference in December 2024. PwC, one of the Big Four global accounting and consulting firms, was among the first AWS clients to adopt it.

“We do a lot of work in pharmaceutical, energy, and utilities, all of which are regulated,” says Matt Wood, PwC’s global and U.S. commercial technology and innovation officer. PwC relies on solutions like AWS’s automated reasoning tool to check the accuracy of the outputs of its generative AI tools—including agents. But Wood sees the technology’s appeal spreading beyond finance and other regulation-heavy industries.

“Look at what it took to set up a website 25 years ago—that was a refined set of skills. Today, you go on Squarespace, click a button, and it’s done,” he says. “My expectation is that automated reasoning will follow a similar path. Amazon will make this easier and easier: If you want an automated reasoning check on something, you’ll have one.”

Amazon has already embarked on this path with its own enterprise products and internal systems. Rufus, the AI shopping assistant, uses automated reasoning to keep its responses relevant and accurate. Warehouse robots use it to coordinate their actions in close quarters. Nova, Amazon’s fleet of generative AI foundation models, uses it to improve so-called “chain of thought” capabilities.

And then there are the agents. Cook says the company has multiple agentic AI projects in development that incorporate automated reasoning, with intended applications in software development, security, and policy enforcement in AWS. One is Policy in AgentCore, which Amazon released after this story was reported. Another that’s peeking out from behind the curtain is Auto, an agent built into Kiro, Amazon’s new AI programming tool, that will use formal logic to help make sure bot-written code matches humans’ intended specifications.

But Sivasubramanian, AWS’s vice president for agentic AI (and Cook’s boss), isn’t coy about the commitment Amazon is making. “We believe agentic AI has the potential to be our next multibillion-dollar business,” he says. “As agents are granted more and more autonomy . . . automated reasoning will be key in helping them reach widespread enterprise adoption.”

Agents are part of why Cook is touting automated reasoning to his engineer colleagues from the North American Stores division at their off-site in the mountains. Retail might not seem to have much in common with finance or pharma, but it’s a domain that’s full of decisions with real stakes. (While onstage at re:Invent 2025, Cook said that “giving an agent access to your credit card is like giving a teenager access to your credit card… You might end up owning a pony or a warehouse full of candy.”) And in that environment, relying on autonomous bots—empowered to do anything from execute transactions to rewrite software—can turn hallucination from tolerable quirk into Russian roulette.

It’s a matter of scale: When one vibe coding VC unleashes an agent that accidentally nukes his own app’s database, as happened earlier this year to SaaS investor Jason Lemkin, it’s a funny story. (He got the data back.) But if Fortune 500 companies start deploying swarms of agents that accidentally mislead customers, destroy records, or break industry regulations, there’s no Undo button.

Enterprise software is full of these potential pitfalls, and existing methods for reducing hallucination aren’t always strong enough to keep agents from blundering into them. That’s because agents shift the definition of “hallucination” itself, from errors in word to errors in deed. “First of all, this thing could lie to me,” explains Cook. “But secondly, if I let it launch rockets”—his metaphor for irreversible actions—“will it launch rockets when we’re not supposed to?”

Back in his hotel room after the keynote, Cook is reviewing the contents of a confidential slide deck about how automated reasoning can solve this “rocket-launching” problem. The demo, which he hurriedly mentioned in his talk (he ran out of time before being able to show it), describes a system that can transform safety policies for an agent—do’s and don’ts, written in natural language—into a flowchart-like visualization of how the agent can and cannot behave, all backed by mathematical proof. There’s even an Attempt to Fix button to use if the system detects an anomaly.

Cook calls the demo a “concept car,” but some of its ideas made it into Policy in AgentCore, which is already available in preview to some AWS customers.

PwC, for one, sees Amazon’s logic-backed take on AI extending into coordinating the agents themselves. “If you’ve got agents building other agents, collaborating with other agents, managing other agents, agents all the way down,” says Wood, “then having a way of forcing consistency [on their behavior] is going to be really, really important—which is where I think automated reasoning will play a role.”


The ability to reliably orchestrate the actions of AI—not just single agents, but entangled legions of them, at scale—is a target that Amazon has squarely in its sights. But automated reasoning may not be the only way to get the job done.

EY, another Big Four firm, recently launched its own neuro-symbolic solution to AI hallucinations, EY Growth Platforms, which fuses deep learning with proprietary “knowledge graphs.” A startup called Kognitos offers business-friendly agents backed by a deterministic symbolic program, dubbed “English as Code.” Others, like PromptQL, forgo neuro-symbolic methods altogether, preferring the simulated “reasoning” of frontier LLMs. But even they still attack the agent hallucination problem much like Amazon does: by using generative AI to translate business processes into a special internal language that’s easy to audit and control.

That translation process is where Amazon built a 10-year lead with automated reasoning. Now it has to maintain it. Nadia Labai is currently working on ways to improve Amazon’s techniques for using LLMs to convert natural language into formal logic. It’s part of a strategy that could help turn Amazon’s brand of customer-driven, business-friendly AI into a new class of industry- defining infrastructure.

A few days before the off-site, I met with Cook in a conference room at Amazon’s Seattle headquarters. Sitting with his legs tucked catlike beneath him, Cook mused about his own vision for the future of automated reasoning—one that extends far beyond Amazon’s ambitions for enterprise-grade AI.

“The world,” he says, “is filled with socio-technical systems”—patchworks of often-abstruse rules that only highly paid experts can easily navigate, from civil statutes to insurance policies. “Right now, rich people get [to take advantage of] that stuff,” he continues. But if the rest of us had a way to manipulate these systems in natural language (thanks, LLMs) with an underlying proof of correctness (thanks, automated reasoning), a workaday kind of “superintelligence” could be unlocked. Not the kind that helps us “colonize the galaxy,” as Google DeepMind CEO Demis Hassabis envisions, but one that simply helps people navigate the complexity of everyday life, like figuring out where it’s legal to build housing for an aging relative or how to get an insurance company to cover their expensive medication.

“You could have an app that, in an hour of your own time, would get answers to questions that before would take you months,” Cook says. “That democratizes, if you will, access to truth. And that’s the start of a new era.”

This story is part of Fast Company’s AI 20 for 2025, our roundup spotlighting 20 of AI’s most innovative technologists, entrepreneurs, corporate leaders, and creative thinkers.

View the full article





Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.