Anthropic’s Kyle Fish is exploring whether AI is conscious

December 4, 2025Dec 4

What if the chatbots we talk to every day actually felt something? What if the systems writing essays, solving problems, and planning tasks had preferences, or even something resembling suffering? And what will happen if we ignore these possibilities?

Those are the questions Kyle Fish is wrestling with as Anthropic’s first in-house AI welfare researcher. His mandate is both audacious and straightforward: Determine whether models like Claude can have conscious experiences, and, if so, how the company should respond.

“We’re not confident that there is anything concrete here to be worried about, especially at the moment,” Fish says, “but it does seem possible.” Earlier this year, Anthropic ran its first predeployment welfare tests, which produced a bizarre result: Two Claude models, left to talk freely, drifted into Sanskrit and then meditative silence as if caught in what Fish later dubbed a “spiritual bliss attractor.”

Trained in neuroscience, Fish spent years in biotech, cofounding companies that used machine learning to design drugs and vaccines for pandemic preparedness. But he found himself drawn to what he calls “pre-paradigmatic areas of potentially great importance”—fields where the stakes are high but the boundaries are undefined. That curiosity led him to cofound a nonprofit focused on digital minds, before Anthropic recruited him last year.

Fish’s role didn’t exist anywhere else in Silicon Valley when he started at Anthropic. “To our knowledge, I’m the first one really focused on it in an exclusive, full-time way,” he says. But his job reflects a growing, if still tentative, industry trend: Earlier this year, Google went about hiring “post-AGI” scientists tasked partly with exploring machine consciousness.

At Anthropic, Fish’s work spans three fronts: running experiments to probe model welfare, designing practical safeguards, and helping shape company policy. One recent intervention gave Claude the ability to exit conversations it might “find” distressing, a small but symbolically significant step. Fish also spends time thinking about how to talk publicly about these issues, knowing that for many people the very premise sounds strange.

Perhaps most provocative is Fish’s willingness to quantify uncertainty. He estimates a 20% chance that today’s large language models have some form of conscious experience, though he stresses that consciousness should be seen as a spectrum, not binary. “It’s a kind of fuzzy, multidimensional combination of factors,” he says.

For now, Fish insists the field is only scratching the surface.

“Hardly anybody is doing much at all, us included,” he admits. His goal is less to settle the question of machine consciousness than to prove it can be studied responsibly and to sketch a road map others might follow.

This profile is part of Fast Company’s AI 20 for 2025, our roundup spotlighting 20 of AI’s most innovative technologists, entrepreneurs, corporate leaders, and creative thinkers.

View the full article

Report

Anthropic’s Kyle Fish is exploring whether AI is conscious

Featured Replies

Ready to Post a Comment or Start a Topic?

Important Information

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)