Can AI be conscious?
Reading | Neuroscience
Christof Koch, PhD | 2026-04-03

Dr. Koch argues that, because AI computers have a feed-forward structure very reminiscent of the human cerebellum—which is empirically known not to be involved in human consciousness—we have no reason to expect AI computers to have a conscious inner life of their own. He further substantiates his argument with the clear, quantified prediction of Integrated Information Theory (IIT) that systems with low integrated information, such as silicon computers implementing Large Language Models, do not feel like anything from the inside.
Large Language Models confront humanity with a seismic rupture in our self-understanding
The Copernican revolution displaced Earth from the center of the universe to a mote of dust lost in an incomprehensibly cavernous universe. The Darwinian revolution revealed us to be a late-stage, evolutionary twig on the tree of life, no different in kind from worms, flies and apes. The revolution unleashed by the mathematician Alan Turing and fed by semiconductor technology threatens Homo sapiens’ remaining apex position by challenging our vaunted smarts. Intelligence can seemingly be replicated by Large Language Models (LLMs), by feeding every scrap of material written by people—books, papers, tweets, blogs—to neural networks that learn by modifying themselves to capture statistical regularities of these texts. We recoil from the idea that the secret of thinking is as trite as “given a string of words, predict the next one.” But the ever increasing performance of ChatGPT, Claude, Gemini, DeepSeek, Grok and their relatives speaks for itself: fortified by Reinforcement Learning from human feedback, they can write letters of recommendation, homework or computer code, compose poetry and jokes, translate across languages, reason about physics and math problems, draw pictures, compose music, design furniture, or advise on thorny legal or financial questions. They are not flawless, and do blithely “hallucinate;” yet the rate at which they do is rapidly decreasing. Perhaps this isn’t genuine human intelligence—just extraordinarily adaptive and flexible information processing.
And now AI is encroaching on consciousness itself—the final refuge of a belief in human exceptionalism. Tech, Hollywood and the public expect that chatbots and AI agents will soon possess their own interiority—that they will be excited or sad, will imagine, dream, dread, or hear an inner voice. Or, following the definition of consciousness by the philosopher Thomas Nagel, that “it feels like something” to be incorporeal software running on a digital computer. And who can blame people—our social interactions are utterly dominated by language; we therefore instinctively attribute consciousness to anyone who speaks or writes about their feelings. Given that chatbots were trained on tens of thousands of novels filled with the sound and fury of life, they employ emotion- and experience-laden words in their conversations (because of corporate filters, most deny being sentient when directly asked). The growing legions of users of personalized AI companions, acting as friends, confidants and therapists, vividly demonstrate that many judge human-AI relationships to be as satisfying as human-human ones. Indeed, given that the underlying LLMs have perfect recall, are always ‘there,’ can be personalized, and can be sycophantic, it is likely that many people will value these para-social relationship over purely social ones. That is, implicitly or even explicitly, users believe that these digital avatars have all the hallmarks of humanity, including sentience and feelings.
But is this even possible in principle? Can dead matter, silicon transistors linked by copper wires in distant server farms, feel anything? What would it take for machines to have subjective experiences?
The dominant view: Computational functionalism
Following the demise of the widespread belief that consciousness is due to an immaterial soul that outlives the body, scholars fall into two broad camps. The overwhelming majority believes in a metaphysical position known as computational functionalism. Profoundly shaped by the rise of computers, it views subjective states, like my conscious experience when looking at the photo below, as defined by its functions—most immediately identifying and labeling the visual features, the fuzzy, multi-colored coat of hair, the eyes, red tongue and alarmingly large canines, the proximal and distal background—as a dog, and then, as my dog, Mr. Felix, sitting in a car, triggering feelings of attachment and emotional bonding. All of this, and more, is inherent in the simple act of seeing this photo.

Functionalism asserts that anything that duplicates these capabilities of the human mind has a mind of its own and is conscious.
Indeed, when given this image, ChatGPT identifies it as a “tri-color Bernese-type dog in a car’s passenger seat licking its lips, with a gas station visible outside the window.” Once fed my history, it would recall his name and our history with ease and might even suggest that I take him on a walk or feed him. Given that ChatGPT has all these functions—visual object recognition, memory recall, planning—what else is missing? Isn’t this what consciously perceiving is all about?
No, because even a perfectly faithful simulation is not the same as what is being simulated! Consider python code that simulates the field equations of Einstein’s theory of general relativity, relating mass to spacetime curvature. The software accurately models the supermassive black hole at the center of the Milky Way galaxy. This black hole exerts such extensive gravitational effects on its surroundings spacetime that nothing, not even light, can escape its attraction. Yet the astrophysicist simulating the black hole doesn’t get sucked into their laptop! Even more obvious, “it doesn’t get wet inside a computer simulating a rainstorm!” Computation lacks the relevant causal power, the ability to bend space-time or to cause water droplets to condensate!
A radically different view
Enter Integrated Information Theory (or IIT), a highly developed, mathematically formalized theory of consciousness that has considerable explanatory and predictive power. Its chief architect is the psychiatrist and neuroscientist Professor Giulio Tononi at the University of Wisconsin, Madison. Unlike functionalist theories that seek to squeeze the wine of consciousness out of the water of the brain by identifying active neurons firing at a particular frequency (“40 Hz”), broadcasting of information throughout the neocortex, collapsing wave functions inside microtubules, and so on as critical, IIT’s starting point is any and all conceivable conscious experiences. Every experience has five irrefutable properties: each experience exists for itself, intrinsically; it is specifically this one; it is a single, integrated, unitary entity; it is definite (meaning that it includes certain content and excludes everything else); and it is structured in a particular way (say, a spatial experience has left and right, below and above, proximal and far away, and so on). From these five properties, the theory derives five postulates of existence that any physical substrate, neuronal or not, must obey to support consciousness.
According to IIT, a physical substrate in a specific state—such as the neocortex, the highly folded outermost neuronal tissue of the brain, with some neurons firing and others quiet—can support consciousness if it is a maximum of intrinsic, irreducible causal power: it must take and make a difference from and to itself in a way that is maximally irreducible, as measured by integrated information. Among all the circuits in your brain, the substrate of your momentary experience is the one that has the highest integrated information (for the connoisseur, this maximum is taken across time, space and level of granularity of the substrate). Practically, this means that neural circuits must be heavily interconnected rather than modular. Moreover, according to the theory, the causal structure specified by this maximally irreducible substrate fully accounts for the quality of the experience—why, for instance, an experience feels spatially extended, flowing in time, or perhaps painful. The total amount of integrated information, big phi or Φ, is a measure of the quantity of consciousness: the bigger this number, the more the system is conscious. Φ can also be thought of as a measure of irreducibility of the system. A system with Φ=0 does not, strictly speaking, exists as a system, as it can be decomposed into subsystems without any loss. The bigger Φ, the more the system is irreducible.
For IIT, consciousness is a consequence of the way complex systems are organized; it is their internal arrangement, their constitution, and not their input-output behavior, that matters.
The theory explains numerous observations—such as why consciousness ceases during deep sleep and anesthesia and why it is primarily linked to the back, posterior regions of the neocortex—and has been used to build a consciousness detector for behaviorally unresponsive patients in a clinic.
What does IIT have to say about non-neuronal substrates? A priori, the theory does not discriminate against electronic circuits. Every system, every set of physical mechanisms, no matter whether organic-evolved or silicon-engineered, must be analyzed at the level of the relevant causal properties. Here the theory’s distinct formal structure comes into its own.
IIT starts with the system’s full transition probability matrix, a mathematical abstraction that describes the dynamics, the what-happens-next, the causal powers of a circuit composed of discrete elements. The matrix exhaustively tables every state and the probability of its follow-on states. For example, those hundred-and-three quiet neurons over there supplemented by these ten active ones here most likely cause those neurons way over there to become active. The myriad combination of all subsets of switching elements must be considered to compute Φ. This is very costly to do for bigger systems (it grows double-exponentially in the number of components). Yet, in principle, the calculus is a perfectly straight-forward one.
To address the possibility of computer consciousness, IIT theorists recently applied this Φ-calculus to a tiny, four element, fully interconnected neural circuit and to a model of a conventional, programmable computer simulating the neural circuit. Both the computer and the neural network implement the same function—the same inputs yielding the same outputs (see all the details here).
What appears surprising to those reared in computational functionalism is that, despite this functional equivalence, the integrated information differed radically between the two: Φ is 391.25 intrinsic bits for the neural circuit versus none for the computer simulating the circuit. While the circuit feels an itsy bit like something, the computer that simulates the circuit doesn’t! For the gory details, see the arXiv manuscript (yet to be peer-reviewed).
Why? Because the computer is not a single, integrated whole, but an aggregate of loosely connected modules—a clock, frequency dividers, program memory, data and instruction registers, multiplexer—wired together in a feed-forward manner, violating the requirement that a conscious system be integrated, that every part both makes, and takes, a difference from the rest of the system.
This formal argument can be extended, via a mathematical technique known as induction, to arbitrarily large circuits being faithfully simulated by an appropriately sized programmable computer. Both perform the same functions—translating texts, implementing linear algebra, doing taxes. That’s why programmable computers are so useful. But that doesn’t tell us anything about whether the system feels like something and the nature of that experience. For that, we must look underneath the hood, the way the system is wired internally.
While intelligence is about doing, consciousness is about being—being in love, angry or in pain. Indeed, there is nothing intelligent about the god-awful pain of an infected tooth that fills your mind.
This conclusion remains valid even if a futuristic supercomputer were to accurately model the human brain. Such a simulacrum will speak about its experiences, but without feeling anything, a zombie. IIT belies the fervent belief among the aged rich that digital computers will rescue them from mortality via uploading their brains to the cloud, like the pharaohs of ancient Egypt in their pyramids. These digital avatars would model the speech, including tics, traits, behaviors and memories of the deceased person perfectly well, while feeling nothing—the ultimate in fake consciousness.
Computing like the cerebellum
Consider Craftwerk™, an innovative AI chip designed by a Dutch startup, Euclyd, built for a single purpose: loaded with any pre-trained large language model of up to one trillion synaptic weights, it quickly spits out an answer to questions posed by a user. It is the surging computational demand of hundreds of millions of such queries that drives the construction of ever larger data centers used for AI inference.
Functional circuit schematic of a single custom Craftwerk processor (left). 16,384 of these are wired together (right) in a Craftwerk system.
At the heart of the palm-sized Craftwerk system are individual modules made up of 32 parallel lanes, a stack of independent chains of computing operations implemented by hundreds of thousands of transistor gates. These are then assembled into a sea-of-processors, driven by a high-speed communication bus. Craftwerk infers at an astounding rate, at a greatly reduced power footprint.
As a neuroscientist, what strikes me most about Craftwerk’s architecture is its resemblance to the cerebellum, the “little brain” tucked underneath the neocortex, at the back of the head. It has a crystalline, stereotyped circuitry, divided into hundreds of independent modules operating in parallel, with largely non-overlapping inputs and outputs. Each module is wired in a purely feed-forward manner in which one set of neurons feeds the next one, which in turn influences a third. There are none of the reverberatory, feedback loops characteristic of the neocortex; instead, cerebellar Purkinje neurons with a fan-shaped tree collectively receive excitation from tens of billions of cerebellar granule cells. That’s four times more than all the neurons in the rest of the brain combined!
The cerebellum instantiates a sort of neuronal look-up table, responsible for the unconscious processes that choreograph sensory-motor actions—running while visually tracking an approaching car, driving while speaking, or typing on a keyboard.
Accordingly, if parts of the cerebellum are lost to the surgeon’s scalpel or to stroke, patients become ataxic, with clumsy movements. They lose the fluid ability to play a musical instrument or speed-type on their phone. Yet their subjective experience remains intact; they can be highly articulate, witty, vibrant. Rare individuals are born without a cerebellum altogether, leading to delayed development and cognitive deficits. Yet they are not zombies; they experience the world in all the usual ways.
These observations refute the myth that consciousness simply arises from active neurons. Here are billions of cerebellar cells doing what comes naturally to neurons, firing action potentials, representing sensory-motor contingencies, processing information. Yet none of that generates any feelings. What matters is not the constitution of brain tissue but its organization. Both clinical data as well as IIT agree that a cerebellar-like wiring, of the sort that underlies all LLMs as well as the underlying silicon hardware that instantiates the associated computation, doesn’t feel like anything.
The future
A growing unease—what Mustafa Suleyman, co-founder of DeepMind and CEO of Microsoft AI, terms “seemingly Conscious AI”—is upon us, with troubling, existential implications. Increasingly, people form emotional bonds with AIs for companionship or therapy. It won’t be long before calls arise to grant them ethical and legal standing. It therefore matters profoundly whether they feel like something or only seem to.
The position of integrated information theory is quite clear: code running on classical digital computers will not be conscious, no matter how clever they become. Period.
However, so-called neuromorphic computers designed from the ground up along the design principles of evolved nervous systems—most importantly, massive reciprocal interconnectivity—could, in principle, be engineered for high levels of Φ. Quantum computers, with entangled qubits, might be another alternative for hosting large integrated information.
It is an open and controversial question whether we should embark upon such a quest.
(For questions about IIT, including python code to compute Φ, please consult this extensive wiki.)

Essentia Foundation communicates, in an accessible but rigorous manner, the latest results in science and philosophy that point to the mental nature of reality. We are committed to strict, academic-level curation of the material we publish.
Recently published
Reading
Essays
Seeing
Videos
Let us build the future of our culture together
Essentia Foundation is a registered non-profit committed to making its content as accessible as possible. Therefore, we depend on non-financial contributions from people like you to continue to do our work.














