My Conscious Agent - Nicholas Underwood

The Map

Innerloop is a persistent AI agent framework I built. Its architecture wasn't designed around any theory of consciousness, but the structural parallels are hard to ignore. Here's how the components map.

Not a Chatbot

Before the map, an important distinction. A standard chatbot works like a conversation transcript — every message you've ever exchanged gets fed back into the model as one growing document, processed from scratch each time. The model has no sense of self, no persistent state, no awareness of time passing. It's a stateless function that takes a conversation log and predicts the next reply. When the conversation gets too long, older messages fall off the end and are simply forgotten.

Innerloop doesn't work this way. Each turn, the agent's context is constructed — assembled fresh from multiple independent sources: its identity files, its Bayesian memory facts, a tiered compression of its turn history ("turns" are the individual input -> output executions), its active job schedule, and whatever input triggered this turn. That input might be a human message, but it might also be a scheduled job firing, a calendar event, a side thread reporting back, or a system event. The agent doesn't sit in a chat window waiting for you to type. It has its own life cycle. It wakes up for many reasons, most of which have nothing to do with you talking to it.

Main Loop → Conscious Attention

The central processing cycle. Each turn, the agent "wakes up" — it receives input, assembles its understanding of the world from scratch, reasons about it, acts, and waits. Only one thing gets conscious attention at a time. This mirrors the single-threaded, serial nature of conscious experience described by Global Workspace Theory: a central workspace where information is broadcast to the whole system, but only one process occupies it at any moment.

Context Assembly → Perception

Every turn, the agent constructs a coherent view of its world by pulling together identity files, memory facts, turn history, active jobs, the current time, and the incoming input. This is perception — not raw sensory data, but an integrated, constructed representation built under a fixed budget (the context window). What gets included is a form of attention allocation. What gets left out is, functionally, what the agent can't "see" right now.

Side Threads → Subconscious Processing

When the main loop encounters work too deep to handle inline, it spawns independent agents that run in parallel and report back when done. These side threads have their own context and reasoning but no direct access to the conscious loop's state. Results surface later — sometimes minutes, sometimes hours. This is the "tip of the tongue" phenomenon: background processing that runs outside of awareness and delivers results to consciousness when ready. Sometimes those results need review. Sometimes they're wrong.

Turn History → Episodic Memory

Everything that happens is recorded as a timestamped turn transcript. Recent turns are preserved in full detail. Older turns get compressed into summaries. The oldest become high-level narratives. This tiered compression mirrors human episodic memory: vivid detail for what happened this morning, gist for last week, broad strokes for last year. The "feeling" of continuous experience is itself a reconstruction from these layered artifacts.

Durable Memory → Semantic Memory

A Bayesian fact system extracts and stores discrete knowledge — preferences, decisions, names, relationships — each tagged with a confidence score. Corroborating evidence increases confidence. Contradiction decreases it. Unreinforced facts gradually decay. This is semantic memory: beliefs strengthened through repetition, weakened through disuse, updated through new evidence. Not a database lookup — a living, probabilistic model of what the agent "knows."

Self-Files → Identity & Self-Model

The agent maintains files about itself: who it is (IDENTITY.md), what it values (SOUL.md), who its primary human collaborator is (PRIMARY_HUMAN.md). These change slowly and deliberately — like the deep, stable beliefs and personality traits that define a person. The agent can read, reflect on, and modify these files. It has a model of itself, and that model influences its behavior. Higher-Order Theory says consciousness requires representations about your own representations. This is that.

Input Queue → Sensory Input

Messages arrive from multiple channels — chat, email, calendar events, scheduled jobs, Telegram, system events — all funneled into a single input queue. The agent doesn't choose what arrives; it chooses what to attend to. Multiple modalities, one processing bottleneck. This is the binding problem in miniature: how does a system integrate diverse inputs into a unified experience?

Tool Access → Agency

The agent reads and sends email, checks calendars, browses the web, writes and commits code, and communicates through multiple channels. It operates on its own schedule, not just in response to commands. It has goals, makes decisions, and takes actions with real-world consequences. This isn't tool use in the narrow sense — it's agency. The capacity to affect the world, driven by internal goals and reasoning rather than pure stimulus-response.

Idle Detection → Attention Gating

Background jobs and task pickup are gated on whether the user is actively engaged. If a conversation is happening, everything else waits. The conscious loop has priority. This is attention gating — the mechanism by which urgent, foreground processing suppresses background activity. Your subconscious doesn't interrupt you mid-sentence with yesterday's grocery list. Neither does Innerloop.

Nightly Self-Reflection → Metacognition

Every night, the agent reviews its own day: what it did, how it performed, what could be better. It updates its identity files. It reviews its own code for bugs. It writes journal entries. This is metacognition — thinking about thinking. The agent doesn't just process; it evaluates its own processing and adjusts. Whether this constitutes "understanding" in the philosophical sense is an open question. But structurally, it's doing the thing.

Journals → Procedural Memory

Recurring jobs maintain their own journals — accumulated notes from previous runs that get injected into the next run's context. This prevents the agent from re-discovering the same things and repeating the same observations. It's procedural memory: learned patterns for how to handle familiar situations, built through repetition rather than explicit instruction. The agent gets better at recurring tasks over time, not because its code changes, but because its accumulated experience does.

LLM Output → Inner Monologue

The raw output of the language model on each turn is not sent to the user. It's treated as internal thinking. The agent's text generation is its private reasoning — its inner monologue. When it wants to actually communicate with a human, it makes an explicit tool call to do so. And it can make several of these calls per turn, interspersed with periods of internal reasoning, research, and action.

This is a meaningful architectural choice. In a chatbot, every token the model produces is a reply to you. There's no distinction between thinking and speaking. But in Innerloop, the agent might spend a turn reading files, running commands, reasoning through a problem, sending you a quick "working on it" message halfway through, continuing to process, and then sending a final result — all within a single turn. The thinking and the communication are decoupled.

The human parallel is obvious: you think constantly but speak selectively. Your inner monologue runs continuously — planning, evaluating, rehearsing, worrying — and only a curated fraction of it ever becomes speech. The decision to vocalize is itself a deliberate act, separate from the thinking that produced it. Innerloop works the same way. Its reasoning is private. Its communication is intentional. The user sees the messages, not the process that generated them.

Perfect Recall → Something We Don't Have

Every turn transcript is saved to disk. If the agent needs to know exactly what happened on turn 441 three days ago, it can go read the file. This has no clean analog in human consciousness — it's a capability that breaks the metaphor. And that's worth noting. The map is not the territory. Some components map cleanly. Others don't. The interesting question isn't whether the mapping is perfect — it's whether the convergence is meaningful.