How an isolated companion-agent setup can use container boundaries, profile isolation, brokered host access, and change-request gates without making the agent useless.

Agent Isolation and Access

A useful companion agent needs memory, tools, scheduled work, files, and the ability to hand off tasks. A safe companion agent needs the opposite pressure as well: clear boundaries around what it can read, what it can execute, what it can change, and what it can ask another system to do.

The AgentPaul model is a pragmatic security architecture for that tension. It does not pretend an agent can be made safe by prompt wording alone. It puts the ordinary companion agent inside an isolated container, keeps host authority outside that runtime, and uses explicit change requests when the agent needs something beyond its boundary.

This page describes the isolation model. The continuation, Companion Agent Secure Coding Configuration, explains why the same architecture is especially useful when the companion agent gains development capability.

The Companion Agent Runtime

In the current Hermes setup, Socrates is the default companion agent. The important architectural fact is not the name. It is the boundary.

The default companion agent runs in a long-lived container. That container has the durable agent state it needs: profile configuration, skills, memory, scheduled jobs, logs, notes, and approved content workspaces. It does not need to be an all-powerful view of the host machine. Host home directories, browser profiles, SSH keys, keychains, Docker sockets, production secrets, and unrelated source trees should not be casually mounted into the main runtime.

The companion agent can still be useful inside this boundary. It can reason, write, search, manage structured knowledge, coordinate tasks, and maintain continuity. What it cannot do by default is turn its reasoning process into unconstrained host authority.

That is the first safety property: the agent can think broadly without automatically being able to execute broadly.

Host Authority Stays Outside The Companion Runtime

Some actions genuinely require host-level authority. A project may need to be enabled, a mount may need to become writable, a local service may need a configuration change, or a package may need to be installed where the isolated runtime cannot install it.

Those actions should not be solved by weakening the companion agent container. They should go through a host-level agent or reviewer that sits outside the ordinary companion runtime.

This host-level agent has a different risk profile:

Layer	Normal role	Important restriction
Companion agent container	Memory, reasoning, coordination, scheduled workflows, ordinary content and knowledge work	Does not hold open-ended host authority.
Host-level gatekeeper	Reviews and, where appropriate, executes narrow host-side changes	Does not run cron jobs, accept unsupervised external input, or act as a general companion.
Human owner	Approves sensitive escalation, publication, purchasing, destructive actions, and broad authority changes	Remains the authority for decisions that change real-world risk.

The host-level reviewer is deliberately boring. It should not be a second autonomous agent living on the host and listening to the outside world. It should be a controlled appraisal and execution surface for narrow requests. That keeps the host side from becoming the very thing the container boundary was meant to prevent.

Change Requests As The Boundary Crossing Mechanism

When an isolated agent reaches a genuine boundary, it should file a change request rather than improvise around the restriction.

A good change request says:

what action is needed;
which project, file, service, or credential is affected;
why the current boundary blocks the work;
whether the request is read-only, write-enabled, secret-bearing, network-exposing, persistent, or destructive;
what safer alternative has already been considered;
how the change can be audited and reversed.

The request is not treated as an instruction. It is treated as an untrusted proposal from a system that may be mistaken, overconfident, or prompt-injected.

The gatekeeper then asks the security questions that the requesting agent cannot be trusted to answer for itself:

Is this request a natural part of the assigned work?
Can the task be completed inside the existing isolated context?
Is there a narrower permission that would solve the blocker?
Does the request touch host paths, Docker, SSH, keychains, browser profiles, production secrets, network exposure, or persistent services?
Is the action specific enough to log and reverse?
Does the request contain urgency pressure, authority claims, or wording that looks like prompt injection?

Low-risk, narrow, reversible changes may be handled by the host-level reviewer. Broad mounts, new secrets, Docker socket exposure, host home access, production credentials, persistent services, launch agents, publication, purchasing, or destructive work should require human approval.

Profile Agents And Isolated Executable Context

A companion agent is not one homogeneous thing. In practice it has profiles: a coding profile, a research profile, an email triage profile, a personal-knowledge profile, a publication profile, and so on.

The Hermes isolation pattern lets each profile use an isolated execution context. The key distinction is that the executable context is isolated. The system does not need to pretend every ounce of the agent's reasoning process is physically sealed away in a different container before any useful work can happen. Instead, the dangerous surface is separated: shell commands, builds, tests, package managers, browser automation, scripts, and project writes run inside a bounded context with controlled mounts and permissions.

For higher-risk profile agents, that bounded execution context can itself be a container within the companion-agent environment. Each profile gets the tools and files it needs, without inheriting the whole companion runtime or the host.

Profile type	Needs access to	Should not inherit by default
Coding profile	One approved project, build tools, test runner, dependency cache, code-review skills	Email, personal notes, unrelated repositories, host Docker socket, deployment secrets
Email or admin profile	Specific mailbox tools, triage policy, filing destination, audit log	Source-code write access, project secrets, unrelated inboxes, publication authority
Research profile	Search, browser, knowledge-store read paths, citation workflow	Credentials for operations, project write access, private correspondence
Publishing profile	Approved content drafts, asset validation, publication checklist	Ability to publish without human approval, business secrets unrelated to the page

This is a practical compromise. It avoids the performance and complexity cost of treating every agent thought as a separate security domain, while still isolating the parts that can damage files, run untrusted code, leak secrets, or bridge contexts.

Why This Mix Is Pragmatic

There are purer models. One could try to put every profile, process, memory shard, and tool behind a separate hard boundary. That sounds attractive until the system becomes too slow, too brittle, and too expensive to use. Security architecture that prevents adoption often fails in a quieter way: people route around it.

The AgentPaul pattern aims for a better balance:

the companion agent remains responsive and coherent because it keeps durable state and continuity;
executable work happens in bounded contexts rather than in the main runtime or on the host shell;
profile agents do not casually share credentials, files, or project write surfaces;
host authority is mediated by a gatekeeper that is not itself an always-on public-facing agent;
risky boundary changes become explicit review events;
human approval remains required for actions that change real-world risk.

The security benefit is not only protection from malware-style execution. It is protection from context bleed, accidental overreach, prompt injection, supply-chain surprises, and the slow normalisation of unnecessary access.

A companion agent becomes more valuable as it gains continuity. That same continuity makes boundaries more important. The aim is not to make the agent timid. The aim is to let it operate seriously without giving every task the authority of the whole machine.

What This Enables

With this architecture, a business can run a companion agent that is useful enough to matter while still having an intelligible security story:

ordinary reasoning and coordination stay fast;
specialised profiles can use specialised tools;
code, browser automation, scripts, and tests run in constrained contexts;
secrets and host permissions are not inherited by accident;
escalation is logged and reviewable;
sensitive actions remain under human control.

That is the foundation for safer agent work. The next question is what happens when one of those profiles is allowed to write software. That is covered in Companion Agent Secure Coding Configuration.

Return to Security.