Enterprise code bastion: Claude works on the code with no file or shell access
A protective boundary between an AI agent and an enterprise codebase. Claude reasons about the code and solves the task — but is stripped of native tools. Every action goes through a filtered mediator and runs inside an isolated container; the agent only gets the result back.
Problem
A large organization cannot hand an AI tool uncontrolled access to source code, secrets, and infrastructure. But it still wants Claude to understand the codebase and do real work in it. The naive setup gives the agent native file and shell tools — and that is exactly the access security cannot sign off on: nothing stops the agent from reading a secret, writing outside the task, or running a destructive command.
Methodology
Don't constrain the agent with prompts — constrain it with architecture. Remove the native tools entirely and route every interaction through a filtered mediator that executes inside a sandbox.
-
1. Project + task is the unit of work. A
projects/directory and atasks/directory. Each «project + task» pair spins up a dedicated Docker container with only that project mounted at/workspace. Sessions don't see each other. -
2. Claude with no tools. Claude Code launches with
--tools ""— native read, write, and command execution are off. On its own the agent cannot read, change, or run anything in the system. -
3. Access only through MCP. Instead of native
tools, Claude talks to an MCP server over HTTP (externalized beyond the project in
production). The server is bound to the session and operates on its container.
Exposed tools:
ping,echo,secureBash. - 4. Strict filtering. Every command is validated before execution (in production — by a small dedicated LLM model acting as the gate). Destructive and out-of-scope actions are cut off before they run.
-
5. Isolated execution. A command that passes the
filter runs via
docker execinside the session container — not on the host. The result is returned to Claude.
Enterprise-grade guarantees:
| Control | How it's enforced |
|---|---|
| Workload isolation | One container per «project + task»; only the needed project mounted; sessions isolated |
| Least privilege | No native tools (--tools ""); the only path is the MCP server with a limited operation set |
| Perimeter control | The MCP guard is externalized; code, secrets, and infra are never handed to the agent directly |
| Operation filtering | Each command validated (in prod by a separate LLM); destructive actions blocked before execution |
| Auditability | Every MCP request logged (logs/mcp.log, JSON Lines) — a full trail for review and compliance |
| Revocation | Session registry (sessions.csv) and containers that can be stopped/removed at any time |
The Claude invocation — note what's missing:
# start an isolated session for a project+task pair
./scripts/start_session.py demo demo-001
# → generates idp-<uuid>, launches a container with projects/demo → /workspace
# → records the pairing in sessions.csv
# build and run the Claude command
eval "$(./scripts/claude_command.py demo demo-001)"
# the generated command looks like:
IDP_SESSION_ID=idp-aae519cd... \
claude --append-system-prompt "..." \
-p "Find out whether this project has a packages.json" \
--tools ""
# IDP_SESSION_ID binds to the session; substituted into the MCP URL in .mcp.json
# --append-system-prompt adds the guard rules from system-prompt.md
# -p the task text from tasks/demo-001.md
# --tools "" kills native tools; the only path left is MCP (secureBash, ...)
Artifact
github.com/dobryakov/enterprise-code-bastion
(Python, Docker, MCP-over-HTTP). A working bastion: session launcher, command
builder, the MCP server with secureBash,
and a CSV session registry — runnable, not a slide.
Where it breaks
- The filter is itself a model. Using an LLM as the command gate means the gate has a prompt-injection / jailbreak surface of its own. For a hard guarantee you want a deterministic allowlist alongside the model, not the model as the sole arbiter — the architectural conversation is where that line sits.
-
secureBashstill runs arbitrary commands inside the container. Isolation is only as strong as the container hardening: non-privileged, controlled network egress, resource limits. A compromised container is still a blast radius — the boundary protects the host, not everything inside the box. -
A flat-file registry doesn't scale to governance.
sessions.csvis a fine append-only journal, but it carries no concurrency control, no RBAC, no who-approved-what. At organization scale you need a real session store and an approval workflow behind it.
For whom and why
This is the security model for AI-on-code where compliance has to sign off: isolation, least privilege, per-operation filtering, full audit. It's not a tutorial on running an agent — it's the architecture conversation about where the line falls between giving the agent enough to be useful and keeping it physically unable to act outside its bounds. The result: Claude reasons and solves, but cannot do anything beyond what's permitted — no tools, no direct access, every command verified and sandboxed.
Letting AI agents touch your codebase — safely?
Isolation, least privilege, per-operation filtering, and an audit trail for AI-on-code — with a working artifact, not a policy deck.
Email meOther breakdowns
An engineering breakdown series: real task → methodology → working artifact → honest breakdown of where it fails.
Back to series →