// guide

Sandboxing Claude Code (and every other coding agent).

An agent that can run shell commands is only as safe as the machine you give it. This guide covers what an unsandboxed agent can actually touch, how machine isolates it in a per-project VM, and an honest comparison with the other options.

01The problem

Coding agents are most useful when they can act: run the tests, install the dependency, fix the config, retry. Claude Code's permission prompts exist because every one of those actions runs on your machine — and the moment you get tired of approving each command and reach for --dangerously-skip-permissions, the agent inherits everything your user account can do:

  • Read your SSH private keys, browser sessions, cloud credentials, and every other project's source and .env files.
  • Execute whatever a compromised or hallucinated npm install postinstall script wants — supply-chain attacks don't ask permission either.
  • Mutate global state: dotfiles, keychains, crontabs, other repos.

The fix isn't more prompts — it's giving the agent a machine where yes to everything is an acceptable answer.

02The isolation model

machine gives each project its own Lima VM with Docker, Node, the agent CLIs (Claude Code, Codex), GitHub CLI, and signed git already provisioned. The boundary is a VM, not a container — a different kernel, not a namespace.

  • No host filesystem is mounted. There is no path from the VM to your home directory — the base template declares mounts: [], and the smoke suite boots a real VM and checks that no host mounts leak.
  • One VM per project. A compromised dependency in one project cannot read another project's code or secrets — they're separate machines.
  • Keys stay on the host. Git auth and commit signing use a forwarded SSH agent: the VM can never read the private key, but while you're connected it can use every key the agent holds — signing and auth for any repo the key authorizes, not just this project's. Gate each use (1Password approval, ssh-add -c) or swap forwarding for a per-project deploy key (forward_agent = false) — see restricting the forwarded agent.
  • Secrets are tmpfs-only. machine secrets renders 1Password Environments into VM tmpfs — never to disk, gone on reboot. A fully compromised VM sees only the secrets a repo explicitly rendered, not your vault.

Inside that boundary, the agent runs with defaultMode: auto — full autonomy, because the blast radius is one disposable VM. machine claude even keeps it running in tmux after you close your laptop. The full model is in the threat model.

03Three commands

shell
$ brew install katspaugh/machine/machine $ machine up # zero-config sandbox VM, ~/code inside $ machine claude # Claude Code in a tmux session, in the VM

Add repos and tool profiles per project in projects.toml when you want more than a scratch VM — see the manual. Nix users: nix profile install github:katspaugh/machine.

04Honest comparison

Doesn't Claude Code already have a sandbox?

Yes — and it's good at what it does. Built-in sandboxing wraps each shell command in OS-level rules (Seatbelt on macOS, bubblewrap on Linux, plus a network proxy): writes are limited to the workspace, network to approved hosts. Three things it doesn't change:

  • It's still your host. Same kernel, same user account, your real working copy. Filesystem reads are broadly allowed with carve-outs, so anything you didn't think to deny is readable — and the escape surface is the OS sandbox itself, not a hypervisor.
  • The prompt loop survives. Commands that legitimately need network access or writes outside the workspace still stop and ask — and the well-worn failure mode is approving "run unsandboxed" until the sandbox is decoration. The flag that skips it all is one keystroke away, on your host.
  • It does nothing for the environment. Your Mac still accumulates the runtimes, browsers, and Docker installs, and there's no boundary between projects.

A VM answers a different question: instead of fencing commands on your machine, give the agent a machine with nothing of yours on it. Inside it, "yes to everything" is safe by construction — and the two compose, since Claude Code's sandbox keeps working inside the VM as defense-in-depth.

Where the alternatives genuinely win, the table says so.

Bare host + permission promptsDocker / devcontainersmachine (Lima VM)
Isolation boundary None — agent shares your account Kernel namespaces; host kernel shared, escapes are rare but real; project dirs are usually bind-mounted in Hardware-virtualized VM, separate kernel, no host mounts
Agent autonomy you can grant Low — every skipped prompt is host exposure High inside the container, but the mounted workspace is still your working copy Full (auto mode) — worst case is a throwaway VM
Cross-project isolation None Only if you never share volumes or images between projects Default — one VM per project
Docker workloads inside Yes (host Docker) Docker-in-Docker (needs --privileged) or socket mounting — both weaken the sandbox; socket mounting pierces it outright Yes — real dockerd inside the VM
Cross-platform & team-shareable config Wins: devcontainer.json is portable, IDE-native, works on Linux/Windows/Codespaces macOS hosts only; config is a TOML you share, but VMs are local
Weight None Wins: containers are lighter and faster to start ~GBs of disk per VM, seconds-to-minutes to boot (cached base disk)

If you need Linux/Windows hosts or team-distributed environments, use devcontainers. If you want the strongest practical boundary for autonomous agents on a Mac, use a VM.

What about Tart, or Apple's container CLI?

Same isolation class — both are Virtualization.framework VMs, like the Lima VMs machine boots. The difference is what's wired up around the VM. Tart is excellent at golden images and OCI-registry distribution, which is why it owns macOS CI; it hands you a booted VM, and the dev-environment part — provisioning, agent CLIs, git signing through a forwarded agent, secrets, per-project config — is yours to build. Apple's container runs each Linux container in its own lightweight VM: strong per-container isolation with container UX, but it's an image-based runtime for workloads, not a persistent provisioned machine an agent works in across sessions.

Building macOS CI? Tart wins. Running containerized services with VM-grade isolation? Apple's container is interesting. Wanting one command from brew install to an agent working in a signed-git, secret-injected Linux box? That's the gap machine fills.

05FAQ

Does machine run on Linux or Windows hosts?

No — macOS 13 or newer only (Apple Silicon or Intel). The VMs boot through Lima's vz driver, which is Apple's Virtualization framework. On Linux, Claude Code's built-in sandbox or a devcontainer is the right tool.

Does the agent get internet access?

Yes — the VM has normal outbound networking (it needs npm, apt, GitHub). The boundary protects your host and other projects, not the network.

How do I see the app the agent is building?

Lima auto-forwards listening guest ports to 127.0.0.1 on the host. Your browser just works.

What if the agent wrecks the VM?

machine destroy && machine up — a fresh, fully provisioned VM in about a minute from the cached base disk.

Is my code safe from the agent?

The code in the VM is the agent's working copy, so the agent can read and modify it freely. Push protection is on you — the forwarded SSH agent means the VM can push commits on your behalf, so use branch protection and review the diff before merging, as you would with any autonomous agent.

// machine · sandboxed AI coding agents · katspaugh/machine ← Back to runmachine.dev