So, I've been using Docker sandboxes for a while now. To be fair, as a Docker Captain, I've been playing with them since late October, when we got introduced to the tool during the Captains Summit. And let me tell you — sandboxes have come a long way since those early days.
If you've been living under a rock (or, more likely, drowning in Kubernetes YAML), here's the TL;DR: Docker Sandboxes are isolated environments designed to run AI agents safely. We're talking network isolation, a separate kernel courtesy of microVMs, and a network proxy that intercepts outbound traffic. Basically, it's the safe-deposit box your AI agent didn't know it needed before it rm -rf'd your home directory.
But there's been one persistent annoyance. And the latest release just fixed it.
The Problem: Sandboxes Love Big Tech (and Big Tech Loves Your Wallet)
Sandboxes are great at isolating AI agents in a secure space. But out of the box, they primarily target the big AI providers — Claude, Codex, Gemini, Kiro, and friends — using their API-based billing.
And look, that's awesome if your company hands you a corporate card with the casual "just expense it." For the rest of us mortals, dropping €2000 on AI API credits to vibe-code a side project isn't exactly... sustainable. We need alternatives.
The good news: alternatives exist. The bad news: they come with tradeoffs and added setup complexity.
I've already gone down this rabbit hole. I tried setting up OpenCode inside a sandbox, hooked up to local model runners like Ollama, LM Studio, and Docker Model Runner. And it worked! It was great! Until I created a second sandbox. And had to do all the setup again. And again. By the third time I was reaching for the coffee and questioning my life choices.
If automation isn't fun yet, you're probably not automating enough. — Some DevOps engineer, 3 AM, on call
Enter: Sandbox Kits (Released Yesterday, by the Way)
With the sbx 0.28.0 release, Docker introduced something called kits. Think of kits as a Dockerfile's clever cousin — but specifically designed for sandbox environments.
A kit is a packaged set of capabilities your sandbox can use. Kits can include tools to install, environment variables to set, credentials to inject, domains to allow, files to drop in, and startup commands to run. You declare everything in a single spec.yaml file, point the CLI at it, and the sandbox handles the rest.
The really cool part? Credentials stay on the host and go through a proxy instead of entering the VM, and outbound traffic is restricted to the domains the kit allows. So your local API keys never actually touch the agent's environment. The sandbox just yells "hey proxy, please add the auth header" and the proxy plays a game of telephone on the agent's behalf. Beautiful.
The Two Flavors of Kits
There are two kinds of kits, and you'll probably end up using both:
- Mixin kits (
kind: mixin) extend an existing agent with extra capabilities. You can stack several on the same sandbox, like a Kubernetes Helm chart but, you know, less painful. - Agent kits (
kind: agent) define a complete agent from scratch — the image, the entrypoint, network policies, the whole nine yards.
Most folks will start with mixin kits because they let you bolt features onto the built-in agents (Claude, Codex, Gemini, OpenCode, etc.) without rebuilding the wheel.
What Can You Actually Do With Kits?
Quite a lot, actually. Let's break it down:
- Run commands — Either
installcommands (run once at creation) orstartupcommands (run on each start). Thinkapt-get install,pip install, or that one cursed shell script your senior dev wrote in 2019. - Inject files — Static files bundled with the kit, or
initFileswritten at startup with runtime values substituted in. - Set environment variables — Including the proxy-managed kind, where the secret never enters the VM.
- Control network access — Whitelist domains the sandbox can reach.
- Declare credential sources — Tell the proxy where secrets live on your host so the sandbox can use them without ever seeing them.
And, by "interceptors," I'm talking about accessing private APIs using credentials that are not reachable from inside the sandbox. The network proxy intercepts the request and adds the proper authorization header. It's basically the bouncer at a club: the sandbox can ask for things, but it never gets to see the VIP guest list.
Example 1: A Python Linting Kit (Because We All Love Consistency, Right?)
Let's say your team wants every sandbox to start with the same Python linter setup. Maybe you're tired of code reviews where Bob uses 4-space indentation and Alice uses 2 and Steve... well, Steve uses tabs (we don't talk about Steve).
Here's a kit that installs Ruff and ships a shared config:
ruff-lint/
├── spec.yaml
└── files/
└── workspace/
└── ruff.toml
ruff-lint/spec.yaml
schemaVersion: "1"
kind: mixin
name: ruff-lint
displayName: Ruff Linter
description: Python linting with shared team config
network:
allowedDomains:
- pypi.org
- files.pythonhosted.org
commands:
install:
- command: "uv tool install ruff@latest"
user: "1000"
description: Install Ruff
ruff-lint/files/workspace/ruff.toml
line-length = 100
[lint]
select = ["E", "F", "I"]
To use it:
sbx run claude --kit /path/to/ruff-lint/
That's it. Every time you spin up a new sandbox with this kit, Ruff is installed, the config is in place, and your linting is consistent across the team. Steve will hate it. The rest of us will sleep better at night.
Example 2: Bringing Your Own LLM (The Whole Reason We're Here)
This is where it gets juicy. Let's build a kit that lets the sandbox talk to a local Ollama instance running on your host machine. No more "burning €50 in API credits while debugging a typo" energy.
Assume Ollama is running on host.docker.internal:11434 and you want OpenCode (or any other agent) to use it.
local-llm/spec.yaml
schemaVersion: "1"
kind: mixin
name: local-llm
displayName: Local Ollama Backend
description: Routes agent requests to local Ollama on the host
network:
allowedDomains:
- "localhost:11434"
serviceDomains:
"localhost:11434": ollama-local
serviceAuth:
ollama-local:
headerName: Authorization
valueFormat: "Bearer %s"
credentials:
sources:
ollama-local:
env:
- OLLAMA_API_KEY
environment:
variables:
OPENCODE_BASE_URL: "http://host.docker.internal:11434/v1"
OPENCODE_MODEL: "gemma4:latest"
proxyManaged:
- OLLAMA_API_KEY
commands:
install:
- command: "npm i -g opencode-ai"
user: "1000"
description: Install OpenCode
initFiles:
- path: /home/agent/.config/opencode/config.json
content: |
{
"$schema": "https://opencode.ai/config.json",
"model": "sbx-local/{env:OPENCODE_MODEL}",
"provider": {
"sbx-local": {
"npm": "@ai-sdk/openai-compatible",
"name": "sbx ({env:OPENCODE_BASE_URL})",
"options": {
"baseURL": "{env:OPENCODE_BASE_URL}"
},
"models": {
"{env:OPENCODE_MODEL}": {
"name": "{env:OPENCODE_MODEL}"
}
}
}
}
}
One important note above: we route opencode to talk with host.docker.internal:11434 but we need to whitelist localhost:11434
In our case, Ollama doesn't really need the API Key. But in case you need to set proxyManaged secret, it might be complicated. As it's not documented properly, and the command sbx secret set-custom is hidden. It's only mentioned on a single place: https://docs.docker.com/ai/sandboxes/customize/build-an-agent/#register-your-api-key
sbx secret set-custom -g \
--host localhost:11434 \
--env OLLAMA_API_KEY \
--placeholder "proxy-managed" \ # what would be visible inside sandbox
--value "myrealsecretwhichwillbereplacedbyproxybutneverseeninside"Now you can run:
sbx run shell --kit ./local-llm/
And boom — your sandbox has OpenCode pre-installed, configured to talk to your local Ollama. Your API key (if you've set one up) lives safely on the host. The sandbox just gets the responses. Your wallet stays happy. Your CFO stays calm. Everybody wins.
Distribution: Because Hoarding Kits Is Selfish
Once you've built a kit you love, you probably want to share it. Maybe with your team, maybe with the whole world. The sbx kit subcommands let you validate, inspect, and publish kits. You can:
- Pack a kit into a ZIP with
sbx kit pack - Push to an OCI registry with
sbx kit push . ghcr.io/myorg/my-kit:1.0 - Pull from a registry, or load directly from Git
Loading from Git looks like this:
sbx run claude --kit "git+https://github.com/docker/sbx-kits-contrib.git#ref=v0.1.0&dir=code-server"
Yes, that's a real Git URL with query-string-style parameters. Yes, you should quote it in your shell. No, the maintainers do not apologize for & triggering background jobs in bash. (Honestly, neither would I.)
Iterating on Kits Without Losing Your Mind
Here's a tip from someone who has cycled through approximately 47 broken kits this week: while you're developing a kit, don't recreate the sandbox every single time. Use:
sbx kit add my-sandbox ./my-kit/
This re-runs install commands and re-copies files into your existing sandbox. Much faster than the full destroy-and-recreate dance.
When (not if) something breaks, two debugging commands are your best friends:
sbx policy logshows every outbound request the proxy saw, including which rule matched and how the request was forwarded. This is your go-to for "why is my install download failing?" mysteries.sbx exec <sandbox> -- <cmd>lets you poke around inside a running sandbox. Runwhich mytool,ls,cat, whatever. It's like SSH but without the existential dread.
The Painful Bit: Go Modules from Private Repos
Alright, time to put on the disclaimer hat. Kits are great, but they're not magic. There's at least one workflow where I'm still wrestling with the setup, and it's worth talking about: Go projects that pull modules from private Git repos.
If you've ever worked on a Go service in a serious company codebase, you know the drill. You have a go.mod that imports github.com/yourcompany/internal-lib, and the toolchain politely refuses to fetch it because — surprise! — it's private. The standard workaround is a combo of environment variables (GOPRIVATE, GONOSUMCHECK, GOPROXY) and either an SSH key or a .netrc file for HTTPS auth.
The env vars part is easy. Kits handle that beautifully:
environment:
variables:
GOPRIVATE: "github.com/yourcompany/*"
GONOSUMCHECK: "github.com/yourcompany/*"
Done. But then authentication shows up at the door, and that's where things get spicy.
Option A: SSH. The cleanest path on a normal dev machine is to tell Git to rewrite HTTPS URLs to SSH and let your SSH agent handle the rest:
git config --global url."git@github.com:".insteadOf "https://github.com/"
Beautiful — except the sandbox doesn't have your SSH keys. They aren't mounted, only your project workspace is. And unlike docker build, there's no --ssh forwarding flag (yet) that would expose your SSH agent into the sandbox. So... no SSH path. Moving on.
Option B: .netrc. The classic HTTPS escape hatch. You drop a ~/.netrc with your token:
machine github.com login yourusername password ghp_yourtokenhere
And Git happily uses it. But now we're back to the same problem — that file lives on your host, not in the sandbox, and putting it in the sandbox kind of defeats the entire point of the credential proxy. The whole "secrets never enter the VM" promise goes out the window the moment you write a token to disk inside the agent's filesystem.
Option C: Maybe... the proxy can save us? Here's the thought I keep coming back to. A .netrc is, fundamentally, just HTTP Basic auth. And the kit system already has a primitive for "intercept this domain and inject an auth header for it." So in theory, a kit like this should work:
network:
allowedDomains:
- github.com
serviceDomains:
github.com: github-private
serviceAuth:
github-private:
headerName: Authorization
valueFormat: "Basic %s"
credentials:
sources:
github-private:
env:
- GITHUB_TOKEN_BASIC
environment:
variables:
GOPRIVATE: "github.com/yourcompany/*"
GONOSUMCHECK: "github.com/yourcompany/*"
Where GITHUB_TOKEN_BASIC on the host is the base64-encoded username:token string. The proxy intercepts every request to github.com, slaps a Authorization: Basic <secret> header on it, and go mod download is none the wiser. The token never enters the sandbox. The agent doesn't know it exists. Beautiful in theory.
In practice... I haven't quite gotten this to work end-to-end yet. The proxy does see the requests, but Git seems to send its own credentials handling on top, and there's some interplay with the serviceDomains matching that I'm still untangling (and that's exactly the kind of thing sbx policy log is built for, by the way). I suspect this is solvable with the right valueFormat and maybe scoping the domain match more carefully — and if I figure it out, that's another blog post coming your way.
The TL;DR for now: kits give you the building blocks to solve private-module auth without compromising the security model, but the recipe isn't quite written down anywhere yet. If you've cracked this, hit me up.
Wrapping Up
Kits are exactly the missing piece I've been waiting for since October. Reproducible sandbox setups, secret management that actually makes sense, and a clean distribution story via OCI registries. It transforms sandboxes from "cool isolated AI playground" to "production-ready dev environment that I can share with my team."
Will I still occasionally drop €50 on Claude API credits when I need the big guns? Probably. But for the other 80% of my agent experiments, kits + local models = happy wallet, happy developer, happy ops team.
Now if you'll excuse me, I have approximately 12 sandbox setup scripts to delete from my dotfiles. Finally.
Read the official docs here: docs.docker.com/ai/sandboxes/customize/kits
Comments ()