Skip to main content
🛠 This page is for engineering teams self-hosting their own Lightdash instance. On Lightdash Cloud, sandboxes are fully managed for you — there’s nothing to configure.

What sandboxes are for

Some Lightdash features use an AI agent (Claude Code) that writes and runs code on your behalf. Today that’s:
  • AI writeback — the agent edits your dbt project (e.g. adds a metric or dimension), runs lightdash compile to validate it, and opens a pull request.
  • Data app generation — the agent generates and builds a small web app from a prompt.
Running model-generated code directly on the Lightdash server would be unsafe: the code is untrusted, can run arbitrary commands, and needs its own toolchain (git, dbt, the Lightdash CLI, Node). Lightdash instead runs each agent inside a sandbox — an isolated, disposable environment with a constrained network. The agent does its work there, Lightdash collects the result (a PR, a built app), and the sandbox is torn down. Sandboxes are also what make these features fast and multi-turn: a sandbox can be suspended between turns and resumed later, so a conversation with the agent keeps its state without holding a container open the whole time.

Sandbox providers

The sandbox backend is pluggable. Lightdash talks to a provider-neutral interface, so the same feature code runs on whichever backend your deployment is configured for. You select the provider with the SANDBOX_PROVIDER environment variable.
ProviderSANDBOX_PROVIDERUse forIsolation
E2B (default)e2bProduction / managedmicroVM
AWS Lambda MicroVMslambda-microvmProduction on AWS (self-hosted)microVM
Azure Container Apps Sandboxesazure-sandboxesProduction on Azure (self-hosted)microVM
Local DockerdockerLocal development onlycontainer
E2B, AWS Lambda MicroVMs, and Azure Container Apps Sandboxes are all supported production backends. E2B is the managed default; the AWS and Azure providers are for teams who want sandboxes to run inside their own cloud account. More providers (Kubernetes, ECS) are planned.
The local Docker provider is for development only. It launches plain Docker containers via the Docker socket, which is root-equivalent on the host and provides no real isolation between the sandbox and your machine. It refuses to start when NODE_ENV=production. Do not use it for a production deployment.

E2B (production default)

E2B runs each sandbox as a Firecracker microVM in E2B’s cloud. It’s the default — if you don’t set SANDBOX_PROVIDER, Lightdash uses E2B. To use it you need an E2B account and API key, and the agent needs an Anthropic API key:
SANDBOX_PROVIDER=e2b            # default, can be omitted
E2B_API_KEY=e2b_...            # from your E2B dashboard
ANTHROPIC_API_KEY=sk-ant-...   # the agent (Claude Code) runs inside the sandbox
The sandbox images are E2B templates. Lightdash uses separate templates for data apps and for AI writeback so they can be pinned or rolled back independently. These default to the published Lightdash templates and rarely need to be set:
E2B_TEMPLATE_NAME=lightdash/lightdash-data-app
E2B_TEMPLATE_TAG=<lightdash-version>             # defaults to your Lightdash version
E2B_AI_WRITEBACK_TEMPLATE_NAME=lightdash/lightdash-ai-writeback
E2B_AI_WRITEBACK_TEMPLATE_TAG=<lightdash-version>

AWS Lambda MicroVMs (self-hosted production)

AWS Lambda MicroVMs run each sandbox as a Firecracker microVM inside your own AWS account, so untrusted agent code and your repository contents never leave your infrastructure. The microVMs have no public IP — your backend reaches each one through an AWS-managed endpoint that requires a short-lived per-microVM token — and you control their outbound network access (see Networking and IAM). This is the recommended sandbox provider for customers deploying Lightdash on AWS — it keeps the sandbox boundary inside your existing AWS account and avoids sending agent workloads or repository contents to a third-party service.

Prerequisites

Provision these with your own IaC, in the same AWS account and region your Lightdash backend already runs in:
  • Two MicroVM images — one for data app generation and one for AI writeback (they bundle different toolchains). Build them from the Dockerfiles in the Lightdash repo (sandboxes/data-apps/, sandboxes/ai-writeback/, and the exec agent in sandboxes/microvm-agent/), push them to ECR, and register each as a Lambda MicroVM image on the AWS-managed al2023 base — ARM_64, 4 GB memory, with the agent’s /ready hook on port 8080. Each registration returns an image ARN for the config below.
  • Control-plane permissions on your backend’s existing IAM role — add RunMicrovm, GetMicrovm, SuspendMicrovm, ResumeMicrovm, TerminateMicrovm, and CreateMicrovmAuthToken.
Registering an image uses AWS’s create-microvm-image, which needs a build role (trusting lambda.amazonaws.com) and an S3 location to stage the build context. These are build-time only — not the S3 bucket Lightdash already uses for results and snapshots, and the backend never touches them at runtime.

Configure the provider

SANDBOX_PROVIDER=lambda-microvm
ANTHROPIC_API_KEY=sk-ant-...   # the agent (Claude Code) runs inside the microVM

# Region the microVMs run in (defaults to eu-west-1, the EU launch region)
LAMBDA_MICROVM_REGION=eu-west-1

# The image ARNs from registering the two MicroVM images above. Required.
LAMBDA_MICROVM_DATA_APP_IMAGE_ARN=arn:aws:lambda:<region>:<account>:microvm-image/...
LAMBDA_MICROVM_AI_WRITEBACK_IMAGE_ARN=arn:aws:lambda:<region>:<account>:microvm-image/...
The backend uses its ambient AWS credentials (instance role / IRSA / standard SDK credential chain) to call the Lambda MicroVMs control plane, so no access keys are configured here.

Networking and IAM

Configure the network connectors before going to production. The AWS-managed defaults give the microVM open inbound and outbound access, which means untrusted agent code can reach the public internet from inside your AWS account. We currently recommend pointing the egress connector at a VPC connector that has no outbound access by default, and only opening up the destinations the agent actually needs (your dbt repository host, the Anthropic / Bedrock API, your ECR registry).
Override these to tighten the network boundary or to give the microVM an IAM role:
# IAM role the microVM assumes (e.g. to pull from a private ECR or reach AWS APIs)
LAMBDA_MICROVM_EXECUTION_ROLE_ARN=arn:aws:iam::...:role/lightdash-sandbox

# Network connectors. Default to AWS-managed open ingress/egress; point these at
# your own VPC connectors to constrain traffic. We strongly recommend an egress
# connector that denies outbound traffic by default.
LAMBDA_MICROVM_INGRESS_CONNECTOR_ARN=arn:aws:lambda:<region>:aws:network-connector:aws-network-connector:ALL_INGRESS
LAMBDA_MICROVM_EGRESS_CONNECTOR_ARN=arn:aws:lambda:<region>:aws:network-connector:aws-network-connector:INTERNET_EGRESS

Azure Container Apps Sandboxes (self-hosted production)

Azure Container Apps Sandboxes run each sandbox as an isolated, microVM-class environment inside your own Azure subscription, so untrusted agent code and your repository contents never leave your infrastructure. Sandboxes have native suspend/resume (a full memory + disk snapshot with sub-second restore), which is what keeps multi-turn agent conversations fast. This is the recommended sandbox provider for customers deploying Lightdash on Azure — it keeps the sandbox boundary inside your existing Azure tenant and avoids sending agent workloads or repository contents to a third-party service.
Azure Container Apps Sandboxes is currently an Azure preview feature. It requires a Microsoft Entra ID account (personal Microsoft accounts aren’t supported), and its API surface may change while in preview.

Prerequisites

Provision these in the same Azure subscription and region your Lightdash backend runs in, using the aca CLI or the Sandboxes portal:
  • A sandbox group per feature — one for data app generation and one for AI writeback (they bundle different toolchains). A sandbox group (Microsoft.App/SandboxGroups) is the management boundary that holds a feature’s sandboxes and disk image. Give each group a Memory-mode auto-suspend lifecycle policy so idle sandboxes snapshot and scale to zero.
  • A disk image per group — build the two images from the Dockerfiles in the Lightdash repo (sandboxes/data-apps/, sandboxes/ai-writeback/), push them to a container registry (e.g. Azure Container Registry), and register each as a disk image in its sandbox group. Registration returns a disk image ID for the config below. (Unlike the AWS provider, there is no in-VM agent to build — Sandboxes expose a native command/file API.)
  • A workload identity with the data-plane role — grant your backend’s managed identity the Container Apps SandboxGroup Data Owner role on each sandbox group. Lightdash uses DefaultAzureCredential (workload identity on AKS, or the standard Azure credential chain) to authenticate — no client secret is configured here.

Configure the provider

SANDBOX_PROVIDER=azure-sandboxes
ANTHROPIC_API_KEY=sk-ant-...   # the agent (Claude Code) runs inside the sandbox

# Where your sandbox groups live
AZURE_SANDBOXES_SUBSCRIPTION_ID=<subscription-id>
AZURE_SANDBOXES_RESOURCE_GROUP=<resource-group>
AZURE_SANDBOXES_REGION=eastus2

# Per-feature sandbox group + disk image ID (from the prerequisites above). Required.
AZURE_SANDBOXES_DATA_APP_GROUP=lightdash-data-app
AZURE_SANDBOXES_DATA_APP_DISK_IMAGE=<data-app-disk-image-id>
AZURE_SANDBOXES_AI_WRITEBACK_GROUP=lightdash-writeback
AZURE_SANDBOXES_AI_WRITEBACK_DISK_IMAGE=<writeback-disk-image-id>

# Optional — sandbox size (XS/S/M/L, default M)
AZURE_SANDBOXES_RESOURCE_TIER=M
Egress is locked down automatically: each sandbox launches with a default-deny egress policy that only allows the hosts the agent needs (the Anthropic API and your git host), with full traffic inspection so the platform enforces the deny on all traffic and blocks non-HTTP egress. Untrusted agent code can’t reach any other destination — outbound requests to non-allowlisted hosts are rejected, and all other ports are blocked.

Local Docker provider (development)

For local development you can run sandboxes as plain Docker containers on your own machine — no E2B account required. This is the recommended way to work on or try the AI features locally. It uses the same images E2B builds, but as plain local Docker images. Two separate images are used (different toolchains), mirroring the two E2B templates:
Image (default tag)Built fromUsed by
lightdash-sandbox:localsandboxes/data-apps/Data app generation
lightdash-ai-writeback:localsandboxes/ai-writeback/AI writeback

Prerequisites

  • Docker running locally, with the daemon reachable from the Lightdash backend.
  • S3-compatible object storage configured (locally this is MinIO). Suspended-sandbox snapshots are tarred to object storage so a conversation survives the container being destroyed — see external object storage.
  • An Anthropic API key (ANTHROPIC_API_KEY) for the agent.

Setup

  1. Build the local sandbox images (each builds from sandboxes/<feature>/):
    ./sandboxes/data-apps/build-local-image.sh        # -> lightdash-sandbox:local
    ./sandboxes/ai-writeback/build-local-image.sh     # -> lightdash-ai-writeback:local
    
    These are large (the writeback image bundles dbt, the Lightdash CLI and Claude Code) and only need rebuilding when the sandbox toolchain changes.
  2. Point Lightdash at the Docker provider:
    SANDBOX_PROVIDER=docker
    # optional — these are the defaults:
    SANDBOX_DOCKER_IMAGE=lightdash-sandbox:local
    SANDBOX_AI_WRITEBACK_DOCKER_IMAGE=lightdash-ai-writeback:local
    
  3. Restart the backend and the scheduler so both pick up the new environment. Data app generation runs in the scheduler worker, so a stale SANDBOX_PROVIDER there will keep it on E2B. (With PM2, a plain restart reuses the cached env — delete and re-start the processes, or restart with --update-env, to actually reload the env file.)

Snapshot lifecycle

Every turn suspends its own sandbox, so in steady state nothing sits idle. Two timers configure the cloud-side idle policy as a backstop — when to auto-suspend a sandbox left running and when to auto-terminate a suspended one:
SANDBOX_IDLE_TIMEOUT_MS=1800000        # auto-suspend a running-but-idle sandbox (default 30 min)
SANDBOX_SNAPSHOT_RETENTION_MS=604800000 # auto-terminate a suspended microVM (default 7 days); also how long a thread stays resumable
SANDBOX_IDLE_TIMEOUT_MS feeds the auto-suspend policy on both the Lambda MicroVMs and Azure Sandboxes providers (for Azure it sets each sandbox’s Memory-mode auto-suspend interval). SANDBOX_SNAPSHOT_RETENTION_MS is read only by Lambda MicroVMs — on Azure, suspended-sandbox retention is governed by the sandbox group’s own auto-delete policy. E2B manages idle sandboxes itself, and the Docker dev provider has no idle handling.

Environment variable reference

VariableDefaultDescription
SANDBOX_PROVIDERe2bSandbox backend: e2b, lambda-microvm, azure-sandboxes, or docker.
ANTHROPIC_API_KEYAPI key for the Claude Code agent running inside the sandbox.
E2B_API_KEYE2B API key (required when SANDBOX_PROVIDER=e2b).
E2B_TEMPLATE_NAMElightdash/lightdash-data-appE2B template for data app sandboxes.
E2B_TEMPLATE_TAGLightdash versionTag of the data app template to launch.
E2B_AI_WRITEBACK_TEMPLATE_NAMElightdash/lightdash-ai-writebackE2B template for writeback sandboxes.
E2B_AI_WRITEBACK_TEMPLATE_TAGLightdash versionTag of the writeback template to launch.
LAMBDA_MICROVM_REGIONeu-west-1AWS region the microVMs run in (lambda-microvm).
LAMBDA_MICROVM_DATA_APP_IMAGE_ARNImage ARN for data app microVMs (required when SANDBOX_PROVIDER=lambda-microvm).
LAMBDA_MICROVM_AI_WRITEBACK_IMAGE_ARNImage ARN for writeback microVMs (required when SANDBOX_PROVIDER=lambda-microvm).
LAMBDA_MICROVM_EXECUTION_ROLE_ARNOptional IAM role the microVM assumes.
LAMBDA_MICROVM_INGRESS_CONNECTOR_ARNAWS-managed ALL_INGRESSOptional ingress network connector.
LAMBDA_MICROVM_EGRESS_CONNECTOR_ARNAWS-managed INTERNET_EGRESSOptional egress network connector.
AZURE_SANDBOXES_SUBSCRIPTION_IDAzure subscription holding the sandbox groups (required when SANDBOX_PROVIDER=azure-sandboxes).
AZURE_SANDBOXES_RESOURCE_GROUPResource group holding the sandbox groups (required when azure-sandboxes).
AZURE_SANDBOXES_REGIONeastus2Region the sandboxes run in (selects the data-plane endpoint).
AZURE_SANDBOXES_DATA_APP_GROUPSandbox group for data app sandboxes (required when azure-sandboxes).
AZURE_SANDBOXES_DATA_APP_DISK_IMAGEDisk image ID for data app sandboxes (required when azure-sandboxes).
AZURE_SANDBOXES_AI_WRITEBACK_GROUPSandbox group for writeback sandboxes (required when azure-sandboxes).
AZURE_SANDBOXES_AI_WRITEBACK_DISK_IMAGEDisk image ID for writeback sandboxes (required when azure-sandboxes).
AZURE_SANDBOXES_RESOURCE_TIERMSandbox size: XS, S, M, or L.
AZURE_SANDBOXES_API_VERSION2026-02-01-previewAzure Sandboxes data-plane API version.
AZURE_SANDBOXES_TOKEN_SCOPEhttps://management.azuredevcompute.io/.defaultEntra token scope for the data plane.
SANDBOX_DOCKER_IMAGElightdash-sandbox:localLocal image for data app sandboxes (docker provider).
SANDBOX_AI_WRITEBACK_DOCKER_IMAGElightdash-ai-writeback:localLocal image for writeback sandboxes (docker provider).
SANDBOX_IDLE_TIMEOUT_MS1800000 (30 min)Auto-suspend a running-but-idle sandbox (lambda-microvm and azure-sandboxes). Ignored by e2b/docker.
SANDBOX_SNAPSHOT_RETENTION_MS604800000 (7 days)Lambda MicroVMs idle policy: auto-terminate a suspended microVM (also how long a thread stays resumable). Ignored by e2b/azure-sandboxes/docker.