Dive into every layer of a Docker image — find wasted space, leaked secrets, and bloat in seconds.
A single, fast Rust binary that pulls images straight from any registry and shows you exactly what's inside. No Docker daemon, no root, no dependencies. Works on Linux, macOS and Windows.
- ⚡ Fast & standalone — one static binary. Pulls layers directly from Docker Hub / any V2 registry, a local docker client, or a
.tarfile. Layers are cached and downloads resume automatically. - 🔍 Layer-by-layer explorer — an interactive TUI to walk the filesystem of every layer, with added / modified / removed files colorized.
- 📉 Waste & bloat detection — efficiency score, wasted bytes, cross-layer duplicate files, oversized layers, package-manager caches, dev/build artifacts, and a reconstructed Dockerfile with anti-pattern linting.
- 🛡️ Security hygiene checks — flags leaked secret files (
.env, SSH / cloud keys, certs), hardcoded credentials inENV/labels/Dockerfile (key names only, never the value), setuid & world-writable files, and containers that run as root. - 🤖 AI optimization report — hand the full analysis to any OpenAI-compatible model and get a prioritized fix list, plus version-over-version regression detection.
- 🚦 CI gate — fail the pipeline when an image drops below your efficiency / wasted-bytes thresholds.
- 🌐 Terminal · Web · JSON / Markdown · WeCom — explore interactively, expose an HTTP API, export a report, or push results to a chat.
Scope note: diving focuses on size, structure, and basic security checks (leaked secret files, file permissions, runs-as-root, etc.). It scans file paths and image metadata — it does not do CVE/vulnerability scanning or file-content scanning. Pair it with Trivy/grype/docker scout for vulnerability coverage.
# 1. install — pick one:
curl -fsSL https://raw.githubusercontent.com/vicanso/diving-rs/main/install.sh | sh # prebuilt binary
cargo install diving # from crates.io
# 2. dive in
diving redis:alpineThat's it — no Docker daemon required. Prebuilt binaries for Linux / macOS / Windows are also on the release page, or build the latest from source with cargo install --git https://github.com/vicanso/diving-rs.
Inside the TUI:
| Key | Action |
|---|---|
1 |
Show only Modified / Removed files of the current layer |
2 |
Show only files ≥ 1 MB |
Esc / 0 |
Reset the view |
diving accepts three source types:
# from a registry (default) — Docker Hub, quay.io, private registries…
diving redis:alpine
diving quay.io/prometheus/node-exporter
# pick an architecture for multi-arch images
diving redis:alpine?arch=arm64
# from the local docker client
diving docker://redis:alpine
# from a saved tar file
diving file:///tmp/redis.tar# JSON
diving redis:alpine --output-file result.json
# Markdown (detected by the .md extension)
diving redis:alpine --output-file result.md
# Markdown to stdout — base image layers are auto-detected and hidden by default
diving myimage:latest --output-file -
# include the base image layers
diving myimage:latest --output-file - --no-skip-baseRun diving in CI to keep images lean. With CI=true it prints the efficiency score and exits 1 when any threshold is exceeded.
CI=true diving redis:alpineThresholds are configurable in ~/.diving/config.yml:
| Option | Default | Meaning |
|---|---|---|
lowest_efficiency |
0.95 |
Minimum acceptable efficiency score (0–1) |
highest_wasted_bytes |
20971520 (20 MB) |
Maximum wasted bytes |
highest_user_wasted_percent |
0.1 |
Maximum wasted percentage (0–1) |
Provide an OpenAI-compatible API key and diving sends the full Markdown analysis (layers, reconstructed Dockerfile, wasted space, large files, security findings) to the model and prints a prioritized optimization report instead of opening the TUI. When the ENTRYPOINT/CMD points to a script inside the image, that script is read from the layers and included, so the model can review what the container actually runs.
# enable AI analysis (prints the report, skips the TUI)
diving redis:alpine --ai-api-key sk-xxxx
# custom endpoint / model
diving redis:alpine \
--ai-api-key sk-xxxx \
--ai-base-url https://your-gateway/v1 \
--ai-model gpt-4o
# configure via the environment
export OPENAI_API_KEY=sk-xxxx
diving redis:alpine
# control the report language (also affects terminal / Markdown output)
diving redis:alpine --ai-api-key sk-xxxx --lang zh| Flag | Environment | Default | Description |
|---|---|---|---|
--ai-api-key |
OPENAI_API_KEY |
— | OpenAI-compatible API key. Providing it enables AI analysis. |
--ai-base-url |
OPENAI_BASE_URL |
https://api.openai.com/v1 |
API base URL. A full .../chat/completions URL is also accepted. |
--ai-model |
OPENAI_MODEL |
gpt-4o |
Model name. |
--ai-system-prompt |
OPENAI_SYSTEM_PROMPT |
built-in DevSecOps template | Override the system prompt to fully replace the built-in one. |
--lang |
DIVING_LANG |
system locale | Output language: en or zh. |
--no-ai-history |
— | off | Skip the regression comparison for this run (the snapshot is still refreshed). |
Each run stores a snapshot under ~/.diving/ai_history/. On the next run of the same image, the previous snapshot is sent alongside the current one so the model can flag size regressions / bloat between versions. --no-ai-history skips that comparison for one run (e.g. when the baseline is stale); the snapshot is still refreshed so subsequent runs compare against this one.
Security tip: the API key, base URL and webhook are CLI/env only — they are never accepted as web query parameters, so they don't end up in access logs.
Pass a WeCom (企业微信) group-bot webhook to push the result straight into a chat instead of opening the TUI. Content is chosen so it always fits the bot's ~4096-byte markdown limit:
- with
--ai-api-keyset → the concise AI report is pushed - without AI → a short summary (efficiency score, wasted space, recommendations)
# bot key (expanded to the standard webhook URL automatically)
diving redis:alpine --wecom-webhook 693a91f6-7aoc-4bc4-97a0-0ec2sifa5aaa
# or the full webhook URL
diving redis:alpine --wecom-webhook "https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=KEY"
# push the AI report instead of the summary
diving redis:alpine --ai-api-key sk-xxxx --wecom-webhook KEY
# or from the environment
export WECOM_WEBHOOK=KEY
diving redis:alpine| Flag | Environment | Default | Description |
|---|---|---|---|
--wecom-webhook |
WECOM_WEBHOOK |
— | WeCom group-bot webhook URL, or a bare bot key. Providing it pushes the result and skips the TUI. |
Oversized content is truncated to the WeCom limit with a … (truncated) marker.
Run diving as an HTTP server with a React frontend for remote analysis.
# Create the data directory and grant it to the container user (UID/GID 1000)
mkdir -p $PWD/diving
chown -R 1000:1000 $PWD/diving
docker run -d --restart=always \
-p 7001:7001 \
-v $PWD/diving:/home/rust/.diving \
--name diving \
vicanso/divingOpen http://127.0.0.1:7001/ in the browser.
The container runs as a non-root UID (1000:1000); the chown above lets it write the layer cache (without it the container fails to start). The image is based on debian:bookworm-slim with ca-certificates and tzdata. It ships no wget/curl, so there is no in-image HEALTHCHECK — probe GET /ping from your orchestrator (Kubernetes livenessProbe, a sidecar, etc.) instead.
Change the listen address with --listen:
diving --mode web --listen 0.0.0.0:8080Analyze a Docker image and return the result.
| Parameter | Type | Required | Description |
|---|---|---|---|
image |
string | yes | Image reference (same formats as terminal mode) |
format |
string | no | Set to markdown to return a Markdown report instead of JSON |
skipBase |
bool | no | When format=markdown, auto-detect and hide base image layers (default true); set false to include them |
# JSON response (default)
curl "http://127.0.0.1:7001/api/analyze?image=redis:alpine"
# specify architecture
curl "http://127.0.0.1:7001/api/analyze?image=redis:alpine%3Farch%3Darm64"
# Markdown report
curl "http://127.0.0.1:7001/api/analyze?image=redis:alpine&format=markdown"
# Markdown report including base layers (hidden by default)
curl "http://127.0.0.1:7001/api/analyze?image=myimage:latest&format=markdown&skipBase=false"During analysis diving scans every file path against built-in rules (.env files, SSH private keys, AWS/GCP credentials, TLS private keys, kubeconfig, .htpasswd, an accidentally-copied .git directory, …) and reports matches under Security Warnings. (It scans paths and metadata, not file contents.)
Extend or suppress the rules with ~/.diving/sensitive-files — one rule per line:
| Line format | Effect |
|---|---|
<glob-pattern> |
Flag matching files (reason: "Custom sensitive file") |
<glob-pattern> | <reason> |
Flag with a custom reason label |
!<glob-pattern> |
Ignore / suppress matches (overrides built-in and custom patterns above) |
Lines starting with # and blank lines are ignored. Globs are case-insensitive; * matches across directory separators, and patterns are also tested against the filename alone (so *.pem matches a/b/cert.pem).
# ── Extra patterns ───────────────────────────────────────────
**/*.vault-token | Vault token
**/app-secrets.json | Application secrets
# ── Suppress built-in rules for intentional inclusions ───────
!**/.env.example
!**/.env.template
!**/certs/nginx.crt
!**/testdata/**
!**/fixtures/**
Config file: ~/.diving/config.yml.
| Option | Default | Description |
|---|---|---|
layer_path |
~/.diving/layers |
Layer blob cache directory |
layer_ttl |
90d |
TTL for cached layer blobs and analysis results; an entry is purged if not accessed within this duration |
analysis_path |
~/.diving/analysis |
Analysis-result cache directory |
cleanup_interval_hours |
1 |
How often (hours) caches are swept for expired entries |
threads |
min(layers, 2 × CPUs) |
Concurrent layer fetch + decompression tasks. Raise on fast networks with many layers; lower when sharing the host |
lowest_efficiency |
0.95 |
CI check — minimum efficiency score (0–1) |
highest_wasted_bytes |
20971520 |
CI check — maximum wasted bytes (20 MB) |
highest_user_wasted_percent |
0.1 |
CI check — maximum wasted percentage (0–1) |
layer_ttl: 30d
cleanup_interval_hours: 6
threads: 4
lowest_efficiency: 0.95
highest_wasted_bytes: 20971520
highest_user_wasted_percent: 0.1diving keeps two on-disk caches under ~/.diving/, both governed by layer_ttl and swept hourly:
- Layer blobs (
~/.diving/layers/) — compressed layer downloads, keyed by layer digest. A hit skips the network download; decompression and file-tree construction still run. - Analysis results (
~/.diving/analysis/) — the fully analyzed result, keyed by theDocker-Content-Digest(from aHEADagainst the manifest endpoint) plus architecture. A hit short-circuits the entire pipeline.
The analysis cache is content-addressable, so re-pushing a mutable tag like :latest automatically invalidates the entry. If the HEAD probe fails for any reason, diving silently falls back to a full analysis — caching never blocks a request.
Because layer data is downloaded from the source (e.g. Docker Hub), the first run on a large image can take a while. Interrupted downloads resume automatically. For privately-deployed registries, run diving (or its web image) on a host that can reach the registry.
Licensed under the Apache License 2.0.

