diving-rs

Dive into every layer of a Docker image — find wasted space, leaked secrets, and bloat in seconds.

A single, fast Rust binary that pulls images straight from any registry and shows you exactly what's inside. No Docker daemon, no root, no dependencies. Works on Linux, macOS and Windows.

Why diving?

⚡ Fast & standalone — one static binary. Pulls layers directly from Docker Hub / any V2 registry, a local docker client, or a .tar file. Layers are cached and downloads resume automatically.
🔍 Layer-by-layer explorer — an interactive TUI to walk the filesystem of every layer, with added / modified / removed files colorized.
📉 Waste & bloat detection — efficiency score, wasted bytes, cross-layer duplicate files, oversized layers, package-manager caches, dev/build artifacts, and a reconstructed Dockerfile with anti-pattern linting.
🛡️ Security hygiene checks — flags leaked secret files (.env, SSH / cloud keys, certs), hardcoded credentials in ENV/labels/Dockerfile (key names only, never the value), setuid & world-writable files, and containers that run as root.
🤖 AI optimization report — hand the full analysis to any OpenAI-compatible model and get a prioritized fix list, plus version-over-version regression detection.
🚦 CI gate — fail the pipeline when an image drops below your efficiency / wasted-bytes thresholds.
🌐 Terminal · Web · JSON / Markdown · WeCom — explore interactively, expose an HTTP API, export a report, or push results to a chat.

Scope note: diving focuses on size, structure, and basic security checks (leaked secret files, file permissions, runs-as-root, etc.). It scans file paths and image metadata — it does not do CVE/vulnerability scanning or file-content scanning. Pair it with Trivy/grype/docker scout for vulnerability coverage.

Quick start

# 1. install — pick one:
curl -fsSL https://raw.githubusercontent.com/vicanso/diving-rs/main/install.sh | sh   # prebuilt binary
cargo install diving                                                                  # from crates.io

# 2. dive in
diving redis:alpine

That's it — no Docker daemon required. Prebuilt binaries for Linux / macOS / Windows are also on the release page, or build the latest from source with cargo install --git https://github.com/vicanso/diving-rs.

Inside the TUI:

Key	Action
`1`	Show only `Modified` / `Removed` files of the current layer
`2`	Show only files ≥ 1 MB
`Esc` / `0`	Reset the view

Analyze any image

diving accepts three source types:

# from a registry (default) — Docker Hub, quay.io, private registries…
diving redis:alpine
diving quay.io/prometheus/node-exporter

# pick an architecture for multi-arch images
diving redis:alpine?arch=arm64

# from the local docker client
diving docker://redis:alpine

# from a saved tar file
diving file:///tmp/redis.tar

Export a report

# JSON
diving redis:alpine --output-file result.json

# Markdown (detected by the .md extension)
diving redis:alpine --output-file result.md

# Markdown to stdout — base image layers are auto-detected and hidden by default
diving myimage:latest --output-file -

# include the base image layers
diving myimage:latest --output-file - --no-skip-base

CI gate

Run diving in CI to keep images lean. With CI=true it prints the efficiency score and exits 1 when any threshold is exceeded.

CI=true diving redis:alpine

Thresholds are configurable in ~/.diving/config.yml:

Option	Default	Meaning
`lowest_efficiency`	`0.95`	Minimum acceptable efficiency score (0–1)
`highest_wasted_bytes`	`20971520` (20 MB)	Maximum wasted bytes
`highest_user_wasted_percent`	`0.1`	Maximum wasted percentage (0–1)

AI analysis

Provide an OpenAI-compatible API key and diving sends the full Markdown analysis (layers, reconstructed Dockerfile, wasted space, large files, security findings) to the model and prints a prioritized optimization report instead of opening the TUI. When the ENTRYPOINT/CMD points to a script inside the image, that script is read from the layers and included, so the model can review what the container actually runs.

# enable AI analysis (prints the report, skips the TUI)
diving redis:alpine --ai-api-key sk-xxxx

# custom endpoint / model
diving redis:alpine \
  --ai-api-key sk-xxxx \
  --ai-base-url https://your-gateway/v1 \
  --ai-model gpt-4o

# configure via the environment
export OPENAI_API_KEY=sk-xxxx
diving redis:alpine

# control the report language (also affects terminal / Markdown output)
diving redis:alpine --ai-api-key sk-xxxx --lang zh

Flag	Environment	Default	Description
`--ai-api-key`	`OPENAI_API_KEY`	—	OpenAI-compatible API key. Providing it enables AI analysis.
`--ai-base-url`	`OPENAI_BASE_URL`	`https://api.openai.com/v1`	API base URL. A full `.../chat/completions` URL is also accepted.
`--ai-model`	`OPENAI_MODEL`	`gpt-4o`	Model name.
`--ai-system-prompt`	`OPENAI_SYSTEM_PROMPT`	built-in DevSecOps template	Override the system prompt to fully replace the built-in one.
`--lang`	`DIVING_LANG`	system locale	Output language: `en` or `zh`.
`--no-ai-history`	—	off	Skip the regression comparison for this run (the snapshot is still refreshed).

Each run stores a snapshot under ~/.diving/ai_history/. On the next run of the same image, the previous snapshot is sent alongside the current one so the model can flag size regressions / bloat between versions. --no-ai-history skips that comparison for one run (e.g. when the baseline is stale); the snapshot is still refreshed so subsequent runs compare against this one.

Security tip: the API key, base URL and webhook are CLI/env only — they are never accepted as web query parameters, so they don't end up in access logs.

WeCom push

Pass a WeCom (企业微信) group-bot webhook to push the result straight into a chat instead of opening the TUI. Content is chosen so it always fits the bot's ~4096-byte markdown limit:

with --ai-api-key set → the concise AI report is pushed
without AI → a short summary (efficiency score, wasted space, recommendations)

# bot key (expanded to the standard webhook URL automatically)
diving redis:alpine --wecom-webhook 693a91f6-7aoc-4bc4-97a0-0ec2sifa5aaa

# or the full webhook URL
diving redis:alpine --wecom-webhook "https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=KEY"

# push the AI report instead of the summary
diving redis:alpine --ai-api-key sk-xxxx --wecom-webhook KEY

# or from the environment
export WECOM_WEBHOOK=KEY
diving redis:alpine

Flag	Environment	Default	Description
`--wecom-webhook`	`WECOM_WEBHOOK`	—	WeCom group-bot webhook URL, or a bare bot key. Providing it pushes the result and skips the TUI.

Oversized content is truncated to the WeCom limit with a … (truncated) marker.

Web mode

Run diving as an HTTP server with a React frontend for remote analysis.

# Create the data directory and grant it to the container user (UID/GID 1000)
mkdir -p $PWD/diving
chown -R 1000:1000 $PWD/diving

docker run -d --restart=always \
  -p 7001:7001 \
  -v $PWD/diving:/home/rust/.diving \
  --name diving \
  vicanso/diving

Open http://127.0.0.1:7001/ in the browser.

The container runs as a non-root UID (1000:1000); the chown above lets it write the layer cache (without it the container fails to start). The image is based on debian:bookworm-slim with ca-certificates and tzdata. It ships no wget/curl, so there is no in-image HEALTHCHECK — probe GET /ping from your orchestrator (Kubernetes livenessProbe, a sidecar, etc.) instead.

Change the listen address with --listen:

diving --mode web --listen 0.0.0.0:8080

API

`GET /api/analyze`

Analyze a Docker image and return the result.

Parameter	Type	Required	Description
`image`	string	yes	Image reference (same formats as terminal mode)
`format`	string	no	Set to `markdown` to return a Markdown report instead of JSON
`skipBase`	bool	no	When `format=markdown`, auto-detect and hide base image layers (default `true`); set `false` to include them

# JSON response (default)
curl "http://127.0.0.1:7001/api/analyze?image=redis:alpine"

# specify architecture
curl "http://127.0.0.1:7001/api/analyze?image=redis:alpine%3Farch%3Darm64"

# Markdown report
curl "http://127.0.0.1:7001/api/analyze?image=redis:alpine&format=markdown"

# Markdown report including base layers (hidden by default)
curl "http://127.0.0.1:7001/api/analyze?image=myimage:latest&format=markdown&skipBase=false"

Sensitive-file scanning

During analysis diving scans every file path against built-in rules (.env files, SSH private keys, AWS/GCP credentials, TLS private keys, kubeconfig, .htpasswd, an accidentally-copied .git directory, …) and reports matches under Security Warnings. (It scans paths and metadata, not file contents.)

Extend or suppress the rules with ~/.diving/sensitive-files — one rule per line:

Line format	Effect
`<glob-pattern>`	Flag matching files (reason: "Custom sensitive file")
`<glob-pattern> \| <reason>`	Flag with a custom reason label
`!<glob-pattern>`	Ignore / suppress matches (overrides built-in and custom patterns above)

Lines starting with # and blank lines are ignored. Globs are case-insensitive; * matches across directory separators, and patterns are also tested against the filename alone (so *.pem matches a/b/cert.pem).

# ── Extra patterns ───────────────────────────────────────────
**/*.vault-token | Vault token
**/app-secrets.json | Application secrets

# ── Suppress built-in rules for intentional inclusions ───────
!**/.env.example
!**/.env.template
!**/certs/nginx.crt
!**/testdata/**
!**/fixtures/**

Configuration

Config file: ~/.diving/config.yml.

Option	Default	Description
`layer_path`	`~/.diving/layers`	Layer blob cache directory
`layer_ttl`	`90d`	TTL for cached layer blobs and analysis results; an entry is purged if not accessed within this duration
`analysis_path`	`~/.diving/analysis`	Analysis-result cache directory
`cleanup_interval_hours`	`1`	How often (hours) caches are swept for expired entries
`threads`	`min(layers, 2 × CPUs)`	Concurrent layer fetch + decompression tasks. Raise on fast networks with many layers; lower when sharing the host
`lowest_efficiency`	`0.95`	CI check — minimum efficiency score (0–1)
`highest_wasted_bytes`	`20971520`	CI check — maximum wasted bytes (20 MB)
`highest_user_wasted_percent`	`0.1`	CI check — maximum wasted percentage (0–1)

layer_ttl: 30d
cleanup_interval_hours: 6
threads: 4
lowest_efficiency: 0.95
highest_wasted_bytes: 20971520
highest_user_wasted_percent: 0.1

How caching works

diving keeps two on-disk caches under ~/.diving/, both governed by layer_ttl and swept hourly:

Layer blobs (~/.diving/layers/) — compressed layer downloads, keyed by layer digest. A hit skips the network download; decompression and file-tree construction still run.
Analysis results (~/.diving/analysis/) — the fully analyzed result, keyed by the Docker-Content-Digest (from a HEAD against the manifest endpoint) plus architecture. A hit short-circuits the entire pipeline.

The analysis cache is content-addressable, so re-pushing a mutable tag like :latest automatically invalidates the entry. If the HEAD probe fails for any reason, diving silently falls back to a full analysis — caching never blocks a request.

Because layer data is downloaded from the source (e.g. Docker Hub), the first run on a large image can take a while. Interrupted downloads resume automatically. For privately-deployed registries, run diving (or its web image) on a host that can reach the registry.

License

Licensed under the Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 332 Commits
.github/workflows		.github/workflows
.vscode		.vscode
assets		assets
hooks		hooks
src		src
web		web
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README-zh.md		README-zh.md
README.md		README.md
build.rs		build.rs
config.yml		config.yml
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

diving-rs

Why diving?

Quick start

Analyze any image

Export a report

CI gate

AI analysis

WeCom push

Web mode

API

`GET /api/analyze`

Sensitive-file scanning

Configuration

How caching works

License

About

Uh oh!

Releases 66

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

diving-rs

Why diving?

Quick start

Analyze any image

Export a report

CI gate

AI analysis

WeCom push

Web mode

API

GET /api/analyze

Sensitive-file scanning

Configuration

How caching works

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 66

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`GET /api/analyze`

Packages