Convert PDF, DOCX, PPTX, XLSX, and HTML to Markdown — in one command, zero Python required.
Built on Microsoft's MarkItDown. Packaged as a Docker/Podman container and distributed as an AI agent skill that works with Claude Code, OpenCode, Codex, Cursor, Windsurf, and 40+ more.
docker run --rm -i ghcr.io/opentechil/markitdown-for-ai < report.pdfMost AI agents can't read binary documents. They need plain text. MarkItDown4AI solves this by giving every agent — and every pipeline — a single, consistent way to extract structured Markdown from any document format.
- No Python, no installs — runs entirely in Docker or Podman
- Preserves structure — tables, headings, lists, and formatting survive conversion
- AI-native — install once as a skill; every supported agent automatically knows how to use it
- CI/CD ready — pipe it into any shell script, GitHub Action, or automation workflow
- Multi-arch — native
amd64andarm64images (Apple Silicon, AWS Graviton, x86 servers)
docker run --rm -i ghcr.io/opentechil/markitdown-for-ai < document.pdfPodman works as a drop-in replacement:
podman run --rm -i ghcr.io/opentechil/markitdown-for-ai < document.pdfInstall the document-to-markdown skill so your AI agent automatically knows how to convert documents whenever you ask.
npx skills add OpenTechIL/markitdown-for-aiDetects your agent automatically and installs to the correct location.
# Install globally for Claude Code
bash <(curl -fsSL https://raw.githubusercontent.com/OpenTechIL/markitdown-for-ai/main/install-skill.sh) --ai claudeInstalls to ~/.claude/skills/document-to-markdown/.
Then in any Claude Code session, just say:
"Summarize this PDF" — and Claude will automatically convert and read it.
# Install globally for OpenCode
bash <(curl -fsSL https://raw.githubusercontent.com/OpenTechIL/markitdown-for-ai/main/install-skill.sh) --ai opencode
# Or install to a specific project
bash <(curl -fsSL https://raw.githubusercontent.com/OpenTechIL/markitdown-for-ai/main/install-skill.sh) --localGlobal installs to ~/.config/opencode/skills/document-to-markdown/.
Local installs to .opencode/skills/document-to-markdown/ in the current project.
bash <(curl -fsSL https://raw.githubusercontent.com/OpenTechIL/markitdown-for-ai/main/install-skill.sh)Installs to every supported location:
~/.config/opencode/skills/(OpenCode global)~/.claude/skills/(Claude Code)~/.agents/skills/(Codex / shared agents)
| Format | Extension |
|---|---|
.pdf |
|
| Word | .docx |
| PowerPoint | .pptx |
| Excel | .xlsx |
| HTML | .html |
cat report.docx | docker run --rm -i ghcr.io/opentechil/markitdown-for-aidocker run --rm -v "$(pwd):/data" -w /data ghcr.io/opentechil/markitdown-for-ai slides.pptxdocker run --rm -i ghcr.io/opentechil/markitdown-for-ai < input.xlsx > output.mdfor f in *.pdf; do
docker run --rm -i ghcr.io/opentechil/markitdown-for-ai < "$f" > "${f%.pdf}.md"
donecurl -s "https://example.com" | docker run --rm -i ghcr.io/opentechil/markitdown-for-aidocker run --rm -i ghcr.io/opentechil/markitdown-for-ai < document.pdf | my-embed-cli ingest| Architecture | Targets |
|---|---|
amd64 |
x86_64 servers, most desktops |
arm64 |
Apple Silicon, AWS Graviton, ARM servers |
Runs on Linux, macOS (Docker Desktop or Podman), and Windows (Docker Desktop).
- Runs as a non-root user (
appuser) inside the container - Multi-stage build — no build tools in the runtime image
- No network access during document conversion
- Self-contained: only the MarkItDown library and its declared extras (
pdf,docx,pptx,xlsx)
docker build -t markitdown-for-ai .docker run --rm -i markitdown-for-ai < test.pdfGitHub Actions builds and publishes multi-arch images to GHCR on every push to main. Releases are tagged automatically.
Contributions are welcome. Please follow Conventional Commits and update CHANGELOG.md under [Unreleased] with every change. See AGENTS.md for full contributor guidance.
Apache License 2.0 — see LICENSE for details.