to-markdown.sh: convert files/URLs to Markdown with Docker
Juliusz Ćwiąkalski
- 3 minutes read - 530 wordsThis article describes to-markdown.sh: a single, self-contained CLI script that converts many document types (PDF/DOCX/HTML/… and URLs) into Markdown using the markitdown tool running inside Docker.
Why this script exists (motivation)
MarkItDown is a Python tool. If you are a Python developer, installing it is usually fine. But for everyone else, it often means dealing with:
- installing Python (and the right Python version)
- setting up a virtualenv / pipx
- troubleshooting native dependencies or platform-specific issues
- avoiding “works on my machine” problems and dependency conflicts
to-markdown.sh intentionally hides all that complexity by running markitdown in a small Docker image.
That means the only requirement on the host is Docker — you get a self-contained, reproducible, easily manageable solution that behaves the same on any machine (and in CI). The script even auto-builds the image on first run, so the setup stays boring.
Source, download, license
- Author: Juliusz Ćwiąkalski
- LinkedIn: https://www.linkedin.com/in/juliusz-cwiakalski/
- Direct download (latest): https://www.cwiakalski.com/cli/to-markdown.sh
- Article URL: https://www.cwiakalski.com/cli/to-markdown
License: MIT (free to use for any purpose, including commercial; must keep attribution; provided “AS IS”, without warranty).
Requirements
- Docker CLI installed
- Docker daemon running and accessible to your user
The script builds a small Docker image automatically on first run (or when using --rebuild).
Install
Download and make it executable:
mkdir -p "$HOME/.local/bin"
curl -fsSL https://www.cwiakalski.com/cli/to-markdown.sh -o "$HOME/.local/bin/to-markdown.sh"
chmod +x "$HOME/.local/bin/to-markdown.sh"
Create a convenient alias named to-markdown and load it from your shell rc:
# pick the rc files you actually use
for rc in "$HOME/.bashrc" "$HOME/.zshrc"; do
[ -f "$rc" ] || continue
# add the alias only once
grep -q "alias to-markdown=" "$rc" || {
printf '\n# to-markdown.sh\nalias to-markdown=\"$HOME/.local/bin/to-markdown.sh\"\n' >> "$rc"
}
done
# reload your shell config (pick one)
source "$HOME/.bashrc" 2>/dev/null || true
source "$HOME/.zshrc" 2>/dev/null || true
Verify:
to-markdown --help
Usage
The output is always written to stdout, so redirect to a file if needed.
1) Convert from stdin
Use this when you already have the content in a pipe.
curl -fsSL https://example.com | to-markdown > out.md
You can also feed it local files by piping them:
cat ./document.pdf | to-markdown > document.md
2) Convert a URL
Pass the URL as the input:
to-markdown https://kernel.org/ > kernel.md
Or explicitly:
to-markdown --url https://kernel.org/ > kernel.md
3) Convert a local file
Pass a file path (the script mounts it read-only into the container):
to-markdown ./document.pdf > document.md
to-markdown ~/Downloads/report.docx > report.md
Or explicitly:
to-markdown --file ./document.pdf > document.md
4) OCR images (screenshots) to Markdown
If the input is an image file (png/jpg/webp/tiff/bmp), the tool extracts text using offline OCR.
Example (screenshot image):
to-markdown ~/Desktop/screenshot.png > screenshot.md
If you have the image on stdin, provide an extension hint to MarkItDown:
cat ~/Desktop/screenshot.png | to-markdown -- --extension png > screenshot.md
5) Force rebuild of the Docker image
Useful if you want a clean rebuild (or updated dependencies):
to-markdown --rebuild https://kernel.org/ > kernel.md
6) Passing arguments directly to markitdown
Anything after -- is forwarded to markitdown:
cat ./document.pdf | to-markdown -- --help
Troubleshooting
Docker not found / daemon not running
If you see an error like:
Docker is required. Install Docker and re-run.Docker daemon is not available...
Install Docker Desktop / Docker Engine and make sure the Docker daemon is running.