For the correspondent who prefers a teletype to a touchscreen, Paperboy ships a proper command-line edition. One install, two main commands, and an editor at your terminal who will convert files to markdown or crawl an entire site without so much as a popup.
The `convert` command turns PDFs, Word documents, spreadsheets, e-books, web pages, images (with local OCR), and plain text into markdown next to the source file. The `crawl` command discovers a site’s pages via its sitemap, fetches them in order, and writes one combined markdown file — ready for an indexer, an agent, or a careful reader.
Configuration is optional. Everything works offline, locally, deterministically. AI features exist only if you bring your own key and call them by name.
$ paperboy-cli convert ./report.pdf -o report.md
→ 12,438 chars · 2 images OCR'd · 4 tables
Wrote report.md (3.2 KB)
$ paperboy-cli crawl https://example.com \
--max-pages 50 --output-mode single \
-o site.md
Discovered 47 URLs from sitemap.xml
Crawled 47/47 pages · 184,920 words
Wrote site.md (412 KB)
$ paperboy-cli doctor --offline
11 OK · 1 warn · 0 fail. Ready to dispatch.sitemap.xml first, falls back to internal-link discovery. Same-origin only. Strips tracking params. Outputs single file, mirrored tree, or JSONL for RAG pipelines.--json for machine output. Exit codes follow Unix convention. Ideal for build pipelines and shell scripts.Install (npm)
npm install -g @proticom/paperboy-cli
Requires Node 20 or later. Install globally from npm and you're ready to convert. GitHub repository.
First-run sequence
# Confirm your environment is ready paperboy-cli doctor --offline # Convert a file paperboy-cli convert ./notes.docx # Crawl a documentation site paperboy-cli crawl https://docs.example.com --max-pages 100
Full options are documented in the README and via paperboy-cli --help. The CLI is the same converter that powers the desktop app, the Chrome extension, and the embeddable widget — one engine, four mastheads.