Documentation
¶
Overview ¶
cmd/pdftable is the command-line interface to the pdftable library. It mirrors pdfplumber's CLI surface for the operations pdftable implements: extract text, extract tables, dump page geometry.
Usage:
pdftable extract <file.pdf> [flags]
Flags (extract subcommand):
--pages Comma-separated page list / dash ranges, e.g. "1,3-5". Default: all. --tables Emit detected tables (JSON or text per --format). --text Emit extracted text. Mutually exclusive with --tables. --format "json" (default) | "text". Output format. --vertical-strategy "lines" (default) | "lines_strict" | "text" | "explicit". --horizontal-strategy Same set; default "lines". --snap-tolerance Float; default 3. --join-tolerance Float; default 3. --edge-min-length Float; default 3. --intersection-tolerance Float; default 3. --text-tolerance Float; default 3. --min-words-vertical Int; default 3. --min-words-horizontal Int; default 1. --explicit-vertical-lines Comma-separated floats; required when vertical-strategy=explicit. --explicit-horizontal-lines Comma-separated floats; required when horizontal-strategy=explicit. --indent Int; JSON pretty-printing indent. 0 = compact.
The CLI uses the standard library `flag` package and the `pdftable` public API only — no third-party dependencies.
Click to show internal directories.
Click to hide internal directories.