cmd/

directory
Version: v0.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 14, 2020 License: GPL-3.0

Directories

Path Synopsis
avg-lines prints a report of the average confidence for each line, sorted from worst to best
avg-lines prints a report of the average confidence for each line, sorted from worst to best
boxtotxt converts a Tesseract .box file to plain text
boxtotxt converts a Tesseract .box file to plain text
bucket-lines copies image-text line pairs into different directories according to the average character probability for the line
bucket-lines copies image-text line pairs into different directories according to the average character probability for the line
dehyphenate does basic dehyphenation on a hocr file
dehyphenate does basic dehyphenation on a hocr file
eeboxmltohocr converts the XML from an EEBO download to hOCR, which can be easily incorporated into a searchable PDF
eeboxmltohocr converts the XML from an EEBO download to hOCR, which can be easily incorporated into a searchable PDF
fonttobytes outputs a font file as a series of bytes in go format, allowing a font to be easily embedded into a go binary
fonttobytes outputs a font file as a series of bytes in go format, allowing a font to be easily embedded into a go binary
hocrtotxt prints the text from a hocr file
hocrtotxt prints the text from a hocr file
pare-gt moves some ground truth, ensuring that the same proportions of each ground truth source are represented in the moved section
pare-gt moves some ground truth, ensuring that the same proportions of each ground truth source are represented in the moved section
pgconf prints the total confidence for a page of hOCR
pgconf prints the total confidence for a page of hOCR

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL