Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
View Source
var ( // ErrPDFOpen is returned when there is an error opening the PDF file ErrPDFOpen = errors.New("error opening PDF file") // ErrOutputFile is returned when there is an error opening the output file ErrOutputFile = errors.New("error opening output file") // ErrPermissions is returned when there is an error related to PDF permissions ErrPermissions = errors.New("error related to PDF permissions") // ErrInvalidPage is returned when the page number is invalid ErrInvalidPage = errors.New("invalid page number") // ErrInvalidRange is returned when the page range is invalid ErrInvalidRange = errors.New("invalid page range") // ErrCommandFailed is returned when the pdftotext command fails ErrCommandFailed = errors.New("pdftotext command failed") // ErrBinaryNotFound is returned when the pdftotext binary is not found ErrBinaryNotFound = errors.New("pdftotext binary not found") )
Functions ¶
This section is empty.
Types ¶
type Converter ¶
type Converter struct {
// contains filtered or unexported fields
}
Converter represents a PDF to text converter
type Options ¶
type Options struct { // FirstPage is the first page to convert FirstPage int // LastPage is the last page to convert LastPage int // Resolution is the resolution in DPI (default 72) Resolution int // CropX is the X-coordinate of crop area CropX int // CropY is the Y-coordinate of crop area CropY int // CropWidth is the width of crop area CropWidth int // CropHeight is the height of crop area CropHeight int // Layout maintains the original layout Layout bool // FixedPitch keeps the text in a fixed-pitch font FixedPitch float64 // Raw keeps text in content stream order Raw bool // NoDiagonal discards diagonal text NoDiagonal bool // HTMLMeta generates HTML with meta information HTMLMeta bool // BBox generates XHTML with word bounding boxes BBox bool // BBoxLayout generates XHTML with block/line/word bounding boxes BBoxLayout bool // TSV generates TSV with bounding box information TSV bool // CropBox uses crop box instead of media box CropBox bool // ColSpacing is the column spacing (default 0.7) ColSpacing float64 // Encoding is the text output encoding (default UTF-8) Encoding string // EOL is the end-of-line convention (default Unix) EOL EOLType // NoPageBreaks don't insert page breaks NoPageBreaks bool // OwnerPassword is the PDF owner password OwnerPassword string // UserPassword is the PDF user password UserPassword string // Quiet suppresses messages and errors Quiet bool }
Options represents the configuration options for the PDF conversion
Click to show internal directories.
Click to hide internal directories.