Documentation
¶
Overview ¶
Package format provides file format detection for the tabula library.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Format ¶
type Format int
Format represents a supported document format.
const ( // Unknown indicates an unrecognized format. Unknown Format = iota // PDF indicates a PDF document. PDF // DOCX indicates a Microsoft Word (.docx) document. DOCX // ODT indicates an OpenDocument Text (.odt) document. ODT // XLSX indicates a Microsoft Excel (.xlsx) document. XLSX // PPTX indicates a Microsoft PowerPoint (.pptx) document. PPTX // HTML indicates an HTML document. HTML // EPUB indicates an EPUB e-book document. EPUB )
func DetectFromMagic ¶
DetectFromMagic checks file magic bytes to determine format. This provides more reliable detection than extension-based detection. Returns Unknown if the format cannot be determined from magic bytes alone.
func DetectFromReader ¶
DetectFromReader inspects the content to determine format. This is more reliable than extension-based detection and can distinguish between different ZIP-based formats (DOCX, XLSX, PPTX).
Click to show internal directories.
Click to hide internal directories.