format

package
v1.6.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 4, 2026 License: MIT Imports: 4 Imported by: 0

Documentation

Overview

Package format provides file format detection for the tabula library.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Format

type Format int

Format represents a supported document format.

const (
	// Unknown indicates an unrecognized format.
	Unknown Format = iota
	// PDF indicates a PDF document.
	PDF
	// DOCX indicates a Microsoft Word (.docx) document.
	DOCX
	// ODT indicates an OpenDocument Text (.odt) document.
	ODT
	// XLSX indicates a Microsoft Excel (.xlsx) document.
	XLSX
	// PPTX indicates a Microsoft PowerPoint (.pptx) document.
	PPTX
	// HTML indicates an HTML document.
	HTML
	// EPUB indicates an EPUB e-book document.
	EPUB
)

func Detect

func Detect(filename string) Format

Detect determines file format from filename extension.

func DetectFromMagic

func DetectFromMagic(data []byte) Format

DetectFromMagic checks file magic bytes to determine format. This provides more reliable detection than extension-based detection. Returns Unknown if the format cannot be determined from magic bytes alone.

func DetectFromReader

func DetectFromReader(r io.ReaderAt, size int64) (Format, error)

DetectFromReader inspects the content to determine format. This is more reliable than extension-based detection and can distinguish between different ZIP-based formats (DOCX, XLSX, PPTX).

func (Format) Extension

func (f Format) Extension() string

Extension returns the typical file extension for the format.

func (Format) String

func (f Format) String() string

String returns the string representation of the format.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL