reader

package
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 28, 2025 License: MIT Imports: 8 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type PDFVersion

type PDFVersion struct {
	Major int
	Minor int
}

PDFVersion represents a PDF version

func (PDFVersion) String

func (v PDFVersion) String() string

String returns the version as a string (e.g., "1.7")

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader represents a PDF file reader

func NewReader

func NewReader(file *os.File) (*Reader, error)

NewReader creates a new PDF reader for the given file

func Open

func Open(filename string) (*Reader, error)

Open opens a PDF file and returns a Reader

func (*Reader) CacheSize

func (r *Reader) CacheSize() int

CacheSize returns the number of cached objects

func (*Reader) ClearCache

func (r *Reader) ClearCache()

ClearCache clears the object cache and object stream cache Useful for freeing memory when processing large PDFs

func (*Reader) Close

func (r *Reader) Close() error

Close closes the PDF file

func (*Reader) ExtractText

func (r *Reader) ExtractText(page *pages.Page) (string, error)

ExtractText extracts text from a page and returns it as a string This is a convenience method for simple text extraction

func (*Reader) ExtractTextFragments

func (r *Reader) ExtractTextFragments(page *pages.Page) ([]text.TextFragment, error)

ExtractTextFragments extracts text fragments from a page This is a convenience method that handles content stream decoding and font registration

func (*Reader) FileSize

func (r *Reader) FileSize() int64

FileSize returns the size of the PDF file in bytes

func (*Reader) GetCatalog

func (r *Reader) GetCatalog() (core.Dict, error)

GetCatalog returns the document catalog (root object)

func (*Reader) GetInfo

func (r *Reader) GetInfo() (core.Dict, error)

GetInfo returns the document info dictionary (metadata)

func (*Reader) GetObject

func (r *Reader) GetObject(objNum int) (core.Object, error)

GetObject loads an object by its number Uses caching to avoid re-reading objects Supports both uncompressed objects and objects in object streams (PDF 1.5+)

func (*Reader) GetPage

func (r *Reader) GetPage(index int) (*pages.Page, error)

GetPage returns the page at the given index (0-based)

func (*Reader) NumObjects

func (r *Reader) NumObjects() int

NumObjects returns the total number of objects in the PDF

func (*Reader) ObjectStreamCacheSize

func (r *Reader) ObjectStreamCacheSize() int

ObjectStreamCacheSize returns the number of cached object streams

func (*Reader) PageCount

func (r *Reader) PageCount() (int, error)

PageCount returns the number of pages in the PDF

func (*Reader) Resolve

func (r *Reader) Resolve(obj core.Object) (core.Object, error)

Resolve resolves an object if it's an indirect reference, otherwise returns it as-is Implements pages.ObjectResolver interface

func (*Reader) ResolveDeep

func (r *Reader) ResolveDeep(obj core.Object) (core.Object, error)

ResolveDeep recursively resolves all indirect references in an object Implements pages.ObjectResolver interface

func (*Reader) ResolveReference

func (r *Reader) ResolveReference(ref core.IndirectRef) (core.Object, error)

ResolveReference resolves an indirect reference

func (*Reader) Trailer

func (r *Reader) Trailer() core.Dict

Trailer returns the trailer dictionary

func (*Reader) Version

func (r *Reader) Version() PDFVersion

Version returns the PDF version

func (*Reader) XRefTable

func (r *Reader) XRefTable() *core.XRefTable

XRefTable returns the cross-reference table Exposed for debugging/inspection

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL