reader

package

v1.1.0 Latest Latest Go to latest Published: May 8, 2026 License: MIT Imports: 14 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/lvillar/gofpdf

Links

Open Source Insights

Documentation ¶

Overview ¶

Package reader provides functionality for reading and parsing existing PDF files.

It implements a PDF parser that can extract the object structure, page tree, and text content from PDF documents conforming to the PDF specification (ISO 32000).

Index ¶

type Array
- func (a Array) String() string
type Boolean
- func (b Boolean) String() string
type Dict
- func (d Dict) GetArray(key Name) Array
- func (d Dict) GetDict(key Name) Dict
- func (d Dict) GetInt(key Name) (int64, bool)
- func (d Dict) GetName(key Name) Name
- func (d Dict) GetString(key Name) string
- func (d Dict) String() string
type Document
- func Open(filename string) (*Document, error)
- func OpenWithPassword(filename, password string) (*Document, error)
- func ReadFrom(r io.Reader) (*Document, error)
- func ReadFromWithPassword(r io.Reader, password string) (*Document, error)
- func (d *Document) Catalog() (Dict, error)
- func (d *Document) FormField(name string) (*FormField, error)
- func (d *Document) FormFields() ([]*FormField, error)
- func (d *Document) Metadata() map[string]string
- func (d *Document) NumPages() int
- func (d *Document) Page(n int) (*Page, error)
- func (d *Document) Pages() iter.Seq2[int, *Page]
- func (d *Document) ResolveReference(ref Reference) (Object, error)
type FormField
- func (f *FormField) IsReadOnly() bool
- func (f *FormField) IsRequired() bool
type IndirectObject
- func (o IndirectObject) String() string
type Integer
- func (i Integer) String() string
type Name
- func (n Name) String() string
type Null
- func (Null) String() string
type Object
type Page
- func (p *Page) ContentStream() ([]byte, error)
- func (p *Page) ExtractText() (string, error)
type Real
- func (r Real) String() string
type Rectangle
- func (r Rectangle) Height() float64
- func (r Rectangle) Width() float64
type Reference
- func (r Reference) String() string
type Stream
- func (s Stream) String() string
type String
- func (s String) String() string

Examples ¶

ReadFrom

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Array ¶

type Array []Object

Array represents a PDF array of objects.

func (Array) String ¶

func (a Array) String() string

type Boolean ¶

type Boolean bool

Boolean represents a PDF boolean value.

func (Boolean) String ¶

func (b Boolean) String() string

type Dict ¶

type Dict map[Name]Object

Dict represents a PDF dictionary mapping names to objects.

func (Dict) GetArray ¶

func (d Dict) GetArray(key Name) Array

GetArray returns an array entry, or nil if not found.

func (Dict) GetDict ¶

func (d Dict) GetDict(key Name) Dict

GetDict returns a sub-dictionary, or nil if not found.

func (Dict) GetInt ¶

func (d Dict) GetInt(key Name) (int64, bool)

GetInt returns the value of an integer entry, or 0 if not found.

func (Dict) GetName ¶

func (d Dict) GetName(key Name) Name

GetName returns the value of a name entry, or empty string if not found.

func (Dict) GetString ¶

func (d Dict) GetString(key Name) string

GetString returns the string value for a dictionary key, resolving references.

func (Dict) String ¶

func (d Dict) String() string

type Document ¶

type Document struct {
	Version string // PDF version from file header (e.g., "1.7")
	// contains filtered or unexported fields
}

Document represents a parsed PDF document.

func Open ¶

func Open(filename string) (*Document, error)

Open opens and parses a PDF file from disk.

func OpenWithPassword ¶

func OpenWithPassword(filename, password string) (*Document, error)

OpenWithPassword opens and parses an encrypted PDF file using the given password.

func ReadFrom ¶

func ReadFrom(r io.Reader) (*Document, error)

ReadFrom parses a PDF document from a reader. The reader content is read entirely into memory for random access.

Example ¶

ExampleOpen demonstrates reading a PDF, inspecting its metadata, and iterating over its pages.

package main

import (
	"bytes"
	"fmt"

	gofpdf "github.com/lvillar/gofpdf"
	"github.com/lvillar/gofpdf/reader"
)

func main() {
	// Build a small in-memory PDF so the example is self-contained.
	pdf := gofpdf.New("P", "mm", "A4", "")
	pdf.SetTitle("Quarterly Report", true)
	pdf.SetAuthor("Acme Analytics", true)
	pdf.SetFont("Helvetica", "B", 16)

	pdf.AddPage()
	pdf.Cell(0, 10, "Page 1: Summary")
	pdf.AddPage()
	pdf.Cell(0, 10, "Page 2: Details")

	var buf bytes.Buffer
	if err := pdf.Output(&buf); err != nil {
		fmt.Println(err)
		return
	}

	// Now read it back.
	doc, err := reader.ReadFrom(&buf)
	if err != nil {
		fmt.Println(err)
		return
	}

	meta := doc.Metadata()
	fmt.Printf("Pages: %d\n", doc.NumPages())
	fmt.Printf("Title: %s\n", meta["Title"])
	fmt.Printf("Author: %s\n", meta["Author"])

}

Output:
Pages: 2
Title: Quarterly Report
Author: Acme Analytics

func ReadFromWithPassword ¶

func ReadFromWithPassword(r io.Reader, password string) (*Document, error)

ReadFromWithPassword parses an encrypted PDF from a reader using the given password.

func (*Document) Catalog ¶

func (d *Document) Catalog() (Dict, error)

Catalog returns the document's catalog dictionary (the /Root object).

func (*Document) FormField ¶

func (d *Document) FormField(name string) (*FormField, error)

FormField returns the form field with the given fully qualified name. Returns nil if the field is not found.

func (*Document) FormFields ¶

func (d *Document) FormFields() ([]*FormField, error)

FormFields returns all form fields found in the document's AcroForm. Returns an empty slice (not nil) if no AcroForm is present.

func (*Document) Metadata ¶

func (d *Document) Metadata() map[string]string

Metadata returns document metadata from the /Info dictionary.

func (*Document) NumPages ¶

func (d *Document) NumPages() int

NumPages returns the total number of pages in the document.

func (*Document) Page ¶

func (d *Document) Page(n int) (*Page, error)

Page returns the page at the given 1-based index.

func (*Document) Pages ¶

func (d *Document) Pages() iter.Seq2[int, *Page]

Pages returns an iterator over all pages. Index is 1-based.

func (*Document) ResolveReference ¶

func (d *Document) ResolveReference(ref Reference) (Object, error)

ResolveReference resolves an indirect reference to the actual object. This is the public API for resolving references.

type FormField ¶

type FormField struct {
	Name     string       // partial field name (/T)
	FullName string       // fully qualified dotted name
	Type     string       // field type: "Tx", "Btn", "Ch", "Sig"
	Value    string       // current value (/V)
	Default  string       // default value (/DV)
	Flags    int          // field flags (/Ff)
	Rect     Rectangle    // widget annotation rectangle
	Options  []string     // choice options (/Opt) for "Ch" fields
	Kids     []*FormField // child fields in hierarchy
	ObjNum   int          // object number if from an indirect object
	// contains filtered or unexported fields
}

FormField represents a form field parsed from a PDF's AcroForm dictionary.

func (*FormField) IsReadOnly ¶

func (f *FormField) IsReadOnly() bool

IsReadOnly returns true if the field has the ReadOnly flag set (bit 1).

func (*FormField) IsRequired ¶

func (f *FormField) IsRequired() bool

IsRequired returns true if the field has the Required flag set (bit 2).

type IndirectObject ¶

type IndirectObject struct {
	Reference
	Value Object
}

IndirectObject represents a PDF indirect object definition (e.g., "10 0 obj ... endobj").

func (IndirectObject) String ¶

func (o IndirectObject) String() string

type Integer ¶

type Integer int64

Integer represents a PDF integer value.

func (Integer) String ¶

func (i Integer) String() string

type Name ¶

type Name string

Name represents a PDF name object (e.g., /Type, /Pages).

func (Name) String ¶

func (n Name) String() string

type Null ¶

type Null struct{}

Null represents the PDF null object.

func (Null) String ¶

func (Null) String() string

type Object ¶

type Object interface {
	String() string
	// contains filtered or unexported methods
}

Object is the interface satisfied by all PDF object types. The unexported method prevents external types from implementing it.

type Page ¶

type Page struct {
	Number    int
	MediaBox  Rectangle
	CropBox   *Rectangle
	Resources Dict
	Contents  []Stream
	Rotate    int
	// contains filtered or unexported fields
}

Page represents a single page in a PDF document.

func (*Page) ContentStream ¶

func (p *Page) ContentStream() ([]byte, error)

ContentStream returns the decompressed content stream data for this page. If the page has multiple content streams, they are concatenated.

func (*Page) ExtractText ¶

func (p *Page) ExtractText() (string, error)

ExtractText extracts the text content from this page. It parses the content stream and extracts text from BT/ET blocks using the Tj, TJ, ', and " operators.

Note: This is a basic extraction that handles common cases. Complex text with custom encodings, CIDFonts, or ToUnicode CMaps may not be fully supported.

type Real ¶

type Real float64

Real represents a PDF real (floating-point) value.

func (Real) String ¶

func (r Real) String() string

type Rectangle ¶

type Rectangle struct {
	LLX, LLY, URX, URY float64
}

Rectangle represents a PDF rectangle (typically [llx lly urx ury]).

func (Rectangle) Height ¶

func (r Rectangle) Height() float64

Height returns the height of the rectangle.

func (Rectangle) Width ¶

func (r Rectangle) Width() float64

Width returns the width of the rectangle.

type Reference ¶

type Reference struct {
	Number     int
	Generation int
}

Reference represents an indirect object reference (e.g., "10 0 R").

func (Reference) String ¶

func (r Reference) String() string

type Stream ¶

type Stream struct {
	Dict Dict
	Data []byte // raw data (may be compressed)
}

Stream represents a PDF stream object (dictionary + encoded data).

func (Stream) String ¶

func (s Stream) String() string

type String ¶

type String struct {
	Value []byte
	IsHex bool
}

String represents a PDF string (literal or hexadecimal).

func (String) String ¶

func (s String) String() string

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL