pdf

module

v0.0.10 Latest Latest Go to latest Published: Apr 27, 2024 License: MIT

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/benoitkugler/pdf

Links

Open Source Insights

README ¶

Golang PDF toolbox

Why yet another PDF processing library ?

There are already numerous good PDF libraries for Go, and this one deliberatly takes inspiration from them. However, it is based on a slighty different approach : instead of working with a PDF as a tree of dynamic objects, it starts by modeling the whole SPEC (at least a good portion of it) with static types: see the package model.

Overview

The package model is the corner stone of this library. Then, packages may be divided in two parts:

reader imports a PDF file into memory
fonts provides support to use embeded PDF fonts
contentstream and formfill provides tools to create PDF models

Scope

The idea is possibly to provide a complete support of the PDF spec, but more importantly to exposes the differents layers (such as parser or content stream operators) so that it can be reusable by other libraries. As such, the first target of this library would be higher levels libraries (such as pdfcpu, gofpdf, oksvg, etc...).

Code example

A standard workflow to modify an existing PDF would look like

// load the existing file in memory
fi, _, err := reader.ParsePDFFile(filePath, reader.Options{})
// error handling ...

// process the document model as you wish

err = fi.WriteFile(output, nil)
// error handling ...

See decompress and api for more examples.

Directories ¶

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

Path	Synopsis
apidemo
cmd
decompress This script decodes the streams of a PDF file.	This script decodes the streams of a PDF file.
contentstream This package defines the commands used in PDF content stream objects.	This package defines the commands used in PDF content stream objects.
fmt_tool
fonts This package provides tooling for exploiting the fonts defined (and embedded) in a PDF file and ( TODO: ) to add new ones.	This package provides tooling for exploiting the fonts defined (and embedded) in a PDF file and ( TODO: ) to add new ones.
cmaps Implements a CMap parser (both for ToUnicode and CID CMaps)	Implements a CMap parser (both for ToUnicode and CID CMaps)
glyphsnames copied from https://git.maze.io/go/unipdf/src/branch/master/internal/textencoding	copied from https://git.maze.io/go/unipdf/src/branch/master/internal/textencoding
psinterpreter Package psinterpreter implement a Postscript interpreter required to parse .CFF files, and Type1 and Type2 Charstrings.	Package psinterpreter implement a Postscript interpreter required to parse .CFF files, and Type1 and Type2 Charstrings.
simpleencodings Simple encodings map a subset of the unicode characters (at most 256) to a set of single bytes.	Simple encodings map a subset of the unicode characters (at most 256) to a set of single bytes.
standardcmaps Adobe predefined ToUnicode cmaps	Adobe predefined ToUnicode cmaps
standardcmaps/generate
standardfonts
standardfonts/generate Tool to generate the metrics for the standard Adobe Type1 fonts.	Tool to generate the metrics for the standard Adobe Type1 fonts.
type1 Package type1 implements a parser for Adobe Type1 fonts, defined by .afm files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5004.AFM_Spec.pdf) and .pdf files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/T1_SPEC.pdf)	Package type1 implements a parser for Adobe Type1 fonts, defined by .afm files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5004.AFM_Spec.pdf) and .pdf files (https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/T1_SPEC.pdf)
type1C Package type1c provides a parser for the CFF font format defined at https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5176.CFF.pdf.	Package type1c provides a parser for the CFF font format defined at https://www.adobe.com/content/dam/acom/en/devnet/font/pdfs/5176.CFF.pdf.
formfill Package formfill provides support for filling forms found in PDF files (aka AcroForm), reading forms input either form an FDF file or directly from memory.	Package formfill provides support for filling forms found in PDF files (aka AcroForm), reading forms input either form an FDF file or directly from memory.
model Implements the in-memory structure of a PDF document, using static types.	Implements the in-memory structure of a PDF document, using static types.
reader Package reader leverage a PDF file reader to read a file, analyze its structure and build a high level, in-memory representation as a `model.Document`.	Package reader leverage a PDF file reader to read a file, analyze its structure and build a high level, in-memory representation as a `model.Document`.
file Package file builds upon a parser to read an existing PDF file, producing a tree of PDF objets.	Package file builds upon a parser to read an existing PDF file, producing a tree of PDF objets.
parser Implements a PDF object parser, mapping a list of tokens (see the tokenizer package) into tree-like structure.	Implements a PDF object parser, mapping a list of tokens (see the tokenizer package) into tree-like structure.
parser/filters Package filters provide logic to handle binary data encoded with PDF filters, such as inline data images.	Package filters provide logic to handle binary data encoded with PDF filters, such as inline data images.
parser/filters/ccitt