Documentation ¶
Overview ¶
Package pdfcpu is a simple PDF processing library written in Go supporting encryption. It provides an API and a command line interface. Supported are all versions up to PDF 1.7 (ISO-32000).
The available commands are:
validate validate PDF against PDF 32000-1:2008 (PDF 1.7) optimize optimize PDF by getting rid of redundant page resources split split multi-page PDF into several single-page PDFs merge concatenate 2 or more PDFs extract extract images, fonts, content or pages trim create trimmed version attach list, add, remove, extract embedded file attachments perm list, add user access permissions encrypt set password protection decrypt remove password protection changeupw change user password changeopw change owner password version print version
Index ¶
- Constants
- func AddAttachments(fileIn string, files []string, config *Configuration) error
- func AddPermissions(fileIn string, config *Configuration) error
- func ChangeOwnerPassword(fileIn, fileOut string, config *Configuration, pwOld, pwNew *string) error
- func ChangeUserPassword(fileIn, fileOut string, config *Configuration, pwOld, pwNew *string) error
- func Date(s string) bool
- func DecodeUTF16String(s string) (string, error)
- func Decrypt(fileIn, fileOut string, config *Configuration) error
- func Encrypt(fileIn, fileOut string, config *Configuration) error
- func Escape(s string) (*string, error)
- func ExtractAttachments(fileIn, dirOut string, files []string, config *Configuration) error
- func ExtractContent(fileIn, dirOut string, pageSelection []string, config *Configuration) error
- func ExtractFonts(fileIn, dirOut string, pageSelection []string, config *Configuration) error
- func ExtractImages(fileIn, dirOut string, pageSelection []string, config *Configuration) error
- func ExtractPages(fileIn, dirOut string, pageSelection []string, config *Configuration) error
- func HexLiteralToString(hexString string) (string, error)
- func IsStringUTF16BE(s string) bool
- func IsUTF16BE(b []byte) (ok bool, err error)
- func ListAttachments(fileIn string, config *Configuration) ([]string, error)
- func ListPermissions(fileIn string, config *Configuration) ([]string, error)
- func Merge(filesIn []string, fileOut string, config *Configuration) error
- func Optimize(fileIn, fileOut string, config *Configuration) error
- func ParsePageSelection(s string) ([]string, error)
- func Process(cmd *Command) (out []string, err error)
- func RemoveAttachments(fileIn string, files []string, config *Configuration) error
- func Split(fileIn, dirOut string, config *Configuration) error
- func StringLiteralToString(s string) (string, error)
- func Trim(fileIn, fileOut string, pageSelection []string, config *Configuration) error
- func Unescape(s string) ([]byte, error)
- func Validate(fileIn string, config *Configuration) error
- func VersionString(version PDFVersion) string
- func Write(ctx *PDFContext) error
- type ByteSize
- type Command
- func AddAttachmentsCommand(pdfFileNameIn string, fileNamesIn []string, config *Configuration) *Command
- func AddPermissionsCommand(pdfFileNameIn string, config *Configuration) *Command
- func ChangeOwnerPWCommand(pdfFileNameIn, pdfFileNameOut string, config *Configuration, ...) *Command
- func ChangeUserPWCommand(pdfFileNameIn, pdfFileNameOut string, config *Configuration, ...) *Command
- func DecryptCommand(pdfFileNameIn, pdfFileNameOut string, config *Configuration) *Command
- func EncryptCommand(pdfFileNameIn, pdfFileNameOut string, config *Configuration) *Command
- func ExtractAttachmentsCommand(pdfFileNameIn, dirNameOut string, fileNamesIn []string, config *Configuration) *Command
- func ExtractContentCommand(pdfFileNameIn, dirNameOut string, pageSelection []string, ...) *Command
- func ExtractFontsCommand(pdfFileNameIn, dirNameOut string, pageSelection []string, ...) *Command
- func ExtractImagesCommand(pdfFileNameIn, dirNameOut string, pageSelection []string, ...) *Command
- func ExtractPagesCommand(pdfFileNameIn, dirNameOut string, pageSelection []string, ...) *Command
- func ListAttachmentsCommand(pdfFileNameIn string, config *Configuration) *Command
- func ListPermissionsCommand(pdfFileNameIn string, config *Configuration) *Command
- func MergeCommand(pdfFileNamesIn []string, pdfFileNameOut string, config *Configuration) *Command
- func OptimizeCommand(pdfFileNameIn, pdfFileNameOut string, config *Configuration) *Command
- func RemoveAttachmentsCommand(pdfFileNameIn string, fileNamesIn []string, config *Configuration) *Command
- func SplitCommand(pdfFileNameIn, dirNameOut string, config *Configuration) *Command
- func TrimCommand(pdfFileNameIn, pdfFileNameOut string, pageSelection []string, ...) *Command
- func ValidateCommand(pdfFileName string, config *Configuration) *Command
- type CommandMode
- type Configuration
- type Enc
- type FontObject
- type ImageObject
- type IntSet
- type Node
- func (n *Node) Add(xRefTable *XRefTable, k string, v PDFObject) error
- func (n *Node) AddToLeaf(k string, v PDFObject)
- func (n Node) KeyList() ([]string, error)
- func (n Node) Process(xRefTable *XRefTable, handler func(*XRefTable, string, PDFObject) error) error
- func (n *Node) Remove(xRefTable *XRefTable, k string) (empty, ok bool, err error)
- func (n Node) String() string
- func (n Node) Value(k string) (PDFObject, bool)
- type OptimizationContext
- func (oc *OptimizationContext) DuplicateFontObjectsString() (int, string)
- func (oc *OptimizationContext) DuplicateImageObjectsString() (int, string)
- func (oc *OptimizationContext) DuplicateInfoObjectsString() (int, string)
- func (oc *OptimizationContext) IsDuplicateFontObject(i int) bool
- func (oc *OptimizationContext) IsDuplicateImageObject(i int) bool
- func (oc *OptimizationContext) IsDuplicateInfoObject(i int) bool
- func (oc *OptimizationContext) NonReferencedObjsString() (int, string)
- type PDFArray
- type PDFBoolean
- type PDFContext
- type PDFDict
- func (d PDFDict) BooleanEntry(key string) *bool
- func (d *PDFDict) Delete(key string) (value PDFObject)
- func (d *PDFDict) Entry(dictName, key string, required bool) (PDFObject, error)
- func (d PDFDict) Find(key string) (value PDFObject, found bool)
- func (d PDFDict) First() *int
- func (d PDFDict) Index() *PDFArray
- func (d PDFDict) IndirectRefEntry(key string) *PDFIndirectRef
- func (d *PDFDict) Insert(key string, value PDFObject) (ok bool)
- func (d *PDFDict) InsertFloat(key string, value float32)
- func (d *PDFDict) InsertInt(key string, value int)
- func (d *PDFDict) InsertName(key, value string)
- func (d *PDFDict) InsertString(key, value string)
- func (d PDFDict) Int64Entry(key string) *int64
- func (d PDFDict) IntEntry(key string) *int
- func (d PDFDict) IsLinearizationParmDict() bool
- func (d PDFDict) IsObjStm() bool
- func (d *PDFDict) Len() int
- func (d PDFDict) Length() (*int64, *int)
- func (d PDFDict) N() *int
- func (d PDFDict) NameEntry(key string) *string
- func (d PDFDict) PDFArrayEntry(key string) *PDFArray
- func (d PDFDict) PDFDictEntry(key string) *PDFDict
- func (d PDFDict) PDFHexLiteralEntry(key string) *PDFHexLiteral
- func (d PDFDict) PDFNameEntry(key string) *PDFName
- func (d PDFDict) PDFStreamDictEntry(key string) *PDFStreamDict
- func (d PDFDict) PDFString() string
- func (d PDFDict) PDFStringLiteralEntry(key string) *PDFStringLiteral
- func (d PDFDict) Prev() *int64
- func (d PDFDict) Size() *int
- func (d PDFDict) String() string
- func (d PDFDict) StringEntry(key string) *string
- func (d PDFDict) StringEntryBytes(key string) ([]byte, error)
- func (d PDFDict) Subtype() *string
- func (d PDFDict) Type() *string
- func (d *PDFDict) Update(key string, value PDFObject)
- func (d PDFDict) W() *PDFArray
- type PDFFilter
- type PDFFloat
- type PDFHexLiteral
- type PDFIndirectRef
- type PDFInteger
- type PDFName
- type PDFObject
- type PDFObjectStreamDict
- type PDFStats
- type PDFStreamDict
- type PDFStringLiteral
- type PDFVersion
- type PDFXRefStreamDict
- type ReadContext
- type StringSet
- type WriteContext
- type XRefTable
- func (xRefTable *XRefTable) BindNameTrees() error
- func (xRefTable *XRefTable) Catalog() (*PDFDict, error)
- func (xRefTable *XRefTable) CatalogHasPieceInfo() (bool, error)
- func (xRefTable *XRefTable) DeleteObject(objectNumber int) error
- func (xRefTable *XRefTable) DeleteObjectGraph(obj PDFObject) error
- func (xRefTable *XRefTable) Dereference(obj PDFObject) (PDFObject, error)
- func (xRefTable *XRefTable) DereferenceArray(obj PDFObject) (*PDFArray, error)
- func (xRefTable *XRefTable) DereferenceDict(obj PDFObject) (*PDFDict, error)
- func (xRefTable *XRefTable) DereferenceInteger(obj PDFObject) (*PDFInteger, error)
- func (xRefTable *XRefTable) DereferenceName(obj PDFObject, sinceVersion PDFVersion, validate func(string) bool) (n PDFName, err error)
- func (xRefTable *XRefTable) DereferenceStreamDict(obj PDFObject) (*PDFStreamDict, error)
- func (xRefTable *XRefTable) DereferenceStringLiteral(obj PDFObject, sinceVersion PDFVersion, validate func(string) bool) (s PDFStringLiteral, err error)
- func (xRefTable *XRefTable) DereferenceStringOrHexLiteral(obj PDFObject, sinceVersion PDFVersion, validate func(string) bool) (o PDFObject, err error)
- func (xRefTable *XRefTable) EncryptDict() (*PDFDict, error)
- func (xRefTable *XRefTable) EnsureCollection() error
- func (xRefTable *XRefTable) EnsureValidFreeList() error
- func (xRefTable *XRefTable) Exists(objNumber int) bool
- func (xRefTable *XRefTable) Find(objNumber int) (*XRefTableEntry, bool)
- func (xRefTable *XRefTable) FindObject(objNumber int) (PDFObject, error)
- func (xRefTable *XRefTable) FindTableEntry(objNumber int, generationNumber int) (*XRefTableEntry, bool)
- func (xRefTable *XRefTable) FindTableEntryForIndRef(indRef *PDFIndirectRef) (*XRefTableEntry, bool)
- func (xRefTable *XRefTable) FindTableEntryLight(objNumber int) (*XRefTableEntry, bool)
- func (xRefTable *XRefTable) Free(objNumber int) (*XRefTableEntry, error)
- func (xRefTable *XRefTable) IDFirstElement() (id []byte, err error)
- func (xRefTable *XRefTable) IndRefForNewObject(obj PDFObject) (*PDFIndirectRef, error)
- func (xRefTable *XRefTable) InsertAndUseRecycled(xRefTableEntry XRefTableEntry) (objNumber int, err error)
- func (xRefTable *XRefTable) InsertNew(xRefTableEntry XRefTableEntry) (objNumber int)
- func (xRefTable *XRefTable) InsertObject(obj PDFObject) (objNumber int, err error)
- func (xRefTable *XRefTable) IsLinearizationObject(i int) bool
- func (xRefTable *XRefTable) LinearizationObjsString() (int, string)
- func (xRefTable *XRefTable) LocateNameTree(nameTreeName string, ensure bool) error
- func (xRefTable *XRefTable) MissingObjects() (int, *string)
- func (xRefTable *XRefTable) NamesDict() (*PDFDict, error)
- func (xRefTable *XRefTable) NewEmbeddedFileStreamDict(filename string) (*PDFStreamDict, error)
- func (xRefTable *XRefTable) NewFileSpecDict(filename string, indRefStreamDict PDFIndirectRef) (*PDFDict, error)
- func (xRefTable *XRefTable) NewPDFStreamDict(filename string) (*PDFStreamDict, error)
- func (xRefTable *XRefTable) NewSoundStreamDict(filename string, samplingRate int, fileSpecDict *PDFDict) (*PDFStreamDict, error)
- func (xRefTable *XRefTable) NextForFree(objNumber int) (int, error)
- func (xRefTable *XRefTable) PageDict(page int) (*PDFDict, error)
- func (xRefTable *XRefTable) Pages() (*PDFIndirectRef, error)
- func (xRefTable *XRefTable) ParseRootVersion() (v *string, err error)
- func (xRefTable *XRefTable) RemoveCollection() error
- func (xRefTable *XRefTable) RemoveEmbeddedFilesNameTree() error
- func (xRefTable *XRefTable) RemoveNameTree(nameTreeName string) error
- func (xRefTable *XRefTable) UndeleteObject(objectNumber int) error
- func (xRefTable *XRefTable) ValidateVersion(element string, sinceVersion PDFVersion) error
- func (xRefTable *XRefTable) Version() PDFVersion
- func (xRefTable *XRefTable) VersionString() string
- type XRefTableEntry
Examples ¶
- Process (AddAttachments)
- Process (AddPermissions)
- Process (ChangeOwnerPW)
- Process (ChangeUserPW)
- Process (Decrypt)
- Process (Encrypt)
- Process (ExtractAttachments)
- Process (ExtractImages)
- Process (ExtractPages)
- Process (ListAttachments)
- Process (ListPermissions)
- Process (Merge)
- Process (Optimize)
- Process (RemoveAttachments)
- Process (Split)
- Process (Trim)
- Process (Validate)
Constants ¶
const ( // ValidationStrict ensures 100% compliance with the spec (PDF 32000-1:2008). ValidationStrict = 0 // ValidationRelaxed ensures PDF compliance based on frequently encountered validation errors. ValidationRelaxed = 1 // StatsFileNameDefault is the standard stats filename. StatsFileNameDefault = "stats.csv" // PermissionsAll enables all user access permission bits. PermissionsAll int16 = -1 // 0xFFFF // PermissionsNone disables all user access permissions bits. PermissionsNone int16 = -3901 // 0xF0C3 )
const ( RootVersion = iota RootExtensions RootPageLabels RootNames RootDests RootViewerPrefs RootPageLayout RootPageMode RootOutlines RootThreads RootOpenAction RootAA RootURI RootAcroForm RootMetadata RootStructTreeRoot RootMarkInfo RootLang RootSpiderInfo RootOutputIntents RootPieceInfo RootOCProperties RootPerms RootLegal RootRequirements RootCollection RootNeedsRendering )
The PDF root object fields.
const ( PageLastModified = iota PageResources PageMediaBox PageCropBox PageBleedBox PageTrimBox PageArtBox PageBoxColorInfo PageContents PageRotate PageGroup PageThumb PageB PageDur PageTrans PageAnnots PageAA PageMetadata PagePieceInfo PageStructParents PageID PagePZ PageSeparationInfo PageTabs PageTemplateInstantiated PagePresSteps PageUserUnit PageVP )
The PDF page object fields.
const ( EolLF = "\x0A" EolCR = "\x0D" EolCRLF = "\x0D\x0A" )
Supported line delimiters
const ( // REQUIRED is used for required dict entries. REQUIRED = true // OPTIONAL is used for optional dict entries. OPTIONAL = false )
const ( // ExcludePatternCS ... ExcludePatternCS = true // IncludePatternCS ... IncludePatternCS = false )
const ( // PDFCPUVersion returns the current pdfcpu version. PDFCPUVersion = "0.1.11" // PDFCPULongVersion returns pdfcpu's signature. PDFCPULongVersion = "golang pdfcpu v" + PDFCPUVersion )
const FreeHeadGeneration = 65535
FreeHeadGeneration is the predefined generation number for the head of the free list.
const (
// ObjectStreamMaxObjects limits the number of objects within an object stream written.
ObjectStreamMaxObjects = 100
)
Variables ¶
This section is empty.
Functions ¶
func AddAttachments ¶ added in v0.1.3
func AddAttachments(fileIn string, files []string, config *Configuration) error
AddAttachments embeds files into a PDF.
func AddPermissions ¶ added in v0.1.6
func AddPermissions(fileIn string, config *Configuration) error
AddPermissions sets the user access permissions.
func ChangeOwnerPassword ¶ added in v0.1.1
func ChangeOwnerPassword(fileIn, fileOut string, config *Configuration, pwOld, pwNew *string) error
ChangeOwnerPassword of fileIn and write result to fileOut.
func ChangeUserPassword ¶ added in v0.1.1
func ChangeUserPassword(fileIn, fileOut string, config *Configuration, pwOld, pwNew *string) error
ChangeUserPassword of fileIn and write result to fileOut.
func DecodeUTF16String ¶ added in v0.1.11
DecodeUTF16String decodes a UTF16BE string from a hex string.
func Decrypt ¶ added in v0.1.1
func Decrypt(fileIn, fileOut string, config *Configuration) error
Decrypt fileIn and write result to fileOut.
func Encrypt ¶ added in v0.1.1
func Encrypt(fileIn, fileOut string, config *Configuration) error
Encrypt fileIn and write result to fileOut.
func ExtractAttachments ¶ added in v0.1.3
func ExtractAttachments(fileIn, dirOut string, files []string, config *Configuration) error
ExtractAttachments extracts embedded files from a PDF.
func ExtractContent ¶
func ExtractContent(fileIn, dirOut string, pageSelection []string, config *Configuration) error
ExtractContent dumps "PDF source" files from fileIn into dirOut for selected pages.
func ExtractFonts ¶
func ExtractFonts(fileIn, dirOut string, pageSelection []string, config *Configuration) error
ExtractFonts dumps embedded fontfiles from fileIn into dirOut for selected pages.
func ExtractImages ¶
func ExtractImages(fileIn, dirOut string, pageSelection []string, config *Configuration) error
ExtractImages dumps embedded image resources from fileIn into dirOut for selected pages.
func ExtractPages ¶
func ExtractPages(fileIn, dirOut string, pageSelection []string, config *Configuration) error
ExtractPages generates single page PDF files from fileIn in dirOut for selected pages.
func HexLiteralToString ¶ added in v0.1.11
HexLiteralToString returns a possibly UTF16 encoded string for a hex string.
func IsStringUTF16BE ¶ added in v0.1.11
IsStringUTF16BE checks a string for Big Endian byte order BOM.
func ListAttachments ¶ added in v0.1.3
func ListAttachments(fileIn string, config *Configuration) ([]string, error)
ListAttachments returns a list of embedded file attachments.
func ListPermissions ¶ added in v0.1.6
func ListPermissions(fileIn string, config *Configuration) ([]string, error)
ListPermissions returns a list of user access permissions.
func Merge ¶
func Merge(filesIn []string, fileOut string, config *Configuration) error
Merge some PDF files together and write the result to fileOut. This corresponds to concatenating these files in the order specified by filesIn. The first entry of filesIn serves as the destination xRefTable where all the remaining files gets merged into.
func Optimize ¶
func Optimize(fileIn, fileOut string, config *Configuration) error
Optimize reads in fileIn, does validation, optimization and writes the result to fileOut.
func ParsePageSelection ¶
ParsePageSelection ensures a correct page selection expression.
func Process ¶
Process executes a pdfcpu command.
Example (AddAttachments) ¶
config := NewDefaultConfiguration() // Set optional password(s). //config.UserPW = "upw" //config.OwnerPW = "opw" _, err := Process(AddAttachmentsCommand("in.pdf", []string{"a.csv", "b.jpg", "c.pdf"}, config)) if err != nil { return }
Output:
Example (AddPermissions) ¶
config := NewDefaultConfiguration() config.UserPW = "upw" config.OwnerPW = "opw" config.UserAccessPermissions = PermissionsAll _, err := Process(AddPermissionsCommand("in.pdf", config)) if err != nil { return }
Output:
Example (ChangeOwnerPW) ¶
config := NewDefaultConfiguration() // supply existing user pw like so config.UserPW = "upw" // old and new owner pw pwOld := "pwOld" pwNew := "pwNew" _, err := Process(ChangeOwnerPWCommand("in.pdf", "out.pdf", config, &pwOld, &pwNew)) if err != nil { return }
Output:
Example (ChangeUserPW) ¶
config := NewDefaultConfiguration() // supply existing owner pw like so config.OwnerPW = "opw" pwOld := "pwOld" pwNew := "pwNew" _, err := Process(ChangeUserPWCommand("in.pdf", "out.pdf", config, &pwOld, &pwNew)) if err != nil { return }
Output:
Example (Decrypt) ¶
config := NewDefaultConfiguration() config.UserPW = "upw" config.OwnerPW = "opw" _, err := Process(DecryptCommand("in.pdf", "out.pdf", config)) if err != nil { return }
Output:
Example (Encrypt) ¶
config := NewDefaultConfiguration() config.UserPW = "upw" config.OwnerPW = "opw" _, err := Process(EncryptCommand("in.pdf", "out.pdf", config)) if err != nil { return }
Output:
Example (ExtractAttachments) ¶
config := NewDefaultConfiguration() // Set optional password(s). //config.UserPW = "upw" //config.OwnerPW = "opw" // Extract all attachments. _, err := Process(ExtractAttachmentsCommand("in.pdf", "dirOut", nil, config)) if err != nil { return } // Extract specific attachments. _, err = Process(ExtractAttachmentsCommand("in.pdf", "dirOut", []string{"a.csv", "b.pdf"}, config)) if err != nil { return }
Output:
Example (ExtractImages) ¶
// Extract all embedded images for first 5 and last 5 pages but not for page 4. selectedPages := []string{"-5", "5-", "!4"} config := NewDefaultConfiguration() // Set optional password(s). //config.UserPW = "upw" //config.OwnerPW = "opw" _, err := Process(ExtractImagesCommand("in.pdf", "dirOut", selectedPages, config)) if err != nil { return }
Output:
Example (ExtractPages) ¶
// Extract single-page PDFs for pages 3, 4 and 5. selectedPages := []string{"3..5"} config := NewDefaultConfiguration() // Set optional password(s). //config.UserPW = "upw" //config.OwnerPW = "opw" _, err := Process(ExtractPagesCommand("in.pdf", "dirOut", selectedPages, config)) if err != nil { return }
Output:
Example (ListAttachments) ¶
config := NewDefaultConfiguration() // Set optional password(s). //config.UserPW = "upw" //config.OwnerPW = opw" list, err := Process(ListAttachmentsCommand("in.pdf", config)) if err != nil { return } // Print attachment list. for _, l := range list { fmt.Println(l) }
Output:
Example (ListPermissions) ¶
config := NewDefaultConfiguration() config.UserPW = "upw" config.OwnerPW = "opw" list, err := Process(ListPermissionsCommand("in.pdf", config)) if err != nil { return } // Print permissions list. for _, l := range list { fmt.Println(l) }
Output:
Example (Merge) ¶
// Concatenate this sequence of PDF files: filenamesIn := []string{"in1.pdf", "in2.pdf", "in3.pdf"} _, err := Process(MergeCommand(filenamesIn, "out.pdf", NewDefaultConfiguration())) if err != nil { return }
Output:
Example (Optimize) ¶
config := NewDefaultConfiguration() // Set optional password(s). //config.UserPW = "upw" //config.OwnerPW = "opw" // Generate optional stats. config.StatsFileName = "stats.csv" // Configure end of line sequence for writing. config.Eol = EolLF _, err := Process(OptimizeCommand("in.pdf", "out.pdf", config)) if err != nil { return }
Output:
Example (RemoveAttachments) ¶
config := NewDefaultConfiguration() // Set optional password(s). //config.UserPW = "upw" //config.OwnerPW = "opw" // Not to be confused with the ExtractAttachmentsCommand! // Remove all attachments. _, err := Process(RemoveAttachmentsCommand("in.pdf", nil, config)) if err != nil { return } // Remove specific attachments. _, err = Process(RemoveAttachmentsCommand("in.pdf", []string{"a.csv", "b.jpg"}, config)) if err != nil { return }
Output:
Example (Split) ¶
config := NewDefaultConfiguration() // Set optional password(s). //config.UserPW = "upw" //config.OwnerPW = "opw" // Split into single-page PDFs. _, err := Process(SplitCommand("in.pdf", "outDir", config)) if err != nil { return }
Output:
Example (Trim) ¶
// Trim to first three pages. selectedPages := []string{"-3"} config := NewDefaultConfiguration() // Set optional password(s). //config.UserPW = "upw" //config.OwnerPW = "opw" _, err := Process(TrimCommand("in.pdf", "out.pdf", selectedPages, config)) if err != nil { return }
Output:
Example (Validate) ¶
config := NewDefaultConfiguration() // Set optional password(s). //config.UserPW = "upw" //config.OwnerPW = "opw" // Set relaxed validation mode. config.ValidationMode = ValidationRelaxed _, err := Process(ValidateCommand("in.pdf", config)) if err != nil { return }
Output:
func RemoveAttachments ¶ added in v0.1.3
func RemoveAttachments(fileIn string, files []string, config *Configuration) error
RemoveAttachments deletes embedded files from a PDF.
func Split ¶
func Split(fileIn, dirOut string, config *Configuration) error
Split generates a sequence of single page PDF files in dirOut creating one file for every page of inFile.
func StringLiteralToString ¶ added in v0.1.11
StringLiteralToString returns the best possible string rep for a string literal.
func Trim ¶
func Trim(fileIn, fileOut string, pageSelection []string, config *Configuration) error
Trim generates a trimmed version of fileIn containing all pages selected.
func Validate ¶
func Validate(fileIn string, config *Configuration) error
Validate validates a PDF file against ISO-32000-1:2008.
func VersionString ¶ added in v0.1.11
func VersionString(version PDFVersion) string
VersionString returns a string representation for a given PDFVersion.
Types ¶
type ByteSize ¶ added in v0.1.11
type ByteSize float64
ByteSize represents the various terms for storage space.
type Command ¶
type Command struct { Mode CommandMode // VALIDATE OPTIMIZE SPLIT MERGE EXTRACT TRIM LISTATT ADDATT REMATT EXTATT ENCRYPT DECRYPT CHANGEUPW CHANGEOPW LISTP ADDP InFile *string // * * * - * * * * * * * * * * * * InFiles []string // - - - * - - - * * * - - - - - - InDir *string // - - - - - - - - - - - - - - - - OutFile *string // - * - * - * - - - - * * * * - - OutDir *string // - - * - * - - - - * - - - - - - PageSelection []string // - - - - * * - - - - - - - - - - Config *Configuration // * * * * * * * * * * * * * * * * PWOld *string // - - - - - - - - - - - - * * - - PWNew *string // - - - - - - - - - - - - * * - - }
Command represents an execution context.
func AddAttachmentsCommand ¶ added in v0.1.3
func AddAttachmentsCommand(pdfFileNameIn string, fileNamesIn []string, config *Configuration) *Command
AddAttachmentsCommand creates a new AddAttachmentsCommand.
func AddPermissionsCommand ¶ added in v0.1.6
func AddPermissionsCommand(pdfFileNameIn string, config *Configuration) *Command
AddPermissionsCommand creates a new AddPermissionsCommand.
func ChangeOwnerPWCommand ¶ added in v0.1.1
func ChangeOwnerPWCommand(pdfFileNameIn, pdfFileNameOut string, config *Configuration, pwOld, pwNew *string) *Command
ChangeOwnerPWCommand creates a new ChangeOwnerPWCommand.
func ChangeUserPWCommand ¶ added in v0.1.1
func ChangeUserPWCommand(pdfFileNameIn, pdfFileNameOut string, config *Configuration, pwOld, pwNew *string) *Command
ChangeUserPWCommand creates a new ChangeUserPWCommand.
func DecryptCommand ¶ added in v0.1.1
func DecryptCommand(pdfFileNameIn, pdfFileNameOut string, config *Configuration) *Command
DecryptCommand creates a new DecryptCommand.
func EncryptCommand ¶ added in v0.1.1
func EncryptCommand(pdfFileNameIn, pdfFileNameOut string, config *Configuration) *Command
EncryptCommand creates a new EncryptCommand.
func ExtractAttachmentsCommand ¶ added in v0.1.3
func ExtractAttachmentsCommand(pdfFileNameIn, dirNameOut string, fileNamesIn []string, config *Configuration) *Command
ExtractAttachmentsCommand creates a new ExtractAttachmentsCommand.
func ExtractContentCommand ¶
func ExtractContentCommand(pdfFileNameIn, dirNameOut string, pageSelection []string, config *Configuration) *Command
ExtractContentCommand creates a new ExtractContentCommand.
func ExtractFontsCommand ¶
func ExtractFontsCommand(pdfFileNameIn, dirNameOut string, pageSelection []string, config *Configuration) *Command
ExtractFontsCommand creates a new ExtractFontsCommand. (experimental)
func ExtractImagesCommand ¶
func ExtractImagesCommand(pdfFileNameIn, dirNameOut string, pageSelection []string, config *Configuration) *Command
ExtractImagesCommand creates a new ExtractImagesCommand. (experimental)
func ExtractPagesCommand ¶
func ExtractPagesCommand(pdfFileNameIn, dirNameOut string, pageSelection []string, config *Configuration) *Command
ExtractPagesCommand creates a new ExtractPagesCommand.
func ListAttachmentsCommand ¶ added in v0.1.3
func ListAttachmentsCommand(pdfFileNameIn string, config *Configuration) *Command
ListAttachmentsCommand create a new ListAttachmentsCommand.
func ListPermissionsCommand ¶ added in v0.1.6
func ListPermissionsCommand(pdfFileNameIn string, config *Configuration) *Command
ListPermissionsCommand create a new ListPermissionsCommand.
func MergeCommand ¶
func MergeCommand(pdfFileNamesIn []string, pdfFileNameOut string, config *Configuration) *Command
MergeCommand creates a new MergeCommand.
func OptimizeCommand ¶
func OptimizeCommand(pdfFileNameIn, pdfFileNameOut string, config *Configuration) *Command
OptimizeCommand creates a new OptimizeCommand.
func RemoveAttachmentsCommand ¶ added in v0.1.3
func RemoveAttachmentsCommand(pdfFileNameIn string, fileNamesIn []string, config *Configuration) *Command
RemoveAttachmentsCommand creates a new RemoveAttachmentsCommand.
func SplitCommand ¶
func SplitCommand(pdfFileNameIn, dirNameOut string, config *Configuration) *Command
SplitCommand creates a new SplitCommand.
func TrimCommand ¶
func TrimCommand(pdfFileNameIn, pdfFileNameOut string, pageSelection []string, config *Configuration) *Command
TrimCommand creates a new TrimCommand.
func ValidateCommand ¶
func ValidateCommand(pdfFileName string, config *Configuration) *Command
ValidateCommand creates a new ValidateCommand.
type CommandMode ¶ added in v0.1.11
type CommandMode int
CommandMode specifies the operation being executed.
const ( VALIDATE CommandMode = iota OPTIMIZE SPLIT MERGE EXTRACTIMAGES EXTRACTFONTS EXTRACTPAGES EXTRACTCONTENT TRIM ADDATTACHMENTS REMOVEATTACHMENTS EXTRACTATTACHMENTS LISTATTACHMENTS ADDPERMISSIONS LISTPERMISSIONS ENCRYPT DECRYPT CHANGEUPW CHANGEOPW )
The available commands.
type Configuration ¶ added in v0.1.11
type Configuration struct { // Enables PDF V1.5 compatible processing of object streams, xref streams, hybrid PDF files. Reader15 bool // Enables decoding of all streams (fontfiles, images..) for logging purposes. DecodeAllStreams bool // Validate against ISO-32000: strict or relaxed ValidationMode int // End of line char sequence for writing. Eol string // Turns on object stream generation. // A signal for compressing any new non-stream-object into an object stream. // true enforces WriteXRefStream to true. // false does not prevent xRefStream generation. WriteObjectStream bool // Switches between xRefSection (<=V1.4) and objectStream/xRefStream (>=V1.5) writing. WriteXRefStream bool // Turns on stats collection. CollectStats bool // A CSV-filename holding the statistics. StatsFileName string // Supplied user password UserPW string UserPWNew *string // Supplied owner password OwnerPW string OwnerPWNew *string // EncryptUsingAES ensures AES encryption. // true: AES encryption // false: RC4 encryption. EncryptUsingAES bool // EncryptUsing128BitKey ensures 128 bit key length. // true: use 128 bit key // false: use 40 bit key EncryptUsing128BitKey bool // Supplied user access permissions, see Table 22 UserAccessPermissions int16 // Command being executed. Mode CommandMode }
Configuration of a PDFContext.
func NewDefaultConfiguration ¶ added in v0.1.11
func NewDefaultConfiguration() *Configuration
NewDefaultConfiguration returns the default pdfcpu configuration.
func (*Configuration) ValidationModeString ¶ added in v0.1.11
func (c *Configuration) ValidationModeString() string
ValidationModeString returns a string rep for the validation mode in effect.
type FontObject ¶ added in v0.1.11
type FontObject struct { ResourceNames []string Prefix string FontName string FontDict *PDFDict Data []byte Extension string }
FontObject represents a font used in a PDF file.
func (*FontObject) AddResourceName ¶ added in v0.1.11
func (fo *FontObject) AddResourceName(resourceName string)
AddResourceName adds a resourceName referring to this font.
func (FontObject) Embedded ¶ added in v0.1.11
func (fo FontObject) Embedded() (embedded bool)
Embedded returns true if the font is embedded into this PDF file.
func (FontObject) Encoding ¶ added in v0.1.11
func (fo FontObject) Encoding() string
Encoding returns the Encoding of this font.
func (FontObject) ResourceNamesString ¶ added in v0.1.11
func (fo FontObject) ResourceNamesString() string
ResourceNamesString returns a string representation of all the resource names of this font.
func (FontObject) String ¶ added in v0.1.11
func (fo FontObject) String() string
func (FontObject) SubType ¶ added in v0.1.11
func (fo FontObject) SubType() string
SubType returns the SubType of this font.
type ImageObject ¶ added in v0.1.11
type ImageObject struct { ResourceNames []string ImageDict *PDFStreamDict Extension string }
ImageObject represents an image used in a PDF file.
func (*ImageObject) AddResourceName ¶ added in v0.1.11
func (io *ImageObject) AddResourceName(resourceName string)
AddResourceName adds a resourceName to this imageObject's ResourceNames dict.
func (ImageObject) Data ¶ added in v0.1.11
func (io ImageObject) Data() []byte
Data returns the raw data belonging to this image object.
func (ImageObject) ResourceNamesString ¶ added in v0.1.11
func (io ImageObject) ResourceNamesString() string
ResourceNamesString returns a string representation of the ResourceNames for this image.
type Node ¶ added in v0.1.11
type Node struct { Kids []*Node // Mirror of the name tree's Kids array. Names []entry // Mirror of the name tree's Names array. Kmin, Kmax string // Mirror of the name tree's Limit array[Kmin,Kmax]. IndRef *PDFIndirectRef // Pointer to the PDF object representing this name tree node. }
Node is an opiniated implementation of the PDF name tree. pdfcpu caches all name trees found in the PDF catalog with this data structure. The PDF spec does not impose any rules regarding a strategy for the creation of nodes. A binary tree was chosen where each leaf node has a limited number of entries (maxEntries). Once maxEntries has been reached a leaf node turns into an intermediary node with two kids, which are leaf nodes each of them holding half of the sorted entries of the original leaf node.
func (Node) Process ¶ added in v0.1.11
func (n Node) Process(xRefTable *XRefTable, handler func(*XRefTable, string, PDFObject) error) error
Process traverses the nametree applying a handler to each entry (key-value pair).
type OptimizationContext ¶ added in v0.1.11
type OptimizationContext struct { // Font section PageFonts []IntSet FontObjects map[int]*FontObject Fonts map[string][]int DuplicateFontObjs IntSet DuplicateFonts map[int]*PDFDict // Image section PageImages []IntSet ImageObjects map[int]*ImageObject DuplicateImageObjs IntSet DuplicateImages map[int]*PDFStreamDict DuplicateInfoObjects IntSet // Possible result of manual info dict modification. NonReferencedObjs []int // Objects that are not referenced. }
OptimizationContext represents the context for the optimiziation of a PDF file.
func (*OptimizationContext) DuplicateFontObjectsString ¶ added in v0.1.11
func (oc *OptimizationContext) DuplicateFontObjectsString() (int, string)
DuplicateFontObjectsString returns a formatted string and the number of objs.
func (*OptimizationContext) DuplicateImageObjectsString ¶ added in v0.1.11
func (oc *OptimizationContext) DuplicateImageObjectsString() (int, string)
DuplicateImageObjectsString returns a formatted string and the number of objs.
func (*OptimizationContext) DuplicateInfoObjectsString ¶ added in v0.1.11
func (oc *OptimizationContext) DuplicateInfoObjectsString() (int, string)
DuplicateInfoObjectsString returns a formatted string and the number of objs.
func (*OptimizationContext) IsDuplicateFontObject ¶ added in v0.1.11
func (oc *OptimizationContext) IsDuplicateFontObject(i int) bool
IsDuplicateFontObject returns true if object #i is a duplicate font object.
func (*OptimizationContext) IsDuplicateImageObject ¶ added in v0.1.11
func (oc *OptimizationContext) IsDuplicateImageObject(i int) bool
IsDuplicateImageObject returns true if object #i is a duplicate image object.
func (*OptimizationContext) IsDuplicateInfoObject ¶ added in v0.1.11
func (oc *OptimizationContext) IsDuplicateInfoObject(i int) bool
IsDuplicateInfoObject returns true if object #i is a duplicate info object.
func (*OptimizationContext) NonReferencedObjsString ¶ added in v0.1.11
func (oc *OptimizationContext) NonReferencedObjsString() (int, string)
NonReferencedObjsString returns a formatted string and the number of objs.
type PDFArray ¶ added in v0.1.11
type PDFArray []PDFObject
PDFArray represents a PDF array object.
func NewIntegerArray ¶ added in v0.1.11
NewIntegerArray returns a PDFArray with PDFInteger entries.
func NewNameArray ¶ added in v0.1.11
NewNameArray returns a PDFArray with PDFName entries.
func NewNumberArray ¶ added in v0.1.11
NewNumberArray returns a PDFArray with PDFFloat entries.
func NewRectangle ¶ added in v0.1.11
NewRectangle creates a rectangle array
func NewStringArray ¶ added in v0.1.11
NewStringArray returns a PDFArray with PDFStringLiteral entries.
type PDFBoolean ¶ added in v0.1.11
type PDFBoolean bool
PDFBoolean represents a PDF boolean object.
func (PDFBoolean) PDFString ¶ added in v0.1.11
func (boolean PDFBoolean) PDFString() string
PDFString returns a string representation as found in and written to a PDF file.
func (PDFBoolean) String ¶ added in v0.1.11
func (boolean PDFBoolean) String() string
func (PDFBoolean) Value ¶ added in v0.1.11
func (boolean PDFBoolean) Value() bool
Value returns a bool value for this PDF object.
type PDFContext ¶ added in v0.1.11
type PDFContext struct { *Configuration *XRefTable Read *ReadContext Optimize *OptimizationContext Write *WriteContext }
PDFContext represents the context for processing PDF files.
func NewPDFContext ¶ added in v0.1.11
func NewPDFContext(fileName string, file *os.File, config *Configuration) (*PDFContext, error)
NewPDFContext initializes a new PDFContext.
func Read ¶
func Read(fileIn string, config *Configuration) (*PDFContext, error)
Read reads in a PDF file and builds an internal structure holding its cross reference table aka the PDFContext.
func (*PDFContext) ResetWriteContext ¶ added in v0.1.11
func (ctx *PDFContext) ResetWriteContext()
ResetWriteContext prepares an existing WriteContext for a new file to be written.
func (*PDFContext) String ¶ added in v0.1.11
func (ctx *PDFContext) String() string
type PDFDict ¶ added in v0.1.11
PDFDict represents a PDF dict object.
func NewPDFDict ¶ added in v0.1.11
func NewPDFDict() PDFDict
NewPDFDict returns a new PDFDict object.
func (PDFDict) BooleanEntry ¶ added in v0.1.11
BooleanEntry expects and returns a BooleanEntry for given key.
func (PDFDict) IndirectRefEntry ¶ added in v0.1.11
func (d PDFDict) IndirectRefEntry(key string) *PDFIndirectRef
IndirectRefEntry returns an indirectRefEntry for given key for this dictionary.
func (*PDFDict) InsertFloat ¶ added in v0.1.11
InsertFloat adds a new float entry to this PDFDict.
func (*PDFDict) InsertName ¶ added in v0.1.11
InsertName adds a new name entry to this PDFDict.
func (*PDFDict) InsertString ¶ added in v0.1.11
InsertString adds a new string entry to this PDFDict.
func (PDFDict) Int64Entry ¶ added in v0.1.11
Int64Entry expects and returns a PDFInteger entry representing an int64 value for given key.
func (PDFDict) IntEntry ¶ added in v0.1.11
IntEntry expects and returns a PDFInteger entry for given key.
func (PDFDict) IsLinearizationParmDict ¶ added in v0.1.11
IsLinearizationParmDict returns true if this dict has an int entry for key "Linearized".
func (PDFDict) IsObjStm ¶ added in v0.1.11
IsObjStm returns true if given PDFDict is an object stream.
func (PDFDict) Length ¶ added in v0.1.11
Length returns a *int64 for entry with key "Length". Stream length may be referring to an indirect object.
func (PDFDict) NameEntry ¶ added in v0.1.11
NameEntry expects and returns a PDFName entry for given key.
func (PDFDict) PDFArrayEntry ¶ added in v0.1.11
PDFArrayEntry expects and returns a PDFArray entry for given key.
func (PDFDict) PDFDictEntry ¶ added in v0.1.11
PDFDictEntry expects and returns a PDFDict entry for given key.
func (PDFDict) PDFHexLiteralEntry ¶ added in v0.1.11
func (d PDFDict) PDFHexLiteralEntry(key string) *PDFHexLiteral
PDFHexLiteralEntry returns a PDFHexLiteral object for given key.
func (PDFDict) PDFNameEntry ¶ added in v0.1.11
PDFNameEntry returns a PDFName object for given key.
func (PDFDict) PDFStreamDictEntry ¶ added in v0.1.11
func (d PDFDict) PDFStreamDictEntry(key string) *PDFStreamDict
PDFStreamDictEntry expects and returns a PDFStreamDict entry for given key. unused.
func (PDFDict) PDFString ¶ added in v0.1.11
PDFString returns a string representation as found in and written to a PDF file.
func (PDFDict) PDFStringLiteralEntry ¶ added in v0.1.11
func (d PDFDict) PDFStringLiteralEntry(key string) *PDFStringLiteral
PDFStringLiteralEntry returns a PDFStringLiteral object for given key.
func (PDFDict) StringEntry ¶ added in v0.1.11
StringEntry expects and returns a PDFStringLiteral entry for given key. Unused.
func (PDFDict) StringEntryBytes ¶ added in v0.1.11
StringEntryBytes returns the byte slice representing the string value for key.
func (PDFDict) Subtype ¶ added in v0.1.11
Subtype returns the value of the name entry for key "Subtype".
type PDFFloat ¶ added in v0.1.11
type PDFFloat float64
PDFFloat represents a PDF float object.
type PDFHexLiteral ¶ added in v0.1.11
type PDFHexLiteral string
PDFHexLiteral represents a PDF hex literal object.
func (PDFHexLiteral) Bytes ¶ added in v0.1.11
func (hexliteral PDFHexLiteral) Bytes() ([]byte, error)
Bytes returns the byte representation.
func (PDFHexLiteral) PDFString ¶ added in v0.1.11
func (hexliteral PDFHexLiteral) PDFString() string
PDFString returns the string representation as found in and written to a PDF file.
func (PDFHexLiteral) String ¶ added in v0.1.11
func (hexliteral PDFHexLiteral) String() string
func (PDFHexLiteral) Value ¶ added in v0.1.11
func (hexliteral PDFHexLiteral) Value() string
Value returns a string value for this PDF object.
type PDFIndirectRef ¶ added in v0.1.11
type PDFIndirectRef struct { ObjectNumber PDFInteger GenerationNumber PDFInteger }
PDFIndirectRef represents a PDF indirect object.
func NewPDFIndirectRef ¶ added in v0.1.11
func NewPDFIndirectRef(objectNumber, generationNumber int) *PDFIndirectRef
NewPDFIndirectRef returns a new PDFIndirectRef object.
func (PDFIndirectRef) Equals ¶ added in v0.1.11
func (ir PDFIndirectRef) Equals(indRef PDFIndirectRef) bool
Equals returns true if two indirect References refer to the same object.
func (PDFIndirectRef) PDFString ¶ added in v0.1.11
func (ir PDFIndirectRef) PDFString() string
PDFString returns a string representation as found in and written to a PDF file.
func (PDFIndirectRef) String ¶ added in v0.1.11
func (ir PDFIndirectRef) String() string
type PDFInteger ¶ added in v0.1.11
type PDFInteger int
PDFInteger represents a PDF integer object.
func (PDFInteger) PDFString ¶ added in v0.1.11
func (i PDFInteger) PDFString() string
PDFString returns a string representation as found in and written to a PDF file.
func (PDFInteger) String ¶ added in v0.1.11
func (i PDFInteger) String() string
func (PDFInteger) Value ¶ added in v0.1.11
func (i PDFInteger) Value() int
Value returns an int value for this PDF object.
type PDFName ¶ added in v0.1.11
type PDFName string
PDFName represents a PDF name object.
type PDFObjectStreamDict ¶ added in v0.1.11
type PDFObjectStreamDict struct { PDFStreamDict Prolog []byte ObjCount int FirstObjOffset int ObjArray PDFArray }
PDFObjectStreamDict represents a object stream dictionary.
func NewPDFObjectStreamDict ¶ added in v0.1.11
func NewPDFObjectStreamDict() *PDFObjectStreamDict
NewPDFObjectStreamDict creates a new PDFObjectStreamDict object.
func (*PDFObjectStreamDict) AddObject ¶ added in v0.1.11
func (oStreamDict *PDFObjectStreamDict) AddObject(objNumber int, entry *XRefTableEntry) error
AddObject adds another object to this object stream. Relies on decoded content!
func (*PDFObjectStreamDict) Finalize ¶ added in v0.1.11
func (oStreamDict *PDFObjectStreamDict) Finalize()
Finalize prepares the final content of the objectstream.
func (*PDFObjectStreamDict) IndexedObject ¶ added in v0.1.11
func (oStreamDict *PDFObjectStreamDict) IndexedObject(index int) (PDFObject, error)
IndexedObject returns the object at given index from a PDFObjectStreamDict.
type PDFStats ¶ added in v0.1.11
type PDFStats struct {
// contains filtered or unexported fields
}
PDFStats is a container for stats.
func NewPDFStats ¶ added in v0.1.11
func NewPDFStats() PDFStats
NewPDFStats returns a new PDFStats object.
func (PDFStats) AddPageAttr ¶ added in v0.1.11
AddPageAttr adds the occurrence of a field with given name to the pageAttrs set.
func (PDFStats) AddRootAttr ¶ added in v0.1.11
AddRootAttr adds the occurrence of a field with given name to the rootAttrs set.
func (PDFStats) UsesPageAttr ¶ added in v0.1.11
UsesPageAttr returns true if a field with given name is contained in the pageAttrs set.
func (PDFStats) UsesRootAttr ¶ added in v0.1.11
UsesRootAttr returns true if a field with given name is contained in the rootAttrs set.
type PDFStreamDict ¶ added in v0.1.11
type PDFStreamDict struct { PDFDict StreamOffset int64 StreamLength *int64 StreamLengthObjNr *int FilterPipeline []PDFFilter Raw []byte // Encoded Content []byte // Decoded IsPageContent bool }
PDFStreamDict represents a PDF stream dict object.
func NewPDFStreamDict ¶ added in v0.1.11
func NewPDFStreamDict(pdfDict PDFDict, streamOffset int64, streamLength *int64, streamLengthObjNr *int, filterPipeline []PDFFilter) PDFStreamDict
NewPDFStreamDict creates a new PDFStreamDict for given PDFDict, stream offset and length.
func (PDFStreamDict) HasSoleFilterNamed ¶ added in v0.1.11
func (streamDict PDFStreamDict) HasSoleFilterNamed(filterName string) bool
HasSoleFilterNamed returns true if there is exactly one filter defined for a stream dict.
type PDFStringLiteral ¶ added in v0.1.11
type PDFStringLiteral string
PDFStringLiteral represents a PDF string literal object.
func DateStringLiteral ¶ added in v0.1.11
func DateStringLiteral(t time.Time) PDFStringLiteral
DateStringLiteral returns a PDFStringLiteral for time.
func (PDFStringLiteral) PDFString ¶ added in v0.1.11
func (stringliteral PDFStringLiteral) PDFString() string
PDFString returns a string representation as found in and written to a PDF file.
func (PDFStringLiteral) String ¶ added in v0.1.11
func (stringliteral PDFStringLiteral) String() string
func (PDFStringLiteral) Value ¶ added in v0.1.11
func (stringliteral PDFStringLiteral) Value() string
Value returns a string value for this PDF object.
type PDFVersion ¶ added in v0.1.11
type PDFVersion int
PDFVersion is a type for the internal representation of PDF versions.
const ( V10 PDFVersion = iota V11 V12 V13 V14 V15 V16 V17 )
Constants for all PDF versions up to v1.7
func Version ¶ added in v0.1.11
func Version(versionStr string) (PDFVersion, error)
Version returns the PDFVersion for a version string.
type PDFXRefStreamDict ¶ added in v0.1.11
type PDFXRefStreamDict struct { PDFStreamDict Size int Objects []int W [3]int PreviousOffset *int64 }
PDFXRefStreamDict represents a cross reference stream dictionary.
func NewPDFXRefStreamDict ¶ added in v0.1.11
func NewPDFXRefStreamDict(ctx *PDFContext) *PDFXRefStreamDict
NewPDFXRefStreamDict creates a new PDFXRefStreamDict object.
type ReadContext ¶ added in v0.1.11
type ReadContext struct { // The PDF-File which gets processed. FileName string File *os.File FileSize int64 BinaryTotalSize int64 // total stream data BinaryImageSize int64 // total image stream data BinaryFontSize int64 // total font stream data (fontfiles) BinaryImageDuplSize int64 // total obsolet image stream data after optimization BinaryFontDuplSize int64 // total obsolet font stream data after optimization Linearized bool // File is linearized. Hybrid bool // File is a hybrid PDF file. UsingObjectStreams bool // File is using object streams. ObjectStreams IntSet // All object numbers of any object streams found which need to be decoded. UsingXRefStreams bool // File is using xref streams. XRefStreams IntSet // All object numbers of any xref streams found. }
ReadContext represents the context for reading a PDF file.
func (*ReadContext) IsObjectStreamObject ¶ added in v0.1.11
func (rc *ReadContext) IsObjectStreamObject(i int) bool
IsObjectStreamObject returns true if object i is a an object stream. All compressed objects are object streams.
func (*ReadContext) IsXRefStreamObject ¶ added in v0.1.11
func (rc *ReadContext) IsXRefStreamObject(i int) bool
IsXRefStreamObject returns true if object #i is a an xref stream.
func (*ReadContext) LogStats ¶ added in v0.1.11
func (rc *ReadContext) LogStats(optimized bool)
LogStats logs stats for read file.
func (*ReadContext) ObjectStreamsString ¶ added in v0.1.11
func (rc *ReadContext) ObjectStreamsString() (int, string)
ObjectStreamsString returns a formatted string and the number of object stream objects.
func (*ReadContext) XRefStreamsString ¶ added in v0.1.11
func (rc *ReadContext) XRefStreamsString() (int, string)
XRefStreamsString returns a formatted string and the number of xref stream objects.
type WriteContext ¶ added in v0.1.11
type WriteContext struct { // The PDF-File which gets generated. DirName string FileName string FileSize int64 *bufio.Writer Command string // command in effect. ExtractPageNr int // page to be generated for rendering a single-page/PDF. ExtractPages IntSet // pages to be generated for a trimmed PDF. BinaryTotalSize int64 // total stream data, counts 100% all stream data written. BinaryImageSize int64 // total image stream data written = Read.BinaryImageSize. BinaryFontSize int64 // total font stream data (fontfiles) = copy of Read.BinaryFontSize. Table map[int]int64 // object write offsets Offset int64 // current write offset WriteToObjectStream bool // if true start to embed objects into object streams and obey ObjectStreamMaxObjects. CurrentObjStream *int // if not nil, any new non-stream-object gets added to the object stream with this object number. Eol string // end of line char sequence }
WriteContext represents the context for writing a PDF file.
func NewWriteContext ¶ added in v0.1.11
func NewWriteContext(eol string) *WriteContext
NewWriteContext returns a new WriteContext.
func (*WriteContext) ExtractPage ¶ added in v0.1.11
func (wc *WriteContext) ExtractPage(i int) bool
ExtractPage returns true if page i needs to be generated.
func (*WriteContext) HasWriteOffset ¶ added in v0.1.11
func (wc *WriteContext) HasWriteOffset(objNumber int) bool
HasWriteOffset returns true if an object has already been written to PDFDestination.
func (*WriteContext) LogStats ¶ added in v0.1.11
func (wc *WriteContext) LogStats()
LogStats logs stats for written file.
func (*WriteContext) ReducedFeatureSet ¶ added in v0.1.11
func (wc *WriteContext) ReducedFeatureSet() bool
ReducedFeatureSet returns true for Split,Trim,Merge,ExtractPages. Don't confuse with pdfcpu commands, these are internal triggers.
func (*WriteContext) SetWriteOffset ¶ added in v0.1.11
func (wc *WriteContext) SetWriteOffset(objNumber int)
SetWriteOffset saves the current write offset to the PDFDestination.
func (*WriteContext) WriteEol ¶ added in v0.1.11
func (wc *WriteContext) WriteEol() error
WriteEol writes an end of line sequence.
type XRefTable ¶ added in v0.1.11
type XRefTable struct { Table map[int]*XRefTableEntry Size *int // Object count from PDF trailer dict. PageCount int // Number of pages, set during validation. Root *PDFIndirectRef // Pointer to catalog (reference to root object). RootDict *PDFDict // Catalog Names map[string]*Node // Cache for name trees as found in catalog. Encrypt *PDFIndirectRef // Encrypt dict. E *Enc EncKey []byte // Encrypt key. AES4Strings bool AES4Streams bool AES4EmbeddedStreams bool // PDF Version HeaderVersion *PDFVersion // The PDF version the source is claiming to us as per its header. RootVersion *PDFVersion // Optional PDF version taking precedence over the header version. // Document information section Info *PDFIndirectRef // Infodict (reference to info dict object) ID *PDFArray // from info dict (or trailer?) Author string Creator string Producer string // Linearization section (not yet supported) OffsetPrimaryHintTable *int64 OffsetOverflowHintTable *int64 LinearizationObjs IntSet // Offspec section AdditionalStreams *PDFArray // array of PDFIndirectRef - trailer :e.g., Oasis "Open Doc" // Statistics Stats PDFStats Tagged bool // File is using tags. This is important for ??? // Validation Valid bool // true means successful validated against ISO 32000. ValidationMode int // see Configuration Optimized bool }
XRefTable represents a PDF cross reference table plus stats for a PDF file.
func (*XRefTable) BindNameTrees ¶ added in v0.1.11
BindNameTrees syncs up the internal name tree cache with the xreftable.
func (*XRefTable) Catalog ¶ added in v0.1.11
Catalog returns a pointer to the root object / catalog.
func (*XRefTable) CatalogHasPieceInfo ¶ added in v0.1.11
CatalogHasPieceInfo returns true if the root has an entry for \"PieceInfo\".
func (*XRefTable) DeleteObject ¶ added in v0.1.11
DeleteObject marks an object as free and inserts it into the free list right after the head.
func (*XRefTable) DeleteObjectGraph ¶ added in v0.1.11
DeleteObjectGraph deletes all objects reachable by indRef.
func (*XRefTable) Dereference ¶ added in v0.1.11
Dereference resolves an indirect object and returns the resulting PDF object.
func (*XRefTable) DereferenceArray ¶ added in v0.1.11
DereferenceArray resolves and validates an array object, which may be an indirect reference.
func (*XRefTable) DereferenceDict ¶ added in v0.1.11
DereferenceDict resolves and validates a dictionary object, which may be an indirect reference.
func (*XRefTable) DereferenceInteger ¶ added in v0.1.11
func (xRefTable *XRefTable) DereferenceInteger(obj PDFObject) (*PDFInteger, error)
DereferenceInteger resolves and validates an integer object, which may be an indirect reference.
func (*XRefTable) DereferenceName ¶ added in v0.1.11
func (xRefTable *XRefTable) DereferenceName(obj PDFObject, sinceVersion PDFVersion, validate func(string) bool) (n PDFName, err error)
DereferenceName resolves and validates a name object, which may be an indirect reference.
func (*XRefTable) DereferenceStreamDict ¶ added in v0.1.11
func (xRefTable *XRefTable) DereferenceStreamDict(obj PDFObject) (*PDFStreamDict, error)
DereferenceStreamDict resolves and validates a stream dictionary object, which may be an indirect reference.
func (*XRefTable) DereferenceStringLiteral ¶ added in v0.1.11
func (xRefTable *XRefTable) DereferenceStringLiteral(obj PDFObject, sinceVersion PDFVersion, validate func(string) bool) (s PDFStringLiteral, err error)
DereferenceStringLiteral resolves and validates a string literal object, which may be an indirect reference.
func (*XRefTable) DereferenceStringOrHexLiteral ¶ added in v0.1.11
func (xRefTable *XRefTable) DereferenceStringOrHexLiteral(obj PDFObject, sinceVersion PDFVersion, validate func(string) bool) (o PDFObject, err error)
DereferenceStringOrHexLiteral resolves and validates a string or hex literal object, which may be an indirect reference.
func (*XRefTable) EncryptDict ¶ added in v0.1.11
EncryptDict returns a pointer to the root object / catalog.
func (*XRefTable) EnsureCollection ¶ added in v0.1.11
EnsureCollection makes sure there is a Collection entry in the catalog. Needed for portfolio / portable collections eg. for file attachments.
func (*XRefTable) EnsureValidFreeList ¶ added in v0.1.11
EnsureValidFreeList ensures the integrity of the free list associated with the recorded free objects. See 7.5.4 Cross-Reference Table
func (*XRefTable) Exists ¶ added in v0.1.11
Exists returns true if xRefTable contains an entry for objNumber.
func (*XRefTable) Find ¶ added in v0.1.11
func (xRefTable *XRefTable) Find(objNumber int) (*XRefTableEntry, bool)
Find returns the XRefTable entry for given object number.
func (*XRefTable) FindObject ¶ added in v0.1.11
FindObject returns the object of the XRefTableEntry for a specific object number.
func (*XRefTable) FindTableEntry ¶ added in v0.1.11
func (xRefTable *XRefTable) FindTableEntry(objNumber int, generationNumber int) (*XRefTableEntry, bool)
FindTableEntry returns the XRefTable entry for given object and generation numbers.
func (*XRefTable) FindTableEntryForIndRef ¶ added in v0.1.11
func (xRefTable *XRefTable) FindTableEntryForIndRef(indRef *PDFIndirectRef) (*XRefTableEntry, bool)
FindTableEntryForIndRef returns the XRefTable entry for given indirect reference.
func (*XRefTable) FindTableEntryLight ¶ added in v0.1.11
func (xRefTable *XRefTable) FindTableEntryLight(objNumber int) (*XRefTableEntry, bool)
FindTableEntryLight returns the XRefTable entry for given object number.
func (*XRefTable) Free ¶ added in v0.1.11
func (xRefTable *XRefTable) Free(objNumber int) (*XRefTableEntry, error)
Free returns the cross ref table entry for given number of a free object.
func (*XRefTable) IDFirstElement ¶ added in v0.1.11
IDFirstElement returns the first element of ID.
func (*XRefTable) IndRefForNewObject ¶ added in v0.1.11
func (xRefTable *XRefTable) IndRefForNewObject(obj PDFObject) (*PDFIndirectRef, error)
IndRefForNewObject inserts an object into the xRefTable and returns an indirect reference to it.
func (*XRefTable) InsertAndUseRecycled ¶ added in v0.1.11
func (xRefTable *XRefTable) InsertAndUseRecycled(xRefTableEntry XRefTableEntry) (objNumber int, err error)
InsertAndUseRecycled adds given xRefTableEntry into the cross reference table utilizing the freelist.
func (*XRefTable) InsertNew ¶ added in v0.1.11
func (xRefTable *XRefTable) InsertNew(xRefTableEntry XRefTableEntry) (objNumber int)
InsertNew adds given xRefTableEntry at next new objNumber into the cross reference table. Only to be called once an xRefTable has been generated completely and all trailer dicts have been processed. xRefTable.Size is the size entry of the first trailer dict processed. Called on creation of new object streams. Called by InsertAndUseRecycled.
func (*XRefTable) InsertObject ¶ added in v0.1.11
InsertObject inserts an object into the xRefTable.
func (*XRefTable) IsLinearizationObject ¶ added in v0.1.11
IsLinearizationObject returns true if object #i is a a linearization object.
func (*XRefTable) LinearizationObjsString ¶ added in v0.1.11
LinearizationObjsString returns a formatted string and the number of objs.
func (*XRefTable) LocateNameTree ¶ added in v0.1.11
LocateNameTree locates/ensures a specific name tree.
func (*XRefTable) MissingObjects ¶ added in v0.1.11
MissingObjects returns the number of objects that were not written plus the corresponding comma separated string representation.
func (*XRefTable) NamesDict ¶ added in v0.1.11
NamesDict returns the dict that contains all name trees.
func (*XRefTable) NewEmbeddedFileStreamDict ¶ added in v0.1.11
func (xRefTable *XRefTable) NewEmbeddedFileStreamDict(filename string) (*PDFStreamDict, error)
NewEmbeddedFileStreamDict creates and returns an embeddedFileStreamDict containing the file "filename".
func (*XRefTable) NewFileSpecDict ¶ added in v0.1.11
func (xRefTable *XRefTable) NewFileSpecDict(filename string, indRefStreamDict PDFIndirectRef) (*PDFDict, error)
NewFileSpecDict creates and returns a new fileSpec dictionary.
func (*XRefTable) NewPDFStreamDict ¶ added in v0.1.11
func (xRefTable *XRefTable) NewPDFStreamDict(filename string) (*PDFStreamDict, error)
NewPDFStreamDict creates a streamDict for buf.
func (*XRefTable) NewSoundStreamDict ¶ added in v0.1.11
func (xRefTable *XRefTable) NewSoundStreamDict(filename string, samplingRate int, fileSpecDict *PDFDict) (*PDFStreamDict, error)
NewSoundStreamDict returns a new sound stream dict.
func (*XRefTable) NextForFree ¶ added in v0.1.11
NextForFree returns the number of the object the free object with objNumber links to. This is the successor of this free object in the free list.
func (*XRefTable) Pages ¶ added in v0.1.11
func (xRefTable *XRefTable) Pages() (*PDFIndirectRef, error)
Pages returns the Pages reference contained in the catalog.
func (*XRefTable) ParseRootVersion ¶ added in v0.1.11
ParseRootVersion returns a string representation for an optional Version entry in the root object.
func (*XRefTable) RemoveCollection ¶ added in v0.1.11
RemoveCollection removes an existing Collection entry from the catalog.
func (*XRefTable) RemoveEmbeddedFilesNameTree ¶ added in v0.1.11
RemoveEmbeddedFilesNameTree removes both the embedded files name tree and the Collection dict.
func (*XRefTable) RemoveNameTree ¶ added in v0.1.11
RemoveNameTree removes a specific name tree. Also removes a resulting empty names dict.
func (*XRefTable) UndeleteObject ¶ added in v0.1.11
UndeleteObject ensures an object is not recorded in the free list. e.g. sometimes caused by indirect references to free objects in the original PDF file.
func (*XRefTable) ValidateVersion ¶ added in v0.1.11
func (xRefTable *XRefTable) ValidateVersion(element string, sinceVersion PDFVersion) error
ValidateVersion validates against the xRefTable's version.
func (*XRefTable) Version ¶ added in v0.1.11
func (xRefTable *XRefTable) Version() PDFVersion
Version returns the PDF version of the PDF writer that created this file. Before V1.4 this is the header version. Since V1.4 the catalog may contain a Version entry which takes precedence over the header version.
func (*XRefTable) VersionString ¶ added in v0.1.11
VersionString return a string representation for this PDF files PDF version.
type XRefTableEntry ¶ added in v0.1.11
type XRefTableEntry struct { Free bool Offset *int64 Generation *int Object PDFObject Compressed bool ObjectStream *int ObjectStreamInd *int }
XRefTableEntry represents an entry in the PDF cross reference table.
This may wrap a free object, a compressed object or any in use PDF object:
PDFDict, PDFStreamDict, PDFObjectStreamDict, PDFXRefStreamDict, PDFArray, PDFInteger, PDFFloat, PDFName, PDFStringLiteral, PDFHexLiteral, PDFBoolean
func NewFreeHeadXRefTableEntry ¶ added in v0.1.11
func NewFreeHeadXRefTableEntry() *XRefTableEntry
NewFreeHeadXRefTableEntry returns the xref table entry for object 0 which is per definition the head of the free list (list of free objects).
func NewXRefTableEntryGen0 ¶ added in v0.1.11
func NewXRefTableEntryGen0(obj PDFObject) *XRefTableEntry
NewXRefTableEntryGen0 returns a cross reference table entry for an object with generation 0.
Source Files ¶
- api.go
- array.go
- attach.go
- configuration.go
- context.go
- createAnnotations.go
- createRenditions.go
- createTestPDF.go
- crypto.go
- dict.go
- doc.go
- equal.go
- extract.go
- filter.go
- merge.go
- nameTree.go
- optimize.go
- parse.go
- process.go
- read.go
- resources.go
- stats.go
- streamdict.go
- string.go
- types.go
- utf16.go
- validateAcroForm.go
- validateAction.go
- validateAnnotations.go
- validateColorspace.go
- validateDate.go
- validateDestination.go
- validateExtGState.go
- validateFileSpec.go
- validateFont.go
- validateFunction.go
- validateInfo.go
- validateMedia.go
- validateNameTree.go
- validateNumberTree.go
- validateObjects.go
- validateOptionalContent.go
- validateOutlineTree.go
- validatePages.go
- validatePattern.go
- validateProperties.go
- validateShading.go
- validateStructTree.go
- validateThread.go
- validateXObject.go
- validateXReftable.go
- version.go
- write.go
- writeInfo.go
- writeObjects.go
- writePages.go
- writeStats.go
- xreftable.go