document

package
v0.14.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 2, 2026 License: MIT Imports: 4 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ExtractDocxText

func ExtractDocxText(path string) (string, error)

ExtractDocxText extracts all text content from a DOCX file. Text is extracted from <w:t> tags within word/document.xml.

func ExtractPptxText

func ExtractPptxText(path string) (string, error)

ExtractPptxText extracts all text content from a PPTX file. Text is extracted from <a:t> tags within slide XML files.

func ExtractTextFromTags

func ExtractTextFromTags(data []byte, tagPrefix string) string

ExtractTextFromTags extracts text content between XML tags with the given prefix. For example, ExtractTextFromTags(data, "a:t") extracts text from <a:t>content</a:t>.

func FindFileInZip

func FindFileInZip(reader *zip.ReadCloser, name string) *zip.File

FindFileInZip finds a file in a ZIP archive by exact name.

func FindFilesWithPrefix

func FindFilesWithPrefix(reader *zip.ReadCloser, prefix, suffix string) []*zip.File

FindFilesWithPrefix finds all files in a ZIP archive matching a prefix and suffix.

func OpenOfficeFile

func OpenOfficeFile(path string) (*zip.ReadCloser, error)

OpenOfficeFile opens an Office file (PPTX, DOCX, XLSX) as a ZIP archive. Caller is responsible for closing the returned reader.

func ReadZipFile

func ReadZipFile(file *zip.File) ([]byte, error)

ReadZipFile reads a file from a ZIP archive and returns its contents.

Types

type DocxMetadata

type DocxMetadata struct {
	WordCount int
	Author    string
}

DocxMetadata contains extracted metadata from a DOCX file.

func ExtractDocxMetadata

func ExtractDocxMetadata(path string) (*DocxMetadata, error)

ExtractDocxMetadata extracts metadata from a DOCX file.

type PptxMetadata

type PptxMetadata struct {
	SlideCount int
	Author     string
}

PptxMetadata contains extracted metadata from a PPTX file.

func ExtractPptxMetadata

func ExtractPptxMetadata(path string) (*PptxMetadata, error)

ExtractPptxMetadata extracts metadata from a PPTX file.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL