docextractor

package
v5.39.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 15, 2021 License: AGPL-3.0, Apache-2.0 Imports: 18 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Extract

func Extract(filename string, r io.ReadSeeker, settings ExtractSettings) (string, error)

Extract extract the text from a document using the system default extractors

func ExtractWithExtraExtractors

func ExtractWithExtraExtractors(filename string, r io.ReadSeeker, settings ExtractSettings, extraExtractors []Extractor) (string, error)

ExtractWithExtraExtractors extract the text from a document using the provided extractors beside the system default extractors.

Types

type ExtractSettings

type ExtractSettings struct {
	ArchiveRecursion bool
	MMPreviewURL     string
	MMPreviewSecret  string
}

ExtractSettings defines the features enabled/disable during the document text extraction.

type Extractor

type Extractor interface {
	Match(filename string) bool
	Extract(filename string, file io.ReadSeeker) (string, error)
}

Extractors define the interface needed to extract file content

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL