Documentation ¶
Index ¶
- func ConvertDoc(r io.Reader) (string, map[string]string, error)
- func ConvertDocx(r io.Reader) (string, map[string]string, error)
- func ConvertHTML(r io.Reader, readability bool) (string, map[string]string, error)
- func ConvertImage(r io.Reader) (string, map[string]string, error)
- func ConvertODT(r io.Reader) (string, map[string]string, error)
- func ConvertPDF(r io.Reader) (string, map[string]string, error)
- func ConvertPages(r io.Reader) (string, map[string]string, error)
- func ConvertPathReadability(path string, readability bool) ([]byte, error)
- func ConvertRTF(r io.Reader) (string, map[string]string, error)
- func ConvertURL(input io.Reader, readability bool) (string, map[string]string, error)
- func ConvertXML(r io.Reader) (string, map[string]string, error)
- func DocxXMLToText(r io.Reader) (string, error)
- func HTMLReadability(r io.Reader) []byte
- func HTMLToText(input io.Reader) string
- func MimeTypeByExtension(filename string) string
- func SetImageLanguages(string)
- func Tidy(r io.Reader, xmlIn bool) ([]byte, error)
- func XMLToMap(r io.Reader) (map[string]string, error)
- func XMLToText(r io.Reader, breaks []string, skip []string, strict bool) (string, error)
- type HTMLReadabilityOptions
- type LocalFile
- type Response
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ConvertDoc ¶
Convert MS Word DOC
func ConvertDocx ¶
Convert DOCX to text
func ConvertHTML ¶
Convert HTML
func ConvertODT ¶
Convert ODT to text
func ConvertPages ¶
Convert PAGES to text
func ConvertPathReadability ¶
TODO(dhowden): Refactor this. Convert a file given a path
func ConvertURL ¶
Convert URL
func HTMLReadability ¶
Extract the readable text in an HTML document
func HTMLToText ¶
func MimeTypeByExtension ¶
Determine the mime type by the file's extension
func SetImageLanguages ¶
func SetImageLanguages(string)
Types ¶
type HTMLReadabilityOptions ¶
type HTMLReadabilityOptions struct { LengthLow int LengthHigh int StopwordsLow float64 StopwordsHigh float64 MaxLinkDensity float64 MaxHeadingDistance int ReadabilityUseClasses string }
HTMLReadabilityOptions is a type which defines parameters that are passed to the justext paackage. TODO: Improve this!
var HTMLReadabilityOptionsValues HTMLReadabilityOptions
TODO: Remove this from global state.
type LocalFile ¶
LocalFile is a type which wraps an *os.File. See NewLocalFile for more details.
func NewLocalFile ¶
NewLocalFile ensures that there is a file which contains the data provided by r. If r is actually an instance of *os.File then this file is used, otherwise a temporary file is created (using dir and prefix) and the data from r copied into it. Callers must call Done() when the LocalFile is no longer needed to ensure all resources are cleaned up.
type Response ¶
type Response struct { Body string `json:"body"` Meta map[string]string `json:"meta"` MSecs uint32 `json:"msecs"` }
Response payload sent back to the requestor
func ConvertPath ¶
TODO(dhowden): Refactor this. Convert a file given a path