Documentation
¶
Overview ¶
Package ltxmlharvest provides a MathWebSearch harvester for documents outputted by latexml
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func HarvestFS ¶
func HarvestFS(fsys fs.FS, accept func(path string) bool, uri func(path string) string, writer func(path string, harvest Harvest) error, logger *log.Logger)
HarvestFS recursively harvests all files in fs.FS. Each directory will be grouped into a single harvest.
func HarvestReader ¶
HarvestReader harvests a single reader and writes the output to writer
Types ¶
type Harvest ¶
type Harvest []HarvestFragment
Harvest represents a single harvest. It implements sort.Interface
func HarvestFragments ¶
HarvestFragments executes jobs and writes them to logger
func (Harvest) MarshalXML ¶
MarshalXML marshals this harvest into xml form
type HarvestFormula ¶
type HarvestFormula struct { // ID of this formula ID string // Dual (Content + Presentation) MathML contained in this document // Content and Presentation should be linked using "xref" attributes. // May use "m" and "mws" namespaces. DualMathML string // Content MathML corresponding to the DualMathML above. // Must use the "m" namespace. ContentMathML string }
HarvestFormula represents a single formula found within the harvest
func ReadFormula ¶
func ReadFormula(math *etree.Element) (HarvestFormula, error)
ReadFormula parses a formula based on element
type HarvestFragment ¶
type HarvestFragment struct { // ID is an internal, but unique, id of this harvest fragment // typically just the running id of this fragment ID string // URI is the URI of the corresponding document URI string // XHTMLContent of this document, substiuting "math" + id for formulae XHTMLContent string // List of formulae within the harvest Formulae []HarvestFormula }
HarvestFragment represents a single document fragment within a harvest
func (HarvestFragment) MarshalXML ¶
func (frag HarvestFragment) MarshalXML(e *xml.Encoder, start xml.StartElement) error
MarshalXML marshals this document into xml
type Job ¶
type Job struct { Reader func() (io.ReadCloser, error) URI string }
Job describes a job for the harvester
func JobFromFile ¶
JobFromFile creates a new Job from a file and a uribase