Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type HTindex ¶
type HTindex struct { // RootPrefix is concatenated with paths given in input file to get // complete path to HathiTrust files. RootPrefix string // InputPath gives path to file with input data. InputPath string // OutputPath gives path to a directory to keep output data. OutputPath string // JobsNum sets number of jobs/workers to run. JobsNum int // Dict contains shared dictionary for name finding. Dict *dict.Dictionary // WordsAround sets number of words retained before and after a // name-candidate. WordsAround int // ProgressNum determines how many titles should be processed for // a progress report. ProgressNum int }
HTindex detects occurences of scientific names in Hathi Trust data.
func NewHTindex ¶
NewHTindex creates HTindex instance with several defaults. If a some options are provided, they will override default settings.
type Option ¶
type Option func(h *HTindex)
Option sets the time for all options received during creation of new instance of HTindex object.
func OptInput ¶
OptIntput is an absolute path to input data file. Each line of such file displays path to zipped file of a title.
func OptOutput ¶
OptOutput is an absolute path to a directory where results will be written. If such directory does not exist already, it will be created during initialization of HTindex instance.
func OptProgressNum ¶ added in v0.0.2
OptProgressNum sets how often to printout a line about the progress. When it is set to 1 report line appears after processing every title, and if it is 10 progress is shows after every 10th title.
func OptRoot ¶
OptRoot sets the prefix of the path to zipped titles. It wil be concatenated with a path provided in the input file to receive complete absolute path.
func OptWordsAround ¶ added in v0.0.7
OptWordsAround sets number of words retained before and after a name-candidate.