Documentation
¶
Index ¶
- Variables
- func DetectAbbreviated(parsed *pb.Parsed) *protob.Result
- type MatchResult
- type MatchTask
- type Matcher
- func (m Matcher) Match(ns NameString) *protob.Result
- func (m Matcher) MatchFuzzy(name, stem string, ns NameString, kv *badger.DB) *protob.Result
- func (m Matcher) MatchPartial(ns NameString, kv *badger.DB) *protob.Result
- func (m Matcher) MatchVirus(ns NameString) *protob.Result
- func (m Matcher) MatchWorker(chIn <-chan MatchTask, chOut chan<- MatchResult, wg *sync.WaitGroup, ...)
- type Multinomial
- type NameString
- type Partial
Constants ¶
This section is empty.
Variables ¶
var ( // GNUUID is a UUID seed made from 'globalnames.org' domain to generate // UUIDv5 identifiers. GNUUID = uuid.NewV5(uuid.NamespaceDNS, "globalnames.org") )
Functions ¶
Types ¶
type MatchResult ¶ added in v0.3.0
type MatchTask ¶ added in v0.3.0
MatchTask contains a name to be matched and an index where it should be located in an array.
type Matcher ¶
Matcher contains data and functions necessary for exact, fuzzy and partial matching of scientific names.
func NewMatcher ¶
NewMatcher creates a new instance of Matcher struct.
func (Matcher) Match ¶
func (m Matcher) Match(ns NameString) *protob.Result
Match tries to match a canonical form of a name-string exactly to canonical from from gnames database.
func (Matcher) MatchFuzzy ¶
func (m Matcher) MatchFuzzy(name, stem string, ns NameString, kv *badger.DB) *protob.Result
MatchFuzzy tries to do fuzzy matchin of a stemmed name-string to canonical forms from the gnames database.
func (Matcher) MatchPartial ¶
func (m Matcher) MatchPartial(ns NameString, kv *badger.DB) *protob.Result
MatchPartial tries to match all patial variants of a name-string. The process stops as soon as a match was found.
func (Matcher) MatchVirus ¶
func (m Matcher) MatchVirus(ns NameString) *protob.Result
MatchVirus tries to match a name-string exactly to a virus name from the gnames database.
func (Matcher) MatchWorker ¶ added in v0.3.0
func (m Matcher) MatchWorker(chIn <-chan MatchTask, chOut chan<- MatchResult, wg *sync.WaitGroup, kv *badger.DB)
MatchWorker takes name-strings from chIn channel, matches them and sends results to chOut channel.
type Multinomial ¶
type Multinomial struct { // Tail is genus + the last epithet. Tail string // Head is the name without the last epithet. Head string }
Multinomial contains multinomial names that were constructed from an 'infraspecific' name-string.
type NameString ¶
type NameString struct { // ID is UUID v5 generated from the verbatim name-string. ID string // Name is a verbatim name-string. Name string // Cardinality is the apparent number of elemenents in a name. Uninomial // corresponds to cardinality 1, bionmial to 2, trinomial to 3 etc. Cardinality int // Canonical is the simplest most common version of a canonical form of // a name string. Canonical string // CanonicalID is UUID v5 generated from the Canonical field. CanonicalID string // CanonicalFull is a canonical form that also contains infraspecific ranks // and hybrid signes for named hybrids names. CanonicalFull string // CanonicalFullID is UUID v5 generated from the CanonicalFullID field. CanonicalFullID string // Canonical Stem is version of the Canonical field with suffixes removed // and characters substituted according to rules of Latin grammar. CanonicalStem string // Partial contains truncated versions of Canonical form. It is important // for matching names that could not be matched for all specific epithets. Partial *Partial }
NameString stores input data for doing exact, fuzzy, exact partial, and fuzzy partial matching. It is created by parsing a name-string and storing its semantic elements.
func NewNameString ¶
NewNameString creates a new instance of NameString.
type Partial ¶
type Partial struct { // Genus is a truncated canonical form with all specific epithets removed. Genus string // Multinomials are truncated canonical forms where one or more specific // epithets removed. Multinomials []Multinomial }
Partial stores truncated version of a 'canonical' name-string.