package module
Version: v0.5.0 Latest Latest

This package is not in the latest version of its module.

Go to latest
Published: Jun 23, 2018 License: MIT Imports: 12 Imported by: 3


Global Names Finder Build Status Doc Status

Finds scientific names using dictionary and nlp approaches.

Usage as a command line.

Download the binary executable for your operating system from the latest release. To see flags and usage:

gnfinder --help

Usage as a library

go get
go get
go get
# To update dictionaries if they are changed
cd $GOPATH/srs/
go generate
import (

dict = &dict.LoadDictionary()
bytesText := []byte(utfText)

jsonNames := FindNamesJSON(bytesText, dict, opts)

To install latest gnfinder

git get
cd $GOPATH/src/
gnfinder -h

Install [ginkgo], a [BDD] testing framefork for Go.

go get
go get

To run tests go to root directory of the project and run



go test




This section is empty.


This section is empty.


func FindNamesJSON

func FindNamesJSON(data []byte, dict *dict.Dictionary,
	opts ...util.Opt) []byte

FindNamesJSON takes a text and returns scientific names found in the text, as well as tokens


type Meta

type Meta struct {
	// Date represents time when output was generated.
	Date time.Time `json:"date"`
	// Language of the document
	Language string `json:"language"`
	// TotalTokens is a number of 'normalized' words in the text
	TotalTokens int `json:"total_words"`
	// TotalNameCandidates is a number of words that might be a start of
	// a scientific name
	TotalNameCandidates int `json:"total_candidates"`
	// TotalNames is a number of scientific names found
	TotalNames int `json:"total_names"`
	// CurrentName (optional) is the index of the names array that designates a
	// "position of a cursor". It is used by programs like gntagger that allow
	// to work on the list of found names interactively.
	CurrentName int `json:"current_index,omitempty"`

Meta contains meta-information of name-finding result.

type Name

type Name struct {
	Type        string               `json:"type"`
	Verbatim    string               `json:"verbatim"`
	Name        string               `json:"name"`
	Odds        float64              `json:"odds,omitempty"`
	OddsDetails token.OddsDetails    `json:"odds_details,omitempty"`
	OffsetStart int                  `json:"start"`
	OffsetEnd   int                  `json:"end"`
	Annotation  string               `json:"annotation"`
	Validation  *resolver.NameOutput `json:"validation"`

Name represents one found name.

func TokensToName

func TokensToName(ts []token.Token, text []rune) Name

type OddsDatum

type OddsDatum struct {
	Name bool
	Odds float64

OddsDatum is a simplified version of a name, that stores boolean decision (Name/NotName), and corresponding odds of the name.

type Output

type Output struct {
	Meta  `json:"metadata"`
	Names []Name `json:"names"`

Output type is the result of name-finding.

func CollectOutput

func CollectOutput(ts []token.Token, text []rune, m *util.Model) Output

CollectOutput takes tagged tokens and assembles gnfinder output out of them.

func FindNames

func FindNames(text []rune, d *dict.Dictionary, opts ...util.Opt) Output

FindNames traverses a text and finds scientific names in it.

func NewOutput

func NewOutput(names []Name, ts []token.Token, m *util.Model) Output

NewOutput is a constructor for Output type.

func (*Output) FromJSON

func (o *Output) FromJSON(data []byte)

FromJSON converts JSON representation of Outout to Output object.

func (*Output) ToJSON

func (o *Output) ToJSON() []byte

ToJSON converts Output to JSON representation.


Path Synopsis
package dict contains dictionaries for finding scientific names
package dict contains dictionaries for finding scientific names
Package token deals with breaking a text into tokens.
Package token deals with breaking a text into tokens.
Package util contains useful shared functions
Package util contains useful shared functions

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL