Global Names Finder
Finds scientific names using dictionary and nlp approaches.
Features
- Multiplatform packages (Linux, Windows, Mac OS X).
- Self-contained, no external dependencies, only binary
gnfinder
or
gnfinder.exe
(~15Mb) is needed. However the internet connection is
required for name-verification.
- Takes UTF8-encoded text and returns back JSON-formatted output that contains
detected scientific names.
- Optionally, automatically detects the language of the text, and adjusts Bayes
algorithm for the language. English and German languages are currently
supported.
- Uses complementary heuristic and natural language processing algorithms.
- Optionally verifies found names against multiple biodiversity databases using
gnindex service.
- Detection of nomenclatural annotations like
sp. nov.
, comb. nov.
,
ssp. nov.
and their variants.
- Ability to see words that surround detected name-strings.
- The library can be used concurrently to significantly improve speed.
On a server with 40threads it is able to detect names on 50 million pages
in approximately 3 hours using both heuristic and Bayes algorithms. Check
bhlindex project for an example.
Install as a command line app
Download the binary executable for your operating system from the
latest release.
Linux or OS X
Move gnfinder
executabe somewhere in your PATH
(for example /usr/local/bin
)
sudo mv path_to/gnfinder /usr/local/bin
Windows
One possible way would be to create a default folder for executables and place
gnfinder
there.
Use Windows+R
keys
combination and type "cmd
". In the appeared terminal window type:
mkdir C:\bin
copy path_to\gnfinder.exe C:\bin
Add C:\bin
directory to your PATH
environment variable.
Go
Install Go >= v1.16
git clone git@github.com:/gnames/gnfinder
cd gnfinder
make tools
make install
Usage
Usage as a command line app
To see flags and usage:
gnfinder --help
# or just
gnfinder
To see the version of its binary:
gnfinder -V
Examples:
Getting data from a pipe forcing English language and verification
echo "Pomatomus saltator and Parus major" | gnfinder -v -l eng
echo "Pomatomus saltator and Parus major" | gnfinder --verify --lang eng
Displaying matches from NCBI
and Encyclopedia of Life
, if exist. For
the list of data source ids go to gnverifier's data sources page.
echo "Pomatomus saltator and Parus major" | gnfinder -v -l eng -s "4,12"
echo "Pomatomus saltator and Parus major" | gnfinder --verify --lang eng --sources "4,12"
Adjusting Prior Odds using information about found names. They are calculated
as "found names number / (capitalized words number - found names number)".
Such adjustment will decrease Odds for texts with very few names, and increase
odds for texts with a lot of found names.
gnfinder -a -d -f pretty file_with_names.txt
Returning 5 words before and after found name-candidate.
gnfinder -w 5 file_with_names.txt
gnfinder --words-around 5 file_with_names.txt
Getting data from a file and redirecting result to another file
gnfinder file1.txt > file2.json
Detection of nomenclatural annotations
echo "Parus major sp. n." | gnfinder
Usage as a library
import (
"github.com/gnames/gnfinder"
"github.com/gnames/gnfinder/ent/nlp"
"github.com/gnames/gnfinder/io/dict"
)
func Example() {
txt := `Blue Adussel (Mytilus edulis) grows to about two
inches the first year,Pardosa moesta Banks, 1892`
cfg := gnfinder.NewConfig()
dictionary := dict.LoadDictionary()
weights := nlp.BayesWeights()
gnf := gnfinder.New(cfg, dictionary, weights)
res := gnf.Find(txt)
name := res.Names[0]
fmt.Printf(
"Name: %s, start: %d, end: %d",
name.Name,
name.OffsetStart,
name.OffsetEnd,
)
// Output:
// Name: Mytilus edulis, start: 13, end: 29
}
Usage as a docker container
docker pull gnames/gnfinder
# run gnfinder server, and map it to port 8888 on the host machine
docker run -d -p 8888:8778 --name gnfinder gnames/gnfinder
Development
To install the latest gnfinder
git clone git@github.com:/gnames/gnfinder
cd gnfinder
make tools
make install
Modify OpenAPI documentation
docker run -d -p 80:8080 swaggerapi/swagger-editor
Testing
From the root of the project:
make tools
# run make install for CLI testing
make install
To run tests go to the root directory of the project and run
go test ./...
#or
make test