biodv

package module
v0.0.0-...-e2aaca8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 28, 2018 License: BSD-2-Clause Imports: 10 Imported by: 0

README

Biodv

Biodv is a collection of packages and tools for management and analysis of biodiversity data.

Database organization

Biodv code and tools, requires the data to be organized in an specific directory and file hierarchy within the current working path (which is defined hera as a “project”). Each sub-directory stores the data in one or more files using the stanza format.

Here is an example of a project sub-directories:

my-project/
	taxonomy/
		taxonomy.stz
	specimens/
		taxon-a.stz
		taxon-b.stz
		synonym-a.stz
		...
	sources/
		biblio.stz
		datasets.stz
		sources.stz
	geography/
		geography.stz

Not all sub-directories are required, as they will be created given the needs of each project.

Both tools, and biodv packages look for the database files automatically, so most of the time, a user, or a developer, does not need to know how each file is called, or where is stored.

Stanza format

In biodv data is stored using the stanza format. The stanza format is an structured UTF-8 text file with the following rules:

  1. Each line containing a field must starts with the field name and separated from its content by a colon ( : ) character. If the field name ends with a new line rather than a colon, the field is considered as empty.
  2. Field names are case insensitive (always read as lower caps), without spaces, and should be unique.
  3. A field ends with a new line. If the content of a field extends more than one line, the next line should start with at least one space or tab character.
  4. A record ends with a line that starts with percent sign ( % ) character. Any character after the percent sign will be ignored (usually %% is used to increase visibility of the end-of-record).
  5. Lines starting with the sharp symbol ( # ) character are taken as comments.
  6. Empty lines are ignored.

Here is a simple example of an stanza file containing taxonomic data:

# example dataset
name:	Homo
rank:	genus
correct: true
%%
name:	Pan
rank:	genus
correct: true
%%
name:	Homo sapiens
parent:	Homo
rank:	species
correct: true
%%
name:	Homo erectus
parent:	Homo
rank:	species
correct: true
%%
name:	Pithecanthropus
parent:	Homo
rank:	genus
correct: false
%%
name:	Sinanthropus pekinensis
parent:	Homo erectus
rank:	species
correct: false
%%

Stanza file format was inspired by the record-jar/stanza format described by E. Raymond "The Art of UNIX programming" (2003) Addison-Wesley, and C. Strozzi "NoSQL list format" (2007).

Authorship and license

Biodv is distributed under a BSD2 license that can be found in the LICENSE file. For a list of authors, see the AUTHORS file.

Documentation

Overview

Package biodv contains main interfaces and types for a basic biodiversity database.

Index

Constants

View Source
const (
	TaxAuthor = "author"    // Author of the taxon name
	TaxExtern = "extern"    // Extern IDs
	TaxRef    = "reference" // A bibliographic reference
	TaxSource = "source"    // Source of taxonomic data
)

Common keys used for a Taxon.

View Source
const (
	RecRef      = "reference"  // A bibliographic reference
	RecDataset  = "dataset"    // Source of the specimen data
	RecCatalog  = "catalog"    // Museum catalog number
	RecDeterm   = "determiner" // The person who identified the specimen
	RecExtern   = "extern"     // Extern IDs
	RecComment  = "comment"    // A free text comment
	RecOrganism = "organism"   // An ID of the organism
	RecSex      = "sex"        // Sex of the organism
	RecStage    = "stage"      // Life stage of the organism
)

Common keys used for Record.

View Source
const (
	SetAboutKey  = "about"     // A text description of the dataset
	SetExtern    = "extern"    // Extern IDs
	SetRef       = "reference" // A bibliographic reference
	SetLicense   = "license"   // License used for the data
	SetURLKey    = "url"       // Homepage of the dataset
	SetPublisher = "publisher" // The organization that publish the dataset
)

Common keys used for Dataset.

Variables

This section is empty.

Functions

func GzAbout

func GzAbout(driver string) string

GzAbout returns a short message describing the driver.

func GzDrivers

func GzDrivers() []string

GzDrivers returns a sorted list of names of the registered gazetteer drivers.

func ParseDriverString

func ParseDriverString(str string) (driver, param string)

ParseDriverString separates a driver and its parameter if it is set in the form <driver>:<param>.

func RecAbout

func RecAbout(driver string) string

RecAbout returns the short message describing the driver.

func RecDrivers

func RecDrivers() []string

RecDrivers returns a sorted list of names of the registered drivers.

func RecURL

func RecURL(driver, id string) string

RecURL returns the URL of a given Record ID in a given database.

func RegisterGz

func RegisterGz(name string, driver GzDriver)

RegisterGz makes a Gazetteer driver available by the provided name. If Register is called twice with the same name or if the driver is nil, it panics.

func RegisterRec

func RegisterRec(name string, driver RecDriver)

RegisterRec makes a recDB driver availablre by the provided name. If Register is called twice with the same name or if driver is nil, it panics.

func RegisterSet

func RegisterSet(name string, driver SetDriver)

RegisterSet makes a setDB driver available by the provided name. If Register is called twice with the same name or if the driver is nil it panics.

func RegisterTax

func RegisterTax(name string, driver TaxDriver)

RegisterTax makes a taxonomy driver available by the provided name. If Register is called twice with the same name or if driver is nil, it panics.

func SetAbout

func SetAbout(driver string) string

SetAbout returns the short message describing the driver.

func SetDrivers

func SetDrivers() []string

SetDrivers returns a sorted list of names of the registered drivers.

func SetURL

func SetURL(driver, id string) string

SetURL returns the URL of a given dataset ID in a given database.

func TaxAbout

func TaxAbout(driver string) string

TaxAbout returns the short message describing the driver.

func TaxCanon

func TaxCanon(name string) string

TaxCanon returns a taxon name into its canonical form.

func TaxDrivers

func TaxDrivers() []string

TaxDrivers returns a sorted list of names of the registered drivers.

func TaxURL

func TaxURL(driver, id string) string

TaxURL returns the URL of a given taxon ID in a given database.

func TaxYear

func TaxYear(tx Taxon) int

TaxYear returns the year of the taxon description as given from the author field. If no year is defined, it will return 0.

Types

type BasisOfRecord

type BasisOfRecord uint

BasisOfRecord indicates the physic basis of an specimen record.

const (
	UnknownBasis BasisOfRecord = iota
	Preserved                  // a preserved (museum) specimen
	Fossil                     // a fossilized specimen
	Observation                // a human observation
	Machine                    // a machine "observation"
)

Valid basis of record.

func GetBasis

func GetBasis(s string) BasisOfRecord

GetBasis returns a BasisOfRecord value from a string.

func (BasisOfRecord) String

func (b BasisOfRecord) String() string

String returns the basis string of a basis of record.

type CollectionEvent

type CollectionEvent struct {
	Date      time.Time
	Admin     geography.Admin
	Locality  string
	Collector string

	// Z is the z-coordinate,
	// in meters
	// for altitude
	// or depth (as negative)
	// in which an flying
	// or an oceanic specimen is sampled.
	Z int
}

A CollectionEvent stores the information of a collection event for a record.

func (CollectionEvent) Country

func (c CollectionEvent) Country() string

Country returns the country name of the collection event.

func (CollectionEvent) CountryCode

func (c CollectionEvent) CountryCode() string

CountryCode returns the country code of the collection event.

func (CollectionEvent) County

func (c CollectionEvent) County() string

County returns the second administrative division of a country, for a collection event.

func (CollectionEvent) State

func (c CollectionEvent) State() string

State returns the first administrative division of a country, e.g. a US state, for a collection event.

type Dataset

type Dataset interface {
	// ID returns the ID of the current dataset.
	ID() string

	// Title returns the title of the dataset.
	Title() string

	// Keys returns a list of additional fields
	// stored in the dataset.
	Keys() []string

	// Value returns the value
	// of an additional field stored in the dataset.
	Value(key string) string
}

A Dataset is a museum collection, a published dataset, or any other source of data.

type Gazetteer

type Gazetteer interface {
	// Locate returns a set of points
	// for a given locality.
	Locate(adm geography.Admin, locality string) *GeoScan

	// Reverse returns the administrative data
	// for a given point.
	// It is not implemented by all gazetteers.
	Reverse(p geography.Position) (geography.Admin, error)
}

A Gazetteer is a georeferencing service.

func OpenGz

func OpenGz(driver, param string) (Gazetteer, error)

OpenGz opens a Gazetteer by its driver, and a driver specific parameter string.

type GeoScan

type GeoScan struct {
	// contains filtered or unexported fields
}

A GeoScan is a georeferenced position scanner to stream the results of a query that is expected to produce a list of georeferenced positions.

Use Scan to advance the stream:

sc := gz.Locate(adm, "las pavas")
for sc.Scan() {
	p := sc.Position()
	...
}
if err := sc.Err(); err != nil {
	...	// process the error
}

func NewGeoScan

func NewGeoScan(sz int) *GeoScan

NewGeoScan creates a georeferenced position scanner, with a buffer of the indicated size.

func (*GeoScan) Add

func (gsc *GeoScan) Add(p geography.Position, err error) bool

Add adds a position or an error to a GeoScan. It should be used by clients that return the scanner.

It returns true, if the element is added successfully.

func (*GeoScan) Close

func (gsc *GeoScan) Close()

Close closes the scanner. If Scan is called and returns false the scanner is closed automatically.

func (*GeoScan) Err

func (gsc *GeoScan) Err() error

Err returns the error, if any, that was encountered during iteration.

func (*GeoScan) Position

func (gsc *GeoScan) Position() geography.Position

Position returns the last read position. Every call to Position must be preceded by a call to Scan.

func (*GeoScan) Scan

func (gsc *GeoScan) Scan() bool

Scan advances the scanner to the next result. It returns false when there is no more positions, or an error happens when preparing it. Err should be consulted to distinguish between the two cases.

Every call to Position, even the first one, must be precede by a call to Scan.

type GzDriver

type GzDriver struct {
	// Open is a function to open
	// a Gazetteer.
	Open func(string) (Gazetteer, error)

	// About is a function that return
	// a short description of the driver.
	About func() string
}

GzDriver contains components of a Gazetteer driver.

type Rank

type Rank uint

Rank is a linnean rank. Ranks are arranged in a way that an inclusive rank in the taxonomy is always smaller than more exclusive ranks.

Then it is possible to use the form:

if rank < biodv.Genus {
	// do something
}
const (
	Unranked Rank = iota
	Kingdom
	Phylum
	Class
	Order
	Family
	Genus
	Species
)

Valid taxonomic ranks.

func GetRank

func GetRank(s string) Rank

GetRank returns a rank value from a string.

func (Rank) String

func (r Rank) String() string

String returns the rank string of a rank.

type RecDB

type RecDB interface {
	// TaxRecs returns a list of records from a given taxon ID.
	TaxRecs(id string) *RecScan

	// RecID returns the record with a given ID.
	RecID(id string) (Record, error)
}

RecDB is a record database.

func OpenRec

func OpenRec(driver, param string) (RecDB, error)

OpenRec opens a RecDB database by its driver, and a driver specific parameter string.

type RecDriver

type RecDriver struct {
	// Open is a function to open
	// a RecDB.
	Open func(string) (RecDB, error)

	// URL is a function to return
	// an URL of a given record ID.
	// This value can be nil.
	URL func(id string) string

	// About is a function that return
	// a short description of the driver.
	About func() string
}

RecDriver contains components of a RecDB driver.

type RecScan

type RecScan struct {
	// contains filtered or unexported fields
}

A RecScan is a record scanner to stream the results of a query that is expected to producece a list of records.

Use scan to advance the stream:

sc := recs.TaxRecs("Rhinella")
for sc.Scan() {
	rec := sc.Record()
	...
}
if err := sc.Err(); err != nil {
	...	// process the error
}

func NewRecScan

func NewRecScan(sz int) *RecScan

NewRecScan creates a record scanner, with a buffer of the indicated size.

func (*RecScan) Add

func (rsc *RecScan) Add(rec Record, err error) bool

Add adds a record or an error to a record scanner. It should be used by clients that return the scanner.

It returns true, if the element is added successfully.

func (*RecScan) Close

func (rsc *RecScan) Close()

Close closes the scanner. If Scan is called and returns false the scanner is closed automatically.

func (*RecScan) Err

func (rsc *RecScan) Err() error

Err returns the error, if any, that was encountered during iteration.

func (*RecScan) Record

func (rsc *RecScan) Record() Record

Record returns the last read record. Every call to Record must be preceded by a call to Scan.

func (*RecScan) Scan

func (rsc *RecScan) Scan() bool

Scan advances the scanner to the next result. It returns false when there is no more records, or an error happens when preparing it. Err should be consulted to distinguish between the two cases.

Every call to Record, even the first one, must be precede by a call to Scan.

type Record

type Record interface {
	// Taxon returns the ID of the taxon
	// assigned to the specimen.
	Taxon() string

	// ID return the ID of the current specimen.
	ID() string

	// Basis returns the basis of the specimen record.
	Basis() BasisOfRecord

	// CollEvent is the collection event of the record.
	CollEvent() CollectionEvent

	// GeoRef returns a geographic point.
	//
	// If the record is not georeferenced
	// is should return an invalid Point.
	GeoRef() geography.Position

	// Keys returns a list of additional fields
	// stored in the record.
	Keys() []string

	// Value returns the value
	// of an additional field stored in the record.
	Value(key string) string
}

A Record is an specimen record.

type SetDB

type SetDB interface {
	// SetID returns the dataset with a given ID.
	SetID(id string) (Dataset, error)
}

A SetDB is a dataset database.

func OpenSet

func OpenSet(driver, param string) (SetDB, error)

OpenSet opens a SetDB database by its driver, and a driver speciefic parameter string.

type SetDriver

type SetDriver struct {
	// Open is a function to open
	// a SetDB.
	Open func(string) (SetDB, error)

	// URL is a function to return
	// an URL of a given dataset ID.
	// This value can be nil.
	URL func(id string) string

	// About is a function that return
	// a short description of the driver.
	About func() string
}

SetDriver contains components of a SetDB driver.

type TaxDriver

type TaxDriver struct {
	// Open is a function to open
	// a taxonomy.
	Open func(string) (Taxonomy, error)

	// URL is a function to return
	// an URL of a given taxon ID.
	// This value can be nil.
	URL func(id string) string

	// About is a function that return
	// a short description of the driver.
	About func() string
}

TaxDriver contains components of a Taxonomy driver.

type TaxScan

type TaxScan struct {
	// contains filtered or unexported fields
}

A TaxScan is a taxon scanner to stream the results of a query that is expected to produce a taxon list.

Use Scan to advance the stream:

sc, err := txm.Taxon("Rhinella")
for sc.Scan() {
	tax := sc.Taxon()
	...
}
if err := sc.Err(); err != nil {
	...	// process the error
}

func NewTaxScan

func NewTaxScan(sz int) *TaxScan

NewTaxScan creates a taxon scanner, with a buffer of the indicated size.

func (*TaxScan) Add

func (tsc *TaxScan) Add(tax Taxon, err error) bool

Add adds a taxon or an error to a taxon scanner. It should be used by clients that return the scanner.

It returns true, if the element is added successfully.

func (*TaxScan) Close

func (tsc *TaxScan) Close()

Close closes the scanner. If Scan is called and returns false the scanner is closed automatically.

func (*TaxScan) Err

func (tsc *TaxScan) Err() error

Err returns the error, if any, that was encountered during iteration.

func (*TaxScan) Scan

func (tsc *TaxScan) Scan() bool

Scan advances the scanner to the next result. It returns false when there is no more taxons, or an error happens when preparing it. Err should be consulted to distinguish between the two cases.

Every call to Taxon, even the first one, must be precede by a call to Scan.

func (*TaxScan) Taxon

func (tsc *TaxScan) Taxon() Taxon

Taxon returns the last read taxon. Every call to Taxon must be preceded by a call to Scan.

type Taxon

type Taxon interface {
	// Name returns the canonical name of the current taxon.
	Name() string

	// ID returns the ID of the current taxon.
	ID() string

	// Parent returns the ID of the taxon's parent.
	Parent() string

	// Rank returns the linnean rank of the current taxon.
	Rank() Rank

	// IsCorrect returns true if the taxon
	// is a correct name
	// (i.e. not a synonym).
	IsCorrect() bool

	// Keys returns a list of additional fields
	// stored in the taxon.
	Keys() []string

	// Value returns the value
	// of an additional field stored in the taxon.
	Value(key string) string
}

A Taxon is a taxon name in a taxonomy.

func TaxList

func TaxList(tsc *TaxScan) ([]Taxon, error)

TaxList creates a list of Taxon, from a taxon scanner.

func TaxParents

func TaxParents(txm Taxonomy, id string) ([]Taxon, error)

TaxParents returns a list with the parents in a Taxonomy for a given taxon ID (included the given taxon) sorted from the most inclusive to the most exclusive taxon.

type Taxonomy

type Taxonomy interface {
	// Taxon returns a list of taxons with a given name.
	Taxon(name string) *TaxScan

	// TaxID returns the taxon with a given ID.
	TaxID(id string) (Taxon, error)

	// Synonyms returns a list taxons synonyms of a given ID.
	Synonyms(id string) *TaxScan

	// Children returns a list of taxon children of a given ID,
	// if the ID is empty,
	// it will return the taxons attached to the root
	// of the taxonomy.
	Children(id string) *TaxScan
}

A Taxonomy is a taxonomy database.

func OpenTax

func OpenTax(driver, param string) (Taxonomy, error)

OpenTax opens a taxonomy database by its driver, and a driver specific parameter string.

Directories

Path Synopsis
cmd
biodv command
Biodv is a tool for management and analysis of biodiveristy data.
Biodv is a tool for management and analysis of biodiveristy data.
biodv/internal/database/drivers
Package drivers implements the db.drivers command, i.e.
Package drivers implements the db.drivers command, i.e.
biodv/internal/dataset/info
Package info implements the set.info command, i.e.
Package info implements the set.info command, i.e.
biodv/internal/records/add
Package add implements the rec.add command, i.e.
Package add implements the rec.add command, i.e.
biodv/internal/records/assign
Package assign implements the rec.assign command, i.e.
Package assign implements the rec.assign command, i.e.
biodv/internal/records/dbadd
Package dbadd implements the rec.db.add command, i.e.
Package dbadd implements the rec.db.add command, i.e.
biodv/internal/records/del
Package del implements the rec.del command, i.e.
Package del implements the rec.del command, i.e.
biodv/internal/records/ed
Package ed implements the rec.ed command, i.e.
Package ed implements the rec.ed command, i.e.
biodv/internal/records/georef
Package georef implements the rec.georef command, i.e.
Package georef implements the rec.georef command, i.e.
biodv/internal/records/gzgeoref
Package gzgeoref implements the rec.gz.georef command, i.e.
Package gzgeoref implements the rec.gz.georef command, i.e.
biodv/internal/records/info
Package info implements the rec.info command, i.e.
Package info implements the rec.info command, i.e.
biodv/internal/records/mapcmd
Package mapcmd implements the rec.map command, i.e.
Package mapcmd implements the rec.map command, i.e.
biodv/internal/records/set
Package set implements the rec.set command, i.e.
Package set implements the rec.set command, i.e.
biodv/internal/records/table
Package table implements the rec.table command, i.e.
Package table implements the rec.table command, i.e.
biodv/internal/records/validate
Package validate implements the rec.validate command, i.e.
Package validate implements the rec.validate command, i.e.
biodv/internal/records/value
Package value implements the rec.value command, i.e.
Package value implements the rec.value command, i.e.
biodv/internal/taxonomy/add
Package add implements the tax.add command, i.e.
Package add implements the tax.add command, i.e.
biodv/internal/taxonomy/catalog
Package catalog implements the tax.catalog command, i.e.
Package catalog implements the tax.catalog command, i.e.
biodv/internal/taxonomy/dbadd
Package dbadd implements the tax.db.add command, i.e.
Package dbadd implements the tax.db.add command, i.e.
biodv/internal/taxonomy/dbfill
Package dbfill implements the tax.db.fill command, i.e.
Package dbfill implements the tax.db.fill command, i.e.
biodv/internal/taxonomy/dbsync
Package dbsync implements the tax.db.sync command, i.e.
Package dbsync implements the tax.db.sync command, i.e.
biodv/internal/taxonomy/dbupdate
Package dbupdate implements the tax.db.update command, i.e.
Package dbupdate implements the tax.db.update command, i.e.
biodv/internal/taxonomy/del
Package del implements the tax.del command, i.e.
Package del implements the tax.del command, i.e.
biodv/internal/taxonomy/format
Package format implements the tax.format command, i.e.
Package format implements the tax.format command, i.e.
biodv/internal/taxonomy/info
Package info implements the tax.info command, i.e.
Package info implements the tax.info command, i.e.
biodv/internal/taxonomy/list
Package list implements the tax.list command, i.e.
Package list implements the tax.list command, i.e.
biodv/internal/taxonomy/move
Package move implements the tax.move command, i.e.
Package move implements the tax.move command, i.e.
biodv/internal/taxonomy/rank
Package rank implement the tax.rank command, i.e.
Package rank implement the tax.rank command, i.e.
biodv/internal/taxonomy/set
Package set implements the tax.set command, i.e.
Package set implements the tax.set command, i.e.
biodv/internal/taxonomy/validate
Package validate implements the tax.validate command, i.e.
Package validate implements the tax.validate command, i.e.
biodv/internal/taxonomy/value
Package value implements the tax.value command, i.e.
Package value implements the tax.value command, i.e.
Package cmdapp implements a command line application.
Package cmdapp implements a command line application.
Package dataset implements a database of dataset metadata.
Package dataset implements a database of dataset metadata.
driver
gbif
Package gbif implements an interface to GBIF webservice.
Package gbif implements an interface to GBIF webservice.
geolocate
Package geolocate implements an interface to GEOLocate gazetteer.
Package geolocate implements an interface to GEOLocate gazetteer.
encoding
stanza
Package stanza reads and writes records in a list ('stanza') format.
Package stanza reads and writes records in a list ('stanza') format.
Package geography implements simple geographic utilities.
Package geography implements simple geographic utilities.
Package records implements a records database.
Package records implements a records database.
Package taxonomy implements a hierarchical, linnean ranked taxonomy.
Package taxonomy implements a hierarchical, linnean ranked taxonomy.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL