metadata

package
v0.0.0-...-2f503fb Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 17, 2019 License: Apache-2.0 Imports: 20 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CreateMetadataIndex

func CreateMetadataIndex(client *elastic.Client, index string, overwrite bool) error

CreateMetadataIndex creates a new ElasticSearch index with our target mappings. An ngram analyze is defined and applied to the variable names to allow for substring searching.

func DatasetMatches

func DatasetMatches(m *model.Metadata, variables []string) bool

DatasetMatches determines if the metadata variables match.

func IngestMetadata

func IngestMetadata(client *elastic.Client, index string, datasetPrefix string, datasetSource DatasetSource, meta *model.Metadata) error

IngestMetadata adds a document consisting of the metadata to the provided index.

func IsMetadataVariable

func IsMetadataVariable(v *model.Variable) bool

IsMetadataVariable indicates whether or not a variable is additional metadata added to the source.

func LoadDatasetStats

func LoadDatasetStats(m *model.Metadata, datasetPath string) error

LoadDatasetStats loads the dataset and computes various stats.

func LoadImportance

func LoadImportance(m *model.Metadata, importanceFile string) error

LoadImportance wiull load the importance feature selection metric.

func LoadMetadataFromClassification

func LoadMetadataFromClassification(schemaPath string, classificationPath string, normalizeVariableNames bool) (*model.Metadata, error)

LoadMetadataFromClassification loads metadata from a merged schema and classification file.

func LoadMetadataFromMergedSchema

func LoadMetadataFromMergedSchema(schemaPath string) (*model.Metadata, error)

LoadMetadataFromMergedSchema loads metadata from a merged schema file.

func LoadMetadataFromOriginalSchema

func LoadMetadataFromOriginalSchema(schemaPath string) (*model.Metadata, error)

LoadMetadataFromOriginalSchema loads metadata from a schema file.

func LoadMetadataFromRawFile

func LoadMetadataFromRawFile(datasetPath string, classificationPath string) (*model.Metadata, error)

LoadMetadataFromRawFile loads metadata from a raw file and a classification file.

func LoadSummary

func LoadSummary(m *model.Metadata, summaryFile string, useCache bool) error

LoadSummary loads a description summary

func LoadSummaryFromDescription

func LoadSummaryFromDescription(m *model.Metadata, summaryFile string) error

LoadSummaryFromDescription loads a summary from the description.

func LoadSummaryMachine

func LoadSummaryMachine(m *model.Metadata, summaryFile string) error

LoadSummaryMachine loads a machine-learned summary.

func SetTypeProbabilityThreshold

func SetTypeProbabilityThreshold(threshold float64)

SetTypeProbabilityThreshold below which a suggested type is not used as variable type

func VerifyAndUpdate

func VerifyAndUpdate(m *model.Metadata, dataPath string) error

VerifyAndUpdate will update the metadata when inconsistentices or errors are found.

func WriteMergedSchema

func WriteMergedSchema(m *model.Metadata, path string, mergedDataResource *model.DataResource) error

WriteMergedSchema exports the current meta data as a merged schema file.

func WriteSchema

func WriteSchema(m *model.Metadata, path string) error

WriteSchema exports the current meta data as a schema file.

Types

type DataResourceParser

type DataResourceParser interface {
	Parse(res *gabs.Container) (*model.DataResource, error)
}

DataResourceParser is a parser for a data resource in the schema document.

type DatasetSource

type DatasetSource string

DatasetSource flags the type of ingest action that created a dataset

const (
	// ProvenanceSimon identifies the type provenance as Simon
	ProvenanceSimon = "d3m.primitives.distil.simon"
	// ProvenanceSchema identifies the type provenance as schema
	ProvenanceSchema = "schema"

	// Seed flags a dataset as ingested from seed data
	Seed DatasetSource = "seed"

	// Contrib flags a dataset as being ingested from contributed data
	Contrib DatasetSource = "contrib"

	// Augmented flags a dataset as being ingested from augmented data
	Augmented DatasetSource = "augmented"
)

type Media

type Media struct {
	Type string
}

Media is a data resource that is backed by media files.

func NewMedia

func NewMedia(typ string) *Media

NewMedia creates a new Media instance.

func (*Media) Parse

func (r *Media) Parse(res *gabs.Container) (*model.DataResource, error)

Parse extracts the data resource from the data schema document.

type Raw

type Raw struct {
	// contains filtered or unexported fields
}

Raw is a data resource that is contained within one file which does not have fields specified in the schema.

func (*Raw) Parse

func (r *Raw) Parse(res *gabs.Container) (*model.DataResource, error)

Parse extracts the data resource from the data schema document.

type Table

type Table struct {
}

Table is a data respurce that is contained within one or many tabular files.

func (*Table) Parse

func (r *Table) Parse(res *gabs.Container) (*model.DataResource, error)

Parse extracts the data resource from the data schema document.

type Timeseries

type Timeseries struct {
}

Timeseries is a data resource that is contained within one or many timeseries files.

func (*Timeseries) Parse

func (r *Timeseries) Parse(res *gabs.Container) (*model.DataResource, error)

Parse extracts the data resource from the data schema document.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL