source

package
v0.15.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 13, 2021 License: MIT Imports: 28 Imported by: 0

Documentation

Overview

Package source provides functionality for dealing with data sources.

Index

Constants

View Source
const (
	// StdinHandle is the reserved handle for stdin pipe input.
	StdinHandle = "@stdin"

	// ActiveHandle is the reserved handle for the active source.
	// FIXME: it should be possible to use "@0" as the active handle, but
	//  the SLQ grammar doesn't currently allow it. Possibly change this
	//  value to "@0" after modifying the SLQ grammar.
	ActiveHandle = "@active"

	// ScratchHandle is the reserved handle for the scratch source.
	ScratchHandle = "@scratch"

	// JoinHandle is the reserved handle for the join db source.
	JoinHandle = "@join"

	// MonotableName is the table name used for "mono-table" drivers
	// such as CSV. Thus a source @address_csv will have its
	// data accessible via @address_csv.data.
	MonotableName = "data"
)
View Source
const TypeNone = Type("")

TypeNone is the zero value of driver.Type.

Variables

This section is empty.

Functions

func AbsLocation

func AbsLocation(loc string) string

AbsLocation returns the absolute path of loc. That is, relative paths etc in loc are resolved. If loc is not a file path or it cannot be processed, loc is returned unmodified.

func IsSQLLocation

func IsSQLLocation(loc string) bool

IsSQLLocation returns true if source location loc seems to be a DSN for a SQL driver.

func LocationFileName

func LocationFileName(src *Source) (string, error)

LocationFileName returns the final component of the file/URL path.

func ParseTableHandle

func ParseTableHandle(input string) (handle, table string, err error)

ParseTableHandle attempts to parse a SLQ source handle and/or table name. Surrounding whitespace is trimmed. Examples of valid input values are:

@handle.tblName
@handle
.tblName

func RedactLocation added in v0.15.0

func RedactLocation(loc string) string

RedactLocation returns a redacted version of the source location loc, with the password component (if any) of the location masked.

func ReservedHandles

func ReservedHandles() []string

ReservedHandles returns a slice of the handle names that are reserved for sq use.

func ShortLocation

func ShortLocation(loc string) string

ShortLocation returns a short location string. For example, the base name (data.xlsx) for a file or for a DSN, user@host[:port]/db.

func SuggestHandle

func SuggestHandle(typ Type, loc string, takenFn func(string) bool) (string, error)

SuggestHandle suggests a handle based on location and type. If typ is TypeNone, the type will be inferred from loc. The takenFn is used to determine if a suggested handle is free to be used (e.g. "@sakila_csv" -> "@sakila_csv_1", etc).

If the base name (derived from loc) contains illegal handle runes, those are replaced with underscore. If the handle would start with a number or underscore, it will be prefixed with "h" (for "handle"). Thus "123.xlsx" becomes "@h123_xlsx".

func TempDirFile

func TempDirFile(filename string) (dir string, f *os.File, cleanFn func() error, err error)

TempDirFile creates a new temporary file in a new temp dir, opens the file for reading and writing, and returns the resulting *os.File, as well as the parent dir. It is the caller's responsibility to close the file and remove the temp dir, which the returned cleanFn encapsulates.

func VerifyLegalHandle

func VerifyLegalHandle(handle string) error

VerifyLegalHandle returns an error if handle is not an acceptable source handle value. Valid input must match:

\A[@][a-zA-Z][a-zA-Z0-9_]*$

func VerifySetIntegrity

func VerifySetIntegrity(ss *Set) error

VerifySetIntegrity verifies the internal state of s. Typically this func is invoked after s has been loaded from config, verifying that the config is not corrupt.

Types

type ColMetadata

type ColMetadata struct {
	Name         string    `json:"name"`
	Position     int64     `json:"position"`
	PrimaryKey   bool      `json:"primary_key"`
	BaseType     string    `json:"base_type"`
	ColumnType   string    `json:"column_type"`
	Kind         kind.Kind `json:"kind"`
	Nullable     bool      `json:"nullable"`
	DefaultValue string    `json:"default_value,omitempty"`
	Comment      string    `json:"comment,omitempty"`
}

ColMetadata models metadata for a particular column of a data source.

func (*ColMetadata) String

func (c *ColMetadata) String() string

type DBVar

type DBVar struct {
	Name  string `json:"name"`
	Value string `json:"value"`
}

DBVar models a key-value pair for driver config. REVISIT: maybe better named as SourceSetting or such?

type FileOpenFunc

type FileOpenFunc func() (io.ReadCloser, error)

FileOpenFunc returns a func that opens a ReadCloser. The caller is responsible for closing the returned ReadCloser.

type Files

type Files struct {
	// contains filtered or unexported fields
}

Files is the centralized API for interacting with files.

Why does Files exist? There's a need for functionality to transparently get a Reader for remote or local files, and most importantly, an ability for multiple goroutines to read/sample a file while its being read (mainly to "sample" the file type, e.g. to determine if it's an XLSX file etc). Currently we use fscache under the hood for this, but our implementation is not satisfactory: in particular, the implementation currently requires that we read the entire source file into fscache before it's available to be read (which is awful if we're reading long-running pipe from stdin). This entire thing needs to be revisited.

func NewFiles

func NewFiles(log lg.Log) (*Files, error)

NewFiles returns a new Files instance.

func (*Files) AddStdin

func (fs *Files) AddStdin(f *os.File) error

AddStdin copies f to fs's cache: the stdin data in f is later accessible via fs.Open(src) where src.Handle is StdinHandle; f's type can be detected via TypeStdin. Note that f is closed by this method.

DESIGN: it's possible we'll ditch AddStdin and TypeStdin

in some future version; this mechanism is a stopgap.

func (*Files) AddTypeDetectors

func (fs *Files) AddTypeDetectors(detectFns ...TypeDetectFunc)

AddTypeDetectors adds type detectors.

func (*Files) CleanupE

func (fs *Files) CleanupE(fn func() error)

CleanupE adds fn to the cleanup sequence invoked by fs.Close.

func (*Files) Close

func (fs *Files) Close() error

Close closes any open resources.

func (*Files) Open

func (fs *Files) Open(src *Source) (io.ReadCloser, error)

Open returns a new io.ReadCloser for src.Location. If src.Handle is StdinHandle, AddStdin must first have been invoked. The caller must close the reader.

func (*Files) OpenFunc

func (fs *Files) OpenFunc(src *Source) func() (io.ReadCloser, error)

OpenFunc returns a func that invokes fs.Open for src.Location.

func (*Files) ReadAll

func (fs *Files) ReadAll(src *Source) ([]byte, error)

ReadAll is a convenience method to read the bytes of a source.

func (*Files) Size

func (fs *Files) Size(src *Source) (size int64, err error)

Size returns the file size of src.Location. This exists as a convenience function and something of a replacement for using os.Stat to get the file size.

func (*Files) Type

func (fs *Files) Type(ctx context.Context, loc string) (Type, error)

Type returns the source type of location.

func (*Files) TypeStdin

func (fs *Files) TypeStdin(ctx context.Context) (Type, error)

TypeStdin detects the type of stdin as previously added by AddStdin. An error is returned if AddStdin was not first invoked. If the type cannot be detected, TypeNone and nil are returned.

type Metadata

type Metadata struct {
	// Handle is the source handle.
	Handle string `json:"handle"`

	// Name is the base name of the source, e.g. the base filename
	// or DB name etc. For example, "sakila".
	Name string `json:"name"`

	// FQName is the full name of the data source, typically
	// including catalog/schema etc. For example, "sakila.public"
	FQName string `json:"name_fq"`

	// SourceType is the source driver type.
	SourceType Type `json:"driver"`

	// DBDriverType is the type of the underling DB driver.
	// This is the same value as SourceType for SQL database types.
	DBDriverType Type `json:"db_driver"`

	// DBProduct is the DB product string, such as "PostgreSQL 9.6.17 on x86_64-pc-linux-gnu".
	DBProduct string `json:"db_product"`

	// DBVersion is the DB version.
	DBVersion string `json:"db_version"`

	// DBVars are configuration name-value pairs from the DB.
	DBVars []DBVar `json:"db_variables,omitempty"`

	// Location is the source location such as a DB connection string,
	// a file path, or a URL.
	Location string `json:"location"`

	// User is the username, if applicable.
	User string `json:"user,omitempty"`

	// Size is the physical size of the source in bytes, e.g. DB file size.
	Size int64 `json:"size"`

	// Tables is the metadata for each table in the source.
	Tables []*TableMetadata `json:"tables"`
}

Metadata holds metadata for a source.

func (*Metadata) String

func (md *Metadata) String() string

func (*Metadata) TableNames

func (md *Metadata) TableNames() []string

TableNames is a convenience method that returns md's table names.

type Set

type Set struct {
	// contains filtered or unexported fields
}

Set is a set of sources. Typically it is loaded from config at a start of a run.

func (*Set) Active

func (s *Set) Active() *Source

Active returns the active source, or nil.

func (*Set) Add

func (s *Set) Add(src *Source) error

Add adds src to s.

func (*Set) Exists

func (s *Set) Exists(handle string) bool

Exists returns true if handle already exists in the set.

func (*Set) Get

func (s *Set) Get(handle string) (*Source, error)

Get gets the src with handle, or returns an error.

func (*Set) Handles added in v0.15.0

func (s *Set) Handles() []string

Handles returns the set of source handles.

func (*Set) Items

func (s *Set) Items() []*Source

Items returns the sources as a slice.

func (*Set) MarshalJSON

func (s *Set) MarshalJSON() ([]byte, error)

MarshalJSON implements json.Marshaler.

func (*Set) MarshalYAML

func (s *Set) MarshalYAML() (interface{}, error)

MarshalYAML implements yaml.Marshaler.

func (*Set) Remove

func (s *Set) Remove(handle string) error

Remove removes from the set the src having handle.

func (*Set) Scratch

func (s *Set) Scratch() *Source

Scratch returns the scratch source, or nil.

func (*Set) SetActive

func (s *Set) SetActive(handle string) (*Source, error)

SetActive sets the active src, or unsets any active src if handle is empty. If handle does not exist, an error is returned.

func (*Set) SetScratch

func (s *Set) SetScratch(handle string) (*Source, error)

SetScratch sets the scratch src to handle.

func (*Set) String

func (s *Set) String() string

func (*Set) UnmarshalJSON

func (s *Set) UnmarshalJSON(b []byte) error

UnmarshalJSON implements json.Unmarshaler

func (*Set) UnmarshalYAML

func (s *Set) UnmarshalYAML(unmarshal func(interface{}) error) error

UnmarshalYAML implements yaml.Unmarshaler.

type Source

type Source struct {
	Handle   string          `yaml:"handle" json:"handle"`
	Type     Type            `yaml:"type" json:"type"`
	Location string          `yaml:"location" json:"location"`
	Options  options.Options `yaml:"options,omitempty" json:"options,omitempty"`
}

Source describes a data source.

func (*Source) RedactedLocation

func (s *Source) RedactedLocation() string

RedactedLocation returns s.Location, with the password component of the location masked.

func (*Source) ShortLocation

func (s *Source) ShortLocation() string

ShortLocation returns a short location string. For example, the base name (data.xlsx) for a file or for a DSN, user@host[:port]/db.

func (*Source) String

func (s *Source) String() string

type TableMetadata

type TableMetadata struct {
	// Name is the table name, such as "actor".
	Name string `json:"name"`

	// FQName is the fully-qualified name, such as "sakila.public.actor"
	FQName string `json:"name_fq,omitempty"`

	// TableType indicates if this is a "table" or "view". The value
	// is driver-independent. See DBTableType for the driver-dependent
	// value.
	TableType string `json:"table_type,omitempty"`

	// DBTableType indicates if this is a table or view, etc.
	// The value is driver-dependent, e.g. "BASE TABLE" or "VIEW" for postgres.
	DBTableType string `json:"table_type_db,omitempty"`

	// RowCount is the number of rows in the table.
	RowCount int64 `json:"row_count"`

	// Size is the physical size of the table in bytes. For a view, this
	// may be nil.
	Size *int64 `json:"size,omitempty"`

	// Comment is the comment for the table. Typically empty.
	Comment string `json:"comment,omitempty"`

	// Columns holds the metadata for the table's columns.
	Columns []*ColMetadata `json:"columns"`
}

TableMetadata models table (or view) metadata.

func TableFromSourceMetadata deprecated

func TableFromSourceMetadata(srcMeta *Metadata, tblName string) (*TableMetadata, error)

TableFromSourceMetadata returns TableMetadata whose name matches tblName.

Deprecated: Each driver should implement this correctly for a single table.

func (*TableMetadata) Column

func (t *TableMetadata) Column(colName string) *ColMetadata

Column returns the named col or nil.

func (*TableMetadata) PKCols

func (t *TableMetadata) PKCols() []*ColMetadata

PKCols returns a possibly empty slice of cols that are part of the table primary key.

func (*TableMetadata) String

func (t *TableMetadata) String() string

type Type

type Type string

Type is a source type, e.g. "mysql", "postgres", "csv", etc.

func DetectMagicNumber

func DetectMagicNumber(ctx context.Context, log lg.Log, openFn FileOpenFunc) (detected Type, score float32, err error)

DetectMagicNumber is a TypeDetectFunc that uses an external pkg (h2non/filetype) to detect the "magic number" from the start of files.

func (Type) String

func (t Type) String() string

type TypeDetectFunc

type TypeDetectFunc func(ctx context.Context, log lg.Log, openFn FileOpenFunc) (detected Type, score float32, err error)

TypeDetectFunc interrogates a byte stream to determine the source driver type. A score is returned indicating the the confidence that the driver type has been detected. A score <= 0 is failure, a score >= 1 is success; intermediate values indicate some level of confidence. An error is returned only if an IO problem occurred. The implementation gets access to the byte stream by invoking openFn, and is responsible for closing any reader it opens.

Directories

Path Synopsis
Package fetcher provides a mechanism for fetching files from URLs.
Package fetcher provides a mechanism for fetching files from URLs.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL