tagstash

package module
v0.0.0-...-e3dc122 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 21, 2016 License: MIT Imports: 15 Imported by: 0

README

GoDoc Go Report Card Go Cover

Tagstash

Tagstash is a library for tag lookup.

It is designed to decouple tagging from data. It typically stores URIs with one or more tag. The lookup operation accepts multiple tags, and tries to find the stored value that is the closest match for the provided set. Internally, it relies on a PostgreSQL based storage, extended with an in-memory cache. Both the persistent storage and the cache accept custom implementations. For simple scenarios, or for prototyping, it supports Sqlite instead of PostgreSQL.

Example:
stash, err := tagstash.New(tagstash.Options{
	CacheOptions: tagstash.CacheOptions{
		CacheSize: 1 << 12,
	},
})

if err != nil {
	log.Fatal(err)
}

stash.Set("https://www.example.org/page1.html", "foo", "bar", "baz")
stash.Set("https://www.example.org/page2.html", "foo", "qux", "quux")

if u, err := stash.Get("foo", "qux", "wah"); err != nil {
	fmt.Printf("error: %v", err)
} else {
	fmt.Printf("found: %s", u)
}
Installation
go get github.com/aryszka/tagstash

If using the default Sqlite, the database with the default settings is automatically created. If using PostgreSQL, use the provided make task, create-postgres, to initialize the database, or run sql/create-db.sql to initialize it.

make PSQL_DB=foo PSQL_USER=$(whoami) create-postgres
Documentation

Find the godoc documentation here:

https://godoc.org/github.com/aryszka/tagstash

Documentation

Overview

Package tagstash provides tagging for arbitrary string values.

Tagstash implements many-to-many associations between values and tags. It returns the best match for a query with multiple tags. It prioritizes the entries based on how many tags they match, and if still multiple values come out as the best, it takes into account the order of the querying tags, too.

It stores the value-tag associations in a persistent storage, and caches the most often queried tags in memory. Both the persistence layer and the cache can be replaced with a custom implementation of a simple interface (Storage). When evaluating a query, tagstash tries to find the best match first in the cache, and if any tags in the query cannot be found there, only then fetches their associations from the persistent storage.

Example
package main

import (
	"fmt"
	"log"

	"github.com/aryszka/tagstash"
)

func main() {
	stash, err := tagstash.New(tagstash.Options{
		CacheOptions: tagstash.CacheOptions{
			CacheSize: 1 << 12,
		},
	})

	if err != nil {
		log.Fatal(err)
	}

	stash.Set("https://www.example.org/page1.html", "foo", "bar", "baz")
	stash.Set("https://www.example.org/page2.html", "foo", "qux", "quux")

	if u, err := stash.Get("qux", "foo", "wah"); err != nil {
		fmt.Printf("error: %v", err)
	} else {
		fmt.Printf("found: %s", u)
	}

}
Output:

found: https://www.example.org/page2.html

Index

Examples

Constants

View Source
const (

	// DefaultDriverName is used as the default sql driver (sqlite3).
	DefaultDriverName = sqlite

	// DefaultDataSourceName is used as the default data source (data.sqlite).
	DefaultDataSourceName = "data.sqlite"
)

Variables

View Source
var (
	// ErrDamagedCacheData is returned when the cache detects damaged data.
	ErrDamagedCacheData = errors.New("damaged cache data")

	// ErrFailedToCacheEntry is returned when caching an entry failed, e.g. due to oversize.
	ErrFailedToCacheEntry = errors.New("failed to cache entry")
)
View Source
var ErrNotSupported = errors.New("not supported")

ErrNotSupported is returned when a feature is not supported by the current implementation. E.g. the storage doesn't support lookup by value.

Functions

This section is empty.

Types

type CacheOptions

type CacheOptions struct {

	// CacheSize defines the maximum memory usage of the cache. Defaults to 1G.
	CacheSize int

	// ExpectedItemSize provides a hint for the cache about the expected median size of the stored values.
	//
	// This option exists only for optimization, there is no good rule of thumb. Too high values will result
	// in worse memory utilization, while too low values may affect the individual lookup performance.
	// Generally, it is better to err for the smaller values.
	ExpectedItemSize int
}

CacheOptions are used by the default cache implementation.

type Entry

type Entry struct {

	// Value that a tag belongs to.
	Value string

	// Tag associated with a value.
	Tag string

	// TagIndex marks how strong strong a tag describes a value.
	TagIndex int
	// contains filtered or unexported fields
}

Entry represents a value-tag associaction.

type Options

type Options struct {

	// Custom storage implementation. By default, a builtin storage is used.
	Storage Storage

	// Custom cache implementation. By default, a builtin cache is used.
	Cache Storage

	// CacheOptions define options for the default persistent storage implementation when not replaced by a custom
	// storage.
	StorageOptions StorageOptions

	// CacheOptions define options for the default cache implementation when not replaced by a custom
	// cache.
	CacheOptions CacheOptions
}

Options are used to initialization tagstash.

type Storage

type Storage interface {

	// Get returns all entries whose tag is listed in the arguments.
	Get([]string) ([]*Entry, error)

	// Set stores a value-tag association. Implementations must make sure that the value-tag combinations
	// are unique.
	Set(*Entry) error

	// Remove deletes a single value-tag association.
	Remove(*Entry) error

	// Delete deletes all associations with the provided tag.
	Delete(string) error

	// Close releases any resources taken by the storage implementation.
	Close()
}

Storage implementations store value-tag associations.

type StorageOptions

type StorageOptions struct {

	// DriverName specifies which data base driver to use. Currently supported: postgres, sqlite3. The
	// default value is sqlite3.
	DriverName string

	// DataSourceName specifies the data source for the storage. In case of postgresql, it is the postgresql
	// connection string, while in case of sqlite3, it is a path to a new or existing file. When not
	// specified and the driver is sqlite3, ./data.sqlite will be used.
	//
	// When PostgreSQL is used, please refer to the driver implementation's documentation for configuration
	// details: https://github.com/lib/pq.
	DataSourceName string
}

StorageOptions are used by the default storage implementation.

type TagLookup

type TagLookup interface {
	GetTags(string) ([]string, error)
}

TagLookup when implemented by a storage, can return all tags associated with a value.

type TagStash

type TagStash struct {
	// contains filtered or unexported fields
}

TagStash is used to store tags associated with values and return the best matching value for a set of query tags.

func New

func New(o Options) (*TagStash, error)

New creates and initializes a tagstash instance.

func (*TagStash) Close

func (t *TagStash) Close()

Close releases all resources.

func (*TagStash) Delete

func (t *TagStash) Delete(tag string) error

Delete deletes all associations of a tag.

func (*TagStash) Get

func (t *TagStash) Get(tags ...string) (string, error)

Get returns the best matching value for a set of tags. When there are overlapping tags and values, it prioritizes first those values that match more tags from the arguments. When there are matches with the same number of matching tags, it prioritizes those that whose tag order matches the closer the order of the tags in the arguments. The tag order means the order of tags at the time of the definition (Set()).

func (*TagStash) GetAll

func (t *TagStash) GetAll(tags ...string) ([]string, error)

GetAll returns all matches for a set of tags, sorted by the same rules that are used for prioritization when calling Get().

func (*TagStash) GetTags

func (t *TagStash) GetTags(value string) ([]string, error)

GetTags returns the tags associated with the provided value or ErrNotSupported if the storage implementation doesn't support this query.

func (*TagStash) Remove

func (t *TagStash) Remove(value string, tag string) error

Remove deletes a value-tag association.

func (*TagStash) Set

func (t *TagStash) Set(value string, tags ...string) error

Set stores tags associated with a value. The order of the tags is taken into account when there are overlapping matches during retrieval.

Directories

Path Synopsis
Package sql contains generated files, not stored in the repository.
Package sql contains generated files, not stored in the repository.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL