moss

package module
v0.0.0-...-d75712e Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 9, 2017 License: Apache-2.0 Imports: 19 Imported by: 0

README

moss

moss provides a simple, fast, persistable, ordered key-val collection implementation as a 100% golang library.

moss stands for "memory-oriented sorted segments".

Build Status Coverage Status GoDoc Go Report Card

Features

  • ordered key-val collection API
  • 100% go implementation
  • key range iterators
  • snapshots provide for isolated reads
  • atomic mutations via a batch API
  • merge operations allow for read-compute-write optimizations for write-heavy use cases (e.g., updating counters)
  • concurrent readers and writers don't block each other
  • optional, advanced API's to avoid extra memory copying
  • optional lower-level storage implementation, called "mossStore", that uses an append-only design for writes and mmap() for reads, with configurable compaction policy; see: OpenStoreCollection()
  • mossStore supports navigating back through previous commit points in read-only fashion, and supports reverting to previous commit points.
  • optional persistence hooks to allow write-back caching to a lower-level storage implementation that advanced users may wish to provide (e.g., you can hook moss up to leveldb, sqlite, etc)
  • event callbacks allow the monitoring of asynchronous tasks
  • unit tests
  • fuzz tests via go-fuzz & smat (github.com/mschoch/smat); see README-smat.md

License

Apache 2.0

Example

import github.com/couchbase/moss

c, err := moss.NewCollection(moss.CollectionOptions{})
defer c.Close()

batch, err := c.NewBatch(0, 0)
defer batch.Close()

batch.Set([]byte("car-0"), []byte("tesla"))
batch.Set([]byte("car-1"), []byte("honda"))

err = c.ExecuteBatch(batch, moss.WriteOptions{})

ss, err := c.Snapshot()
defer ss.Close()

ropts := moss.ReadOptions{}

val0, err := ss.Get([]byte("car-0"), ropts) // val0 == []byte("tesla").
valX, err := ss.Get([]byte("car-not-there"), ropts) // valX == nil.

For persistence, you can use...

store, collection, err := moss.OpenStoreCollection(directoryPath,
    moss.StoreOptions{}, moss.StorePersistOptions{})

Design

The design is similar to a (much) simplified LSM tree, with a stack of sorted, immutable key-val arrays or "segments".

To incorporate the next Batch of key-val mutations, the incoming key-val entries are first sorted into an immutable "segment", which is then atomically pushed onto the top of the stack of segments.

For readers, a higher segment in the stack will shadow entries of the same key from lower segments.

Separately, an asynchronous goroutine (the "merger") will continuously merge N sorted segments to keep stack height low.

In the best case, a remaining, single, large sorted segment will be efficient in memory usage and efficient for binary search and range iteration.

Iterations when the stack height is > 1 are implementing using a N-way heap merge.

In this design, the stack of segments is treated as immutable via a copy-on-write approach whenever the stack needs to be "modified". So, multiple readers and writers won't block each other, and taking a Snapshot is also a similarly cheap operation by cloning the stack.

See also the DESIGN.md writeup.

Limitations and considerations

NOTE: Keys in a Batch must be unique. That is, myBatch.Set("x", "foo"); myBatch.Set("x", "bar") is not supported. Applications that do not naturally meet this requirement might maintain their own map[key]val data structures to ensure this uniqueness constraint.

Max key length is 2^24 (24 bits used to track key length).

Max val length is 2^28 (28 bits used to track val length).

Metadata overhead for each key-val operation is 16 bytes.

Read performance characterization is roughly O(log N) for key-val retrieval.

Write performance characterization is roughly O(M log M), where M is the number of mutations in a batch when invoking ExecuteBatch().

Those performance characterizations, however, don't account for background, asynchronous processing for the merging of segments and data structure maintenance.

A background merger task, for example, that is too slow can eventually stall ingest of new batches. (See the CollectionOptions settings that limit segment stack height.)

As another example, one slow reader that holds onto a Snapshot or onto an Iterator for a long time can hold onto a lot of resources. Worst case is the reader's Snapshot or Iterator may delay the reclaimation of large, old segments, where incoming mutations have obsoleted the immutable segments that the reader is still holding onto.

Performance

Please try go test -bench=. for some basic performance tests.

Contributing changes

Please see the CONTRIBUTING.md document.

Documentation

Overview

Package moss stands for "memory-oriented sorted segments", and provides a data structure that manages an ordered Collection of key-val entries, with optional persistence.

The design is similar to a simplified LSM tree (log structured merge tree), but is more like a "LSM array", in that a stack of immutable, sorted key-val arrays or "segments" is maintained. When there's an incoming Batch of key-val mutations (see: ExecuteBatch()), the Batch, which is an array of key-val mutations, is sorted in-place and becomes an immutable "segment". Then, the segment is atomically pushed onto a stack of segment pointers. A higher segment in the stack will shadow mutations of the same key from lower segments.

Separately, an asynchronous goroutine (the "merger") will continuously merge N sorted segments to keep stack height low.

In the best case, a remaining, single, large sorted segment will be efficient in memory usage and efficient for binary search and range iteration.

Iterations when the stack height is > 1 are implementing using a N-way heap merge.

A Batch and a segment is actually two arrays: a byte array of contiguous key-val entries; and an uint64 array of entry offsets and key-val lengths that refer to the previous key-val entries byte array.

In this design, stacks are treated as immutable via a copy-on-write approach whenever a stack is "modified". So, readers and writers essentially don't block each other, and taking a Snapshot is also a relatively simple operation of atomically cloning the stack of segment pointers.

Of note: mutations are only supported through Batch operations, which acknowledges the common practice of using batching to achieve higher write performance and embraces it. Additionally, higher performance can be attained by using the batch memory pre-allocation parameters and the Batch.Alloc() API, allowing applications to serialize keys and vals directly into memory maintained by a batch, which can avoid extra memory copying.

IMPORTANT: The keys in a Batch must be unique. That is, myBatch.Set("x", "foo"); myBatch.Set("x", "bar") is not supported. Applications that do not naturally meet this requirement might maintain their own map[key]val data structures to ensure this uniqueness constraint.

An optional, asynchronous persistence goroutine (the "persister") can drain mutations to a lower level, ordered key-value storage layer. An optional, built-in storage layer ("mossStore") is available, that will asynchronously write segments to the end of a file (append only design), with reads performed using mmap(), and with user controllable compaction configuration. See: OpenStoreCollection().

NOTE: the mossStore persistence design does not currently support moving files created on one machine endian'ness type to another machine with a different endian'ness type.

Index

Constants

View Source
const OperationDel = uint64(0x0200000000000000)

OperationDel removes the value associated with the key.

View Source
const OperationMerge = uint64(0x0300000000000000)

OperationMerge merges the new value with the existing value associated with the key, as described by the configured MergeOperator.

View Source
const OperationSet = uint64(0x0100000000000000)

OperationSet replaces the value associated with the key.

Variables

View Source
var AllocationGranularity = STORE_PAGE_SIZE
View Source
var BASIC_SEGMENT_KIND = "a"

BASIC_SEGMENT_KIND is the code for a basic, persistable segment implementation, which represents a segment as two arrays: an array of contiguous key-val bytes [key0, val0, key1, val1, ... keyN, valN], and an array of offsets plus lengths into the first array.

View Source
var CompactionAllow = CompactionConcern(1)

CompactionAllow means compaction decision is automated and based on the configed policy and parameters, such as CompactionPercentage.

View Source
var CompactionDisable = CompactionConcern(0)

CompactionDisable means no compaction.

View Source
var CompactionForce = CompactionConcern(2)

CompactionForce means compaction should be performed immediately.

View Source
var DefaultCollectionOptions = CollectionOptions{
	MergeOperator:          nil,
	MinMergePercentage:     0.8,
	MaxPreMergerBatches:    10,
	MergerCancelCheckEvery: 1024 * 1024,
	Debug:                  0,
	Log:                    nil,
}

DefaultCollectionOptions are the default configuration options.

View Source
var DefaultNaiveSeekToMaxTries = 100

DefaultNaiveSeekToMaxTries is the max number of attempts a forward iterator.SeekTo() will loop using simple Next()'s before giving up and starting a binary search for a given, forward seekToKey.

View Source
var DefaultStoreOptions = StoreOptions{
	CompactionBufferPages: 512,
}

DefaultStoreOptions are the default store options when the application hasn't provided a meaningful configuration value.

View Source
var ErrAllocTooLarge = errors.New("alloc-too-large")

ErrAllocTooLarge is returned when the requested allocation cannot be satisfied by the pre-allocated buffer.

View Source
var ErrAlreadyInitialized = errors.New("already-initialized")

ErrAlreadyInitialized is returned when initialization was attempted on an already initialized object.

View Source
var ErrCanceled = errors.New("canceled")

ErrCanceled is used when an operation has been canceled.

View Source
var ErrClosed = errors.New("closed")

ErrClosed is returned when the collection is already closed.

View Source
var ErrIteratorDone = errors.New("iterator-done")

ErrIteratorDone is returned when the iterator has reached the end range of the iterator or the end of the collection.

View Source
var ErrMaxTries = errors.New("max-tries")

ErrMaxTries is returned when a max number of tries or attempts for some operation has been reached.

View Source
var ErrMergeOperatorFullMergeFailed = errors.New("merge-operator-full-merge-failed")

ErrMergeOperatorFullMergeFailed is returned when the provided MergeOperator fails during the FullMerge operations.

View Source
var ErrMergeOperatorNil = errors.New("merge-operator-nil")

ErrMergeOperatorNil is returned if a merge operation is performed without specifying a MergeOperator in the CollectionOptions.

View Source
var ErrNoValidFooter = errors.New("no-valid-footer")

ErrNoValidFooter is returned when a valid footer could not be found in a file.

View Source
var ErrUnexpected = errors.New("unexpected")

ErrUnexpected is returned on an unexpected situation.

View Source
var ErrUnimplemented = errors.New("unimplemented")

ErrUnimplemented is returned when an unimplemented feature has been used.

View Source
var EventKindBatchExecute = EventKind(6)

EventKindBatchExecute is fired when a collection has finished executing a batch.

View Source
var EventKindBatchExecuteStart = EventKind(5)

EventKindBatchExecuteStart is fired when a collection is starting to execute a batch.

View Source
var EventKindClose = EventKind(2)

EventKindClose is fired when a collection has been fully closed.

View Source
var EventKindCloseStart = EventKind(1)

EventKindCloseStart is fired when a collection.Close() has begun. The closing might take awhile to complete and an EventKindClose will follow later.

View Source
var EventKindMergerProgress = EventKind(3)

EventKindMergerProgress is fired when the merger has completed a round of merge processing.

View Source
var EventKindPersisterProgress = EventKind(4)

EventKindPersisterProgress is fired when the persister has completed a round of persistence processing.

View Source
var STORE_ENDIAN = binary.LittleEndian
View Source
var STORE_MAGIC_BEG []byte = []byte("0m1o2s")
View Source
var STORE_MAGIC_END []byte = []byte("3s4p5s")
View Source
var STORE_PAGE_SIZE = 4096
View Source
var STORE_PREFIX = "data-" // File name prefix.
View Source
var STORE_SUFFIX = ".moss" // File name suffix.
View Source
var STORE_VERSION = uint32(3)

STORE_VERSION must be bumped whenever the file format changes.

View Source
var SegmentLoaders = map[string]SegmentLoaderFunc{}

SegmentLoaders is a registry of available segment loaders, which should be immutable after process init()'ialization. It is keyed by SegmentLoc.Kind.

Functions

func ByteSliceToUint64Slice

func ByteSliceToUint64Slice(in []byte) ([]uint64, error)

ByteSliceToUint64Slice gives access to []byte as []uint64. By default, an efficient O(1) implementation of this function is used, but which requires the unsafe package. See the "safe" build tag to use an O(N) implementation that does not need the unsafe package.

func FormatFName

func FormatFName(seq int64) string

FormatFName returns a file name like "data-000123.moss" given a seq of 123.

func NewBufferedSectionWriter

func NewBufferedSectionWriter(w io.WriterAt, begPos, maxBytes int64,
	bufSize int) *bufferedSectionWriter

NewBufferedSectionWriter converts incoming Write() requests into buffered, asynchronous WriteAt()'s in a section of a file.

func NewSnapshotWrapper

func NewSnapshotWrapper(ss Snapshot, closer io.Closer) *snapshotWrapper

NewSnapshotWrapper creates a wrapper which provides ref-counting around a snapshot. The snapshot (and an optional io.Closer) will be closed when the ref-count reaches zero.

func OpenStoreCollection

func OpenStoreCollection(dir string,
	options StoreOptions,
	persistOptions StorePersistOptions) (*Store, Collection, error)

OpenStoreCollection returns collection based on a persisted store in a directory. Updates to the collection will be persisted. An empty directory starts an empty collection. Both the store and collection should be closed by the caller when done.

func ParseFNameSeq

func ParseFNameSeq(fname string) (int64, error)

ParseFNameSeq parses a file name like "data-000123.moss" into 123.

func ToOsFile

func ToOsFile(f File) *os.File

ToOsFile provides the underlying os.File for a File, if available.

func Uint64SliceToByteSlice

func Uint64SliceToByteSlice(in []uint64) ([]byte, error)

Uint64SliceToByteSlice gives access to []uint64 as []byte. By default, an efficient O(1) implementation of this function is used, but which requires the unsafe package. See the "safe" build tag to use an O(N) implementation that does not need the unsafe package.

Types

type Batch

type Batch interface {
	// Close must be invoked to release resources.
	Close() error

	// Set creates or updates an key-val entry in the Collection.  The
	// key must be unique (not repeated) within the Batch.  Set()
	// copies the key and val bytes into the Batch, so the memory
	// bytes of the key and val may be reused by the caller.
	Set(key, val []byte) error

	// Del deletes a key-val entry from the Collection.  The key must
	// be unique (not repeated) within the Batch.  Del copies the key
	// bytes into the Batch, so the memory bytes of the key may be
	// reused by the caller.  Del() on a non-existent key results in a
	// nil error.
	Del(key []byte) error

	// Merge creates or updates a key-val entry in the Collection via
	// the MergeOperator defined in the CollectionOptions.  The key
	// must be unique (not repeated) within the Batch.  Merge() copies
	// the key and val bytes into the Batch, so the memory bytes of
	// the key and val may be reused by the caller.
	Merge(key, val []byte) error

	// Alloc provides a slice of bytes "owned" by the Batch, to reduce
	// extra copying of memory.  See the Collection.NewBatch() method.
	Alloc(numBytes int) ([]byte, error)

	// AllocSet is like Set(), but the caller must provide []byte
	// parameters that came from Alloc().
	AllocSet(keyFromAlloc, valFromAlloc []byte) error

	// AllocDel is like Del(), but the caller must provide []byte
	// parameters that came from Alloc().
	AllocDel(keyFromAlloc []byte) error

	// AllocMerge is like Merge(), but the caller must provide []byte
	// parameters that came from Alloc().
	AllocMerge(keyFromAlloc, valFromAlloc []byte) error
}

A Batch is a set of mutations that will be incorporated atomically into a Collection. NOTE: the keys in a Batch must be unique.

Concurrent Batch's are allowed, but to avoid races, concurrent Batches should only be used by concurrent goroutines that can ensure the mutation keys are partitioned or non-overlapping between Batch instances.

type Collection

type Collection interface {
	// Start kicks off required background tasks.
	Start() error

	// Close synchronously stops background tasks and releases
	// resources.
	Close() error

	// Options returns the options currently being used.
	Options() CollectionOptions

	// Snapshot returns a stable Snapshot of the key-value entries.
	Snapshot() (Snapshot, error)

	// NewBatch returns a new Batch instance with preallocated
	// resources.  See the Batch.Alloc() method.
	NewBatch(totalOps, totalKeyValBytes int) (Batch, error)

	// ExecuteBatch atomically incorporates the provided Batch into
	// the Collection.  The Batch instance should be Close()'ed and
	// not reused after ExecuteBatch() returns.
	ExecuteBatch(b Batch, writeOptions WriteOptions) error

	// Stats returns stats for this collection.
	Stats() (*CollectionStats, error)
}

A Collection represents an ordered mapping of key-val entries, where a Collection is snapshot'able and atomically updatable.

func NewCollection

func NewCollection(options CollectionOptions) (
	Collection, error)

NewCollection returns a new, unstarted Collection instance.

type CollectionOptions

type CollectionOptions struct {
	// MergeOperator is an optional func provided by an application
	// that wants to use Batch.Merge()'ing.
	MergeOperator MergeOperator `json:"-"`

	// DeferredSort allows ExecuteBatch() to operate more quickly by
	// deferring the sorting of an incoming batch until it is needed
	// by a reader.  The tradeoff is that later read operations can
	// take longer as the sorting is finally performed.
	DeferredSort bool

	// MinMergePercentage allows the merger to avoid premature merging
	// of segments that are too small, where a segment X has to reach
	// a certain size percentage compared to the next lower segment
	// before segment X (and all segments above X) will be merged.
	MinMergePercentage float64

	// MaxPreMergerBatches is the max number of batches that can be
	// accepted into the collection through ExecuteBatch() and held
	// for merging but that have not been actually processed by the
	// merger yet.  When the number of held but unprocessed batches
	// reaches MaxPreMergerBatches, then ExecuteBatch() will block to
	// allow the merger to catch up.
	MaxPreMergerBatches int

	// MergerCancelCheckEvery is the number of ops the merger will
	// perform before it checks to see if a merger operation was
	// canceled.
	MergerCancelCheckEvery int

	// MaxDirtyOps, when greater than zero, is the max number of dirty
	// (unpersisted) ops allowed before ExecuteBatch() blocks to allow
	// the persister to catch up.  It only has effect with a non-nil
	// LowerLevelUpdate.
	MaxDirtyOps uint64

	// MaxDirtyKeyValBytes, when greater than zero, is the max number
	// of dirty (unpersisted) key-val bytes allowed before
	// ExecuteBatch() blocks to allow the persister to catch up.  It
	// only has effect with a non-nil LowerLevelUpdate.
	MaxDirtyKeyValBytes uint64

	// CachePersisted allows the collection to cache clean, persisted
	// key-val's, and is considered when LowerLevelUpdate is used.
	CachePersisted bool

	// LowerLevelInit is an optional Snapshot implementation that
	// initializes the lower-level storage of a Collection.  This
	// might be used, for example, for having a Collection be a
	// write-back cache in front of a persistent implementation.
	LowerLevelInit Snapshot `json:"-"`

	// LowerLevelUpdate is an optional func that is invoked when the
	// lower-level storage should be updated.
	LowerLevelUpdate LowerLevelUpdate `json:"-"`

	Debug int // Higher means more logging, when Log != nil.

	// Log is a callback invoked when the Collection needs to log a
	// debug message.  Optional, may be nil.
	Log func(format string, a ...interface{}) `json:"-"`

	// OnError is an optional callback invoked when the Collection
	// encounters an error.
	OnError func(error) `json:"-"`

	// OnEvent is an optional callback invoked on Collection related
	// processing events.  If the application's callback
	// implementation blocks, it may pause processing and progress,
	// depending on the type of callback event kind.
	OnEvent func(event Event) `json:"-"`
}

CollectionOptions allows applications to specify config settings.

type CollectionStats

type CollectionStats struct {
	TotOnError uint64

	TotCloseBeg           uint64
	TotCloseMergerDone    uint64
	TotClosePersisterDone uint64
	TotCloseLowerLevelBeg uint64
	TotCloseLowerLevelEnd uint64
	TotCloseEnd           uint64

	TotSnapshotBeg         uint64
	TotSnapshotInternalBeg uint64
	TotSnapshotInternalEnd uint64
	TotSnapshotEnd         uint64

	TotNewBatch                 uint64
	TotNewBatchTotalOps         uint64
	TotNewBatchTotalKeyValBytes uint64

	TotExecuteBatchBeg            uint64
	TotExecuteBatchErr            uint64
	TotExecuteBatchEmpty          uint64
	TotExecuteBatchWaitBeg        uint64
	TotExecuteBatchWaitEnd        uint64
	TotExecuteBatchAwakeMergerBeg uint64
	TotExecuteBatchAwakeMergerEnd uint64
	TotExecuteBatchEnd            uint64

	TotNotifyMergerBeg uint64
	TotNotifyMergerEnd uint64

	TotMergerEnd                  uint64
	TotMergerLoop                 uint64
	TotMergerLoopRepeat           uint64
	TotMergerAll                  uint64
	TotMergerInternalBeg          uint64
	TotMergerInternalErr          uint64
	TotMergerInternalEnd          uint64
	TotMergerInternalSkip         uint64
	TotMergerLowerLevelNotify     uint64
	TotMergerLowerLevelNotifySkip uint64

	TotMergerWaitIncomingBeg  uint64
	TotMergerWaitIncomingStop uint64
	TotMergerWaitIncomingEnd  uint64
	TotMergerWaitIncomingSkip uint64

	TotMergerWaitOutgoingBeg  uint64
	TotMergerWaitOutgoingStop uint64
	TotMergerWaitOutgoingEnd  uint64
	TotMergerWaitOutgoingSkip uint64

	TotPersisterLoop       uint64
	TotPersisterLoopRepeat uint64
	TotPersisterWaitBeg    uint64
	TotPersisterWaitEnd    uint64
	TotPersisterEnd        uint64

	TotPersisterLowerLevelUpdateBeg uint64
	TotPersisterLowerLevelUpdateErr uint64
	TotPersisterLowerLevelUpdateEnd uint64

	CurDirtyOps      uint64
	CurDirtyBytes    uint64
	CurDirtySegments uint64

	CurDirtyTopOps      uint64
	CurDirtyTopBytes    uint64
	CurDirtyTopSegments uint64

	CurDirtyMidOps      uint64
	CurDirtyMidBytes    uint64
	CurDirtyMidSegments uint64

	CurDirtyBaseOps      uint64
	CurDirtyBaseBytes    uint64
	CurDirtyBaseSegments uint64

	CurCleanOps      uint64
	CurCleanBytes    uint64
	CurCleanSegments uint64
}

CollectionStats fields that are prefixed like CurXxxx are gauges (can go up and down), and fields that are prefixed like TotXxxx are monotonically increasing counters.

func (*CollectionStats) AtomicCopyTo

func (s *CollectionStats) AtomicCopyTo(r *CollectionStats)

AtomicCopyTo copies stats from s to r (from source to result).

type CompactionConcern

type CompactionConcern int // See StorePersistOptions.CompactionConcern.

type EntryEx

type EntryEx struct {
	// Operation is an OperationXxx const.
	Operation uint64
}

EntryEx provides extra, advanced information about an entry from the Iterator.CurrentEx() method.

type Event

type Event struct {
	Kind       EventKind
	Collection Collection
	Duration   time.Duration
}

Event represents the information provided in an OnEvent() callback.

type EventKind

type EventKind int

EventKind represents an event code for OnEvent() callbacks.

type File

type File interface {
	io.ReaderAt
	io.WriterAt
	io.Closer
	Stat() (os.FileInfo, error)
	Sync() error
	Truncate(size int64) error
}

The File interface is implemented by os.File. App specific implementations may add concurrency, caching, stats, fuzzing, etc.

type FileRef

type FileRef struct {
	// contains filtered or unexported fields
}

FileRef provides a ref-counting wrapper around a File.

func (*FileRef) AddRef

func (r *FileRef) AddRef() File

AddRef increases the ref-count on the file ref.

func (*FileRef) Close

func (r *FileRef) Close() error

func (*FileRef) DecRef

func (r *FileRef) DecRef() (err error)

DecRef decreases the ref-count on the file ref, and closing the underlying file when the ref-count reaches zero.

func (*FileRef) OnAfterClose

func (r *FileRef) OnAfterClose(cb func())

OnAfterClose registers event callback func's that are invoked after the file is closed.

func (*FileRef) OnBeforeClose

func (r *FileRef) OnBeforeClose(cb func())

OnBeforeClose registers event callback func's that are invoked before the file is closed.

type Footer struct {
	SegmentLocs SegmentLocs // Persisted; older SegmentLoc's come first.
	// contains filtered or unexported fields
}

Footer represents a footer record persisted in a file, and also implements the moss.Snapshot interface.

func ReadFooter

func ReadFooter(options *StoreOptions, file File) (*Footer, error)

ReadFooter reads the last valid Footer from a file.

func ScanFooter

func ScanFooter(options *StoreOptions, fref *FileRef, fileName string,
	pos int64) (*Footer, error)

ScanFooter scans a file backwards from the given pos for a valid Footer, adding ref-counts to fref on success.

func (*Footer) AddRef

func (f *Footer) AddRef()

func (*Footer) Close

func (f *Footer) Close() error

func (*Footer) DecRef

func (f *Footer) DecRef()

func (*Footer) Get

func (f *Footer) Get(key []byte, readOptions ReadOptions) ([]byte, error)

Get retrieves a val from the footer, and will return nil val if the entry does not exist in the footer.

func (*Footer) SegmentStack

func (f *Footer) SegmentStack() (SegmentLocs, *segmentStack)

SegmentStack() returns the current SegmentLocs and segmentStack for a footer, while also incrementing the ref-count on the footer. The caller must DecRef() the footer when done.

func (*Footer) StartIterator

func (f *Footer) StartIterator(startKeyIncl, endKeyExcl []byte,
	iteratorOptions IteratorOptions) (Iterator, error)

StartIterator returns a new Iterator instance on this footer.

On success, the returned Iterator will be positioned so that Iterator.Current() will either provide the first entry in the range or ErrIteratorDone.

A startKeyIncl of nil means the logical "bottom-most" possible key and an endKeyExcl of nil means the logical "top-most" possible key.

type Header struct {
	Version       uint32 // The file format / STORE_VERSION.
	CreatedAt     string
	CreatedEndian string // The endian() of the file creator.
}

Header represents the JSON stored at the head of a file, where the file header bytes should be less than STORE_PAGE_SIZE length.

type InitCloser

type InitCloser interface {
	InitCloser(io.Closer) error
}

An InitCloser holds onto an io.Closer, and is used for chaining io.Closer's. That is, we often want the closing of one resource to close related resources.

type Iterator

type Iterator interface {
	// Close must be invoked to release resources.
	Close() error

	// Next moves the Iterator to the next key-val entry and will
	// return ErrIteratorDone if the Iterator is done.
	Next() error

	// SeekTo moves the Iterator to the lowest key-val entry whose key
	// is >= the given seekToKey, and will return ErrIteratorDone if
	// the Iterator is done.  SeekTo() will respect the
	// startKeyInclusive/endKeyExclusive bounds, if any, that were
	// specified with StartIterator().  Seeking to before the
	// startKeyInclusive will end up on the first key.  Seeking to or
	// after the endKeyExclusive will result in ErrIteratorDone.
	SeekTo(seekToKey []byte) error

	// Current returns ErrIteratorDone if the iterator is done.
	// Otherwise, Current() returns the current key and val, which
	// should be treated as immutable or read-only.  The key and val
	// bytes will remain available until the next call to Next() or
	// Close().
	Current() (key, val []byte, err error)

	// CurrentEx is a more advanced form of Current() that returns
	// more metadata for each entry.  It is more useful when used with
	// IteratorOptions.IncludeDeletions of true.  It returns
	// ErrIteratorDone if the iterator is done.  Otherwise, the
	// current EntryEx, key, val are returned, which should be treated
	// as immutable or read-only.
	CurrentEx() (entryEx EntryEx, key, val []byte, err error)
}

An Iterator allows enumeration of key-val entries.

type IteratorOptions

type IteratorOptions struct {
	// IncludeDeletions is an advanced flag that specifies that an
	// Iterator should include deletion operations in its enuemration.
	// See also the Iterator.CurrentEx() method.
	IncludeDeletions bool

	// SkipLowerLevel is an advanced flag that specifies that an
	// Iterator should not enumerate key-val entries from the
	// optional, chained, lower-level iterator.  See
	// CollectionOptions.LowerLevelInit/LowerLevelUpdate.
	SkipLowerLevel bool

	// MinSegmentLevel is an advanced parameter that specifies that an
	// Iterator should skip segments at a level less than
	// MinSegmentLevel.  MinSegmentLevel is 0-based level, like an
	// array index.
	MinSegmentLevel int

	// MaxSegmentHeight is an advanced parameter that specifies that
	// an Iterator should skip segments at a level >= than
	// MaxSegmentHeight.  MaxSegmentHeight is 1-based height, like an
	// array length.
	MaxSegmentHeight int
	// contains filtered or unexported fields
}

IteratorOptions are provided to StartIterator().

type LowerLevelUpdate

type LowerLevelUpdate func(higher Snapshot) (lower Snapshot, err error)

LowerLevelUpdate is the func callback signature used when a Collection wants to update its optional, lower-level storage.

type MergeOperator

type MergeOperator interface {
	// Name returns an identifier for this merge operator, which might
	// be used for logging / debugging.
	Name() string

	// FullMerge the full sequence of operands on top of an
	// existingValue and returns the merged value.  The existingValue
	// may be nil if no value currently exists.  If full merge cannot
	// be done, return (nil, false).
	FullMerge(key, existingValue []byte, operands [][]byte) ([]byte, bool)

	// Partially merge two operands.  If partial merge cannot be done,
	// return (nil, false), which will defer processing until a later
	// FullMerge().
	PartialMerge(key, leftOperand, rightOperand []byte) ([]byte, bool)
}

A MergeOperator may be implemented by applications that wish to optimize their read-compute-write use cases. Write-heavy counters, for example, could be implemented efficiently by using the MergeOperator functionality.

type MergeOperatorStringAppend

type MergeOperatorStringAppend struct {
	Sep string // The separator string between operands.
	// contains filtered or unexported fields
}

MergeOperatorStringAppend implements a simple merger that appends strings. It was originally built for testing and sample purposes.

func (*MergeOperatorStringAppend) FullMerge

func (mo *MergeOperatorStringAppend) FullMerge(key, existingValue []byte,
	operands [][]byte) ([]byte, bool)

func (*MergeOperatorStringAppend) Name

func (mo *MergeOperatorStringAppend) Name() string

func (*MergeOperatorStringAppend) PartialMerge

func (mo *MergeOperatorStringAppend) PartialMerge(key,
	leftOperand, rightOperand []byte) ([]byte, bool)

type OpenFile

type OpenFile func(name string, flag int, perm os.FileMode) (File, error)

The OpenFile func signature is similar to os.OpenFile().

type OsFile

type OsFile interface {
	OsFile() *os.File
}

OsFile interface allows conversion from a File to an os.File.

type ReadOptions

type ReadOptions struct {
}

ReadOptions are provided to Snapshot.Get().

type Segment

type Segment interface {
	// Returns the kind of segment, used for persistence.
	Kind() string

	// Len returns the number of ops in the segment.
	Len() int

	// NumKeyValBytes returns the number of bytes used for key-val data.
	NumKeyValBytes() (uint64, uint64)

	// FindStartKeyInclusivePos() returns the logical entry position for
	// the given (inclusive) start key.  With segment keys of [b, d, f],
	// looking for 'c' will return 1.  Looking for 'd' will return 1.
	// Looking for 'g' will return 3.  Looking for 'a' will return 0.
	FindStartKeyInclusivePos(startKeyInclusive []byte) int

	// GetOperationKeyVal() returns the operation, key, val for a given
	// logical entry position in the segment.
	GetOperationKeyVal(pos int) (operation uint64, key []byte, val []byte)

	// Returns true if the segment is already sorted, and returns
	// false if the sorting is only asynchronously scheduled.
	RequestSort(synchronous bool) bool
}

A Segment represents the read-oriented interface for a segment.

type SegmentLoaderFunc

type SegmentLoaderFunc func(
	sloc *SegmentLoc, kvs []uint64, buf []byte) (Segment, error)

A SegmentLoaderFunc is able to load a segment from a SegmentLoc.

type SegmentLoc

type SegmentLoc struct {
	Kind string // Used as the key for SegmentLoaders.

	KvsOffset uint64 // Byte offset within the file.
	KvsBytes  uint64 // Number of bytes for the persisted segment.kvs.

	BufOffset uint64 // Byte offset within the file.
	BufBytes  uint64 // Number of bytes for the persisted segment.buf.

	TotOpsSet  uint64
	TotOpsDel  uint64
	TotKeyByte uint64
	TotValByte uint64
	// contains filtered or unexported fields
}

SegmentLoc represents a persisted segment.

func (*SegmentLoc) TotOps

func (sloc *SegmentLoc) TotOps() int

TotOps returns number of ops in a segment loc.

type SegmentLocs

type SegmentLocs []SegmentLoc

func (SegmentLocs) AddRef

func (slocs SegmentLocs) AddRef()

func (SegmentLocs) Close

func (slocs SegmentLocs) Close() error

func (SegmentLocs) DecRef

func (slocs SegmentLocs) DecRef()

type SegmentMutator

type SegmentMutator interface {
	Mutate(operation uint64, key, val []byte) error
}

A SegmentMutator represents the mutation methods of a segment.

type SegmentPersister

type SegmentPersister interface {
	Persist(file File) (SegmentLoc, error)
}

A SegmentPersister represents a segment that can be persisted.

type SegmentStackStats

type SegmentStackStats struct {
	CurOps      uint64
	CurBytes    uint64 // Counts key-val bytes only, not metadata.
	CurSegments uint64
}

SegmentStackStats represents the stats for a segmentStack.

func (*SegmentStackStats) AddTo

func (sss *SegmentStackStats) AddTo(dest *SegmentStackStats)

AddTo adds the values from this SegmentStackStats to the dest SegmentStackStats.

type Snapshot

type Snapshot interface {
	// Close must be invoked to release resources.
	Close() error

	// Get retrieves a val from the Snapshot, and will return nil val
	// if the entry does not exist in the Snapshot.
	Get(key []byte, readOptions ReadOptions) ([]byte, error)

	// StartIterator returns a new Iterator instance on this Snapshot.
	//
	// On success, the returned Iterator will be positioned so that
	// Iterator.Current() will either provide the first entry in the
	// range or ErrIteratorDone.
	//
	// A startKeyInclusive of nil means the logical "bottom-most"
	// possible key and an endKeyExclusive of nil means the logical
	// key that's above the "top-most" possible key.
	StartIterator(startKeyInclusive, endKeyExclusive []byte,
		iteratorOptions IteratorOptions) (Iterator, error)
}

A Snapshot is a stable view of a Collection for readers, isolated from concurrent mutation activity.

type Store

type Store struct {
	// contains filtered or unexported fields
}

Store represents data persisted in a directory.

func OpenStore

func OpenStore(dir string, options StoreOptions) (*Store, error)

OpenStore returns a store instance for a directory. An empty directory results in an empty store.

func (*Store) AddRef

func (s *Store) AddRef()

func (*Store) Close

func (s *Store) Close() error

func (*Store) Dir

func (s *Store) Dir() string

func (*Store) OpenCollection

func (store *Store) OpenCollection(
	options StoreOptions,
	persistOptions StorePersistOptions) (Collection, error)

OpenCollection opens a collection based on a store. Applications should open at most a single collection per store for performing read/write work.

func (*Store) Options

func (s *Store) Options() StoreOptions

func (*Store) Persist

func (s *Store) Persist(higher Snapshot, persistOptions StorePersistOptions) (
	Snapshot, error)

Persist helps the store implement the lower-level-update func. The higher snapshot may be nil.

func (*Store) Snapshot

func (s *Store) Snapshot() (Snapshot, error)

func (*Store) SnapshotPrevious

func (s *Store) SnapshotPrevious(ss Snapshot) (Snapshot, error)

SnapshotPrevious returns the next older, previous snapshot based on a given snapshot, allowing the application to walk backwards into the history of a store at previous points in time. The given snapshot must come from the same store. A nil returned snapshot means no previous snapshot is available. Of note, store compactions will trim previous history from a store.

func (*Store) SnapshotRevert

func (s *Store) SnapshotRevert(revertTo Snapshot) error

SnapshotRevert atomically and durably brings the store back to the point-in-time as represented by the revertTo snapshot. SnapshotRevert() should only be passed a snapshot that came from the same store, such as from using Store.Snapshot() or Store.SnapshotPrevious().

SnapshotRevert() must not be invoked concurrently with Store.Persist(), so it is recommended that SnapshotRevert() should be invoked only after the collection has been Close()'ed, which helps ensure that you are not racing with concurrent, background persistence goroutines.

SnapshotRevert() can fail if the given snapshot is too old, especially w.r.t. compactions. For example, navigate back to an older snapshot X via SnapshotPrevious(). Then, do a full compaction. Then, SnapshotRevert(X) will give an error.

func (*Store) Stats

func (s *Store) Stats() (map[string]interface{}, error)

Stats returns a map of stats.

type StoreOptions

type StoreOptions struct {
	// CollectionOptions should be the same as used with
	// NewCollection().
	CollectionOptions CollectionOptions

	// CompactionPercentage determines when a compaction will run when
	// CompactionConcern is CompactionAllowed.  When the percentage of
	// ops between the non-base level and the base level is greater
	// than CompactionPercentage, then compaction will be run.
	CompactionPercentage float64

	// CompactionBufferPages is the number of pages to use for
	// compaction, where writes are buffered before flushing to disk.
	CompactionBufferPages int

	// CompactionSync of true means perform a file sync at the end of
	// compaction for additional safety.
	CompactionSync bool

	// OpenFile allows apps to optionally provide their own file
	// opening implementation.  When nil, os.OpenFile() is used.
	OpenFile OpenFile `json:"-"`

	// Log is a callback invoked when store needs to log a debug
	// message.  Optional, may be nil.
	Log func(format string, a ...interface{}) `json:"-"`

	// KeepFiles means that unused, obsoleted files will not be
	// removed during OpenStore().  Keeping old files might be useful
	// when diagnosing file corruption cases.
	KeepFiles bool
}

StoreOptions are provided to OpenStore().

type StorePersistOptions

type StorePersistOptions struct {
	// NoSync means do not perform a file sync at the end of
	// persistence (before returning from the Store.Persist() method).
	// Using NoSync of true might provide better performance, but at
	// the cost of data safety.
	NoSync bool

	// CompactionConcern controls whether compaction is allowed or
	// forced as part of persistence.
	CompactionConcern CompactionConcern
}

StorePersistOptions are provided to Store.Persist().

type WriteOptions

type WriteOptions struct {
}

WriteOptions are provided to Collection.ExecuteBatch().

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL