Documentation

Overview

    Package mapio implements a sorted, on-disk map, similar to the SSTable data structure used in Bigtable [1], Cassandra [2], and others. Maps are read-only, and are produced by a Writer. Each Writer expects keys to be appended in lexicographic order. Buf provides a means of buffering writes to be sorted before appended to a Writer.

    Mapio's on-disk layout loosely follows that of LevelDB [3]. Each Map is a sequence of blocks; each block comprises a sequence of entries, followed by a trailer:

    block := blockEntry* blockTrailer
    blockEntry :=
    	nshared:   uvarint           // number of bytes shared with previous key
    	nunshared: uvarint           // number of new bytes in this entry's key
    	nvalue:    uvarint           // number of bytes in value
    	key:       uint8[nunshared]  // the (prefix compressed) key
    	value:     uint8[nvalue]     // the entry's value
    blockTrailer :=
    	restarts:  uint32[nrestart]  // array of key restarts
    	nrestart:  uint32            // size of restart array
    	type:      uint8             // block type (should be 0; reserved for future use)
    	crc32:     uint32            // IEEE crc32 of contents and trailer
    

    Maps prefix compress each key by storing the number of bytes shared with the previous key. Maps contain a number of restart points: points at which the full key is specified (and nshared = 0). The restart point are stored in an array in the block trailer. This array can be used to perform binary search for keys.

    A Map is a sequence of data blocks, followed by an index block, followed by a trailer.

    map := block(data)* block(meta)* block(index) mapTrailer
    mapTrailer :=
    	meta:	blockAddr[20]  // zero-padded address of the meta block index (tbd)
    	index:  blockAddr[20]  // zero-padded address of index
    	magic:	uint64         // magic (0xa8b2374e8558bc76)
    blockAddr :=
    	offset: uvarint        // offset of block in map
    	len:    uvarint        // length of block
    

    The index block contains one entry for each block in the map: each entry's key is the last key in that block; the entry's value is a blockAddr containing the position of that block. This arrangement allows the reader to binary search the index block then search the found block.

    [1] https://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf [2] https://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf [3] https://github.com/google/leveldb

    Index

    Constants

    This section is empty.

    Variables

    This section is empty.

    Functions

    This section is empty.

    Types

    type Buf

    type Buf struct {
    	// contains filtered or unexported fields
    }

      A Buf is an unordered write buffer for maps. It holds entries in memory; these are then sorted and written to a map.

      func (*Buf) Append

      func (b *Buf) Append(key, value []byte)

        Append append the given entry to the buffer.

        func (*Buf) Len

        func (b *Buf) Len() int

          Len implements sort.Interface

          func (*Buf) Less

          func (b *Buf) Less(i, j int) bool

            Less implements sort.Interface

            func (*Buf) Size

            func (b *Buf) Size() int

              Size returns the number size of this buffer in bytes.

              func (*Buf) Swap

              func (b *Buf) Swap(i, j int)

                Swap implements sort.Interface

                func (*Buf) WriteTo

                func (b *Buf) WriteTo(w *Writer) error

                  WriteTo sorts and then writes all of the entries in this buffer to the provided writer.

                  type Map

                  type Map struct {
                  	// contains filtered or unexported fields
                  }

                    Map is a read-only, sorted map backed by an io.ReadSeeker. The on-disk layout of maps are described by the package documentation. Maps support both lookup and (ordered) iteration. A Map instance maintains a current position, starting out at the first entry.

                    func New

                    func New(r io.ReadSeeker) (*Map, error)

                      New opens the map at the provided io.ReadSeeker (usually a file).

                      func (*Map) Seek

                      func (m *Map) Seek(key []byte) *MapScanner

                        Seek returns a map scanner beginning at the first key in the map >= the provided key.

                        type MapScanner

                        type MapScanner struct {
                        	// contains filtered or unexported fields
                        }

                          MapScanner implements ordered iteration over a map.

                          func (*MapScanner) Err

                          func (m *MapScanner) Err() error

                            Err returns the last error encountered while scanning.

                            func (*MapScanner) Key

                            func (m *MapScanner) Key() []byte

                              Key returns the key that was last scanned.

                              func (*MapScanner) Scan

                              func (m *MapScanner) Scan() bool

                                Scan scans the next entry, returning true on success. When Scan returns false, the caller should inspect Err to distinguish between scan completion and scan error.

                                func (*MapScanner) Value

                                func (m *MapScanner) Value() []byte

                                  Value returns the value that was last scanned.

                                  type Merged

                                  type Merged []*Map

                                    Merged represents the merged contents of multiple underlying maps. Like Map, Merged presents a sorted, scannable map, but it does not guarantee that the order of traversal is stable.

                                    func (Merged) Seek

                                    func (m Merged) Seek(key []byte) *MergedScanner

                                      Seek returns a scanner for the merged map that starts at the first entry where entryKey <= key.

                                      type MergedScanner

                                      type MergedScanner []*MapScanner

                                        MergedScanner is a scanner for merged maps.

                                        func (MergedScanner) Err

                                        func (m MergedScanner) Err() error

                                          Err returns the last error encountered while scanning, if any.

                                          func (MergedScanner) Key

                                          func (m MergedScanner) Key() []byte

                                            Key returns the last key scanned.

                                            func (MergedScanner) Len

                                            func (m MergedScanner) Len() int

                                              Len implements heap.Interface

                                              func (MergedScanner) Less

                                              func (m MergedScanner) Less(i, j int) bool

                                                Less implements heap.Interface

                                                func (*MergedScanner) Pop

                                                func (m *MergedScanner) Pop() interface{}

                                                  Pop implements heap.Interface

                                                  func (*MergedScanner) Push

                                                  func (m *MergedScanner) Push(x interface{})

                                                    Push implements heap.Interface

                                                    func (*MergedScanner) Scan

                                                    func (m *MergedScanner) Scan() bool

                                                      Scan scans the next entry in the merged map, returning true on success. If Scan returns false, the caller should check Err to distinguish between scan completion and scan error.

                                                      func (MergedScanner) Swap

                                                      func (m MergedScanner) Swap(i, j int)

                                                        Swap implements heap.Interface

                                                        func (MergedScanner) Value

                                                        func (m MergedScanner) Value() []byte

                                                          Value returns the last value scanned.

                                                          type Scanner

                                                          type Scanner interface {
                                                          	// Scan scans the next entry, returning true on success, after which
                                                          	// time the entry is available to inspect using the Key and Value
                                                          	// methods.
                                                          	Scan() bool
                                                          	// Err returns the last error encountered while scanning, if any.
                                                          	Err() error
                                                          	// Key returns the key of the last scanned entry.
                                                          	Key() []byte
                                                          	// Value returns the value of the last scanned entry.
                                                          	Value() []byte
                                                          }

                                                            Scanner is an ordered iterator over map entries.

                                                            type WriteOption

                                                            type WriteOption func(*Writer)

                                                              WriteOption represents a tunable writer parameter.

                                                              func BlockSize

                                                              func BlockSize(sz int) WriteOption

                                                                BlockSize sets the writer's target block size to sz (in bytes). Note that keys and values cannot straddle blocks, so that if large data are added to a map, block sizes can grow large. The default target block size is 4KB.

                                                                func RestartInterval

                                                                func RestartInterval(iv int) WriteOption

                                                                  RestartInterval sets the writer's restart interval to provided value. The default restart interval is 16.

                                                                  type Writer

                                                                  type Writer struct {
                                                                  	// contains filtered or unexported fields
                                                                  }

                                                                    A Writer appends key-value pairs to a map. Keys must be appended in lexicographic order.

                                                                    func NewWriter

                                                                    func NewWriter(w io.Writer, opts ...WriteOption) *Writer

                                                                      NewWriter returns a new Writer that writes a map to the provided io.Writer. BlockSize specifies the target block size, while restartInterval determines the frequency of key restart points, which trades off lookup performance with size. See package docs for more details.

                                                                      func (*Writer) Append

                                                                      func (w *Writer) Append(key, value []byte) error

                                                                        Append appends an entry to the maps. Keys must be provided in lexicographic order.

                                                                        func (*Writer) Close

                                                                        func (w *Writer) Close() error

                                                                          Close flushes the last block of the writer and writes the map's trailer. After successful close, the map is ready to be opened.

                                                                          func (*Writer) Flush

                                                                          func (w *Writer) Flush() error

                                                                            Flush creates a new block with the current contents. It forces the creation of a new block, and overrides the Writer's block size parameter.