lsmt

package module
v1.5.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 14, 2024 License: AGPL-3.0 Imports: 13 Imported by: 0

README

LSMT package

LSMT is a robust, single-level embedded Log Structured Merge Tree implementation, developed entirely in Go. It is designed to deliver efficient data storage and retrieval solutions.

This package utilizes an in-memory AVL tree, known as a memtable, for temporarily storing key-value pairs. These pairs are then flushed to Sorted String Tables (SSTables) on disk. When the number of SSTables reaches a specified threshold, the compaction process is triggered.

This process merges multiple SSTables into fewer ones, reducing file count and minimizing disk I/O for read operations. Additionally, the system maintains a minimum number of SSTables to further optimize read performance.

Benchmarking

v1.4.0 Benchmark

11th Gen Intel(R) Core(TM) i7-11700K @ 3.60GHz UBuntu with WDC WDS500G2B0A-00SM50(HDD)

We put 1 MILLION keys in 8.5s 8 seconds Write speed is roughly 117,647 keys per second with this setup.

Features

  • Memtable - The use of an in-memory AVL tree (memtable) allows for fast insertions and lookups. By accumulating writes in memory, the implementation reduces the number of disk I/O operations.
  • Batch Writes to SSTables - Instead of writing each key-value pair immediately to disk, the system flushes the memtable to an SSTable when it reaches a predefined size (memtableFlushSize). This batching improves write performance.
  • Compaction Strategy - The compaction process merges multiple SSTables into fewer ones, reducing the number of files and thus the amount of disk I/O needed for reads. The implementation also ensures that a minimum number of SSTables is retained to optimize read performance.
  • Efficient Key Lookups - The use of binary search on SSTables improves the efficiency of key lookups. Each SSTable is stored sorted, allowing fast search operations.
  • Range Queries - The implementation supports various range queries (e.g., Range, GreaterThan, LessThan), which can be optimized for both the memtable and SSTables.
  • Concurrent Access - The use of read-write mutexes allows concurrent reads while ensuring safe writes to the memtable, which can improve performance in multi-threaded environments.
  • Tombstones for Deletions - Instead of physically removing key-value pairs from SSTables, tombstones are written to represent deletions. This avoids the overhead of immediate compaction and allows the system to manage deletions in a more efficient way.
  • File Management - The implementation supports splitting large SSTables into smaller ones, which can help maintain read performance by keeping SSTables manageable in size.
  • Lazy Loading of SSTables - SSTables are only loaded into memory when necessary, minimizing memory usage and startup time.
  • WAL for Durability - The implementation uses a write-ahead log (WAL) to ensure durability. The WAL records all write operations before they are applied to the memtable, providing a way to recover the system in case of a crash.
  • Transaction Support - The implementation supports transactions, allowing multiple write operations to be grouped together and applied atomically to the memtable.

Usage

Importing

github.com/guycipher/lsmt
// Create a new LSM-tree in the specified directory
directory := "data"

// You can specify the directory, file permissions, max memtable size (amount of keyv's), and compaction interval (amount of ssTables before compaction), amount of minimum sstables after compaction
l, err := lsmt.New(directory, os.FileMode(0777), 10, 5, 2)
if err != nil {
    fmt.Println("Error creating LSM-tree:", err)
    return
}

defer os.RemoveAll(directory) // Clean up after use

// Successfully created the LSM-tree
fmt.Println("LSM-tree created successfully!")

Put

You can insert a value into a key using the Put method. If you try to insert a key that already exists, the value will be updated.

// Assume lsmt is already created
// Insert key-value pairs into the LSM-tree
if err := l.Put([]byte("key1"), []byte("value1")); err != nil {
    fmt.Println("Error inserting key1:", err)
}
if err := l.Put([]byte("key2"), []byte("value2")); err != nil {
    fmt.Println("Error inserting key2:", err)
}

fmt.Println("Key-value pairs inserted successfully!")

Get

To get a value you can you the Get method. The get method will return all the keys values.

// Assume lsmt is already created and populated
value, err := l.Get([]byte("key1"))
if err != nil {
    fmt.Println("Error retrieving key1:", err)
} else {
    fmt.Println("Retrieved value for key1:", string(value))
}

NGet

To get all keys not equal to the key you can use the NGet method.

// Assume lsmt is already created and populated
keys, values, err:= l.NGet([]byte("key1"))
if err != nil {
    fmt.Println("Error retrieving key1:", err)
} else {
    fmt.Println("Retrieved values not equal to key1:", string(value))
}

Delete

Delete key2

// Assume lsmt is already created
if err := l.Delete([]byte("key2")); err != nil {
    fmt.Println("Error deleting key2:", err)
} else {
    fmt.Println("key2 marked for deletion.")
}

Range

Get all keys between key56 and key100

// Assume lsmt is already created and populated
keys, values, err := l.Range([]byte("key56"), []byte("key100"))
if err != nil {
    log.Fatal(err)
}
for i, key := range keys {
    fmt.Printf("Key: %s, Value: %s\n", string(key), string(values[i]))
}

NRange

Get all keys not between key1 and key3

// Assume lsmt is already created and populated
keys, values, err := l.NRange([]byte("key1"), []byte("key3"))
if err != nil {
    log.Fatal(err)
}
for i, key := range keys {
    fmt.Printf("Key: %s, Value: %s\n", string(key), string(values[i]))
}

GreaterThan

Get all keys greater than key1

// Assume lsmt is already created and populated
keys, values, err := l.GreaterThan([]byte("key1"))
if err != nil {
    fmt.Println("Error retrieving key1:", err)
} else {
    fmt.Println("Retrieved value for key1:", string(value))
}

GreaterThanEqual

Get all keys greater than or equal to key1

// Assume lsmt is already created and populated
keys, values, err := l.GreaterThanEqual([]byte("key1"))
if err != nil {
    fmt.Println("Error retrieving key1:", err)
} else {
    fmt.Println("Retrieved value for key1:", string(value))
}

LessThan

Get all keys less than key1

// Assume lsmt is already created and populated
keys, values, err := l.LessThan([]byte("key1"))
if err != nil {
    fmt.Println("Error retrieving key1:", err)
} else {
    fmt.Println("Retrieved value for key1:", string(value))
}

LessThanEqual

Get all keys less than or equal to key1

// Assume lsmt is already created and populated
keys, values, err := l.LessThanEqual([]byte("key1"))
if err != nil {
    fmt.Println("Error retrieving key1:", err)
} else {
    fmt.Println("Retrieved value for key1:", string(value))
}

Compaction

// Assume lsmt is already created and populated
if err := l.Compact(); err != nil {
    fmt.Println("Error compacting LSM-tree:", err)
} else {
    fmt.Println("LSM-tree compacted successfully!")
}

Transactions

// Start a new transaction
tx := l.BeginTransaction()

// Add a put operation to the transaction
tx.AddPut([]byte("key1"), []byte("value1"))

// Add a delete operation to the transaction
tx.AddDelete([]byte("key2"))

// Commit the transaction
if err := l.CommitTransaction(tx); err != nil {
fmt.Println("Error committing transaction:", err)
}

Rollback
// Abort the transaction
l.RollbackTransaction(tx)

WAL Recovery

// Assume lsmt is already created
ops, err := l.GetWAL().Recover()
if err != nil {
    fmt.Println("Error recovering WAL:", err)
} else {
    err := l.RunRecoveredOperations(ops)
    if err != nil {
        fmt.Println("Error running recovered operations:", err)
    }

    fmt.Println("Recovered operations:", ops)
}

Close

Flushes the memtable to disk and closes all opened sstables

// Assume lsmt is already created and populated
if err := l.Close(); err != nil {
    fmt.Println("Error closing LSM-tree:", err)
} else {
    fmt.Println("LSM-tree closed successfully!")
}

Documentation

Overview

Package lsmt provides a single-level embedded log-structured merge-tree (LSM-tree) Copyright (C) Alex Gaetano Padula

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

Package lsmt File pager implementation Copyright (C) Alex Gaetano Padula

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

Index

Constants

View Source
const HEADER_SIZE = 256 // next (overflowed)
View Source
const PAGE_SIZE = 1024 // Page size
View Source
const SSTABLE_EXTENSION = ".sst"
View Source
const TOMBSTONE_VALUE = "$tombstone"
View Source
const WAL_EXTENSION = ".wal"

Variables

This section is empty.

Functions

This section is empty.

Types

type KeyValue

type KeyValue struct {
	Key   []byte
	Value []byte
}

KeyValue is a struct representing a key-value pair.

type LSMT

type LSMT struct {
	// contains filtered or unexported fields
}

LSMT is the main struct for the log-structured merge-tree.

func New

func New(directory string, directoryPerm os.FileMode, memtableFlushSize, compactionInterval int, minimumSSTables int) (*LSMT, error)

New creates a new LSM-tree or opens an existing one.

func (*LSMT) BeginTransaction

func (l *LSMT) BeginTransaction() *Transaction

BeginTransaction starts a new transaction.

func (*LSMT) Close

func (l *LSMT) Close() error

Close closes the LSM-tree gracefully closing all opened SSTable files.

func (*LSMT) CommitTransaction

func (l *LSMT) CommitTransaction(tx *Transaction) error

CommitTransaction commits a transaction.

func (*LSMT) Compact

func (l *LSMT) Compact() error

Compact compacts the LSM-tree by merging all SSTables into a single SSTable.

func (*LSMT) Delete

func (l *LSMT) Delete(key []byte) error

Delete removes a key from the LSM-tree.

func (*LSMT) Get

func (l *LSMT) Get(key []byte) ([]byte, error)

Get retrieves the value for a given key from the LSM-tree.

func (*LSMT) GetWal added in v1.5.0

func (l *LSMT) GetWal() *Wal

GetWal returns the write-ahead log.

func (*LSMT) GreaterThan

func (l *LSMT) GreaterThan(key []byte) ([][]byte, [][]byte, error)

GreaterThan retrieves all key-value pairs greater than the key from the LSM-tree.

func (*LSMT) GreaterThanEqual

func (l *LSMT) GreaterThanEqual(key []byte) ([][]byte, [][]byte, error)

GreaterThanEqual retrieves all key-value pairs greater than or equal to the key from the LSM-tree.

func (*LSMT) LessThan

func (l *LSMT) LessThan(key []byte) ([][]byte, [][]byte, error)

LessThan retrieves all key-value pairs less than the key from the LSM-tree.

func (*LSMT) LessThanEqual

func (l *LSMT) LessThanEqual(key []byte) ([][]byte, [][]byte, error)

LessThanEqual retrieves all key-value pairs less than or equal to the key from the LSM-tree.

func (*LSMT) NGet

func (l *LSMT) NGet(key []byte) ([][]byte, [][]byte, error)

NGet retrieves all key-value pairs not equal to the key from the LSM-tree.

func (*LSMT) NRange

func (l *LSMT) NRange(start, end []byte) ([][]byte, [][]byte, error)

NRange retrieves all key-value pairs not within a given range from the LSM-tree.

func (*LSMT) Put

func (l *LSMT) Put(key, value []byte) error

Put inserts a key-value pair into the LSM-tree.

func (*LSMT) Range

func (l *LSMT) Range(start, end []byte) ([][]byte, [][]byte, error)

Range retrieves all key-value pairs within a given range from the LSM-tree.

func (*LSMT) RollbackTransaction

func (l *LSMT) RollbackTransaction(tx *Transaction)

RollbackTransaction aborts a transaction.

func (*LSMT) RunRecoveredOperations

func (l *LSMT) RunRecoveredOperations(operations []Operation) error

RunRecoveredOperations runs the recovered operations from the write-ahead log.

func (*LSMT) SplitSSTable

func (l *LSMT) SplitSSTable(sstable *SSTable, n int) ([]*SSTable, error)

SplitSSTable splits a compacted SSTable into n smaller SSTables.

type Operation

type Operation struct {
	Type  OperationType
	Key   []byte // The key of the operation.
	Value []byte // Only used for OpPut
}

Operation is a struct representing an operation in a transaction.

type OperationType

type OperationType int

OperationType is an enum representing the type of operation.

const (
	OpPut OperationType = iota
	OpDelete
)

type Pager added in v1.4.0

type Pager struct {
	StatLock *sync.RWMutex // lock for stats
	// contains filtered or unexported fields
}

Pager manages pages in a file

func OpenPager added in v1.4.0

func OpenPager(filename string, flag int, perm os.FileMode) (*Pager, error)

OpenPager opens a file for page management

func (*Pager) Analyze added in v1.4.0

func (p *Pager) Analyze() error

Analyze forces the pager to analyze the file

func (*Pager) Close added in v1.4.0

func (p *Pager) Close() error

Close closes the file

func (*Pager) Count added in v1.4.0

func (p *Pager) Count() int64

Count returns the number of pages

func (*Pager) DeletePage added in v1.4.0

func (p *Pager) DeletePage(pageID int64) error

DeletePage deletes a page

func (*Pager) GetDeletedPages added in v1.4.0

func (p *Pager) GetDeletedPages() []int64

GetDeletedPages returns the list of deleted pages

func (*Pager) GetPage added in v1.4.0

func (p *Pager) GetPage(pageID int64) ([]byte, error)

GetPage gets a page and returns the data Will gather all the pages that are linked together

func (*Pager) PagesCount added in v1.5.0

func (p *Pager) PagesCount() int64

PagesCount returns the number of pages

func (*Pager) Size added in v1.5.0

func (p *Pager) Size() int64

Size returns the size of the file

func (*Pager) Write added in v1.4.0

func (p *Pager) Write(data []byte) (int64, error)

Write writes data to the next available page

func (*Pager) WriteTo added in v1.4.0

func (p *Pager) WriteTo(pageID int64, data []byte) error

WriteTo writes data to a specific page

type SSTable

type SSTable struct {
	// contains filtered or unexported fields
}

SSTable is a struct representing a sorted string table.

type SSTableIterator

type SSTableIterator struct {
	// contains filtered or unexported fields
}

SSTableIterator is an iterator for SSTable.

func (*SSTableIterator) Next

func (it *SSTableIterator) Next() (*KeyValue, error)

Next returns the next key-value pair from the SSTable.

func (*SSTableIterator) Ok added in v1.4.0

func (it *SSTableIterator) Ok() bool

Ok returns whether the iterator is valid.

type Transaction

type Transaction struct {
	Operations []Operation // List of operations in the transaction.
	Aborted    bool        // Whether the transaction has been aborted.
}

Transaction is a struct representing a transaction.

func (*Transaction) AddDelete

func (tx *Transaction) AddDelete(key []byte)

AddDelete adds a delete operation to a transaction.

func (*Transaction) AddPut

func (tx *Transaction) AddPut(key, value []byte)

AddPut adds a put operation to a transaction.

type Wal

type Wal struct {
	// contains filtered or unexported fields
}

Wal is a struct representing a write-ahead log.

func (*Wal) Recover

func (wal *Wal) Recover() ([]Operation, error)

Recover reads the write-ahead log and recovers the operations.

func (*Wal) WriteOperation

func (wal *Wal) WriteOperation(op Operation) error

WriteOperation writes an operation to the write-ahead log.

Directories

Path Synopsis
Package avl implements an AVL tree Copyright (C) Alex Gaetano Padula
Package avl implements an AVL tree Copyright (C) Alex Gaetano Padula

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL