kv

package
v0.0.0-...-c80a3f0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 20, 2022 License: Apache-2.0 Imports: 12 Imported by: 0

README

Ethdb package hold's bouquet of objects to access DB

Words "KV" and "DB" have special meaning here:

  • KV - key-value-style API to access data: let developer manage transactions, stateful cursors.
  • DB - object-oriented-style API to access data: Get/Put/Delete/WalkOverTable/MultiPut, managing transactions internally.

So, DB abstraction fits 95% times and leads to more maintainable code - because it's looks stateless.

About "key-value-style": Modern key-value databases don't provide Get/Put/Delete methods, because it's very hard-drive-unfriendly - it pushes developers do random-disk-access which is order of magnitude slower than sequential read. To enforce sequential-reads - introduced stateful cursors/iterators - they intentionally look as file-api: open_cursor/seek/write_data_from_current_position/move_to_end/step_back/step_forward/delete_key_on_current_position/append.

Class diagram:

// This is not call graph, just show classes from low-level to high-level. 
// And show which classes satisfy which interfaces.

                    +-----------------------------------+   +-----------------------------------+ 
                    |  github.com/torquem-ch/mdbx-go    |   | google.golang.org/grpc.ClientConn |                    
                    |  (app-agnostic MDBX go bindings)  |   | (app-agnostic RPC and streaming)  |
                    +-----------------------------------+   +-----------------------------------+
                                      |                                      |
                                      |                                      |
                                      v                                      v
                    +-----------------------------------+   +-----------------------------------+
                    |       ethdb/kv_mdbx.go            |   |       ethdb/kv_remote.go          |                
                    |  (tg-specific MDBX implementaion) |   |   (tg-specific remote DB access)  |              
                    +-----------------------------------+   +-----------------------------------+
                                      |                                      |
                                      |                                      |
                                      v                                      v
            +----------------------------------------------------------------------------------------------+
            |                                       ethdb/kv_abstract.go                                   |  
            |         (Common KV interface. DB-friendly, disk-friendly, cpu-cache-friendly.                |
            |           Same app code can work with local or remote database.                              |
            |           Allows experiment with another database implementations.                           |
            |          Supports context.Context for cancelation. Any operation can return error)           |
            +----------------------------------------------------------------------------------------------+

ethdb.AbstractKV design:

  • InMemory, ReadOnly: NewMDBX().Flags(mdbx.ReadOnly).InMem().Open()

  • MultipleDatabases, Customization: NewMDBX().Path(path).WithBucketsConfig(config).Open()

  • 1 Transaction object can be used only withing 1 goroutine.

  • Only 1 write transaction can be active at a time (other will wait).

  • Unlimited read transactions can be active concurrently (not blocked by write transaction).

  • Methods db.Update, db.View - can be used to open and close short transaction.

  • Methods Begin/Commit/Rollback - for long transaction.

  • it's safe to call .Rollback() after .Commit(), multiple rollbacks are also safe. Common transaction patter:

tx, err := db.Begin(true, ethdb.RW)
if err != nil {
    return err
}
defer tx.Rollback() // important to avoid transactions leak at panic or early return

// ... code which uses database in transaction
 
err := tx.Commit()
if err != nil {
    return err
}
  • No internal copies/allocations. It means: 1. app must copy keys/values before put to database. 2. Data after read from db - valid only during current transaction - copy it if plan use data after transaction Commit/Rollback.

  • Methods .Bucket() and .Cursor(), can’t return nil, can't return error.

  • Bucket and Cursor - are interfaces - means different classes can satisfy it: for example MdbxCursor and MdbxDupSortCursor classes satisfy it. If your are not familiar with "DupSort" concept, please read dupsort.md first.

  • If Cursor returns err!=nil then key SHOULD be != nil (can be []byte{} for example). Then traversal code look as:

for k, v, err := c.First(); k != nil; k, v, err = c.Next() {
    if err != nil {
        return err
    }
    // logic
}
  • Move cursor: cursor.Seek(key)

ethdb.Database design:

  • Allows pass multiple implementations
  • Allows traversal tables by db.Walk

ethdb.TxDb design:

  • holds inside 1 long-running transaction and 1 cursor per table
  • method Begin DOESN'T create new TxDb object, it means this object can be passed into other objects by pointer, and high-level app code can start/commit transactions when it needs without re-creating all objects which holds TxDb pointer.
  • This is reason why txDb.CommitAndBegin() method works: inside it creating new transaction object, pinter to TxDb stays valid.

How to dump/load table

Install all database tools: make db-tools

./build/bin/mdbx_dump -a <datadir>/erigon/chaindata | lz4 > dump.lz4
lz4 -d < dump.lz4 | ./build/bin/mdbx_load -an <datadir>/erigon/chaindata

How to get table checksum

./build/bin/mdbx_dump -s table_name <datadir>/erigon/chaindata | tail -n +4 | sha256sum # tail here is for excluding header 

Header example:
VERSION=3
geometry=l268435456,c268435456,u25769803776,s268435456,g268435456
mapsize=756375552
maxreaders=120
format=bytevalue
database=TBL0001
type=btree
db_pagesize=4096
duplicates=1
dupsort=1
HEADER=END

Documentation

Index

Constants

View Source
const (

	//HashedAccounts
	// key - address hash
	// value - account encoded for storage
	// Contains Storage:
	//key - address hash + incarnation + storage key hash
	//value - storage value(common.hash)
	HashedAccounts = "HashedAccount"
	HashedStorage  = "HashedStorage"
)
View Source
const (

	//key - contract code hash
	//value - contract code
	Code = "Code"

	//key - addressHash+incarnation
	//value - code hash
	ContractCode = "HashedCodeHash"

	// IncarnationMap for deleted accounts
	//key - address
	//value - incarnation of account when it was last deleted
	IncarnationMap = "IncarnationMap"

	//TEVMCode -
	//key - contract code hash
	//value - contract TEVM code
	ContractTEVMCode = "TEVMCode"
)
View Source
const (
	// DatabaseInfo is used to store information about data layout.
	DatabaseInfo = "DbInfo"

	// Data item prefixes (use single byte to avoid mixing data types, avoid `i`, used for indexes).
	HeaderNumber = "HeaderNumber" // header_hash -> num_u64

	HeaderCanonical = "CanonicalHeader"        // block_num_u64 -> header hash
	Headers         = "Header"                 // block_num_u64 + hash -> header (RLP)
	HeaderTD        = "HeadersTotalDifficulty" // block_num_u64 + hash -> td (RLP)

	BlockBody = "BlockBody" // block_num_u64 + hash -> block body

	// EthTx - stores only txs of canonical blocks. As a result - id's used in this table are also
	// canonical - same across all nodex in network - regardless reorgs. Transactions of
	// non-canonical blocs are not removed, but moved to NonCanonicalTransaction - then during re-org don't
	// need re-download block from network.
	// Also this table has system-txs before and after block: if
	// block has no system-tx - records are absent, but sequence increasing
	EthTx           = "BlockTransaction"        // tbl_sequence_u64 -> rlp(tx)
	NonCanonicalTxs = "NonCanonicalTransaction" // tbl_sequence_u64 -> rlp(tx)
	MaxTxNum        = "MaxTxNum"                // block_number_u64 -> max_tx_num_in_block_u64

	Receipts = "Receipt"        // block_num_u64 -> canonical block receipts (non-canonical are not stored)
	Log      = "TransactionLog" // block_num_u64 + txId -> logs of transaction

	// Stores bitmap indices - in which block numbers saw logs of given 'address' or 'topic'
	// [addr or topic] + [2 bytes inverted shard number] -> bitmap(blockN)
	// indices are sharded - because some bitmaps are >1Mb and when new incoming blocks process it
	//	 updates ~300 of bitmaps - by append small amount new values. It cause much big writes (MDBX does copy-on-write).
	//
	// if last existing shard size merge it with delta
	// if serialized size of delta > ShardLimit - break down to multiple shards
	// shard number - it's biggest value in bitmap
	LogTopicIndex   = "LogTopicIndex"
	LogAddressIndex = "LogAddressIndex"

	// CallTraceSet is the name of the table that contain the mapping of block number to the set (sorted) of all accounts
	// touched by call traces. It is DupSort-ed table
	// 8-byte BE block number -> account address -> two bits (one for "from", another for "to")
	CallTraceSet = "CallTraceSet"
	// Indices for call traces - have the same format as LogTopicIndex and LogAddressIndex
	// Store bitmap indices - in which block number we saw calls from (CallFromIndex) or to (CallToIndex) some addresses
	CallFromIndex = "CallFromIndex"
	CallToIndex   = "CallToIndex"

	// Cumulative indexes for estimation of stage execution
	CumulativeGasIndex         = "CumulativeGasIndex"
	CumulativeTransactionIndex = "CumulativeTransactionIndex"

	TxLookup = "BlockTransactionLookup" // hash -> transaction/receipt lookup metadata

	ConfigTable = "Config" // config prefix for the db

	// Progress of sync stages: stageName -> stageData
	SyncStageProgress = "SyncStage"

	Clique             = "Clique"
	CliqueSeparate     = "CliqueSeparate"
	CliqueSnapshot     = "CliqueSnapshot"
	CliqueLastSnapshot = "CliqueLastSnapshot"

	// Snapshot table used for Binance Smart Chain's consensus engine Parlia
	// Schema of key/value pairs containing:
	// Key (string): SnapshotFullKey = SnapshotBucket + num (uint64 big endian) + hash
	// Value (JSON blob):
	// {
	//     "number"             // Block number where the snapshot was created
	//     "hash"               // Block hash where the snapshot was created
	//     "validators"         // Set of authorized validators at this moment
	//     "recents"            // Set of recent validators for spam protections
	//     "recent_fork_hashes" // Set of recent forkHash
	// }
	ParliaSnapshot = "ParliaSnapshot"

	// Proof-of-stake
	// Beacon chain head that is been executed at the current time
	CurrentExecutionPayload = "CurrentExecutionPayload"

	// NodeRecords stores P2P node records (ENR)
	NodeRecords = "NodeRecord"
	// Inodes stores P2P discovery service info about the nodes
	Inodes = "Inode"

	// Transaction senders - stored separately from the block bodies
	Senders = "TxSender" // block_num_u64 + blockHash -> sendersList (no serialization format, every 20 bytes is new sender)

	// headBlockKey tracks the latest know full block's hash.
	HeadBlockKey = "LastBlock"

	HeadHeaderKey = "LastHeader"

	// headBlockHash, safeBlockHash, finalizedBlockHash of the latest Engine API forkchoice
	LastForkchoice = "LastForkchoice"

	// TransitionBlockKey tracks the last proof-of-work block
	TransitionBlockKey = "TransitionBlock"

	// migrationName -> serialized SyncStageProgress and SyncStageUnwind buckets
	// it stores stages progress to understand in which context was executed migration
	// in case of bug-report developer can ask content of this bucket
	Migrations = "Migration"

	Sequence = "Sequence" // tbl_name -> seq_u64

	Epoch        = "DevEpoch"        // block_num_u64+block_hash->transition_proof
	PendingEpoch = "DevPendingEpoch" // block_num_u64+block_hash->transition_proof

	Issuance = "Issuance" // block_num_u64->RLP(issuance+burnt[0 if < london])

	StateAccounts   = "StateAccounts"
	StateStorage    = "StateStorage"
	StateCode       = "StateCode"
	StateCommitment = "StateCommitment"

	BorReceipts = "BorReceipt"
	BorTxLookup = "BlockBorTransactionLookup" // transaction_hash -> block_num_u64
	BorSeparate = "BorSeparate"

	// Downloader
	BittorrentCompletion = "BittorrentCompletion"
	BittorrentInfo       = "BittorrentInfo"

	// Domains and Inverted Indices
	AccountKeys        = "AccountKeys"
	AccountVals        = "AccountVals"
	AccountHistoryKeys = "AccountHistoryKeys"
	AccountHistoryVals = "AccountHistoryVals"
	AccountSettings    = "AccountSettings"
	AccountIdx         = "AccountIdx"

	StorageKeys        = "StorageKeys"
	StorageVals        = "StorageVals"
	StorageHistoryKeys = "StorageHistoryKeys"
	StorageHistoryVals = "StorageHistoryVals"
	StorageSettings    = "StorageSettings"
	StorageIdx         = "StorageIdx"

	CodeKeys        = "CodeKeys"
	CodeVals        = "CodeVals"
	CodeHistoryKeys = "CodeHistoryKeys"
	CodeHistoryVals = "CodeHistoryVals"
	CodeSettings    = "CodeSettings"
	CodeIdx         = "CodeIdx"

	CommitmentKeys        = "CommitmentKeys"
	CommitmentVals        = "CommitmentVals"
	CommitmentHistoryKeys = "CommitmentHistoryKeys"
	CommitmentHistoryVals = "CommitmentHistoryVals"
	CommitmentSettings    = "CommitmentSettings"
	CommitmentIdx         = "CommitmentIdx"

	LogAddressKeys = "LogAddressKeys"
	LogAddressIdx  = "LogAddressIdx"
	LogTopicsKeys  = "LogTopicsKeys"
	LogTopicsIdx   = "LogTopicsIdx"

	TracesFromKeys = "TracesFromKeys"
	TracesFromIdx  = "TracesFromIdx"
	TracesToKeys   = "TracesToKeys"
	TracesToIdx    = "TracesToIdx"

	Snapshots = "Snapshots" // name -> hash

	RAccountKeys = "RAccountKeys"
	RAccountIdx  = "RAccountIdx"
	RStorageKeys = "RStorageKeys"
	RStorageIdx  = "RStorageIdx"
	RCodeKeys    = "RCodeKeys"
	RCodeIdx     = "RCodeIdx"

	PlainStateR    = "PlainStateR"    // temporary table for PlainState reconstitution
	PlainStateD    = "PlainStateD"    // temporary table for PlainStare reconstitution, deletes
	CodeR          = "CodeR"          // temporary table for Code reconstitution
	CodeD          = "CodeD"          // temporary table for Code reconstitution, deletes
	PlainContractR = "PlainContractR" // temporary table for PlainContract reconstitution
	PlainContractD = "PlainContractD" // temporary table for PlainContract reconstitution, deletes

	// Erigon-CL
	BeaconState = "BeaconState"
	// [slot + block root] => [signature + block without execution payload]
	BeaconBlocks = "BeaconBlock"

	// LightClientStore => LightClientStore object
	// LightClientFinalityUpdate => latest finality update
	// LightClientOptimisticUpdate => latest optimistic update
	LightClient = "LightClient"
	// Period (one every 27 hours) => LightClientUpdate
	LightClientUpdates = "LightClientUpdates"
)
View Source
const (
	RecentLocalTransaction = "RecentLocalTransaction" // sequence_u64 -> tx_hash
	PoolTransaction        = "PoolTransaction"        // txHash -> sender_id_u64+tx_rlp
	PoolInfo               = "PoolInfo"               // option_key -> option_value
)
View Source
const AccountChangeSet = "AccountChangeSet"

AccountChangeSet and StorageChangeSet - of block N store values of state before block N changed them. Because values "after" change stored in PlainState. Logical format:

key - blockNum_u64 + key_in_plain_state
value - value_in_plain_state_before_blockNum_changes

Example: If block N changed account A from value X to Y. Then:

AccountChangeSet has record: bigEndian(N) + A -> X
PlainState has record: A -> Y

See also: docs/programmers_guide/db_walkthrough.MD#table-history-of-accounts

As you can see if block N changes much accounts - then all records have repetitive prefix `bigEndian(N)`. MDBX can store such prefixes only once - by DupSort feature (see `docs/programmers_guide/dupsort.md`). Both buckets are DupSort-ed and have physical format: AccountChangeSet:

key - blockNum_u64
value - address + account(encoded)

StorageChangeSet:

key - blockNum_u64 + address + incarnation_u64
value - plain_storage_key + value
View Source
const AccountsHistory = "AccountHistory"

AccountsHistory and StorageHistory - indices designed to serve next 2 type of requests: 1. what is smallest block number >= X where account A changed 2. get last shard of A - to append there new block numbers

Task 1. is part of "get historical state" operation (see `core/state:GetAsOf`): If `db.Seek(A+bigEndian(X))` returns non-last shard -

then get block number from shard value Y := RoaringBitmap(shard_value).GetGte(X)
and with Y go to ChangeSets: db.Get(ChangeSets, Y+A)

If `db.Seek(A+bigEndian(X))` returns last shard -

then we go to PlainState: db.Get(PlainState, A)

Format:

  • index split to shards by 2Kb - RoaringBitmap encoded sorted list of block numbers (to avoid performance degradation of popular accounts or look deep into history. Also 2Kb allows avoid Overflow pages inside DB.)
  • if shard is not last - then key has suffix 8 bytes = bigEndian(max_block_num_in_this_shard)
  • if shard is last - then key has suffix 8 bytes = 0xFF

It allows:

  • server task 1. by 1 db operation db.Seek(A+bigEndian(X))
  • server task 2. by 1 db operation db.Get(A+0xFF)

see also: docs/programmers_guide/db_walkthrough.MD#table-change-sets

AccountsHistory:

key - address + shard_id_u64
value - roaring bitmap  - list of block where it changed

StorageHistory

key - address + storage_key + shard_id_u64
value - roaring bitmap - list of block where it changed
View Source
const PlainContractCode = "PlainCodeHash"

PlainContractCode - key - address+incarnation value - code hash

View Source
const PlainState = "PlainState"

PlainState logical layout:

Contains Accounts:
  key - address (unhashed)
  value - account encoded for storage
Contains Storage:
  key - address (unhashed) + incarnation + storage key (unhashed)
  value - storage value(common.hash)

Physical layout:

PlainState and HashedStorage utilises DupSort feature of MDBX (store multiple values inside 1 key).

-------------------------------------------------------------

key              |            value

------------------------------------------------------------- [acc_hash] | [acc_value] [acc_hash]+[inc] | [storage1_hash]+[storage1_value]

| [storage2_hash]+[storage2_value] // this value has no own key. it's 2nd value of [acc_hash]+[inc] key.
| [storage3_hash]+[storage3_value]
| ...

[acc_hash]+[old_inc] | [storage1_hash]+[storage1_value]

| ...

[acc2_hash] | [acc2_value]

...
View Source
const ReadersLimit = 32000 // MDBX_READERS_LIMIT=32767
View Source
const StorageChangeSet = "StorageChangeSet"
View Source
const StorageHistory = "StorageHistory"
View Source
const TrieOfAccounts = "TrieAccount"

TrieOfAccounts and TrieOfStorage hasState,groups - mark prefixes existing in hashed_account table hasTree - mark prefixes existing in trie_account table (not related with branchNodes) hasHash - mark prefixes which hashes are saved in current trie_account record (actually only hashes of branchNodes can be saved) @see UnmarshalTrieNode @see integrity.Trie

+-----------------------------------------------------------------------------------------------------+ | DB record: 0x0B, hasState: 0b1011, hasTree: 0b1001, hasHash: 0b1001, hashes: [x,x] | +-----------------------------------------------------------------------------------------------------+

|                                           |                               |
v                                           |                               v

+---------------------------------------------+ | +--------------------------------------+ | DB record: 0x0B00, hasState: 0b10001 | | | DB record: 0x0B03, hasState: 0b10010 | | hasTree: 0, hasHash: 0b10000, hashes: [x] | | | hasTree: 0, hasHash: 0, hashes: [] | +---------------------------------------------+ | +--------------------------------------+

|                    |                              |                         |                  |
v                    v                              v                         v                  v

+------------------+ +----------------------+ +---------------+ +---------------+ +---------------+ | Account: | | BranchNode: 0x0B0004 | | Account: | | Account: | | Account: | | 0x0B0000... | | has no record in | | 0x0B01... | | 0x0B0301... | | 0x0B0304... | | in HashedAccount | | TrieAccount | | | | | | | +------------------+ +----------------------+ +---------------+ +---------------+ +---------------+

                           |                |
                           v                v
		           +---------------+  +---------------+
		           | Account:      |  | Account:      |
		           | 0x0B000400... |  | 0x0B000401... |
		           +---------------+  +---------------+

Invariants: - hasTree is subset of hasState - hasHash is subset of hasState - first level in account_trie always exists if hasState>0 - TrieStorage record of account.root (length=40) must have +1 hash - it's account.root - each record in TrieAccount table must have parent (may be not direct) and this parent must have correct bit in hasTree bitmap - if hasState has bit - then HashedAccount table must have record according to this bit - each TrieAccount record must cover some state (means hasState is always > 0) - TrieAccount records with length=1 can satisfy (hasBranch==0&&hasHash==0) condition - Other records in TrieAccount and TrieStorage must (hasTree!=0 || hasHash!=0)

View Source
const TrieOfStorage = "TrieStorage"
View Source
const VerkleRoots = "VerkleRoots"

Mapping [block number] => [Verkle Root]

View Source
const VerkleTrie = "VerkleTrie"

Mapping [Verkle Root] => [Rlp-Encoded Verkle Node]

Variables

View Source
var (
	ErrAttemptToDeleteNonDeprecatedBucket = errors.New("only buckets from dbutils.ChaindataDeprecatedTables can be deleted")
	ErrUnknownBucket                      = errors.New("unknown bucket. add it to dbutils.ChaindataTables")

	DbSize    = metrics.NewCounter(`db_size`)    //nolint
	TxLimit   = metrics.NewCounter(`tx_limit`)   //nolint
	TxSpill   = metrics.NewCounter(`tx_spill`)   //nolint
	TxUnspill = metrics.NewCounter(`tx_unspill`) //nolint
	TxDirty   = metrics.NewCounter(`tx_dirty`)   //nolint

	DbCommitPreparation = metrics.GetOrCreateSummary(`db_commit_seconds{phase="preparation"}`) //nolint
	DbCommitGc          = metrics.GetOrCreateSummary(`db_commit_seconds{phase="gc"}`)          //nolint
	DbCommitAudit       = metrics.GetOrCreateSummary(`db_commit_seconds{phase="audit"}`)       //nolint
	DbCommitWrite       = metrics.GetOrCreateSummary(`db_commit_seconds{phase="write"}`)       //nolint
	DbCommitSync        = metrics.GetOrCreateSummary(`db_commit_seconds{phase="sync"}`)        //nolint
	DbCommitEnding      = metrics.GetOrCreateSummary(`db_commit_seconds{phase="ending"}`)      //nolint
	DbCommitTotal       = metrics.GetOrCreateSummary(`db_commit_seconds{phase="total"}`)       //nolint

	DbPgopsNewly   = metrics.NewCounter(`db_pgops_newly`)           //nolint
	DbPgopsCow     = metrics.NewCounter(`db_pgops_cow`)             //nolint
	DbPgopsClone   = metrics.NewCounter(`db_pgops_clone`)           //nolint
	DbPgopsSplit   = metrics.NewCounter(`db_pgops_split`)           //nolint
	DbPgopsMerge   = metrics.NewCounter(`db_pgops_merge`)           //nolint
	DbPgopsSpill   = metrics.NewCounter(`db_pgops_spill`)           //nolint
	DbPgopsUnspill = metrics.NewCounter(`db_pgops_unspill`)         //nolint
	DbPgopsWops    = metrics.NewCounter(`db_pgops_wops`)            //nolint
	DbPgopsGcrtime = metrics.GetOrCreateSummary(`db_pgops_gcrtime`) //nolint

	GcLeafMetric     = metrics.NewCounter(`db_gc_leaf`)     //nolint
	GcOverflowMetric = metrics.NewCounter(`db_gc_overflow`) //nolint
	GcPagesMetric    = metrics.NewCounter(`db_gc_pages`)    //nolint

)
View Source
var (
	//StorageModeTEVM - does not translate EVM to TEVM
	StorageModeTEVM = []byte("smTEVM")

	PruneTypeOlder  = []byte("older")
	PruneTypeBefore = []byte("before")

	PruneHistory        = []byte("pruneHistory")
	PruneHistoryType    = []byte("pruneHistoryType")
	PruneReceipts       = []byte("pruneReceipts")
	PruneReceiptsType   = []byte("pruneReceiptsType")
	PruneTxIndex        = []byte("pruneTxIndex")
	PruneTxIndexType    = []byte("pruneTxIndexType")
	PruneCallTraces     = []byte("pruneCallTraces")
	PruneCallTracesType = []byte("pruneCallTracesType")

	DBSchemaVersionKey = []byte("dbVersion")

	BittorrentPeerID            = "peerID"
	CurrentHeadersSnapshotHash  = []byte("CurrentHeadersSnapshotHash")
	CurrentHeadersSnapshotBlock = []byte("CurrentHeadersSnapshotBlock")
	CurrentBodiesSnapshotHash   = []byte("CurrentBodiesSnapshotHash")
	CurrentBodiesSnapshotBlock  = []byte("CurrentBodiesSnapshotBlock")
	PlainStateVersion           = []byte("PlainStateVersion")

	LightClientStore            = []byte("LightClientStore")
	LightClientFinalityUpdate   = []byte("LightClientFinalityUpdate")
	LightClientOptimisticUpdate = []byte("LightClientOptimisticUpdate")
)

Keys

View Source
var ChaindataDeprecatedTables = []string{
	Clique,
	TransitionBlockKey,
}

ChaindataDeprecatedTables - list of buckets which can be programmatically deleted - for example after migration

View Source
var ChaindataTables = []string{}/* 101 elements not displayed */

ChaindataTables - list of all buckets. App will panic if some bucket is not in this list. This list will be sorted in `init` method. ChaindataTablesCfg - can be used to find index in sorted version of ChaindataTables list by name

View Source
var ChaindataTablesCfg = TableCfg{
	HashedStorage: {
		Flags:                     DupSort,
		AutoDupSortKeysConversion: true,
		DupFromLen:                72,
		DupToLen:                  40,
	},
	AccountChangeSet: {Flags: DupSort},
	StorageChangeSet: {Flags: DupSort},
	PlainState: {
		Flags:                     DupSort,
		AutoDupSortKeysConversion: true,
		DupFromLen:                60,
		DupToLen:                  28,
	},
	CallTraceSet: {Flags: DupSort},

	AccountKeys:           {Flags: DupSort},
	AccountHistoryKeys:    {Flags: DupSort},
	AccountIdx:            {Flags: DupSort},
	StorageKeys:           {Flags: DupSort},
	StorageHistoryKeys:    {Flags: DupSort},
	StorageIdx:            {Flags: DupSort},
	CodeKeys:              {Flags: DupSort},
	CodeHistoryKeys:       {Flags: DupSort},
	CodeIdx:               {Flags: DupSort},
	CommitmentKeys:        {Flags: DupSort},
	CommitmentHistoryKeys: {Flags: DupSort},
	CommitmentIdx:         {Flags: DupSort},
	LogAddressKeys:        {Flags: DupSort},
	LogAddressIdx:         {Flags: DupSort},
	LogTopicsKeys:         {Flags: DupSort},
	LogTopicsIdx:          {Flags: DupSort},
	TracesFromKeys:        {Flags: DupSort},
	TracesFromIdx:         {Flags: DupSort},
	TracesToKeys:          {Flags: DupSort},
	TracesToIdx:           {Flags: DupSort},
	RAccountKeys:          {Flags: DupSort},
	RAccountIdx:           {Flags: DupSort},
	RStorageKeys:          {Flags: DupSort},
	RStorageIdx:           {Flags: DupSort},
	RCodeKeys:             {Flags: DupSort},
	RCodeIdx:              {Flags: DupSort},
}
View Source
var DBSchemaVersion = types.VersionReply{Major: 6, Minor: 0, Patch: 0}

DBSchemaVersion versions list 5.0 - BlockTransaction table now has canonical ids (txs of non-canonical blocks moving to NonCanonicalTransaction table) 6.0 - BlockTransaction table now has system-txs before and after block (records are absent if block has no system-tx, but sequence increasing)

View Source
var DownloaderTables = []string{
	BittorrentCompletion,
	BittorrentInfo,
}
View Source
var DownloaderTablesCfg = TableCfg{}
View Source
var ErrChanged = fmt.Errorf("key must not change")
View Source
var ErrNotSupported = errors.New("not supported")
View Source
var ReconTablesCfg = TableCfg{
	PlainStateD:    {Flags: DupSort},
	CodeD:          {Flags: DupSort},
	PlainContractD: {Flags: DupSort},
}
View Source
var SentryTables = []string{}
View Source
var SentryTablesCfg = TableCfg{}
View Source
var TxpoolTablesCfg = TableCfg{}

Functions

func BigChunks

func BigChunks(db RoDB, table string, from []byte, walker func(tx Tx, k, v []byte) (bool, error)) error

BigChunks - read `table` by big chunks - restart read transaction after each 1 minutes

func DefaultPageSize

func DefaultPageSize() uint64

func EnsureNotChangedBool

func EnsureNotChangedBool(tx GetPut, bucket string, k []byte, value bool) (ok, enabled bool, err error)

EnsureNotChangedBool - used to store immutable config flags in db. protects from human mistakes

func FirstKey

func FirstKey(tx Tx, table string) ([]byte, error)

FirstKey - candidate on move to kv.Tx interface

func GetBool

func GetBool(tx Getter, bucket string, k []byte) (enabled bool, err error)

func LastKey

func LastKey(tx Tx, table string) ([]byte, error)

LastKey - candidate on move to kv.Tx interface

func ReadAhead

func ReadAhead(ctx context.Context, db RoDB, progress *atomic.Bool, table string, from []byte, amount uint32)

Types

type Bucket

type Bucket string

type BucketMigrator

type BucketMigrator interface {
	DropBucket(string) error
	CreateBucket(string) error
	ExistsBucket(string) (bool, error)
	ClearBucket(string) error
	ListBuckets() ([]string, error)
}

BucketMigrator used for buckets migration, don't use it in usual app code

type Closer

type Closer interface {
	Close()
}

type CmpFunc

type CmpFunc func(k1, k2, v1, v2 []byte) int

type Cursor

type Cursor interface {
	First() ([]byte, []byte, error)               // First - position at first key/data item
	Seek(seek []byte) ([]byte, []byte, error)     // Seek - position at first key greater than or equal to specified key
	SeekExact(key []byte) ([]byte, []byte, error) // SeekExact - position at exact matching key if exists
	Next() ([]byte, []byte, error)                // Next - position at next key/value (can iterate over DupSort key/values automatically)
	Prev() ([]byte, []byte, error)                // Prev - position at previous key
	Last() ([]byte, []byte, error)                // Last - position at last key and last possible value
	Current() ([]byte, []byte, error)             // Current - return key/data at current cursor position

	Count() (uint64, error) // Count - fast way to calculate amount of keys in bucket. It counts all keys even if Prefix was set.

	Close()
}

Cursor - class for navigating through a database CursorDupSort are inherit this class

If methods (like First/Next/Seek) return error, then returned key SHOULD not be nil (can be []byte{} for example). Then looping code will look as: c := kv.Cursor(bucketName)

for k, v, err := c.First(); k != nil; k, v, err = c.Next() {
   if err != nil {
       return err
   }
   ... logic
}

type CursorDupSort

type CursorDupSort interface {
	Cursor

	// SeekBothExact -
	// second parameter can be nil only if searched key has no duplicates, or return error
	SeekBothExact(key, value []byte) ([]byte, []byte, error)
	SeekBothRange(key, value []byte) ([]byte, error) // SeekBothRange - exact match of the key, but range match of the value
	FirstDup() ([]byte, error)                       // FirstDup - position at first data item of current key
	NextDup() ([]byte, []byte, error)                // NextDup - position at next data item of current key
	NextNoDup() ([]byte, []byte, error)              // NextNoDup - position at first data item of next key
	LastDup() ([]byte, error)                        // LastDup - position at last data item of current key

	CountDuplicates() (uint64, error) // CountDuplicates - number of duplicates for the current key
}

type DBI

type DBI uint

type DBVerbosityLvl

type DBVerbosityLvl int8

type Deleter

type Deleter interface {
	// Delete removes a single entry.
	Delete(table string, k []byte) error
}

Deleter wraps the database delete operations.

type GetPut

type GetPut interface {
	Getter
	Putter
}

type Getter

type Getter interface {
	Has

	// GetOne references a readonly section of memory that must not be accessed after txn has terminated
	GetOne(bucket string, key []byte) (val []byte, err error)

	// ForEach iterates over entries with keys greater or equal to fromPrefix.
	// walker is called for each eligible entry.
	// If walker returns an error:
	//   - implementations of local db - stop
	//   - implementations of remote db - do not handle this error and may finish (send all entries to client) before error happen.
	ForEach(bucket string, fromPrefix []byte, walker func(k, v []byte) error) error
	ForPrefix(bucket string, prefix []byte, walker func(k, v []byte) error) error
	ForAmount(bucket string, prefix []byte, amount uint32, walker func(k, v []byte) error) error
}

type Has

type Has interface {
	// Has indicates whether a key exists in the database.
	Has(bucket string, key []byte) (bool, error)
}

type Label

type Label uint8
const (
	ChainDB      Label = 0
	TxPoolDB     Label = 1
	SentryDB     Label = 2
	ConsensusDB  Label = 3
	DownloaderDB Label = 4
)

func (Label) String

func (l Label) String() string

type Putter

type Putter interface {
	// Put inserts or updates a single entry.
	Put(table string, k, v []byte) error
}

Putter wraps the database write operations.

type RoDB

type RoDB interface {
	Closer

	View(ctx context.Context, f func(tx Tx) error) error

	// BeginRo - creates transaction
	// 	tx may be discarded by .Rollback() method
	//
	// A transaction and its cursors must only be used by a single
	// 	thread (not goroutine), and a thread may only have a single transaction at a time.
	//  It happen automatically by - because this method calls runtime.LockOSThread() inside (Rollback/Commit releases it)
	//  By this reason application code can't call runtime.UnlockOSThread() - it leads to undefined behavior.
	//
	// If this `parent` is non-NULL, the new transaction
	//	will be a nested transaction, with the transaction indicated by parent
	//	as its parent. Transactions may be nested to any level. A parent
	//	transaction and its cursors may not issue any other operations than
	//	Commit and Rollback while it has active child transactions.
	BeginRo(ctx context.Context) (Tx, error)
	AllBuckets() TableCfg
	PageSize() uint64
}

RoDB - Read-only version of KV.

type RwCursor

type RwCursor interface {
	Cursor

	Put(k, v []byte) error           // Put - based on order
	Append(k []byte, v []byte) error // Append - append the given key/data pair to the end of the database. This option allows fast bulk loading when keys are already known to be in the correct order.
	Delete(k []byte) error           // Delete - short version of SeekExact+DeleteCurrent or SeekBothExact+DeleteCurrent

	// DeleteCurrent This function deletes the key/data pair to which the cursor refers.
	// This does not invalidate the cursor, so operations such as MDB_NEXT
	// can still be used on it.
	// Both MDB_NEXT and MDB_GET_CURRENT will return the same record after
	// this operation.
	DeleteCurrent() error
}

type RwCursorDupSort

type RwCursorDupSort interface {
	CursorDupSort
	RwCursor

	PutNoDupData(key, value []byte) error // PutNoDupData - inserts key without dupsort
	DeleteCurrentDuplicates() error       // DeleteCurrentDuplicates - deletes all of the data items for the current key
	DeleteExact(k1, k2 []byte) error      // DeleteExact - delete 1 value from given key
	AppendDup(key, value []byte) error    // AppendDup - same as Append, but for sorted dup data
}

type RwDB

type RwDB interface {
	RoDB

	Update(ctx context.Context, f func(tx RwTx) error) error

	BeginRw(ctx context.Context) (RwTx, error)
	BeginRwAsync(ctx context.Context) (RwTx, error)
}

RwDB low-level database interface - main target is - to provide common abstraction over top of MDBX and RemoteKV.

Common pattern for short-living transactions:

 if err := db.View(ctx, func(tx ethdb.Tx) error {
    ... code which uses database in transaction
 }); err != nil {
		return err
}

Common pattern for long-living transactions:

tx, err := db.Begin()
if err != nil {
	return err
}
defer tx.Rollback()

... code which uses database in transaction

err := tx.Commit()
if err != nil {
	return err
}

type RwTx

type RwTx interface {
	Tx
	StatelessWriteTx
	BucketMigrator

	RwCursor(bucket string) (RwCursor, error)
	RwCursorDupSort(bucket string) (RwCursorDupSort, error)

	// CollectMetrics - does collect all DB-related and Tx-related metrics
	// this method exists only in RwTx to avoid concurrency
	CollectMetrics()
	Reset() error
}

RwTx

WARNING:

  • RwTx is not threadsafe and may only be used in the goroutine that created it.
  • ReadOnly transactions do not lock goroutine to thread, RwTx does
  • User Can't call runtime.LockOSThread/runtime.UnlockOSThread in same goroutine until RwTx Commit/Rollback

type StatelessReadTx

type StatelessReadTx interface {
	Getter

	Commit() error // Commit all the operations of a transaction into the database.
	Rollback()     // Rollback - abandon all the operations of the transaction instead of saving them.

	// ReadSequence - allows to create a linear sequence of unique positive integers for each table.
	// Can be called for a read transaction to retrieve the current sequence value, and the increment must be zero.
	// Sequence changes become visible outside the current write transaction after it is committed, and discarded on abort.
	// Starts from 0.
	ReadSequence(bucket string) (uint64, error)

	BucketSize(bucket string) (uint64, error)
}

type StatelessRwTx

type StatelessRwTx interface {
	StatelessReadTx
	StatelessWriteTx
}

type StatelessWriteTx

type StatelessWriteTx interface {
	Putter
	Deleter

	/*
		// if need N id's:
		baseId, err := tx.IncrementSequence(bucket, N)
		if err != nil {
		   return err
		}
		for i := 0; i < N; i++ {    // if N == 0, it will work as expected
		    id := baseId + i
		    // use id
		}


		// or if need only 1 id:
		id, err := tx.IncrementSequence(bucket, 1)
		if err != nil {
		    return err
		}
		// use id
	*/
	IncrementSequence(bucket string, amount uint64) (uint64, error)
	Append(bucket string, k, v []byte) error
	AppendDup(bucket string, k, v []byte) error
}

type TableCfg

type TableCfg map[string]TableCfgItem

type TableCfgItem

type TableCfgItem struct {
	Flags TableFlags
	// AutoDupSortKeysConversion - enables some keys transformation - to change db layout without changing app code.
	// Use it wisely - it helps to do experiments with DB format faster, but better reduce amount of Magic in app.
	// If good DB format found, push app code to accept this format and then disable this property.
	AutoDupSortKeysConversion bool
	IsDeprecated              bool
	DBI                       DBI
	// DupFromLen - if user provide key of this length, then next transformation applied:
	// v = append(k[DupToLen:], v...)
	// k = k[:DupToLen]
	// And opposite at retrieval
	// Works only if AutoDupSortKeysConversion enabled
	DupFromLen int
	DupToLen   int
}

type TableFlags

type TableFlags uint
const (
	Default    TableFlags = 0x00
	ReverseKey TableFlags = 0x02
	DupSort    TableFlags = 0x04
	IntegerKey TableFlags = 0x08
	IntegerDup TableFlags = 0x20
	ReverseDup TableFlags = 0x40
)

type Tx

type Tx interface {
	StatelessReadTx

	// ID returns the identifier associated with this transaction. For a
	// read-only transaction, this corresponds to the snapshot being read;
	// concurrent readers will frequently have the same transaction ID.
	ViewID() uint64

	// Cursor - creates cursor object on top of given bucket. Type of cursor - depends on bucket configuration.
	// If bucket was created with mdbx.DupSort flag, then cursor with interface CursorDupSort created
	// Otherwise - object of interface Cursor created
	//
	// Cursor, also provides a grain of magic - it can use a declarative configuration - and automatically break
	// long keys into DupSort key/values. See docs for `bucket.go:TableCfgItem`
	Cursor(bucket string) (Cursor, error)
	CursorDupSort(bucket string) (CursorDupSort, error) // CursorDupSort - can be used if bucket has mdbx.DupSort flag

	ForEach(bucket string, fromPrefix []byte, walker func(k, v []byte) error) error
	ForPrefix(bucket string, prefix []byte, walker func(k, v []byte) error) error
	ForAmount(bucket string, prefix []byte, amount uint32, walker func(k, v []byte) error) error

	DBSize() (uint64, error)
}

Tx WARNING:

  • Tx is not threadsafe and may only be used in the goroutine that created it
  • ReadOnly transactions do not lock goroutine to thread, RwTx does

Directories

Path Synopsis
temporal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL