kv

package

v0.0.0-...-c80a3f0 Latest Latest Go to latest Published: Dec 20, 2022 License: Apache-2.0 Imports: 12 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/stephanesanchez/erigon-lib

Links

Open Source Insights

README ¶

`Ethdb` package hold's bouquet of objects to access DB

Words "KV" and "DB" have special meaning here:

KV - key-value-style API to access data: let developer manage transactions, stateful cursors.
DB - object-oriented-style API to access data: Get/Put/Delete/WalkOverTable/MultiPut, managing transactions internally.

So, DB abstraction fits 95% times and leads to more maintainable code - because it's looks stateless.

About "key-value-style": Modern key-value databases don't provide Get/Put/Delete methods, because it's very hard-drive-unfriendly - it pushes developers do random-disk-access which is order of magnitude slower than sequential read. To enforce sequential-reads - introduced stateful cursors/iterators - they intentionally look as file-api: open_cursor/seek/write_data_from_current_position/move_to_end/step_back/step_forward/delete_key_on_current_position/append.

Class diagram:

// This is not call graph, just show classes from low-level to high-level. 
// And show which classes satisfy which interfaces.

                    +-----------------------------------+   +-----------------------------------+ 
                    |  github.com/torquem-ch/mdbx-go    |   | google.golang.org/grpc.ClientConn |                    
                    |  (app-agnostic MDBX go bindings)  |   | (app-agnostic RPC and streaming)  |
                    +-----------------------------------+   +-----------------------------------+
                                      |                                      |
                                      |                                      |
                                      v                                      v
                    +-----------------------------------+   +-----------------------------------+
                    |       ethdb/kv_mdbx.go            |   |       ethdb/kv_remote.go          |                
                    |  (tg-specific MDBX implementaion) |   |   (tg-specific remote DB access)  |              
                    +-----------------------------------+   +-----------------------------------+
                                      |                                      |
                                      |                                      |
                                      v                                      v
            +----------------------------------------------------------------------------------------------+
            |                                       ethdb/kv_abstract.go                                   |  
            |         (Common KV interface. DB-friendly, disk-friendly, cpu-cache-friendly.                |
            |           Same app code can work with local or remote database.                              |
            |           Allows experiment with another database implementations.                           |
            |          Supports context.Context for cancelation. Any operation can return error)           |
            +----------------------------------------------------------------------------------------------+

ethdb.AbstractKV design:

InMemory, ReadOnly: NewMDBX().Flags(mdbx.ReadOnly).InMem().Open()
MultipleDatabases, Customization: NewMDBX().Path(path).WithBucketsConfig(config).Open()
1 Transaction object can be used only withing 1 goroutine.
Only 1 write transaction can be active at a time (other will wait).
Unlimited read transactions can be active concurrently (not blocked by write transaction).
Methods db.Update, db.View - can be used to open and close short transaction.
Methods Begin/Commit/Rollback - for long transaction.
it's safe to call .Rollback() after .Commit(), multiple rollbacks are also safe. Common transaction patter:

tx, err := db.Begin(true, ethdb.RW)
if err != nil {
    return err
}
defer tx.Rollback() // important to avoid transactions leak at panic or early return

// ... code which uses database in transaction
 
err := tx.Commit()
if err != nil {
    return err
}

No internal copies/allocations. It means: 1. app must copy keys/values before put to database. 2. Data after read from db - valid only during current transaction - copy it if plan use data after transaction Commit/Rollback.
Methods .Bucket() and .Cursor(), can’t return nil, can't return error.
Bucket and Cursor - are interfaces - means different classes can satisfy it: for example MdbxCursor and MdbxDupSortCursor classes satisfy it. If your are not familiar with "DupSort" concept, please read dupsort.md first.
If Cursor returns err!=nil then key SHOULD be != nil (can be []byte{} for example). Then traversal code look as:

for k, v, err := c.First(); k != nil; k, v, err = c.Next() {
    if err != nil {
        return err
    }
    // logic
}

Move cursor: cursor.Seek(key)

ethdb.Database design:

Allows pass multiple implementations
Allows traversal tables by db.Walk

ethdb.TxDb design:

holds inside 1 long-running transaction and 1 cursor per table
method Begin DOESN'T create new TxDb object, it means this object can be passed into other objects by pointer, and high-level app code can start/commit transactions when it needs without re-creating all objects which holds TxDb pointer.
This is reason why txDb.CommitAndBegin() method works: inside it creating new transaction object, pinter to TxDb stays valid.

How to dump/load table

Install all database tools: make db-tools

./build/bin/mdbx_dump -a <datadir>/erigon/chaindata | lz4 > dump.lz4
lz4 -d < dump.lz4 | ./build/bin/mdbx_load -an <datadir>/erigon/chaindata

How to get table checksum

./build/bin/mdbx_dump -s table_name <datadir>/erigon/chaindata | tail -n +4 | sha256sum # tail here is for excluding header 

Header example:
VERSION=3
geometry=l268435456,c268435456,u25769803776,s268435456,g268435456
mapsize=756375552
maxreaders=120
format=bytevalue
database=TBL0001
type=btree
db_pagesize=4096
duplicates=1
dupsort=1
HEADER=END

Documentation ¶

Index ¶

Constants
Variables
func BigChunks(db RoDB, table string, from []byte, ...) error
func DefaultPageSize() uint64
func EnsureNotChangedBool(tx GetPut, bucket string, k []byte, value bool) (ok, enabled bool, err error)
func FirstKey(tx Tx, table string) ([]byte, error)
func GetBool(tx Getter, bucket string, k []byte) (enabled bool, err error)
func LastKey(tx Tx, table string) ([]byte, error)
func ReadAhead(ctx context.Context, db RoDB, progress *atomic.Bool, table string, from []byte, ...)
type Bucket
type BucketMigrator
type Closer
type CmpFunc
type Cursor
type CursorDupSort
type DBI
type DBVerbosityLvl
type Deleter
type GetPut
type Getter
type Has
type Label
- func (l Label) String() string
type Putter
type RoDB
type RwCursor
type RwCursorDupSort
type RwDB
type RwTx
type StatelessReadTx
type StatelessRwTx
type StatelessWriteTx
type TableCfg
type TableCfgItem
type TableFlags
type Tx

Constants ¶

View Source

const (

	//HashedAccounts
	// key - address hash
	// value - account encoded for storage
	// Contains Storage:
	//key - address hash + incarnation + storage key hash
	//value - storage value(common.hash)
	HashedAccounts = "HashedAccount"
	HashedStorage  = "HashedStorage"
)

View Source

const (

	//key - contract code hash
	//value - contract code
	Code = "Code"

	//key - addressHash+incarnation
	//value - code hash
	ContractCode = "HashedCodeHash"

	// IncarnationMap for deleted accounts
	//key - address
	//value - incarnation of account when it was last deleted
	IncarnationMap = "IncarnationMap"

	//TEVMCode -
	//key - contract code hash
	//value - contract TEVM code
	ContractTEVMCode = "TEVMCode"
)

View Source

const (
	// DatabaseInfo is used to store information about data layout.
	DatabaseInfo = "DbInfo"

	// Data item prefixes (use single byte to avoid mixing data types, avoid `i`, used for indexes).
	HeaderNumber = "HeaderNumber" // header_hash -> num_u64

	HeaderCanonical = "CanonicalHeader"        // block_num_u64 -> header hash
	Headers         = "Header"                 // block_num_u64 + hash -> header (RLP)
	HeaderTD        = "HeadersTotalDifficulty" // block_num_u64 + hash -> td (RLP)

	BlockBody = "BlockBody" // block_num_u64 + hash -> block body

	// EthTx - stores only txs of canonical blocks. As a result - id's used in this table are also
	// canonical - same across all nodex in network - regardless reorgs. Transactions of
	// non-canonical blocs are not removed, but moved to NonCanonicalTransaction - then during re-org don't
	// need re-download block from network.
	// Also this table has system-txs before and after block: if
	// block has no system-tx - records are absent, but sequence increasing
	EthTx           = "BlockTransaction"        // tbl_sequence_u64 -> rlp(tx)
	NonCanonicalTxs = "NonCanonicalTransaction" // tbl_sequence_u64 -> rlp(tx)
	MaxTxNum        = "MaxTxNum"                // block_number_u64 -> max_tx_num_in_block_u64

	Receipts = "Receipt"        // block_num_u64 -> canonical block receipts (non-canonical are not stored)
	Log      = "TransactionLog" // block_num_u64 + txId -> logs of transaction

	// Stores bitmap indices - in which block numbers saw logs of given 'address' or 'topic'
	// [addr or topic] + [2 bytes inverted shard number] -> bitmap(blockN)
	// indices are sharded - because some bitmaps are >1Mb and when new incoming blocks process it
	//	 updates ~300 of bitmaps - by append small amount new values. It cause much big writes (MDBX does copy-on-write).
	//
	// if last existing shard size merge it with delta
	// if serialized size of delta > ShardLimit - break down to multiple shards
	// shard number - it's biggest value in bitmap
	LogTopicIndex   = "LogTopicIndex"
	LogAddressIndex = "LogAddressIndex"

	// CallTraceSet is the name of the table that contain the mapping of block number to the set (sorted) of all accounts
	// touched by call traces. It is DupSort-ed table
	// 8-byte BE block number -> account address -> two bits (one for "from", another for "to")
	CallTraceSet = "CallTraceSet"
	// Indices for call traces - have the same format as LogTopicIndex and LogAddressIndex
	// Store bitmap indices - in which block number we saw calls from (CallFromIndex) or to (CallToIndex) some addresses
	CallFromIndex = "CallFromIndex"
	CallToIndex   = "CallToIndex"

	// Cumulative indexes for estimation of stage execution
	CumulativeGasIndex         = "CumulativeGasIndex"
	CumulativeTransactionIndex = "CumulativeTransactionIndex"

	TxLookup = "BlockTransactionLookup" // hash -> transaction/receipt lookup metadata

	ConfigTable = "Config" // config prefix for the db

	// Progress of sync stages: stageName -> stageData
	SyncStageProgress = "SyncStage"

	Clique             = "Clique"
	CliqueSeparate     = "CliqueSeparate"
	CliqueSnapshot     = "CliqueSnapshot"
	CliqueLastSnapshot = "CliqueLastSnapshot"

	// Snapshot table used for Binance Smart Chain's consensus engine Parlia
	// Schema of key/value pairs containing:
	// Key (string): SnapshotFullKey = SnapshotBucket + num (uint64 big endian) + hash
	// Value (JSON blob):
	// {
	//     "number"             // Block number where the snapshot was created
	//     "hash"               // Block hash where the snapshot was created
	//     "validators"         // Set of authorized validators at this moment
	//     "recents"            // Set of recent validators for spam protections
	//     "recent_fork_hashes" // Set of recent forkHash
	// }
	ParliaSnapshot = "ParliaSnapshot"

	// Proof-of-stake
	// Beacon chain head that is been executed at the current time
	CurrentExecutionPayload = "CurrentExecutionPayload"

	// NodeRecords stores P2P node records (ENR)
	NodeRecords = "NodeRecord"
	// Inodes stores P2P discovery service info about the nodes
	Inodes = "Inode"

	// Transaction senders - stored separately from the block bodies
	Senders = "TxSender" // block_num_u64 + blockHash -> sendersList (no serialization format, every 20 bytes is new sender)

	// headBlockKey tracks the latest know full block's hash.
	HeadBlockKey = "LastBlock"

	HeadHeaderKey = "LastHeader"

	// headBlockHash, safeBlockHash, finalizedBlockHash of the latest Engine API forkchoice
	LastForkchoice = "LastForkchoice"

	// TransitionBlockKey tracks the last proof-of-work block
	TransitionBlockKey = "TransitionBlock"

	// migrationName -> serialized SyncStageProgress and SyncStageUnwind buckets
	// it stores stages progress to understand in which context was executed migration
	// in case of bug-report developer can ask content of this bucket
	Migrations = "Migration"

	Sequence = "Sequence" // tbl_name -> seq_u64

	Epoch        = "DevEpoch"        // block_num_u64+block_hash->transition_proof
	PendingEpoch = "DevPendingEpoch" // block_num_u64+block_hash->transition_proof

	Issuance = "Issuance" // block_num_u64->RLP(issuance+burnt[0 if < london])

	StateAccounts   = "StateAccounts"
	StateStorage    = "StateStorage"
	StateCode       = "StateCode"
	StateCommitment = "StateCommitment"

	BorReceipts = "BorReceipt"
	BorTxLookup = "BlockBorTransactionLookup" // transaction_hash -> block_num_u64
	BorSeparate = "BorSeparate"

	// Downloader
	BittorrentCompletion = "BittorrentCompletion"
	BittorrentInfo       = "BittorrentInfo"

	// Domains and Inverted Indices
	AccountKeys        = "AccountKeys"
	AccountVals        = "AccountVals"
	AccountHistoryKeys = "AccountHistoryKeys"
	AccountHistoryVals = "AccountHistoryVals"
	AccountSettings    = "AccountSettings"
	AccountIdx         = "AccountIdx"

	StorageKeys        = "StorageKeys"
	StorageVals        = "StorageVals"
	StorageHistoryKeys = "StorageHistoryKeys"
	StorageHistoryVals = "StorageHistoryVals"
	StorageSettings    = "StorageSettings"
	StorageIdx         = "StorageIdx"

	CodeKeys        = "CodeKeys"
	CodeVals        = "CodeVals"
	CodeHistoryKeys = "CodeHistoryKeys"
	CodeHistoryVals = "CodeHistoryVals"
	CodeSettings    = "CodeSettings"
	CodeIdx         = "CodeIdx"

	CommitmentKeys        = "CommitmentKeys"
	CommitmentVals        = "CommitmentVals"
	CommitmentHistoryKeys = "CommitmentHistoryKeys"
	CommitmentHistoryVals = "CommitmentHistoryVals"
	CommitmentSettings    = "CommitmentSettings"
	CommitmentIdx         = "CommitmentIdx"

	LogAddressKeys = "LogAddressKeys"
	LogAddressIdx  = "LogAddressIdx"
	LogTopicsKeys  = "LogTopicsKeys"
	LogTopicsIdx   = "LogTopicsIdx"

	TracesFromKeys = "TracesFromKeys"
	TracesFromIdx  = "TracesFromIdx"
	TracesToKeys   = "TracesToKeys"
	TracesToIdx    = "TracesToIdx"

	Snapshots = "Snapshots" // name -> hash

	RAccountKeys = "RAccountKeys"
	RAccountIdx  = "RAccountIdx"
	RStorageKeys = "RStorageKeys"
	RStorageIdx  = "RStorageIdx"
	RCodeKeys    = "RCodeKeys"
	RCodeIdx     = "RCodeIdx"

	PlainStateR    = "PlainStateR"    // temporary table for PlainState reconstitution
	PlainStateD    = "PlainStateD"    // temporary table for PlainStare reconstitution, deletes
	CodeR          = "CodeR"          // temporary table for Code reconstitution
	CodeD          = "CodeD"          // temporary table for Code reconstitution, deletes
	PlainContractR = "PlainContractR" // temporary table for PlainContract reconstitution
	PlainContractD = "PlainContractD" // temporary table for PlainContract reconstitution, deletes

	// Erigon-CL
	BeaconState = "BeaconState"
	// [slot + block root] => [signature + block without execution payload]
	BeaconBlocks = "BeaconBlock"

	// LightClientStore => LightClientStore object
	// LightClientFinalityUpdate => latest finality update
	// LightClientOptimisticUpdate => latest optimistic update
	LightClient = "LightClient"
	// Period (one every 27 hours) => LightClientUpdate
	LightClientUpdates = "LightClientUpdates"
)

View Source

const (
	RecentLocalTransaction = "RecentLocalTransaction" // sequence_u64 -> tx_hash
	PoolTransaction        = "PoolTransaction"        // txHash -> sender_id_u64+tx_rlp
	PoolInfo               = "PoolInfo"               // option_key -> option_value
)

View Source

const AccountChangeSet = "AccountChangeSet"

AccountChangeSet and StorageChangeSet - of block N store values of state before block N changed them. Because values "after" change stored in PlainState. Logical format:

key - blockNum_u64 + key_in_plain_state
value - value_in_plain_state_before_blockNum_changes

Example: If block N changed account A from value X to Y. Then:

AccountChangeSet has record: bigEndian(N) + A -> X
PlainState has record: A -> Y

See also: docs/programmers_guide/db_walkthrough.MD#table-history-of-accounts

As you can see if block N changes much accounts - then all records have repetitive prefix `bigEndian(N)`. MDBX can store such prefixes only once - by DupSort feature (see `docs/programmers_guide/dupsort.md`). Both buckets are DupSort-ed and have physical format: AccountChangeSet:

key - blockNum_u64
value - address + account(encoded)

StorageChangeSet:

key - blockNum_u64 + address + incarnation_u64
value - plain_storage_key + value

View Source

const AccountsHistory = "AccountHistory"

AccountsHistory and StorageHistory - indices designed to serve next 2 type of requests: 1. what is smallest block number >= X where account A changed 2. get last shard of A - to append there new block numbers

Task 1. is part of "get historical state" operation (see `core/state:GetAsOf`): If `db.Seek(A+bigEndian(X))` returns non-last shard -

then get block number from shard value Y := RoaringBitmap(shard_value).GetGte(X)
and with Y go to ChangeSets: db.Get(ChangeSets, Y+A)

If `db.Seek(A+bigEndian(X))` returns last shard -

then we go to PlainState: db.Get(PlainState, A)

Format:

index split to shards by 2Kb - RoaringBitmap encoded sorted list of block numbers (to avoid performance degradation of popular accounts or look deep into history. Also 2Kb allows avoid Overflow pages inside DB.)
if shard is not last - then key has suffix 8 bytes = bigEndian(max_block_num_in_this_shard)
if shard is last - then key has suffix 8 bytes = 0xFF

It allows:

server task 1. by 1 db operation db.Seek(A+bigEndian(X))
server task 2. by 1 db operation db.Get(A+0xFF)

see also: docs/programmers_guide/db_walkthrough.MD#table-change-sets

AccountsHistory:

key - address + shard_id_u64
value - roaring bitmap  - list of block where it changed

StorageHistory

key - address + storage_key + shard_id_u64
value - roaring bitmap - list of block where it changed

View Source

const PlainContractCode = "PlainCodeHash"

PlainContractCode - key - address+incarnation value - code hash

View Source

const PlainState = "PlainState"

PlainState logical layout:

Contains Accounts:
  key - address (unhashed)
  value - account encoded for storage
Contains Storage:
  key - address (unhashed) + incarnation + storage key (unhashed)
  value - storage value(common.hash)

Physical layout:

PlainState and HashedStorage utilises DupSort feature of MDBX (store multiple values inside 1 key).

-------------------------------------------------------------

key              |            value

------------------------------------------------------------- [acc_hash] | [acc_value] [acc_hash]+[inc] | [storage1_hash]+[storage1_value]

| [storage2_hash]+[storage2_value] // this value has no own key. it's 2nd value of [acc_hash]+[inc] key.
| [storage3_hash]+[storage3_value]
| ...

[acc_hash]+[old_inc] | [storage1_hash]+[storage1_value]

| ...

[acc2_hash] | [acc2_value]

...

View Source

const ReadersLimit = 32000 // MDBX_READERS_LIMIT=32767

View Source

const StorageChangeSet = "StorageChangeSet"

View Source

const StorageHistory = "StorageHistory"

View Source

const TrieOfAccounts = "TrieAccount"

TrieOfAccounts and TrieOfStorage hasState,groups - mark prefixes existing in hashed_account table hasTree - mark prefixes existing in trie_account table (not related with branchNodes) hasHash - mark prefixes which hashes are saved in current trie_account record (actually only hashes of branchNodes can be saved) @see UnmarshalTrieNode @see integrity.Trie

+-----------------------------------------------------------------------------------------------------+ | DB record: 0x0B, hasState: 0b1011, hasTree: 0b1001, hasHash: 0b1001, hashes: [x,x] | +-----------------------------------------------------------------------------------------------------+

|                                           |                               |
v                                           |                               v

+---------------------------------------------+ | +--------------------------------------+ | DB record: 0x0B00, hasState: 0b10001 | | | DB record: 0x0B03, hasState: 0b10010 | | hasTree: 0, hasHash: 0b10000, hashes: [x] | | | hasTree: 0, hasHash: 0, hashes: [] | +---------------------------------------------+ | +--------------------------------------+

|                    |                              |                         |                  |
v                    v                              v                         v                  v

+------------------+ +----------------------+ +---------------+ +---------------+ +---------------+ | Account: | | BranchNode: 0x0B0004 | | Account: | | Account: | | Account: | | 0x0B0000... | | has no record in | | 0x0B01... | | 0x0B0301... | | 0x0B0304... | | in HashedAccount | | TrieAccount | | | | | | | +------------------+ +----------------------+ +---------------+ +---------------+ +---------------+

                           |                |
                           v                v
		           +---------------+  +---------------+
		           | Account:      |  | Account:      |
		           | 0x0B000400... |  | 0x0B000401... |
		           +---------------+  +---------------+

Invariants: - hasTree is subset of hasState - hasHash is subset of hasState - first level in account_trie always exists if hasState>0 - TrieStorage record of account.root (length=40) must have +1 hash - it's account.root - each record in TrieAccount table must have parent (may be not direct) and this parent must have correct bit in hasTree bitmap - if hasState has bit - then HashedAccount table must have record according to this bit - each TrieAccount record must cover some state (means hasState is always > 0) - TrieAccount records with length=1 can satisfy (hasBranch==0&&hasHash==0) condition - Other records in TrieAccount and TrieStorage must (hasTree!=0 || hasHash!=0)

View Source

const TrieOfStorage = "TrieStorage"

View Source

const VerkleRoots = "VerkleRoots"

Mapping [block number] => [Verkle Root]

View Source

const VerkleTrie = "VerkleTrie"

Mapping [Verkle Root] => [Rlp-Encoded Verkle Node]

Variables ¶

View Source

var (
	ErrAttemptToDeleteNonDeprecatedBucket = errors.New("only buckets from dbutils.ChaindataDeprecatedTables can be deleted")
	ErrUnknownBucket                      = errors.New("unknown bucket. add it to dbutils.ChaindataTables")

	DbSize    = metrics.NewCounter(`db_size`)    //nolint
	TxLimit   = metrics.NewCounter(`tx_limit`)   //nolint
	TxSpill   = metrics.NewCounter(`tx_spill`)   //nolint
	TxUnspill = metrics.NewCounter(`tx_unspill`) //nolint
	TxDirty   = metrics.NewCounter(`tx_dirty`)   //nolint

	DbCommitPreparation = metrics.GetOrCreateSummary(`db_commit_seconds{phase="preparation"}`) //nolint
	DbCommitGc          = metrics.GetOrCreateSummary(`db_commit_seconds{phase="gc"}`)          //nolint
	DbCommitAudit       = metrics.GetOrCreateSummary(`db_commit_seconds{phase="audit"}`)       //nolint
	DbCommitWrite       = metrics.GetOrCreateSummary(`db_commit_seconds{phase="write"}`)       //nolint
	DbCommitSync        = metrics.GetOrCreateSummary(`db_commit_seconds{phase="sync"}`)        //nolint
	DbCommitEnding      = metrics.GetOrCreateSummary(`db_commit_seconds{phase="ending"}`)      //nolint
	DbCommitTotal       = metrics.GetOrCreateSummary(`db_commit_seconds{phase="total"}`)       //nolint

	DbPgopsNewly   = metrics.NewCounter(`db_pgops_newly`)           //nolint
	DbPgopsCow     = metrics.NewCounter(`db_pgops_cow`)             //nolint
	DbPgopsClone   = metrics.NewCounter(`db_pgops_clone`)           //nolint
	DbPgopsSplit   = metrics.NewCounter(`db_pgops_split`)           //nolint
	DbPgopsMerge   = metrics.NewCounter(`db_pgops_merge`)           //nolint
	DbPgopsSpill   = metrics.NewCounter(`db_pgops_spill`)           //nolint
	DbPgopsUnspill = metrics.NewCounter(`db_pgops_unspill`)         //nolint
	DbPgopsWops    = metrics.NewCounter(`db_pgops_wops`)            //nolint
	DbPgopsGcrtime = metrics.GetOrCreateSummary(`db_pgops_gcrtime`) //nolint

	GcLeafMetric     = metrics.NewCounter(`db_gc_leaf`)     //nolint
	GcOverflowMetric = metrics.NewCounter(`db_gc_overflow`) //nolint
	GcPagesMetric    = metrics.NewCounter(`db_gc_pages`)    //nolint

)

View Source

var (
	//StorageModeTEVM - does not translate EVM to TEVM
	StorageModeTEVM = []byte("smTEVM")

	PruneTypeOlder  = []byte("older")
	PruneTypeBefore = []byte("before")

	PruneHistory        = []byte("pruneHistory")
	PruneHistoryType    = []byte("pruneHistoryType")
	PruneReceipts       = []byte("pruneReceipts")
	PruneReceiptsType   = []byte("pruneReceiptsType")
	PruneTxIndex        = []byte("pruneTxIndex")
	PruneTxIndexType    = []byte("pruneTxIndexType")
	PruneCallTraces     = []byte("pruneCallTraces")
	PruneCallTracesType = []byte("pruneCallTracesType")

	DBSchemaVersionKey = []byte("dbVersion")

	BittorrentPeerID            = "peerID"
	CurrentHeadersSnapshotHash  = []byte("CurrentHeadersSnapshotHash")
	CurrentHeadersSnapshotBlock = []byte("CurrentHeadersSnapshotBlock")
	CurrentBodiesSnapshotHash   = []byte("CurrentBodiesSnapshotHash")
	CurrentBodiesSnapshotBlock  = []byte("CurrentBodiesSnapshotBlock")
	PlainStateVersion           = []byte("PlainStateVersion")

	LightClientStore            = []byte("LightClientStore")
	LightClientFinalityUpdate   = []byte("LightClientFinalityUpdate")
	LightClientOptimisticUpdate = []byte("LightClientOptimisticUpdate")
)

Keys

View Source

var ChaindataDeprecatedTables = []string{
	Clique,
	TransitionBlockKey,
}

ChaindataDeprecatedTables - list of buckets which can be programmatically deleted - for example after migration

View Source

var ChaindataTables = []string{}/* 101 elements not displayed */

ChaindataTables - list of all buckets. App will panic if some bucket is not in this list. This list will be sorted in `init` method. ChaindataTablesCfg - can be used to find index in sorted version of ChaindataTables list by name

View Source

var ChaindataTablesCfg = TableCfg{
	HashedStorage: {
		Flags:                     DupSort,
		AutoDupSortKeysConversion: true,
		DupFromLen:                72,
		DupToLen:                  40,
	},
	AccountChangeSet: {Flags: DupSort},
	StorageChangeSet: {Flags: DupSort},
	PlainState: {
		Flags:                     DupSort,
		AutoDupSortKeysConversion: true,
		DupFromLen:                60,
		DupToLen:                  28,
	},
	CallTraceSet: {Flags: DupSort},

	AccountKeys:           {Flags: DupSort},
	AccountHistoryKeys:    {Flags: DupSort},
	AccountIdx:            {Flags: DupSort},
	StorageKeys:           {Flags: DupSort},
	StorageHistoryKeys:    {Flags: DupSort},
	StorageIdx:            {Flags: DupSort},
	CodeKeys:              {Flags: DupSort},
	CodeHistoryKeys:       {Flags: DupSort},
	CodeIdx:               {Flags: DupSort},
	CommitmentKeys:        {Flags: DupSort},
	CommitmentHistoryKeys: {Flags: DupSort},
	CommitmentIdx:         {Flags: DupSort},
	LogAddressKeys:        {Flags: DupSort},
	LogAddressIdx:         {Flags: DupSort},
	LogTopicsKeys:         {Flags: DupSort},
	LogTopicsIdx:          {Flags: DupSort},
	TracesFromKeys:        {Flags: DupSort},
	TracesFromIdx:         {Flags: DupSort},
	TracesToKeys:          {Flags: DupSort},
	TracesToIdx:           {Flags: DupSort},
	RAccountKeys:          {Flags: DupSort},
	RAccountIdx:           {Flags: DupSort},
	RStorageKeys:          {Flags: DupSort},
	RStorageIdx:           {Flags: DupSort},
	RCodeKeys:             {Flags: DupSort},
	RCodeIdx:              {Flags: DupSort},
}

View Source

var DBSchemaVersion = types.VersionReply{Major: 6, Minor: 0, Patch: 0}

DBSchemaVersion versions list 5.0 - BlockTransaction table now has canonical ids (txs of non-canonical blocks moving to NonCanonicalTransaction table) 6.0 - BlockTransaction table now has system-txs before and after block (records are absent if block has no system-tx, but sequence increasing)

View Source

var DownloaderTables = []string{
	BittorrentCompletion,
	BittorrentInfo,
}

View Source

var DownloaderTablesCfg = TableCfg{}

View Source

var ErrChanged = fmt.Errorf("key must not change")

View Source

var ErrNotSupported = errors.New("not supported")

View Source

var ReconTables = []string{
	PlainStateR,
	PlainStateD,
	CodeR,
	CodeD,
	PlainContractR,
	PlainContractD,
}

View Source

var ReconTablesCfg = TableCfg{
	PlainStateD:    {Flags: DupSort},
	CodeD:          {Flags: DupSort},
	PlainContractD: {Flags: DupSort},
}

View Source

var SentryTables = []string{}

View Source

var SentryTablesCfg = TableCfg{}

View Source

var TxPoolTables = []string{
	RecentLocalTransaction,
	PoolTransaction,
	PoolInfo,
}

View Source

var TxpoolTablesCfg = TableCfg{}

Functions ¶

func BigChunks ¶

func BigChunks(db RoDB, table string, from []byte, walker func(tx Tx, k, v []byte) (bool, error)) error

BigChunks - read `table` by big chunks - restart read transaction after each 1 minutes

func DefaultPageSize ¶

func DefaultPageSize() uint64

func EnsureNotChangedBool ¶

func EnsureNotChangedBool(tx GetPut, bucket string, k []byte, value bool) (ok, enabled bool, err error)

EnsureNotChangedBool - used to store immutable config flags in db. protects from human mistakes

func FirstKey ¶

func FirstKey(tx Tx, table string) ([]byte, error)

FirstKey - candidate on move to kv.Tx interface

func GetBool ¶

func GetBool(tx Getter, bucket string, k []byte) (enabled bool, err error)

func LastKey ¶

func LastKey(tx Tx, table string) ([]byte, error)

LastKey - candidate on move to kv.Tx interface

func ReadAhead ¶

func ReadAhead(ctx context.Context, db RoDB, progress *atomic.Bool, table string, from []byte, amount uint32)

Types ¶

type Bucket ¶

type Bucket string

type BucketMigrator ¶

type BucketMigrator interface {
	DropBucket(string) error
	CreateBucket(string) error
	ExistsBucket(string) (bool, error)
	ClearBucket(string) error
	ListBuckets() ([]string, error)
}

BucketMigrator used for buckets migration, don't use it in usual app code

type Closer ¶

type Closer interface {
	Close()
}

type CmpFunc ¶

type CmpFunc func(k1, k2, v1, v2 []byte) int

type Cursor ¶

type Cursor interface {
	First() ([]byte, []byte, error)               // First - position at first key/data item
	Seek(seek []byte) ([]byte, []byte, error)     // Seek - position at first key greater than or equal to specified key
	SeekExact(key []byte) ([]byte, []byte, error) // SeekExact - position at exact matching key if exists
	Next() ([]byte, []byte, error)                // Next - position at next key/value (can iterate over DupSort key/values automatically)
	Prev() ([]byte, []byte, error)                // Prev - position at previous key
	Last() ([]byte, []byte, error)                // Last - position at last key and last possible value
	Current() ([]byte, []byte, error)             // Current - return key/data at current cursor position

	Count() (uint64, error) // Count - fast way to calculate amount of keys in bucket. It counts all keys even if Prefix was set.

	Close()
}

Cursor - class for navigating through a database CursorDupSort are inherit this class

If methods (like First/Next/Seek) return error, then returned key SHOULD not be nil (can be []byte{} for example). Then looping code will look as: c := kv.Cursor(bucketName)

for k, v, err := c.First(); k != nil; k, v, err = c.Next() {
   if err != nil {
       return err
   }
   ... logic
}

type CursorDupSort ¶

type CursorDupSort interface {
	Cursor

	// SeekBothExact -
	// second parameter can be nil only if searched key has no duplicates, or return error
	SeekBothExact(key, value []byte) ([]byte, []byte, error)
	SeekBothRange(key, value []byte) ([]byte, error) // SeekBothRange - exact match of the key, but range match of the value
	FirstDup() ([]byte, error)                       // FirstDup - position at first data item of current key
	NextDup() ([]byte, []byte, error)                // NextDup - position at next data item of current key
	NextNoDup() ([]byte, []byte, error)              // NextNoDup - position at first data item of next key
	LastDup() ([]byte, error)                        // LastDup - position at last data item of current key

	CountDuplicates() (uint64, error) // CountDuplicates - number of duplicates for the current key
}

type DBI ¶

type DBI uint

type DBVerbosityLvl ¶

type DBVerbosityLvl int8

type Deleter ¶

type Deleter interface {
	// Delete removes a single entry.
	Delete(table string, k []byte) error
}

Deleter wraps the database delete operations.

type GetPut ¶

type GetPut interface {
	Getter
	Putter
}

type Getter ¶

type Getter interface {
	Has

	// GetOne references a readonly section of memory that must not be accessed after txn has terminated
	GetOne(bucket string, key []byte) (val []byte, err error)

	// ForEach iterates over entries with keys greater or equal to fromPrefix.
	// walker is called for each eligible entry.
	// If walker returns an error:
	//   - implementations of local db - stop
	//   - implementations of remote db - do not handle this error and may finish (send all entries to client) before error happen.
	ForEach(bucket string, fromPrefix []byte, walker func(k, v []byte) error) error
	ForPrefix(bucket string, prefix []byte, walker func(k, v []byte) error) error
	ForAmount(bucket string, prefix []byte, amount uint32, walker func(k, v []byte) error) error
}

type Has ¶

type Has interface {
	// Has indicates whether a key exists in the database.
	Has(bucket string, key []byte) (bool, error)
}

type Label ¶

type Label uint8

const (
	ChainDB      Label = 0
	TxPoolDB     Label = 1
	SentryDB     Label = 2
	ConsensusDB  Label = 3
	DownloaderDB Label = 4
)

func (Label) String ¶

func (l Label) String() string

type Putter ¶

type Putter interface {
	// Put inserts or updates a single entry.
	Put(table string, k, v []byte) error
}

Putter wraps the database write operations.

type RoDB ¶

type RoDB interface {
	Closer

	View(ctx context.Context, f func(tx Tx) error) error

	// BeginRo - creates transaction
	// 	tx may be discarded by .Rollback() method
	//
	// A transaction and its cursors must only be used by a single
	// 	thread (not goroutine), and a thread may only have a single transaction at a time.
	//  It happen automatically by - because this method calls runtime.LockOSThread() inside (Rollback/Commit releases it)
	//  By this reason application code can't call runtime.UnlockOSThread() - it leads to undefined behavior.
	//
	// If this `parent` is non-NULL, the new transaction
	//	will be a nested transaction, with the transaction indicated by parent
	//	as its parent. Transactions may be nested to any level. A parent
	//	transaction and its cursors may not issue any other operations than
	//	Commit and Rollback while it has active child transactions.
	BeginRo(ctx context.Context) (Tx, error)
	AllBuckets() TableCfg
	PageSize() uint64
}

RoDB - Read-only version of KV.

type RwCursor ¶

type RwCursor interface {
	Cursor

	Put(k, v []byte) error           // Put - based on order
	Append(k []byte, v []byte) error // Append - append the given key/data pair to the end of the database. This option allows fast bulk loading when keys are already known to be in the correct order.
	Delete(k []byte) error           // Delete - short version of SeekExact+DeleteCurrent or SeekBothExact+DeleteCurrent

	// DeleteCurrent This function deletes the key/data pair to which the cursor refers.
	// This does not invalidate the cursor, so operations such as MDB_NEXT
	// can still be used on it.
	// Both MDB_NEXT and MDB_GET_CURRENT will return the same record after
	// this operation.
	DeleteCurrent() error
}

type RwCursorDupSort ¶

type RwCursorDupSort interface {
	CursorDupSort
	RwCursor

	PutNoDupData(key, value []byte) error // PutNoDupData - inserts key without dupsort
	DeleteCurrentDuplicates() error       // DeleteCurrentDuplicates - deletes all of the data items for the current key
	DeleteExact(k1, k2 []byte) error      // DeleteExact - delete 1 value from given key
	AppendDup(key, value []byte) error    // AppendDup - same as Append, but for sorted dup data
}

type RwDB ¶

type RwDB interface {
	RoDB

	Update(ctx context.Context, f func(tx RwTx) error) error

	BeginRw(ctx context.Context) (RwTx, error)
	BeginRwAsync(ctx context.Context) (RwTx, error)
}

RwDB low-level database interface - main target is - to provide common abstraction over top of MDBX and RemoteKV.

Common pattern for short-living transactions:

 if err := db.View(ctx, func(tx ethdb.Tx) error {
    ... code which uses database in transaction
 }); err != nil {
		return err
}

Common pattern for long-living transactions:

tx, err := db.Begin()
if err != nil {
	return err
}
defer tx.Rollback()

... code which uses database in transaction

err := tx.Commit()
if err != nil {
	return err
}

type RwTx ¶

type RwTx interface {
	Tx
	StatelessWriteTx
	BucketMigrator

	RwCursor(bucket string) (RwCursor, error)
	RwCursorDupSort(bucket string) (RwCursorDupSort, error)

	// CollectMetrics - does collect all DB-related and Tx-related metrics
	// this method exists only in RwTx to avoid concurrency
	CollectMetrics()
	Reset() error
}

RwTx

WARNING:

RwTx is not threadsafe and may only be used in the goroutine that created it.
ReadOnly transactions do not lock goroutine to thread, RwTx does
User Can't call runtime.LockOSThread/runtime.UnlockOSThread in same goroutine until RwTx Commit/Rollback

type StatelessReadTx ¶

type StatelessReadTx interface {
	Getter

	Commit() error // Commit all the operations of a transaction into the database.
	Rollback()     // Rollback - abandon all the operations of the transaction instead of saving them.

	// ReadSequence - allows to create a linear sequence of unique positive integers for each table.
	// Can be called for a read transaction to retrieve the current sequence value, and the increment must be zero.
	// Sequence changes become visible outside the current write transaction after it is committed, and discarded on abort.
	// Starts from 0.
	ReadSequence(bucket string) (uint64, error)

	BucketSize(bucket string) (uint64, error)
}

type StatelessRwTx ¶

type StatelessRwTx interface {
	StatelessReadTx
	StatelessWriteTx
}

type StatelessWriteTx ¶

type StatelessWriteTx interface {
	Putter
	Deleter

	/*
		// if need N id's:
		baseId, err := tx.IncrementSequence(bucket, N)
		if err != nil {
		   return err
		}
		for i := 0; i < N; i++ {    // if N == 0, it will work as expected
		    id := baseId + i
		    // use id
		}


		// or if need only 1 id:
		id, err := tx.IncrementSequence(bucket, 1)
		if err != nil {
		    return err
		}
		// use id
	*/
	IncrementSequence(bucket string, amount uint64) (uint64, error)
	Append(bucket string, k, v []byte) error
	AppendDup(bucket string, k, v []byte) error
}

type TableCfg ¶

type TableCfg map[string]TableCfgItem

type TableCfgItem ¶

type TableCfgItem struct {
	Flags TableFlags
	// AutoDupSortKeysConversion - enables some keys transformation - to change db layout without changing app code.
	// Use it wisely - it helps to do experiments with DB format faster, but better reduce amount of Magic in app.
	// If good DB format found, push app code to accept this format and then disable this property.
	AutoDupSortKeysConversion bool
	IsDeprecated              bool
	DBI                       DBI
	// DupFromLen - if user provide key of this length, then next transformation applied:
	// v = append(k[DupToLen:], v...)
	// k = k[:DupToLen]
	// And opposite at retrieval
	// Works only if AutoDupSortKeysConversion enabled
	DupFromLen int
	DupToLen   int
}

type TableFlags ¶

type TableFlags uint

const (
	Default    TableFlags = 0x00
	ReverseKey TableFlags = 0x02
	DupSort    TableFlags = 0x04
	IntegerKey TableFlags = 0x08
	IntegerDup TableFlags = 0x20
	ReverseDup TableFlags = 0x40
)

type Tx ¶

type Tx interface {
	StatelessReadTx

	// ID returns the identifier associated with this transaction. For a
	// read-only transaction, this corresponds to the snapshot being read;
	// concurrent readers will frequently have the same transaction ID.
	ViewID() uint64

	// Cursor - creates cursor object on top of given bucket. Type of cursor - depends on bucket configuration.
	// If bucket was created with mdbx.DupSort flag, then cursor with interface CursorDupSort created
	// Otherwise - object of interface Cursor created
	//
	// Cursor, also provides a grain of magic - it can use a declarative configuration - and automatically break
	// long keys into DupSort key/values. See docs for `bucket.go:TableCfgItem`
	Cursor(bucket string) (Cursor, error)
	CursorDupSort(bucket string) (CursorDupSort, error) // CursorDupSort - can be used if bucket has mdbx.DupSort flag

	ForEach(bucket string, fromPrefix []byte, walker func(k, v []byte) error) error
	ForPrefix(bucket string, prefix []byte, walker func(k, v []byte) error) error
	ForAmount(bucket string, prefix []byte, amount uint32, walker func(k, v []byte) error) error

	DBSize() (uint64, error)
}

Tx WARNING:

Tx is not threadsafe and may only be used in the goroutine that created it
ReadOnly transactions do not lock goroutine to thread, RwTx does

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
bitmapdb
kvcache
kvcfg
mdbx
memdb
remotedb
remotedbserver
temporal
historyv2

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

Ethdb package hold's bouquet of objects to access DB

Class diagram:

ethdb.AbstractKV design:

ethdb.Database design:

ethdb.TxDb design:

How to dump/load table

How to get table checksum

Documentation ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func BigChunks ¶

func DefaultPageSize ¶

func EnsureNotChangedBool ¶

func FirstKey ¶

func GetBool ¶

func LastKey ¶

func ReadAhead ¶

Types ¶

type Bucket ¶

type BucketMigrator ¶

type Closer ¶

type CmpFunc ¶

type Cursor ¶

type CursorDupSort ¶

type DBI ¶

type DBVerbosityLvl ¶

type Deleter ¶

type GetPut ¶

type Getter ¶

type Has ¶

type Label ¶

func (Label) String ¶

type Putter ¶

type RoDB ¶

type RwCursor ¶

type RwCursorDupSort ¶

type RwDB ¶

type RwTx ¶

type StatelessReadTx ¶

type StatelessRwTx ¶

type StatelessWriteTx ¶

type TableCfg ¶

type TableCfgItem ¶

type TableFlags ¶

type Tx ¶

Source Files ¶

Directories ¶

`Ethdb` package hold's bouquet of objects to access DB