2

Newbie here!

I still not sure if I understood how some structures of Ethereum are physically stored (assuming Geth implementation)

  • State Trie: only one off-chain Merkle Patricia Tries stored using LevelDB;

  • Storage Trie: one Merkle Patricia Trie per account; stored off-chain together with the State Trie using LevelDB;

  • Transaction Tries: not really physically stored; a Merkle Patricia Trie is created on the fly when needed using the block transaction list;

  • Receipts Trie: not a clue;

  • Blocks: State trie and Storage tries are stored in .ldb files (LevelDB), but where can I find the block files and which format are they stored?

1 Answer 1

1

Low level geth database format is:

var databaseVerisionKey = new Buffer("DatabaseVersion"); // databaseVerisionKey tracks the current database version.
var headHeaderKey = new Buffer("LastHeader"); // headHeaderKey tracks the latest know header's hash.
var headBlockKey = new Buffer("LastBlock"); // headBlockKey tracks the latest know full block's hash.
var headFastBlockKey = new Buffer("LastFast"); // headFastBlockKey tracks the latest known incomplete block's hash duirng fast sync.
var fastTrieProgressKey = new Buffer("TrieSync"); // fastTrieProgressKey tracks the number of trie entries imported during fast sync.

// Data item prefixes (use single byte to avoid mixing data types, avoid `i`, used for indexes).
var headerPrefix = new Buffer("h"); // headerPrefix + num (uint64 big endian) + hash -> header
var headerTDSuffix = new Buffer("t"); // headerPrefix + num (uint64 big endian) + hash + headerTDSuffix -> td
var headerHashSuffix = new Buffer("n"); // headerPrefix + num (uint64 big endian) + headerHashSuffix -> hash
var headerNumberPrefix = new Buffer("H"); // headerNumberPrefix + hash -> num (uint64 big endian)
var blockBodyPrefix = new Buffer("b"); // blockBodyPrefix + num (uint64 big endian) + hash -> block body
var blockReceiptsPrefix = new Buffer("r"); // blockReceiptsPrefix + num (uint64 big endian) + hash -> block receipts
var txLookupPrefix = new Buffer("l"); // txLookupPrefix + hash -> transaction/receipt lookup metadata
var bloomBitsPrefix = new Buffer("B"); // bloomBitsPrefix + bit (uint16 big endian) + section (uint64 big endian) + hash -> bloom bits
var preimagePrefix = new Buffer("secure-key-");      // preimagePrefix + hash -> preimage
var configPrefix = new Buffer("ethereum-config-"); // config prefix for the db
var BloomBitsIndexPrefix = new Buffer("iB"); // BloomBitsIndexPrefix is the data table of a chain indexer to track its progress // Chain index prefixes (use `i` + single byte to avoid mixing data types).

To get data you have to recursivily build tree`s from this data. Knowing hash of state root you can find state root, and then you know hashes of children of state root, so you know children so you can get up to leafs.

Depending on geth option --gcmode archive|fast|light (you can also specifie how many blocks you want to remember), geth stores or doesn`t some tries.

Diffrent tries are world state tree (links to accounts), storage tries (account data), and receipt tries (for transaction receipts).

To get value "sample value" from tree (for example contract adress). You need to go 32 length way down the tree depending on 32 chars length sha3("sample value").

To understand better which data are stored in db and how tries are made look at these to pictures:

enter image description here

enter image description here

5
  • 1
    Interesting, but still not clear for me if Transaction Tries are persisted or they are created on the fly when needed. Also where the Receipts Tries data are stored? Are they stored in the LevelDB? Are they stored inside the blocks? Are the blocks stored in LevelDB as well? If not, where can I find the files? Commented Sep 1, 2018 at 19:01
  • Tries are stored in db in format of trie nodes. When Geth needs to use a trie it builds it up recursively and store in ram cache until you turn geth off. For example when requesting data from geth using web3 api. Transactions are also stored in array format in block body. So in transaction case you don`t need to build tree to get tx of specified block. Everything shown on picture is in LevelDB. If archive mode enabled. So summaring answer is: created on fly when needed and cached in ram, yes, yes, yes (points to it by hash), yes.
    – jabone
    Commented Sep 1, 2018 at 20:06
  • Ok, and what about when the State Trie is updated... are the nodes values overridden or new nodes are stored with the new values? Commented Sep 1, 2018 at 20:31
  • Two different data giving same hashes has so small probability that geth assumes that this never happends. So different world states can coegsist in same format in one db, because nodes are referenced by their hash. Common nodes are stored of course uniquely.
    – jabone
    Commented Sep 1, 2018 at 22:12
  • @EtherswornCanonist, LevelDB is a key->value pair database. Everthing is stored as hash being the key and the corresponding data sturcture being the value. That's it Why would you need to dig into the structure of the database, just use the functions ethereum has to get the data you need.
    – Nulik
    Commented Sep 1, 2018 at 23:46

Not the answer you're looking for? Browse other questions tagged or ask your own question.