mirror of
https://github.com/fluencelabs/tendermint
synced 2025-04-25 06:42:16 +00:00
add tree-chunking.md
This commit is contained in:
parent
e0b9298134
commit
fecc9d6e61
123
docs/architecture/tree-chunking.md
Normal file
123
docs/architecture/tree-chunking.md
Normal file
@ -0,0 +1,123 @@
|
||||
# Tree Chunking
|
||||
|
||||
In order to securely sync a Merkle tree, we must have some algorithm for
|
||||
splitting it into chunks that remain verifiably part of the Merkle tree.
|
||||
Whether the chunks are computed eagerly ahead of time,
|
||||
or lazily in real time, is an orthogonal consideration.
|
||||
|
||||
We refer to the node computing the chunks from a complete Merkle tree as the
|
||||
"chunker", and the node reassembling the Merkle tree from received chunks as the
|
||||
"chunkee".
|
||||
|
||||
The main issue with various chunking strategies is how to verify that a given
|
||||
chunk corresponds to a given chunk index. Especially since we desire for
|
||||
chunkees to be able to apply chunks in any order, we need to ensure that
|
||||
applying an out-of-order chunk is fully verifiable - ie. the chunk is a valid
|
||||
part of the Merkle tree, and correctly corresponds to its chunk index.
|
||||
Turns out this may not be possible in general.
|
||||
|
||||
## Arbitrary Chunking
|
||||
|
||||
There are two general approaches here - breadth-first and depth-first.
|
||||
In the former case, we descend from the root one layer at a time, accumulating
|
||||
full internal nodes into chunks. In the later case, we traverse the tree from
|
||||
say left to right, accumulting full key-value pairs into chunks. In each case,
|
||||
we require additional proof data, which consists of internal nodes that prove
|
||||
all compnents of a chunk (whether nodes or key-value pairs) are valid members of
|
||||
the tree.
|
||||
|
||||
Note that for AVL trees, which are insertion-order dependent, that the
|
||||
depth-first approach must include all internal nodes in order to properly
|
||||
communicate the structure of the tree, which cannot be derrived from key-value
|
||||
pairs alone. Such nodes would be part of the proof structure.
|
||||
|
||||
The chunker can assign to each chunk a unique index, and every chunker will
|
||||
compute the same index for each chunk, assuming they follow the same
|
||||
strategy. However, it is not clear that, in general, a chunkee can verify the
|
||||
chunk index, as there is no explicit map from chunk index to chunk or vice
|
||||
versa.
|
||||
|
||||
This raises two concerns for a state sync protocol:
|
||||
- how to handle mis-indexed chunks (ie. requesting chunk 3 and receiving chunk
|
||||
4)
|
||||
- how to handle non-indexed chunks (ie. receiving a chunk that doesn't
|
||||
correspond to any index from the given chunker strategy
|
||||
|
||||
In each case, it appears that liveness is made much more difficult by the need
|
||||
to figure out what went wrong with the applied chunks. However, these problems
|
||||
can be eliminated by requiring chunks to be applied in-order.
|
||||
|
||||
### MisIndexed Chunks
|
||||
|
||||
Suppose an honest chunker follows a strategy creating chunks 1,2,3,4,5.
|
||||
Suppose an honest chunkee requests chunk 3 from a malicious chunker,
|
||||
who responds with chunk 4. The chunkee applies chunk 4 to its Merkle tree
|
||||
and verifies that it's correct. At this point the chunkee believes it
|
||||
successfully applied chunk 3. This raises two questions:
|
||||
|
||||
- When the chunkee actually requests and receives chunk 4, and then discovers it already
|
||||
applied that chunk, how will it know which peer was malicious, the original
|
||||
one or the new one?
|
||||
- How will the chunkee discover that it is missing the real chunk 3?
|
||||
|
||||
### NonIndexed Chunks
|
||||
|
||||
Suppose an honest chunker follows a strategy creating chunks 1,2,3,4,5.
|
||||
Suppose a malicious chunker follows an alternative strategy, creating chunks
|
||||
Q,R,S,T,U, none of which correspond to chunks 1,2,3,4,5, but all of which are
|
||||
valid chunks in the tree.
|
||||
Suppose an honest chunkee requests chunk 3 from the malicious chunker,
|
||||
who responds with chunk R. The chunkee applies chunk R to its Merkle tree and
|
||||
verifies that it's correct. At this point the chunkee believes it
|
||||
successfully applied chunk 3. This raises two questions:
|
||||
|
||||
- When the chunkee receives another chunk that overlaps with chunk R, how will
|
||||
it know which peer was malicious, the original one or the new one?
|
||||
- How will the chunkee determine which real chunks it is missing?
|
||||
|
||||
## Provably Indexed Chunking
|
||||
|
||||
An alternative approach would be a strategy that chunks the tree in a manner
|
||||
such that all chunks have a verifiable index. Certain tree structures may be
|
||||
able to incorporate this requirement into their design. In the extreme case,
|
||||
the map from chunks to indices (or vice versa) could be committed into the tree
|
||||
itself. But generally, we would like to avoid such requirements on the tree and
|
||||
state and instead pursue a general strategy for chunking trees that preserves
|
||||
verifiable chunk indices.
|
||||
|
||||
Consider an example strategy for illustration of how this would work:
|
||||
|
||||
- the first chunk (chunk 0) contains all nodes in the top 10 layers of the tree
|
||||
- the nodes in the layer 10, which are part of chunk 0, form the roots for 1024
|
||||
sub-trees that contain the rest of the tree
|
||||
- each of these sub-trees is its own chunk, numbered from 1 to 1024
|
||||
- so long as chunk 0 is received first, each additional chunk can be verified
|
||||
with its index. For instance, if we receive chunk 5, the root hash of the
|
||||
sub-tree contained therein should correspond to the 5th node in layer 10
|
||||
|
||||
While this example provides the intuition for how the design might work,
|
||||
we now need to generalize it to handle arbitrary tree sizes and bounded
|
||||
chunk sizes. With the example above, we were only required to receive chunk 0 up
|
||||
front, and could then receive the other chunks in any order. However to
|
||||
generalize, we may be required to receive multiple chunks in order first, before
|
||||
we get to a point where we can receive them in any order.
|
||||
|
||||
It may be the case that we need to receive all chunks in order until we get to a
|
||||
depth where the remaining chunks are complete sub-trees and can be received in
|
||||
any order. It is not clear how the chunkee can determine for itself when it
|
||||
reaches this state; while it can verify it has all nodes up to a certain depth,
|
||||
it does not necessarily know that each remaining sub-tree can fit in a chunk.
|
||||
|
||||
Thus it may not be possible to achieve this sort of out-of-order chunking.
|
||||
|
||||
## Conclusion
|
||||
|
||||
The above considerations demonstrate the challenges with applying chunks of a
|
||||
Merkle tree out-of-order. While it may be possible with more complex protocol
|
||||
design, it may be worth first considering a protocol that proceeds in-order, as
|
||||
this should dramatically reduce the protocol complexity while maintaining the
|
||||
ability to verify each chunk fully. This would also be very similar to the
|
||||
blockchain reactor, which must process blocks in serial. Since any form of state sync
|
||||
is likely to be orders of magnitude faster than the blockchain sync, the
|
||||
performance loss from processing chunks in order is likely to be relatively small
|
||||
compared to a more optimized, out-of-order approach.
|
Loading…
x
Reference in New Issue
Block a user