mirror of
https://github.com/fluencelabs/tendermint
synced 2025-04-25 06:42:16 +00:00
Deprecate/refactor content in docs/specification (#1934)
* docs: deprecate specification dir, closes #1814 * update genesis * old spec dir, deprecation complete * rm a file
This commit is contained in:
parent
4f7fac8076
commit
bbf2bd1d81
@ -1,8 +1,4 @@
|
||||
Fast Sync
|
||||
=========
|
||||
|
||||
Background
|
||||
----------
|
||||
# Fast Sync
|
||||
|
||||
In a proof of work blockchain, syncing with the chain is the same
|
||||
process as staying up-to-date with the consensus: download blocks, and
|
||||
@ -14,21 +10,19 @@ scratch can take a very long time. It's much faster to just download
|
||||
blocks and check the merkle tree of validators than to run the real-time
|
||||
consensus gossip protocol.
|
||||
|
||||
Fast Sync
|
||||
---------
|
||||
## Using Fast Sync
|
||||
|
||||
To support faster syncing, tendermint offers a ``fast-sync`` mode, which
|
||||
is enabled by default, and can be toggled in the ``config.toml`` or via
|
||||
``--fast_sync=false``.
|
||||
To support faster syncing, tendermint offers a `fast-sync` mode, which
|
||||
is enabled by default, and can be toggled in the `config.toml` or via
|
||||
`--fast_sync=false`.
|
||||
|
||||
In this mode, the tendermint daemon will sync hundreds of times faster
|
||||
than if it used the real-time consensus process. Once caught up, the
|
||||
daemon will switch out of fast sync and into the normal consensus mode.
|
||||
After running for some time, the node is considered ``caught up`` if it
|
||||
After running for some time, the node is considered `caught up` if it
|
||||
has at least one peer and it's height is at least as high as the max
|
||||
reported peer height. See `the IsCaughtUp
|
||||
method <https://github.com/tendermint/tendermint/blob/b467515719e686e4678e6da4e102f32a491b85a0/blockchain/pool.go#L128>`__.
|
||||
reported peer height. See [the IsCaughtUp
|
||||
method](https://github.com/tendermint/tendermint/blob/b467515719e686e4678e6da4e102f32a491b85a0/blockchain/pool.go#L128).
|
||||
|
||||
If we're lagging sufficiently, we should go back to fast syncing, but
|
||||
this is an open issue:
|
||||
https://github.com/tendermint/tendermint/issues/129
|
||||
this is an [open issue](https://github.com/tendermint/tendermint/issues/129).
|
@ -149,7 +149,33 @@ func MakeParts(obj interface{}, partSize int) []Part
|
||||
|
||||
## Merkle Trees
|
||||
|
||||
Simple Merkle trees are used in numerous places in Tendermint to compute a cryptographic digest of a data structure.
|
||||
For an overview of Merkle trees, see
|
||||
[wikipedia](https://en.wikipedia.org/wiki/Merkle_tree)
|
||||
|
||||
|
||||
A Simple Tree is a simple compact binary tree for a static list of items. Simple Merkle trees are used in numerous places in Tendermint to compute a cryptographic digest of a data structure. In a Simple Tree, the transactions and validation signatures of a block are hashed using this simple merkle tree logic.
|
||||
|
||||
If the number of items is not a power of two, the tree will not be full
|
||||
and some leaf nodes will be at different levels. Simple Tree tries to
|
||||
keep both sides of the tree the same size, but the left side may be one
|
||||
greater, for example:
|
||||
|
||||
```
|
||||
Simple Tree with 6 items Simple Tree with 7 items
|
||||
|
||||
* *
|
||||
/ \ / \
|
||||
/ \ / \
|
||||
/ \ / \
|
||||
/ \ / \
|
||||
* * * *
|
||||
/ \ / \ / \ / \
|
||||
/ \ / \ / \ / \
|
||||
/ \ / \ / \ / \
|
||||
* h2 * h5 * * * h6
|
||||
/ \ / \ / \ / \ / \
|
||||
h0 h1 h3 h4 h0 h1 h2 h3 h4 h5
|
||||
```
|
||||
|
||||
Tendermint always uses the `TMHASH` hash function, which is the first 20-bytes
|
||||
of the SHA256:
|
||||
@ -235,6 +261,18 @@ func computeHashFromAunts(index, total int, leafHash []byte, innerHashes [][]byt
|
||||
}
|
||||
```
|
||||
|
||||
### Simple Tree with Dictionaries
|
||||
|
||||
The Simple Tree is used to merkelize a list of items, so to merkelize a
|
||||
(short) dictionary of key-value pairs, encode the dictionary as an
|
||||
ordered list of ``KVPair`` structs. The block hash is such a hash
|
||||
derived from all the fields of the block ``Header``. The state hash is
|
||||
similarly derived.
|
||||
|
||||
### IAVL+ Tree
|
||||
|
||||
Because Tendermint only uses a Simple Merkle Tree, application developers are expect to use their own Merkle tree in their applications. For example, the IAVL+ Tree - an immutable self-balancing binary tree for persisting application state is used by the [Cosmos SDK](https://github.com/cosmos/cosmos-sdk/blob/develop/docs/core/multistore.md)
|
||||
|
||||
## JSON
|
||||
|
||||
### Amino
|
||||
|
@ -1,9 +1,329 @@
|
||||
We are working to finalize an updated Tendermint specification with formal
|
||||
proofs of safety and liveness.
|
||||
# Byzantine Consensus Algorithm
|
||||
|
||||
In the meantime, see the [description in the
|
||||
docs](http://tendermint.readthedocs.io/en/master/specification/byzantine-consensus-algorithm.html).
|
||||
## Terms
|
||||
|
||||
There are also relevant but somewhat outdated descriptions in Jae Kwon's [original
|
||||
whitepaper](https://tendermint.com/static/docs/tendermint.pdf) and Ethan Buchman's [master's
|
||||
thesis](https://atrium.lib.uoguelph.ca/xmlui/handle/10214/9769).
|
||||
- The network is composed of optionally connected *nodes*. Nodes
|
||||
directly connected to a particular node are called *peers*.
|
||||
- The consensus process in deciding the next block (at some *height*
|
||||
`H`) is composed of one or many *rounds*.
|
||||
- `NewHeight`, `Propose`, `Prevote`, `Precommit`, and `Commit`
|
||||
represent state machine states of a round. (aka `RoundStep` or
|
||||
just "step").
|
||||
- A node is said to be *at* a given height, round, and step, or at
|
||||
`(H,R,S)`, or at `(H,R)` in short to omit the step.
|
||||
- To *prevote* or *precommit* something means to broadcast a [prevote
|
||||
vote](https://godoc.org/github.com/tendermint/tendermint/types#Vote)
|
||||
or [first precommit
|
||||
vote](https://godoc.org/github.com/tendermint/tendermint/types#FirstPrecommit)
|
||||
for something.
|
||||
- A vote *at* `(H,R)` is a vote signed with the bytes for `H` and `R`
|
||||
included in its [sign-bytes](block-structure.html#vote-sign-bytes).
|
||||
- *+2/3* is short for "more than 2/3"
|
||||
- *1/3+* is short for "1/3 or more"
|
||||
- A set of +2/3 of prevotes for a particular block or `<nil>` at
|
||||
`(H,R)` is called a *proof-of-lock-change* or *PoLC* for short.
|
||||
|
||||
## State Machine Overview
|
||||
|
||||
At each height of the blockchain a round-based protocol is run to
|
||||
determine the next block. Each round is composed of three *steps*
|
||||
(`Propose`, `Prevote`, and `Precommit`), along with two special steps
|
||||
`Commit` and `NewHeight`.
|
||||
|
||||
In the optimal scenario, the order of steps is:
|
||||
|
||||
```
|
||||
NewHeight -> (Propose -> Prevote -> Precommit)+ -> Commit -> NewHeight ->...
|
||||
```
|
||||
|
||||
The sequence `(Propose -> Prevote -> Precommit)` is called a *round*.
|
||||
There may be more than one round required to commit a block at a given
|
||||
height. Examples for why more rounds may be required include:
|
||||
|
||||
- The designated proposer was not online.
|
||||
- The block proposed by the designated proposer was not valid.
|
||||
- The block proposed by the designated proposer did not propagate
|
||||
in time.
|
||||
- The block proposed was valid, but +2/3 of prevotes for the proposed
|
||||
block were not received in time for enough validator nodes by the
|
||||
time they reached the `Precommit` step. Even though +2/3 of prevotes
|
||||
are necessary to progress to the next step, at least one validator
|
||||
may have voted `<nil>` or maliciously voted for something else.
|
||||
- The block proposed was valid, and +2/3 of prevotes were received for
|
||||
enough nodes, but +2/3 of precommits for the proposed block were not
|
||||
received for enough validator nodes.
|
||||
|
||||
Some of these problems are resolved by moving onto the next round &
|
||||
proposer. Others are resolved by increasing certain round timeout
|
||||
parameters over each successive round.
|
||||
|
||||
## State Machine Diagram
|
||||
|
||||
```
|
||||
+-------------------------------------+
|
||||
v |(Wait til `CommmitTime+timeoutCommit`)
|
||||
+-----------+ +-----+-----+
|
||||
+----------> | Propose +--------------+ | NewHeight |
|
||||
| +-----------+ | +-----------+
|
||||
| | ^
|
||||
|(Else, after timeoutPrecommit) v |
|
||||
+-----+-----+ +-----------+ |
|
||||
| Precommit | <------------------------+ Prevote | |
|
||||
+-----+-----+ +-----------+ |
|
||||
|(When +2/3 Precommits for block found) |
|
||||
v |
|
||||
+--------------------------------------------------------------------+
|
||||
| Commit |
|
||||
| |
|
||||
| * Set CommitTime = now; |
|
||||
| * Wait for block, then stage/save/commit block; |
|
||||
+--------------------------------------------------------------------+
|
||||
```
|
||||
|
||||
Background Gossip
|
||||
=================
|
||||
|
||||
A node may not have a corresponding validator private key, but it
|
||||
nevertheless plays an active role in the consensus process by relaying
|
||||
relevant meta-data, proposals, blocks, and votes to its peers. A node
|
||||
that has the private keys of an active validator and is engaged in
|
||||
signing votes is called a *validator-node*. All nodes (not just
|
||||
validator-nodes) have an associated state (the current height, round,
|
||||
and step) and work to make progress.
|
||||
|
||||
Between two nodes there exists a `Connection`, and multiplexed on top of
|
||||
this connection are fairly throttled `Channel`s of information. An
|
||||
epidemic gossip protocol is implemented among some of these channels to
|
||||
bring peers up to speed on the most recent state of consensus. For
|
||||
example,
|
||||
|
||||
- Nodes gossip `PartSet` parts of the current round's proposer's
|
||||
proposed block. A LibSwift inspired algorithm is used to quickly
|
||||
broadcast blocks across the gossip network.
|
||||
- Nodes gossip prevote/precommit votes. A node `NODE_A` that is ahead
|
||||
of `NODE_B` can send `NODE_B` prevotes or precommits for `NODE_B`'s
|
||||
current (or future) round to enable it to progress forward.
|
||||
- Nodes gossip prevotes for the proposed PoLC (proof-of-lock-change)
|
||||
round if one is proposed.
|
||||
- Nodes gossip to nodes lagging in blockchain height with block
|
||||
[commits](https://godoc.org/github.com/tendermint/tendermint/types#Commit)
|
||||
for older blocks.
|
||||
- Nodes opportunistically gossip `HasVote` messages to hint peers what
|
||||
votes it already has.
|
||||
- Nodes broadcast their current state to all neighboring peers. (but
|
||||
is not gossiped further)
|
||||
|
||||
There's more, but let's not get ahead of ourselves here.
|
||||
|
||||
## Proposals
|
||||
|
||||
A proposal is signed and published by the designated proposer at each
|
||||
round. The proposer is chosen by a deterministic and non-choking round
|
||||
robin selection algorithm that selects proposers in proportion to their
|
||||
voting power (see
|
||||
[implementation](https://github.com/tendermint/tendermint/blob/develop/types/validator_set.go)).
|
||||
|
||||
A proposal at `(H,R)` is composed of a block and an optional latest
|
||||
`PoLC-Round < R` which is included iff the proposer knows of one. This
|
||||
hints the network to allow nodes to unlock (when safe) to ensure the
|
||||
liveness property.
|
||||
|
||||
## State Machine Spec
|
||||
|
||||
### Propose Step (height:H,round:R)
|
||||
|
||||
Upon entering `Propose`: - The designated proposer proposes a block at
|
||||
`(H,R)`.
|
||||
|
||||
The `Propose` step ends: - After `timeoutProposeR` after entering
|
||||
`Propose`. --> goto `Prevote(H,R)` - After receiving proposal block
|
||||
and all prevotes at `PoLC-Round`. --> goto `Prevote(H,R)` - After
|
||||
[common exit conditions](#common-exit-conditions)
|
||||
|
||||
### Prevote Step (height:H,round:R)
|
||||
|
||||
Upon entering `Prevote`, each validator broadcasts its prevote vote.
|
||||
|
||||
- First, if the validator is locked on a block since `LastLockRound`
|
||||
but now has a PoLC for something else at round `PoLC-Round` where
|
||||
`LastLockRound < PoLC-Round < R`, then it unlocks.
|
||||
- If the validator is still locked on a block, it prevotes that.
|
||||
- Else, if the proposed block from `Propose(H,R)` is good, it
|
||||
prevotes that.
|
||||
- Else, if the proposal is invalid or wasn't received on time, it
|
||||
prevotes `<nil>`.
|
||||
|
||||
The `Prevote` step ends: - After +2/3 prevotes for a particular block or
|
||||
`<nil>`. -->; goto `Precommit(H,R)` - After `timeoutPrevote` after
|
||||
receiving any +2/3 prevotes. --> goto `Precommit(H,R)` - After
|
||||
[common exit conditions](#common-exit-conditions)
|
||||
|
||||
### Precommit Step (height:H,round:R)
|
||||
|
||||
Upon entering `Precommit`, each validator broadcasts its precommit vote.
|
||||
- If the validator has a PoLC at `(H,R)` for a particular block `B`, it
|
||||
(re)locks (or changes lock to) and precommits `B` and sets
|
||||
`LastLockRound = R`. - Else, if the validator has a PoLC at `(H,R)` for
|
||||
`<nil>`, it unlocks and precommits `<nil>`. - Else, it keeps the lock
|
||||
unchanged and precommits `<nil>`.
|
||||
|
||||
A precommit for `<nil>` means "I didn’t see a PoLC for this round, but I
|
||||
did get +2/3 prevotes and waited a bit".
|
||||
|
||||
The Precommit step ends: - After +2/3 precommits for `<nil>`. -->
|
||||
goto `Propose(H,R+1)` - After `timeoutPrecommit` after receiving any
|
||||
+2/3 precommits. --> goto `Propose(H,R+1)` - After [common exit
|
||||
conditions](#common-exit-conditions)
|
||||
|
||||
### Common exit conditions
|
||||
|
||||
- After +2/3 precommits for a particular block. --> goto
|
||||
`Commit(H)`
|
||||
- After any +2/3 prevotes received at `(H,R+x)`. --> goto
|
||||
`Prevote(H,R+x)`
|
||||
- After any +2/3 precommits received at `(H,R+x)`. --> goto
|
||||
`Precommit(H,R+x)`
|
||||
|
||||
### Commit Step (height:H)
|
||||
|
||||
- Set `CommitTime = now()`
|
||||
- Wait until block is received. --> goto `NewHeight(H+1)`
|
||||
|
||||
### NewHeight Step (height:H)
|
||||
|
||||
- Move `Precommits` to `LastCommit` and increment height.
|
||||
- Set `StartTime = CommitTime+timeoutCommit`
|
||||
- Wait until `StartTime` to receive straggler commits. --> goto
|
||||
`Propose(H,0)`
|
||||
|
||||
## Proofs
|
||||
|
||||
### Proof of Safety
|
||||
|
||||
Assume that at most -1/3 of the voting power of validators is byzantine.
|
||||
If a validator commits block `B` at round `R`, it's because it saw +2/3
|
||||
of precommits at round `R`. This implies that 1/3+ of honest nodes are
|
||||
still locked at round `R' > R`. These locked validators will remain
|
||||
locked until they see a PoLC at `R' > R`, but this won't happen because
|
||||
1/3+ are locked and honest, so at most -2/3 are available to vote for
|
||||
anything other than `B`.
|
||||
|
||||
### Proof of Liveness
|
||||
|
||||
If 1/3+ honest validators are locked on two different blocks from
|
||||
different rounds, a proposers' `PoLC-Round` will eventually cause nodes
|
||||
locked from the earlier round to unlock. Eventually, the designated
|
||||
proposer will be one that is aware of a PoLC at the later round. Also,
|
||||
`timeoutProposalR` increments with round `R`, while the size of a
|
||||
proposal are capped, so eventually the network is able to "fully gossip"
|
||||
the whole proposal (e.g. the block & PoLC).
|
||||
|
||||
### Proof of Fork Accountability
|
||||
|
||||
Define the JSet (justification-vote-set) at height `H` of a validator
|
||||
`V1` to be all the votes signed by the validator at `H` along with
|
||||
justification PoLC prevotes for each lock change. For example, if `V1`
|
||||
signed the following precommits: `Precommit(B1 @ round 0)`,
|
||||
`Precommit(<nil> @ round 1)`, `Precommit(B2 @ round 4)` (note that no
|
||||
precommits were signed for rounds 2 and 3, and that's ok),
|
||||
`Precommit(B1 @ round 0)` must be justified by a PoLC at round 0, and
|
||||
`Precommit(B2 @ round 4)` must be justified by a PoLC at round 4; but
|
||||
the precommit for `<nil>` at round 1 is not a lock-change by definition
|
||||
so the JSet for `V1` need not include any prevotes at round 1, 2, or 3
|
||||
(unless `V1` happened to have prevoted for those rounds).
|
||||
|
||||
Further, define the JSet at height `H` of a set of validators `VSet` to
|
||||
be the union of the JSets for each validator in `VSet`. For a given
|
||||
commit by honest validators at round `R` for block `B` we can construct
|
||||
a JSet to justify the commit for `B` at `R`. We say that a JSet
|
||||
*justifies* a commit at `(H,R)` if all the committers (validators in the
|
||||
commit-set) are each justified in the JSet with no duplicitous vote
|
||||
signatures (by the committers).
|
||||
|
||||
- **Lemma**: When a fork is detected by the existence of two
|
||||
conflicting [commits](./validators.html#commiting-a-block), the
|
||||
union of the JSets for both commits (if they can be compiled) must
|
||||
include double-signing by at least 1/3+ of the validator set.
|
||||
**Proof**: The commit cannot be at the same round, because that
|
||||
would immediately imply double-signing by 1/3+. Take the union of
|
||||
the JSets of both commits. If there is no double-signing by at least
|
||||
1/3+ of the validator set in the union, then no honest validator
|
||||
could have precommitted any different block after the first commit.
|
||||
Yet, +2/3 did. Reductio ad absurdum.
|
||||
|
||||
As a corollary, when there is a fork, an external process can determine
|
||||
the blame by requiring each validator to justify all of its round votes.
|
||||
Either we will find 1/3+ who cannot justify at least one of their votes,
|
||||
and/or, we will find 1/3+ who had double-signed.
|
||||
|
||||
### Alternative algorithm
|
||||
|
||||
Alternatively, we can take the JSet of a commit to be the "full commit".
|
||||
That is, if light clients and validators do not consider a block to be
|
||||
committed unless the JSet of the commit is also known, then we get the
|
||||
desirable property that if there ever is a fork (e.g. there are two
|
||||
conflicting "full commits"), then 1/3+ of the validators are immediately
|
||||
punishable for double-signing.
|
||||
|
||||
There are many ways to ensure that the gossip network efficiently share
|
||||
the JSet of a commit. One solution is to add a new message type that
|
||||
tells peers that this node has (or does not have) a +2/3 majority for B
|
||||
(or) at (H,R), and a bitarray of which votes contributed towards that
|
||||
majority. Peers can react by responding with appropriate votes.
|
||||
|
||||
We will implement such an algorithm for the next iteration of the
|
||||
Tendermint consensus protocol.
|
||||
|
||||
Other potential improvements include adding more data in votes such as
|
||||
the last known PoLC round that caused a lock change, and the last voted
|
||||
round/step (or, we may require that validators not skip any votes). This
|
||||
may make JSet verification/gossip logic easier to implement.
|
||||
|
||||
### Censorship Attacks
|
||||
|
||||
Due to the definition of a block
|
||||
[commit](../../tendermint-core/validator.md#commiting-a-block), any 1/3+ coalition of
|
||||
validators can halt the blockchain by not broadcasting their votes. Such
|
||||
a coalition can also censor particular transactions by rejecting blocks
|
||||
that include these transactions, though this would result in a
|
||||
significant proportion of block proposals to be rejected, which would
|
||||
slow down the rate of block commits of the blockchain, reducing its
|
||||
utility and value. The malicious coalition might also broadcast votes in
|
||||
a trickle so as to grind blockchain block commits to a near halt, or
|
||||
engage in any combination of these attacks.
|
||||
|
||||
If a global active adversary were also involved, it can partition the
|
||||
network in such a way that it may appear that the wrong subset of
|
||||
validators were responsible for the slowdown. This is not just a
|
||||
limitation of Tendermint, but rather a limitation of all consensus
|
||||
protocols whose network is potentially controlled by an active
|
||||
adversary.
|
||||
|
||||
### Overcoming Forks and Censorship Attacks
|
||||
|
||||
For these types of attacks, a subset of the validators through external
|
||||
means should coordinate to sign a reorg-proposal that chooses a fork
|
||||
(and any evidence thereof) and the initial subset of validators with
|
||||
their signatures. Validators who sign such a reorg-proposal forego its
|
||||
collateral on all other forks. Clients should verify the signatures on
|
||||
the reorg-proposal, verify any evidence, and make a judgement or prompt
|
||||
the end-user for a decision. For example, a phone wallet app may prompt
|
||||
the user with a security warning, while a refrigerator may accept any
|
||||
reorg-proposal signed by +1/2 of the original validators.
|
||||
|
||||
No non-synchronous Byzantine fault-tolerant algorithm can come to
|
||||
consensus when 1/3+ of validators are dishonest, yet a fork assumes that
|
||||
1/3+ of validators have already been dishonest by double-signing or
|
||||
lock-changing without justification. So, signing the reorg-proposal is a
|
||||
coordination problem that cannot be solved by any non-synchronous
|
||||
protocol (i.e. automatically, and without making assumptions about the
|
||||
reliability of the underlying network). It must be provided by means
|
||||
external to the weakly-synchronous Tendermint consensus algorithm. For
|
||||
now, we leave the problem of reorg-proposal coordination to human
|
||||
coordination via internet media. Validators must take care to ensure
|
||||
that there are no significant network partitions, to avoid situations
|
||||
where two conflicting reorg-proposals are signed.
|
||||
|
||||
Assuming that the external coordination medium and protocol is robust,
|
||||
it follows that forks are less of a concern than [censorship
|
||||
attacks](#censorship-attacks).
|
||||
|
@ -1,218 +0,0 @@
|
||||
Block Structure
|
||||
===============
|
||||
|
||||
The tendermint consensus engine records all agreements by a
|
||||
supermajority of nodes into a blockchain, which is replicated among all
|
||||
nodes. This blockchain is accessible via various rpc endpoints, mainly
|
||||
``/block?height=`` to get the full block, as well as
|
||||
``/blockchain?minHeight=_&maxHeight=_`` to get a list of headers. But
|
||||
what exactly is stored in these blocks?
|
||||
|
||||
Block
|
||||
~~~~~
|
||||
|
||||
A
|
||||
`Block <https://godoc.org/github.com/tendermint/tendermint/types#Block>`__
|
||||
contains:
|
||||
|
||||
- a `Header <#header>`__ contains merkle hashes for various chain
|
||||
states
|
||||
- the
|
||||
`Data <https://godoc.org/github.com/tendermint/tendermint/types#Data>`__
|
||||
is all transactions which are to be processed
|
||||
- the `LastCommit <#commit>`__ > 2/3 signatures for the last block
|
||||
|
||||
The signatures returned along with block ``H`` are those validating
|
||||
block ``H-1``. This can be a little confusing, but we must also consider
|
||||
that the ``Header`` also contains the ``LastCommitHash``. It would be
|
||||
impossible for a Header to include the commits that sign it, as it would
|
||||
cause an infinite loop here. But when we get block ``H``, we find
|
||||
``Header.LastCommitHash``, which must match the hash of ``LastCommit``.
|
||||
|
||||
Header
|
||||
~~~~~~
|
||||
|
||||
The
|
||||
`Header <https://godoc.org/github.com/tendermint/tendermint/types#Header>`__
|
||||
contains lots of information (follow link for up-to-date info). Notably,
|
||||
it maintains the ``Height``, the ``LastBlockID`` (to make it a chain),
|
||||
and hashes of the data, the app state, and the validator set. This is
|
||||
important as the only item that is signed by the validators is the
|
||||
``Header``, and all other data must be validated against one of the
|
||||
merkle hashes in the ``Header``.
|
||||
|
||||
The ``DataHash`` can provide a nice check on the
|
||||
`Data <https://godoc.org/github.com/tendermint/tendermint/types#Data>`__
|
||||
returned in this same block. If you are subscribed to new blocks, via
|
||||
tendermint RPC, in order to display or process the new transactions you
|
||||
should at least validate that the ``DataHash`` is valid. If it is
|
||||
important to verify autheniticity, you must wait for the ``LastCommit``
|
||||
from the next block to make sure the block header (including
|
||||
``DataHash``) was properly signed.
|
||||
|
||||
The ``ValidatorHash`` contains a hash of the current
|
||||
`Validators <https://godoc.org/github.com/tendermint/tendermint/types#Validator>`__.
|
||||
Tracking all changes in the validator set is complex, but a client can
|
||||
quickly compare this hash with the `hash of the currently known
|
||||
validators <https://godoc.org/github.com/tendermint/tendermint/types#ValidatorSet.Hash>`__
|
||||
to see if there have been changes.
|
||||
|
||||
The ``AppHash`` serves as the basis for validating any merkle proofs
|
||||
that come from the ABCI application. It represents the
|
||||
state of the actual application, rather that the state of the blockchain
|
||||
itself. This means it's necessary in order to perform any business
|
||||
logic, such as verifying an account balance.
|
||||
|
||||
**Note** After the transactions are committed to a block, they still
|
||||
need to be processed in a separate step, which happens between the
|
||||
blocks. If you find a given transaction in the block at height ``H``,
|
||||
the effects of running that transaction will be first visible in the
|
||||
``AppHash`` from the block header at height ``H+1``.
|
||||
|
||||
Like the ``LastCommit`` issue, this is a requirement of the immutability
|
||||
of the block chain, as the application only applies transactions *after*
|
||||
they are commited to the chain.
|
||||
|
||||
Commit
|
||||
~~~~~~
|
||||
|
||||
The
|
||||
`Commit <https://godoc.org/github.com/tendermint/tendermint/types#Commit>`__
|
||||
contains a set of
|
||||
`Votes <https://godoc.org/github.com/tendermint/tendermint/types#Vote>`__
|
||||
that were made by the validator set to reach consensus on this block.
|
||||
This is the key to the security in any PoS system, and actually no data
|
||||
that cannot be traced back to a block header with a valid set of Votes
|
||||
can be trusted. Thus, getting the Commit data and verifying the votes is
|
||||
extremely important.
|
||||
|
||||
As mentioned above, in order to find the ``precommit votes`` for block
|
||||
header ``H``, we need to query block ``H+1``. Then we need to check the
|
||||
votes, make sure they really are for that block, and properly formatted.
|
||||
Much of this code is implemented in Go in the
|
||||
`light-client <https://github.com/tendermint/light-client>`__ package.
|
||||
If you look at the code, you will notice that we need to provide the
|
||||
``chainID`` of the blockchain in order to properly calculate the votes.
|
||||
This is to protect anyone from swapping votes between chains to fake (or
|
||||
frame) a validator. Also note that this ``chainID`` is in the
|
||||
``genesis.json`` from *Tendermint*, not the ``genesis.json`` from the
|
||||
basecoin app (`that is a different
|
||||
chainID... <https://github.com/cosmos/cosmos-sdk/issues/32>`__).
|
||||
|
||||
Once we have those votes, and we calculated the proper `sign
|
||||
bytes <https://godoc.org/github.com/tendermint/tendermint/types#Vote.WriteSignBytes>`__
|
||||
using the chainID and a `nice helper
|
||||
function <https://godoc.org/github.com/tendermint/tendermint/types#SignBytes>`__,
|
||||
we can verify them. The light client is responsible for maintaining a
|
||||
set of validators that we trust. Each vote only stores the validators
|
||||
``Address``, as well as the ``Signature``. Assuming we have a local copy
|
||||
of the trusted validator set, we can look up the ``Public Key`` of the
|
||||
validator given its ``Address``, then verify that the ``Signature``
|
||||
matches the ``SignBytes`` and ``Public Key``. Then we sum up the total
|
||||
voting power of all validators, whose votes fulfilled all these
|
||||
stringent requirements. If the total number of voting power for a single
|
||||
block is greater than 2/3 of all voting power, then we can finally trust
|
||||
the block header, the AppHash, and the proof we got from the ABCI
|
||||
application.
|
||||
|
||||
Vote Sign Bytes
|
||||
^^^^^^^^^^^^^^^
|
||||
|
||||
The ``sign-bytes`` of a vote is produced by taking a
|
||||
`stable-json <https://github.com/substack/json-stable-stringify>`__-like
|
||||
deterministic JSON `wire <./wire-protocol.html>`__ encoding of
|
||||
the vote (excluding the ``Signature`` field), and wrapping it with
|
||||
``{"chain_id":"my_chain","vote":...}``.
|
||||
|
||||
For example, a precommit vote might have the following ``sign-bytes``:
|
||||
|
||||
.. code:: json
|
||||
|
||||
{"chain_id":"my_chain","vote":{"block_hash":"611801F57B4CE378DF1A3FFF1216656E89209A99","block_parts_header":{"hash":"B46697379DBE0774CC2C3B656083F07CA7E0F9CE","total":123},"height":1234,"round":1,"type":2}}
|
||||
|
||||
Block Hash
|
||||
~~~~~~~~~~
|
||||
|
||||
The `block
|
||||
hash <https://godoc.org/github.com/tendermint/tendermint/types#Block.Hash>`__
|
||||
is the `Simple Tree hash <./merkle.html#simple-tree-with-dictionaries>`__
|
||||
of the fields of the block ``Header`` encoded as a list of
|
||||
``KVPair``\ s.
|
||||
|
||||
Transaction
|
||||
~~~~~~~~~~~
|
||||
|
||||
A transaction is any sequence of bytes. It is up to your
|
||||
ABCI application to accept or reject transactions.
|
||||
|
||||
BlockID
|
||||
~~~~~~~
|
||||
|
||||
Many of these data structures refer to the
|
||||
`BlockID <https://godoc.org/github.com/tendermint/tendermint/types#BlockID>`__,
|
||||
which is the ``BlockHash`` (hash of the block header, also referred to
|
||||
by the next block) along with the ``PartSetHeader``. The
|
||||
``PartSetHeader`` is explained below and is used internally to
|
||||
orchestrate the p2p propogation. For clients, it is basically opaque
|
||||
bytes, but they must match for all votes.
|
||||
|
||||
PartSetHeader
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
The
|
||||
`PartSetHeader <https://godoc.org/github.com/tendermint/tendermint/types#PartSetHeader>`__
|
||||
contains the total number of pieces in a
|
||||
`PartSet <https://godoc.org/github.com/tendermint/tendermint/types#PartSet>`__,
|
||||
and the Merkle root hash of those pieces.
|
||||
|
||||
PartSet
|
||||
~~~~~~~
|
||||
|
||||
PartSet is used to split a byteslice of data into parts (pieces) for
|
||||
transmission. By splitting data into smaller parts and computing a
|
||||
Merkle root hash on the list, you can verify that a part is legitimately
|
||||
part of the complete data, and the part can be forwarded to other peers
|
||||
before all the parts are known. In short, it's a fast way to securely
|
||||
propagate a large chunk of data (like a block) over a gossip network.
|
||||
|
||||
PartSet was inspired by the LibSwift project.
|
||||
|
||||
Usage:
|
||||
|
||||
.. code:: go
|
||||
|
||||
data := RandBytes(2 << 20) // Something large
|
||||
|
||||
partSet := NewPartSetFromData(data)
|
||||
partSet.Total() // Total number of 4KB parts
|
||||
partSet.Count() // Equal to the Total, since we already have all the parts
|
||||
partSet.Hash() // The Merkle root hash
|
||||
partSet.BitArray() // A BitArray of partSet.Total() 1's
|
||||
|
||||
header := partSet.Header() // Send this to the peer
|
||||
header.Total // Total number of parts
|
||||
header.Hash // The merkle root hash
|
||||
|
||||
// Now we'll reconstruct the data from the parts
|
||||
partSet2 := NewPartSetFromHeader(header)
|
||||
partSet2.Total() // Same total as partSet.Total()
|
||||
partSet2.Count() // Zero, since this PartSet doesn't have any parts yet.
|
||||
partSet2.Hash() // Same hash as in partSet.Hash()
|
||||
partSet2.BitArray() // A BitArray of partSet.Total() 0's
|
||||
|
||||
// In a gossip network the parts would arrive in arbitrary order, perhaps
|
||||
// in response to explicit requests for parts, or optimistically in response
|
||||
// to the receiving peer's partSet.BitArray().
|
||||
for !partSet2.IsComplete() {
|
||||
part := receivePartFromGossipNetwork()
|
||||
added, err := partSet2.AddPart(part)
|
||||
if err != nil {
|
||||
// A wrong part,
|
||||
// the merkle trail does not hash to partSet2.Hash()
|
||||
} else if !added {
|
||||
// A duplicate part already received
|
||||
}
|
||||
}
|
||||
|
||||
data2, _ := ioutil.ReadAll(partSet2.GetReader())
|
||||
bytes.Equal(data, data2) // true
|
@ -1,349 +0,0 @@
|
||||
Byzantine Consensus Algorithm
|
||||
=============================
|
||||
|
||||
Terms
|
||||
-----
|
||||
|
||||
- The network is composed of optionally connected *nodes*. Nodes
|
||||
directly connected to a particular node are called *peers*.
|
||||
- The consensus process in deciding the next block (at some *height*
|
||||
``H``) is composed of one or many *rounds*.
|
||||
- ``NewHeight``, ``Propose``, ``Prevote``, ``Precommit``, and
|
||||
``Commit`` represent state machine states of a round. (aka
|
||||
``RoundStep`` or just "step").
|
||||
- A node is said to be *at* a given height, round, and step, or at
|
||||
``(H,R,S)``, or at ``(H,R)`` in short to omit the step.
|
||||
- To *prevote* or *precommit* something means to broadcast a `prevote
|
||||
vote <https://godoc.org/github.com/tendermint/tendermint/types#Vote>`__
|
||||
or `first precommit
|
||||
vote <https://godoc.org/github.com/tendermint/tendermint/types#FirstPrecommit>`__
|
||||
for something.
|
||||
- A vote *at* ``(H,R)`` is a vote signed with the bytes for ``H`` and
|
||||
``R`` included in its
|
||||
`sign-bytes <block-structure.html#vote-sign-bytes>`__.
|
||||
- *+2/3* is short for "more than 2/3"
|
||||
- *1/3+* is short for "1/3 or more"
|
||||
- A set of +2/3 of prevotes for a particular block or ``<nil>`` at
|
||||
``(H,R)`` is called a *proof-of-lock-change* or *PoLC* for short.
|
||||
|
||||
State Machine Overview
|
||||
----------------------
|
||||
|
||||
At each height of the blockchain a round-based protocol is run to
|
||||
determine the next block. Each round is composed of three *steps*
|
||||
(``Propose``, ``Prevote``, and ``Precommit``), along with two special
|
||||
steps ``Commit`` and ``NewHeight``.
|
||||
|
||||
In the optimal scenario, the order of steps is:
|
||||
|
||||
::
|
||||
|
||||
NewHeight -> (Propose -> Prevote -> Precommit)+ -> Commit -> NewHeight ->...
|
||||
|
||||
The sequence ``(Propose -> Prevote -> Precommit)`` is called a *round*.
|
||||
There may be more than one round required to commit a block at a given
|
||||
height. Examples for why more rounds may be required include:
|
||||
|
||||
- The designated proposer was not online.
|
||||
- The block proposed by the designated proposer was not valid.
|
||||
- The block proposed by the designated proposer did not propagate in
|
||||
time.
|
||||
- The block proposed was valid, but +2/3 of prevotes for the proposed
|
||||
block were not received in time for enough validator nodes by the
|
||||
time they reached the ``Precommit`` step. Even though +2/3 of
|
||||
prevotes are necessary to progress to the next step, at least one
|
||||
validator may have voted ``<nil>`` or maliciously voted for something
|
||||
else.
|
||||
- The block proposed was valid, and +2/3 of prevotes were received for
|
||||
enough nodes, but +2/3 of precommits for the proposed block were not
|
||||
received for enough validator nodes.
|
||||
|
||||
Some of these problems are resolved by moving onto the next round &
|
||||
proposer. Others are resolved by increasing certain round timeout
|
||||
parameters over each successive round.
|
||||
|
||||
State Machine Diagram
|
||||
---------------------
|
||||
|
||||
::
|
||||
|
||||
+-------------------------------------+
|
||||
v |(Wait til `CommmitTime+timeoutCommit`)
|
||||
+-----------+ +-----+-----+
|
||||
+----------> | Propose +--------------+ | NewHeight |
|
||||
| +-----------+ | +-----------+
|
||||
| | ^
|
||||
|(Else, after timeoutPrecommit) v |
|
||||
+-----+-----+ +-----------+ |
|
||||
| Precommit | <------------------------+ Prevote | |
|
||||
+-----+-----+ +-----------+ |
|
||||
|(When +2/3 Precommits for block found) |
|
||||
v |
|
||||
+--------------------------------------------------------------------+
|
||||
| Commit |
|
||||
| |
|
||||
| * Set CommitTime = now; |
|
||||
| * Wait for block, then stage/save/commit block; |
|
||||
+--------------------------------------------------------------------+
|
||||
|
||||
Background Gossip
|
||||
-----------------
|
||||
|
||||
A node may not have a corresponding validator private key, but it
|
||||
nevertheless plays an active role in the consensus process by relaying
|
||||
relevant meta-data, proposals, blocks, and votes to its peers. A node
|
||||
that has the private keys of an active validator and is engaged in
|
||||
signing votes is called a *validator-node*. All nodes (not just
|
||||
validator-nodes) have an associated state (the current height, round,
|
||||
and step) and work to make progress.
|
||||
|
||||
Between two nodes there exists a ``Connection``, and multiplexed on top
|
||||
of this connection are fairly throttled ``Channel``\ s of information.
|
||||
An epidemic gossip protocol is implemented among some of these channels
|
||||
to bring peers up to speed on the most recent state of consensus. For
|
||||
example,
|
||||
|
||||
- Nodes gossip ``PartSet`` parts of the current round's proposer's
|
||||
proposed block. A LibSwift inspired algorithm is used to quickly
|
||||
broadcast blocks across the gossip network.
|
||||
- Nodes gossip prevote/precommit votes. A node NODE\_A that is ahead of
|
||||
NODE\_B can send NODE\_B prevotes or precommits for NODE\_B's current
|
||||
(or future) round to enable it to progress forward.
|
||||
- Nodes gossip prevotes for the proposed PoLC (proof-of-lock-change)
|
||||
round if one is proposed.
|
||||
- Nodes gossip to nodes lagging in blockchain height with block
|
||||
`commits <https://godoc.org/github.com/tendermint/tendermint/types#Commit>`__
|
||||
for older blocks.
|
||||
- Nodes opportunistically gossip ``HasVote`` messages to hint peers
|
||||
what votes it already has.
|
||||
- Nodes broadcast their current state to all neighboring peers. (but is
|
||||
not gossiped further)
|
||||
|
||||
There's more, but let's not get ahead of ourselves here.
|
||||
|
||||
Proposals
|
||||
---------
|
||||
|
||||
A proposal is signed and published by the designated proposer at each
|
||||
round. The proposer is chosen by a deterministic and non-choking round
|
||||
robin selection algorithm that selects proposers in proportion to their
|
||||
voting power. (see
|
||||
`implementation <https://github.com/tendermint/tendermint/blob/develop/types/validator_set.go>`__)
|
||||
|
||||
A proposal at ``(H,R)`` is composed of a block and an optional latest
|
||||
``PoLC-Round < R`` which is included iff the proposer knows of one. This
|
||||
hints the network to allow nodes to unlock (when safe) to ensure the
|
||||
liveness property.
|
||||
|
||||
State Machine Spec
|
||||
------------------
|
||||
|
||||
Propose Step (height:H,round:R)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Upon entering ``Propose``: - The designated proposer proposes a block at
|
||||
``(H,R)``.
|
||||
|
||||
The ``Propose`` step ends: - After ``timeoutProposeR`` after entering
|
||||
``Propose``. --> goto ``Prevote(H,R)`` - After receiving proposal block
|
||||
and all prevotes at ``PoLC-Round``. --> goto ``Prevote(H,R)`` - After
|
||||
`common exit conditions <#common-exit-conditions>`__
|
||||
|
||||
Prevote Step (height:H,round:R)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Upon entering ``Prevote``, each validator broadcasts its prevote vote.
|
||||
|
||||
- First, if the validator is locked on a block since ``LastLockRound``
|
||||
but now has a PoLC for something else at round ``PoLC-Round`` where
|
||||
``LastLockRound < PoLC-Round < R``, then it unlocks.
|
||||
- If the validator is still locked on a block, it prevotes that.
|
||||
- Else, if the proposed block from ``Propose(H,R)`` is good, it
|
||||
prevotes that.
|
||||
- Else, if the proposal is invalid or wasn't received on time, it
|
||||
prevotes ``<nil>``.
|
||||
|
||||
The ``Prevote`` step ends: - After +2/3 prevotes for a particular block
|
||||
or ``<nil>``. --> goto ``Precommit(H,R)`` - After ``timeoutPrevote``
|
||||
after receiving any +2/3 prevotes. --> goto ``Precommit(H,R)`` - After
|
||||
`common exit conditions <#common-exit-conditions>`__
|
||||
|
||||
Precommit Step (height:H,round:R)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Upon entering ``Precommit``, each validator broadcasts its precommit
|
||||
vote. - If the validator has a PoLC at ``(H,R)`` for a particular block
|
||||
``B``, it (re)locks (or changes lock to) and precommits ``B`` and sets
|
||||
``LastLockRound = R``. - Else, if the validator has a PoLC at ``(H,R)``
|
||||
for ``<nil>``, it unlocks and precommits ``<nil>``. - Else, it keeps the
|
||||
lock unchanged and precommits ``<nil>``.
|
||||
|
||||
A precommit for ``<nil>`` means "I didn’t see a PoLC for this round, but
|
||||
I did get +2/3 prevotes and waited a bit".
|
||||
|
||||
The Precommit step ends: - After +2/3 precommits for ``<nil>``. --> goto
|
||||
``Propose(H,R+1)`` - After ``timeoutPrecommit`` after receiving any +2/3
|
||||
precommits. --> goto ``Propose(H,R+1)`` - After `common exit
|
||||
conditions <#common-exit-conditions>`__
|
||||
|
||||
common exit conditions
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
- After +2/3 precommits for a particular block. --> goto ``Commit(H)``
|
||||
- After any +2/3 prevotes received at ``(H,R+x)``. --> goto
|
||||
``Prevote(H,R+x)``
|
||||
- After any +2/3 precommits received at ``(H,R+x)``. --> goto
|
||||
``Precommit(H,R+x)``
|
||||
|
||||
Commit Step (height:H)
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- Set ``CommitTime = now()``
|
||||
- Wait until block is received. --> goto ``NewHeight(H+1)``
|
||||
|
||||
NewHeight Step (height:H)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- Move ``Precommits`` to ``LastCommit`` and increment height.
|
||||
- Set ``StartTime = CommitTime+timeoutCommit``
|
||||
- Wait until ``StartTime`` to receive straggler commits. --> goto
|
||||
``Propose(H,0)``
|
||||
|
||||
Proofs
|
||||
------
|
||||
|
||||
Proof of Safety
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
Assume that at most -1/3 of the voting power of validators is byzantine.
|
||||
If a validator commits block ``B`` at round ``R``, it's because it saw
|
||||
+2/3 of precommits at round ``R``. This implies that 1/3+ of honest
|
||||
nodes are still locked at round ``R' > R``. These locked validators will
|
||||
remain locked until they see a PoLC at ``R' > R``, but this won't happen
|
||||
because 1/3+ are locked and honest, so at most -2/3 are available to
|
||||
vote for anything other than ``B``.
|
||||
|
||||
Proof of Liveness
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
If 1/3+ honest validators are locked on two different blocks from
|
||||
different rounds, a proposers' ``PoLC-Round`` will eventually cause
|
||||
nodes locked from the earlier round to unlock. Eventually, the
|
||||
designated proposer will be one that is aware of a PoLC at the later
|
||||
round. Also, ``timeoutProposalR`` increments with round ``R``, while the
|
||||
size of a proposal are capped, so eventually the network is able to
|
||||
"fully gossip" the whole proposal (e.g. the block & PoLC).
|
||||
|
||||
Proof of Fork Accountability
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Define the JSet (justification-vote-set) at height ``H`` of a validator
|
||||
``V1`` to be all the votes signed by the validator at ``H`` along with
|
||||
justification PoLC prevotes for each lock change. For example, if ``V1``
|
||||
signed the following precommits: ``Precommit(B1 @ round 0)``,
|
||||
``Precommit(<nil> @ round 1)``, ``Precommit(B2 @ round 4)`` (note that
|
||||
no precommits were signed for rounds 2 and 3, and that's ok),
|
||||
``Precommit(B1 @ round 0)`` must be justified by a PoLC at round 0, and
|
||||
``Precommit(B2 @ round 4)`` must be justified by a PoLC at round 4; but
|
||||
the precommit for ``<nil>`` at round 1 is not a lock-change by
|
||||
definition so the JSet for ``V1`` need not include any prevotes at round
|
||||
1, 2, or 3 (unless ``V1`` happened to have prevoted for those rounds).
|
||||
|
||||
Further, define the JSet at height ``H`` of a set of validators ``VSet``
|
||||
to be the union of the JSets for each validator in ``VSet``. For a given
|
||||
commit by honest validators at round ``R`` for block ``B`` we can
|
||||
construct a JSet to justify the commit for ``B`` at ``R``. We say that a
|
||||
JSet *justifies* a commit at ``(H,R)`` if all the committers (validators
|
||||
in the commit-set) are each justified in the JSet with no duplicitous
|
||||
vote signatures (by the committers).
|
||||
|
||||
- **Lemma**: When a fork is detected by the existence of two
|
||||
conflicting `commits <./validators.html#commiting-a-block>`__,
|
||||
the union of the JSets for both commits (if they can be compiled)
|
||||
must include double-signing by at least 1/3+ of the validator set.
|
||||
**Proof**: The commit cannot be at the same round, because that would
|
||||
immediately imply double-signing by 1/3+. Take the union of the JSets
|
||||
of both commits. If there is no double-signing by at least 1/3+ of
|
||||
the validator set in the union, then no honest validator could have
|
||||
precommitted any different block after the first commit. Yet, +2/3
|
||||
did. Reductio ad absurdum.
|
||||
|
||||
As a corollary, when there is a fork, an external process can determine
|
||||
the blame by requiring each validator to justify all of its round votes.
|
||||
Either we will find 1/3+ who cannot justify at least one of their votes,
|
||||
and/or, we will find 1/3+ who had double-signed.
|
||||
|
||||
Alternative algorithm
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Alternatively, we can take the JSet of a commit to be the "full commit".
|
||||
That is, if light clients and validators do not consider a block to be
|
||||
committed unless the JSet of the commit is also known, then we get the
|
||||
desirable property that if there ever is a fork (e.g. there are two
|
||||
conflicting "full commits"), then 1/3+ of the validators are immediately
|
||||
punishable for double-signing.
|
||||
|
||||
There are many ways to ensure that the gossip network efficiently share
|
||||
the JSet of a commit. One solution is to add a new message type that
|
||||
tells peers that this node has (or does not have) a +2/3 majority for B
|
||||
(or ) at (H,R), and a bitarray of which votes contributed towards that
|
||||
majority. Peers can react by responding with appropriate votes.
|
||||
|
||||
We will implement such an algorithm for the next iteration of the
|
||||
Tendermint consensus protocol.
|
||||
|
||||
Other potential improvements include adding more data in votes such as
|
||||
the last known PoLC round that caused a lock change, and the last voted
|
||||
round/step (or, we may require that validators not skip any votes). This
|
||||
may make JSet verification/gossip logic easier to implement.
|
||||
|
||||
Censorship Attacks
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Due to the definition of a block
|
||||
`commit <validators.html#commiting-a-block>`__, any 1/3+
|
||||
coalition of validators can halt the blockchain by not broadcasting
|
||||
their votes. Such a coalition can also censor particular transactions by
|
||||
rejecting blocks that include these transactions, though this would
|
||||
result in a significant proportion of block proposals to be rejected,
|
||||
which would slow down the rate of block commits of the blockchain,
|
||||
reducing its utility and value. The malicious coalition might also
|
||||
broadcast votes in a trickle so as to grind blockchain block commits to
|
||||
a near halt, or engage in any combination of these attacks.
|
||||
|
||||
If a global active adversary were also involved, it can partition the
|
||||
network in such a way that it may appear that the wrong subset of
|
||||
validators were responsible for the slowdown. This is not just a
|
||||
limitation of Tendermint, but rather a limitation of all consensus
|
||||
protocols whose network is potentially controlled by an active
|
||||
adversary.
|
||||
|
||||
Overcoming Forks and Censorship Attacks
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
For these types of attacks, a subset of the validators through external
|
||||
means should coordinate to sign a reorg-proposal that chooses a fork
|
||||
(and any evidence thereof) and the initial subset of validators with
|
||||
their signatures. Validators who sign such a reorg-proposal forego its
|
||||
collateral on all other forks. Clients should verify the signatures on
|
||||
the reorg-proposal, verify any evidence, and make a judgement or prompt
|
||||
the end-user for a decision. For example, a phone wallet app may prompt
|
||||
the user with a security warning, while a refrigerator may accept any
|
||||
reorg-proposal signed by +1/2 of the original validators.
|
||||
|
||||
No non-synchronous Byzantine fault-tolerant algorithm can come to
|
||||
consensus when 1/3+ of validators are dishonest, yet a fork assumes that
|
||||
1/3+ of validators have already been dishonest by double-signing or
|
||||
lock-changing without justification. So, signing the reorg-proposal is a
|
||||
coordination problem that cannot be solved by any non-synchronous
|
||||
protocol (i.e. automatically, and without making assumptions about the
|
||||
reliability of the underlying network). It must be provided by means
|
||||
external to the weakly-synchronous Tendermint consensus algorithm. For
|
||||
now, we leave the problem of reorg-proposal coordination to human
|
||||
coordination via internet media. Validators must take care to ensure
|
||||
that there are no significant network partitions, to avoid situations
|
||||
where two conflicting reorg-proposals are signed.
|
||||
|
||||
Assuming that the external coordination medium and protocol is robust,
|
||||
it follows that forks are less of a concern than `censorship
|
||||
attacks <#censorship-attacks>`__.
|
@ -1,70 +0,0 @@
|
||||
Corruption
|
||||
==========
|
||||
|
||||
Important step
|
||||
--------------
|
||||
|
||||
Make sure you have a backup of the Tendermint data directory.
|
||||
|
||||
Possible causes
|
||||
---------------
|
||||
|
||||
Remember that most corruption is caused by hardware issues:
|
||||
|
||||
- RAID controllers with faulty / worn out battery backup, and an unexpected power loss
|
||||
- Hard disk drives with write-back cache enabled, and an unexpected power loss
|
||||
- Cheap SSDs with insufficient power-loss protection, and an unexpected power-loss
|
||||
- Defective RAM
|
||||
- Defective or overheating CPU(s)
|
||||
|
||||
Other causes can be:
|
||||
|
||||
- Database systems configured with fsync=off and an OS crash or power loss
|
||||
- Filesystems configured to use write barriers plus a storage layer that ignores write barriers. LVM is a particular culprit.
|
||||
- Tendermint bugs
|
||||
- Operating system bugs
|
||||
- Admin error
|
||||
- directly modifying Tendermint data-directory contents
|
||||
|
||||
(Source: https://wiki.postgresql.org/wiki/Corruption)
|
||||
|
||||
WAL Corruption
|
||||
--------------
|
||||
|
||||
If consensus WAL is corrupted at the lastest height and you are trying to start
|
||||
Tendermint, replay will fail with panic.
|
||||
|
||||
Recovering from data corruption can be hard and time-consuming. Here are two approaches you can take:
|
||||
|
||||
1) Delete the WAL file and restart Tendermint. It will attempt to sync with other peers.
|
||||
2) Try to repair the WAL file manually:
|
||||
|
||||
1. Create a backup of the corrupted WAL file:
|
||||
|
||||
.. code:: bash
|
||||
|
||||
cp "$TMHOME/data/cs.wal/wal" > /tmp/corrupted_wal_backup
|
||||
|
||||
2. Use ./scripts/wal2json to create a human-readable version
|
||||
|
||||
.. code:: bash
|
||||
|
||||
./scripts/wal2json/wal2json "$TMHOME/data/cs.wal/wal" > /tmp/corrupted_wal
|
||||
|
||||
3. Search for a "CORRUPTED MESSAGE" line.
|
||||
4. By looking at the previous message and the message after the corrupted one
|
||||
and looking at the logs, try to rebuild the message. If the consequent
|
||||
messages are marked as corrupted too (this may happen if length header
|
||||
got corrupted or some writes did not make it to the WAL ~ truncation),
|
||||
then remove all the lines starting from the corrupted one and restart
|
||||
Tendermint.
|
||||
|
||||
.. code:: bash
|
||||
|
||||
$EDITOR /tmp/corrupted_wal
|
||||
|
||||
5. After editing, convert this file back into binary form by running:
|
||||
|
||||
.. code:: bash
|
||||
|
||||
./scripts/json2wal/json2wal /tmp/corrupted_wal > "$TMHOME/data/cs.wal/wal"
|
@ -1,71 +0,0 @@
|
||||
Genesis
|
||||
=======
|
||||
|
||||
The genesis.json file in ``$TMHOME/config`` defines the initial TendermintCore
|
||||
state upon genesis of the blockchain (`see
|
||||
definition <https://github.com/tendermint/tendermint/blob/master/types/genesis.go>`__).
|
||||
|
||||
Fields
|
||||
~~~~~~
|
||||
|
||||
- ``genesis_time``: Official time of blockchain start.
|
||||
- ``chain_id``: ID of the blockchain. This must be unique for every
|
||||
blockchain. If your testnet blockchains do not have unique chain IDs,
|
||||
you will have a bad time.
|
||||
- ``validators``:
|
||||
- ``pub_key``: The first element specifies the pub\_key type. 1 ==
|
||||
Ed25519. The second element are the pubkey bytes.
|
||||
- ``power``: The validator's voting power.
|
||||
- ``name``: Name of the validator (optional).
|
||||
- ``app_hash``: The expected application hash (as returned by the
|
||||
``ResponseInfo`` ABCI message) upon genesis. If the app's hash does not
|
||||
match, Tendermint will panic.
|
||||
- ``app_state``: The application state (e.g. initial distribution of tokens).
|
||||
|
||||
Sample genesis.json
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. code:: json
|
||||
|
||||
{
|
||||
"genesis_time": "2016-02-05T06:02:31.526Z",
|
||||
"chain_id": "chain-tTH4mi",
|
||||
"validators": [
|
||||
{
|
||||
"pub_key": [
|
||||
1,
|
||||
"9BC5112CB9614D91CE423FA8744885126CD9D08D9FC9D1F42E552D662BAA411E"
|
||||
],
|
||||
"power": 1,
|
||||
"name": "mach1"
|
||||
},
|
||||
{
|
||||
"pub_key": [
|
||||
1,
|
||||
"F46A5543D51F31660D9F59653B4F96061A740FF7433E0DC1ECBC30BE8494DE06"
|
||||
],
|
||||
"power": 1,
|
||||
"name": "mach2"
|
||||
},
|
||||
{
|
||||
"pub_key": [
|
||||
1,
|
||||
"0E7B423C1635FD07C0FC3603B736D5D27953C1C6CA865BB9392CD79DE1A682BB"
|
||||
],
|
||||
"power": 1,
|
||||
"name": "mach3"
|
||||
},
|
||||
{
|
||||
"pub_key": [
|
||||
1,
|
||||
"4F49237B9A32EB50682EDD83C48CE9CDB1D02A7CFDADCFF6EC8C1FAADB358879"
|
||||
],
|
||||
"power": 1,
|
||||
"name": "mach4"
|
||||
}
|
||||
],
|
||||
"app_hash": "15005165891224E721CB664D15CB972240F5703F",
|
||||
"app_state": {
|
||||
{"account": "Bob", "coins": 5000}
|
||||
}
|
||||
}
|
@ -1,33 +0,0 @@
|
||||
Light Client Protocol
|
||||
=====================
|
||||
|
||||
Light clients are an important part of the complete blockchain system
|
||||
for most applications. Tendermint provides unique speed and security
|
||||
properties for light client applications.
|
||||
|
||||
See our `lite package
|
||||
<https://godoc.org/github.com/tendermint/tendermint/lite>`__.
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
The objective of the light client protocol is to get a
|
||||
`commit <./validators.html#committing-a-block>`__ for a recent
|
||||
`block hash <./block-structure.html#block-hash>`__ where the commit
|
||||
includes a majority of signatures from the last known validator set.
|
||||
From there, all the application state is verifiable with `merkle
|
||||
proofs <./merkle.html#iavl-tree>`__.
|
||||
|
||||
Properties
|
||||
----------
|
||||
|
||||
- You get the full collateralized security benefits of Tendermint; No
|
||||
need to wait for confirmations.
|
||||
- You get the full speed benefits of Tendermint; transactions commit
|
||||
instantly.
|
||||
- You can get the most recent version of the application state
|
||||
non-interactively (without committing anything to the blockchain).
|
||||
For example, this means that you can get the most recent value of a
|
||||
name from the name-registry without worrying about fork censorship
|
||||
attacks, without posting a commit and waiting for confirmations. It's
|
||||
fast, secure, and free!
|
@ -1,88 +0,0 @@
|
||||
Merkle
|
||||
======
|
||||
|
||||
For an overview of Merkle trees, see
|
||||
`wikipedia <https://en.wikipedia.org/wiki/Merkle_tree>`__.
|
||||
|
||||
There are two types of Merkle trees used in Tendermint.
|
||||
|
||||
- **IAVL+ Tree**: An immutable self-balancing binary
|
||||
tree for persistent application state
|
||||
- **Simple Tree**: A simple compact binary tree for
|
||||
a static list of items
|
||||
|
||||
IAVL+ Tree
|
||||
----------
|
||||
|
||||
The purpose of this data structure is to provide persistent storage for
|
||||
key-value pairs (e.g. account state, name-registrar data, and
|
||||
per-contract data) such that a deterministic merkle root hash can be
|
||||
computed. The tree is balanced using a variant of the `AVL
|
||||
algorithm <http://en.wikipedia.org/wiki/AVL_tree>`__ so all operations
|
||||
are O(log(n)).
|
||||
|
||||
Nodes of this tree are immutable and indexed by its hash. Thus any node
|
||||
serves as an immutable snapshot which lets us stage uncommitted
|
||||
transactions from the mempool cheaply, and we can instantly roll back to
|
||||
the last committed state to process transactions of a newly committed
|
||||
block (which may not be the same set of transactions as those from the
|
||||
mempool).
|
||||
|
||||
In an AVL tree, the heights of the two child subtrees of any node differ
|
||||
by at most one. Whenever this condition is violated upon an update, the
|
||||
tree is rebalanced by creating O(log(n)) new nodes that point to
|
||||
unmodified nodes of the old tree. In the original AVL algorithm, inner
|
||||
nodes can also hold key-value pairs. The AVL+ algorithm (note the plus)
|
||||
modifies the AVL algorithm to keep all values on leaf nodes, while only
|
||||
using branch-nodes to store keys. This simplifies the algorithm while
|
||||
minimizing the size of merkle proofs
|
||||
|
||||
In Ethereum, the analog is the `Patricia
|
||||
trie <http://en.wikipedia.org/wiki/Radix_tree>`__. There are tradeoffs.
|
||||
Keys do not need to be hashed prior to insertion in IAVL+ trees, so this
|
||||
provides faster iteration in the key space which may benefit some
|
||||
applications. The logic is simpler to implement, requiring only two
|
||||
types of nodes -- inner nodes and leaf nodes. The IAVL+ tree is a binary
|
||||
tree, so merkle proofs are much shorter than the base 16 Patricia trie.
|
||||
On the other hand, while IAVL+ trees provide a deterministic merkle root
|
||||
hash, it depends on the order of updates. In practice this shouldn't be
|
||||
a problem, since you can efficiently encode the tree structure when
|
||||
serializing the tree contents.
|
||||
|
||||
Simple Tree
|
||||
-----------
|
||||
|
||||
For merkelizing smaller static lists, use the Simple Tree. The
|
||||
transactions and validation signatures of a block are hashed using this
|
||||
simple merkle tree logic.
|
||||
|
||||
If the number of items is not a power of two, the tree will not be full
|
||||
and some leaf nodes will be at different levels. Simple Tree tries to
|
||||
keep both sides of the tree the same size, but the left side may be one
|
||||
greater.
|
||||
|
||||
::
|
||||
|
||||
Simple Tree with 6 items Simple Tree with 7 items
|
||||
|
||||
* *
|
||||
/ \ / \
|
||||
/ \ / \
|
||||
/ \ / \
|
||||
/ \ / \
|
||||
* * * *
|
||||
/ \ / \ / \ / \
|
||||
/ \ / \ / \ / \
|
||||
/ \ / \ / \ / \
|
||||
* h2 * h5 * * * h6
|
||||
/ \ / \ / \ / \ / \
|
||||
h0 h1 h3 h4 h0 h1 h2 h3 h4 h5
|
||||
|
||||
Simple Tree with Dictionaries
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The Simple Tree is used to merkelize a list of items, so to merkelize a
|
||||
(short) dictionary of key-value pairs, encode the dictionary as an
|
||||
ordered list of ``KVPair`` structs. The block hash is such a hash
|
||||
derived from all the fields of the block ``Header``. The state hash is
|
||||
similarly derived.
|
@ -1 +0,0 @@
|
||||
Spec moved to [docs/spec](https://github.com/tendermint/tendermint/tree/master/docs/spec).
|
@ -1,172 +0,0 @@
|
||||
Wire Protocol
|
||||
=============
|
||||
|
||||
The `Tendermint wire protocol <https://github.com/tendermint/go-wire>`__
|
||||
encodes data in `c-style binary <#binary>`__ and `JSON <#json>`__ form.
|
||||
|
||||
Supported types
|
||||
---------------
|
||||
|
||||
- Primitive types
|
||||
- ``uint8`` (aka ``byte``), ``uint16``, ``uint32``, ``uint64``
|
||||
- ``int8``, ``int16``, ``int32``, ``int64``
|
||||
- ``uint``, ``int``: variable length (un)signed integers
|
||||
- ``string``, ``[]byte``
|
||||
- ``time``
|
||||
- Derived types
|
||||
- structs
|
||||
- var-length arrays of a particular type
|
||||
- fixed-length arrays of a particular type
|
||||
- interfaces: registered union types preceded by a ``type byte``
|
||||
- pointers
|
||||
|
||||
Binary
|
||||
------
|
||||
|
||||
**Fixed-length primitive types** are encoded with 1,2,3, or 4 big-endian
|
||||
bytes. - ``uint8`` (aka ``byte``), ``uint16``, ``uint32``, ``uint64``:
|
||||
takes 1,2,3, and 4 bytes respectively - ``int8``, ``int16``, ``int32``,
|
||||
``int64``: takes 1,2,3, and 4 bytes respectively - ``time``: ``int64``
|
||||
representation of nanoseconds since epoch
|
||||
|
||||
**Variable-length integers** are encoded with a single leading byte
|
||||
representing the length of the following big-endian bytes. For signed
|
||||
negative integers, the most significant bit of the leading byte is a 1.
|
||||
|
||||
- ``uint``: 1-byte length prefixed variable-size (0 ~ 255 bytes)
|
||||
unsigned integers
|
||||
- ``int``: 1-byte length prefixed variable-size (0 ~ 127 bytes) signed
|
||||
integers
|
||||
|
||||
NOTE: While the number 0 (zero) is encoded with a single byte ``x00``,
|
||||
the number 1 (one) takes two bytes to represent: ``x0101``. This isn't
|
||||
the most efficient representation, but the rules are easier to remember.
|
||||
|
||||
+---------------+----------------+----------------+
|
||||
| number | binary | binary ``int`` |
|
||||
| | ``uint`` | |
|
||||
+===============+================+================+
|
||||
| 0 | ``x00`` | ``x00`` |
|
||||
+---------------+----------------+----------------+
|
||||
| 1 | ``x0101`` | ``x0101`` |
|
||||
+---------------+----------------+----------------+
|
||||
| 2 | ``x0102`` | ``x0102`` |
|
||||
+---------------+----------------+----------------+
|
||||
| 256 | ``x020100`` | ``x020100`` |
|
||||
+---------------+----------------+----------------+
|
||||
| 2^(127\ *8)-1 | ``x800100...`` | overflow |
|
||||
| \| | | |
|
||||
| ``x7FFFFF...` | | |
|
||||
| ` | | |
|
||||
| \| | | |
|
||||
| ``x7FFFFF...` | | |
|
||||
| ` | | |
|
||||
| \| \| | | |
|
||||
| 2^(127*\ 8) | | |
|
||||
+---------------+----------------+----------------+
|
||||
| 2^(255\*8)-1 |
|
||||
| \| |
|
||||
| ``xFFFFFF...` |
|
||||
| ` |
|
||||
| \| overflow |
|
||||
| \| \| -1 \| |
|
||||
| n/a \| |
|
||||
| ``x8101`` \| |
|
||||
| \| -2 \| n/a |
|
||||
| \| ``x8102`` |
|
||||
| \| \| -256 \| |
|
||||
| n/a \| |
|
||||
| ``x820100`` |
|
||||
| \| |
|
||||
+---------------+----------------+----------------+
|
||||
|
||||
**Structures** are encoded by encoding the field values in order of
|
||||
declaration.
|
||||
|
||||
.. code:: go
|
||||
|
||||
type Foo struct {
|
||||
MyString string
|
||||
MyUint32 uint32
|
||||
}
|
||||
var foo = Foo{"626172", math.MaxUint32}
|
||||
|
||||
/* The binary representation of foo:
|
||||
0103626172FFFFFFFF
|
||||
0103: `int` encoded length of string, here 3
|
||||
626172: 3 bytes of string "bar"
|
||||
FFFFFFFF: 4 bytes of uint32 MaxUint32
|
||||
*/
|
||||
|
||||
**Variable-length arrays** are encoded with a leading ``int`` denoting
|
||||
the length of the array followed by the binary representation of the
|
||||
items. **Fixed-length arrays** are similar but aren't preceded by the
|
||||
leading ``int``.
|
||||
|
||||
.. code:: go
|
||||
|
||||
foos := []Foo{foo, foo}
|
||||
|
||||
/* The binary representation of foos:
|
||||
01020103626172FFFFFFFF0103626172FFFFFFFF
|
||||
0102: `int` encoded length of array, here 2
|
||||
0103626172FFFFFFFF: the first `foo`
|
||||
0103626172FFFFFFFF: the second `foo`
|
||||
*/
|
||||
|
||||
foos := [2]Foo{foo, foo} // fixed-length array
|
||||
|
||||
/* The binary representation of foos:
|
||||
0103626172FFFFFFFF0103626172FFFFFFFF
|
||||
0103626172FFFFFFFF: the first `foo`
|
||||
0103626172FFFFFFFF: the second `foo`
|
||||
*/
|
||||
|
||||
**Interfaces** can represent one of any number of concrete types. The
|
||||
concrete types of an interface must first be declared with their
|
||||
corresponding ``type byte``. An interface is then encoded with the
|
||||
leading ``type byte``, then the binary encoding of the underlying
|
||||
concrete type.
|
||||
|
||||
NOTE: The byte ``x00`` is reserved for the ``nil`` interface value and
|
||||
``nil`` pointer values.
|
||||
|
||||
.. code:: go
|
||||
|
||||
type Animal interface{}
|
||||
type Dog uint32
|
||||
type Cat string
|
||||
|
||||
RegisterInterface(
|
||||
struct{ Animal }{}, // Convenience for referencing the 'Animal' interface
|
||||
ConcreteType{Dog(0), 0x01}, // Register the byte 0x01 to denote a Dog
|
||||
ConcreteType{Cat(""), 0x02}, // Register the byte 0x02 to denote a Cat
|
||||
)
|
||||
|
||||
var animal Animal = Dog(02)
|
||||
|
||||
/* The binary representation of animal:
|
||||
010102
|
||||
01: the type byte for a `Dog`
|
||||
0102: the bytes of Dog(02)
|
||||
*/
|
||||
|
||||
**Pointers** are encoded with a single leading byte ``x00`` for ``nil``
|
||||
pointers, otherwise encoded with a leading byte ``x01`` followed by the
|
||||
binary encoding of the value pointed to.
|
||||
|
||||
NOTE: It's easy to convert pointer types into interface types, since the
|
||||
``type byte`` ``x00`` is always ``nil``.
|
||||
|
||||
JSON
|
||||
----
|
||||
|
||||
The JSON codec is compatible with the ```binary`` <#binary>`__ codec,
|
||||
and is fairly intuitive if you're already familiar with golang's JSON
|
||||
encoding. Some quirks are noted below:
|
||||
|
||||
- variable-length and fixed-length bytes are encoded as uppercase
|
||||
hexadecimal strings
|
||||
- interface values are encoded as an array of two items:
|
||||
``[type_byte, concrete_value]``
|
||||
- times are encoded as rfc2822 strings
|
206
docs/tendermint-core/block-structure.md
Normal file
206
docs/tendermint-core/block-structure.md
Normal file
@ -0,0 +1,206 @@
|
||||
# Block Structure
|
||||
|
||||
The tendermint consensus engine records all agreements by a
|
||||
supermajority of nodes into a blockchain, which is replicated among all
|
||||
nodes. This blockchain is accessible via various rpc endpoints, mainly
|
||||
`/block?height=` to get the full block, as well as
|
||||
`/blockchain?minHeight=_&maxHeight=_` to get a list of headers. But what
|
||||
exactly is stored in these blocks?
|
||||
|
||||
## Block
|
||||
|
||||
A
|
||||
[Block](https://godoc.org/github.com/tendermint/tendermint/types#Block)
|
||||
contains:
|
||||
|
||||
- a [Header](#header) contains merkle hashes for various chain states
|
||||
- the
|
||||
[Data](https://godoc.org/github.com/tendermint/tendermint/types#Data)
|
||||
is all transactions which are to be processed
|
||||
- the [LastCommit](#commit) > 2/3 signatures for the last block
|
||||
|
||||
The signatures returned along with block `H` are those validating block
|
||||
`H-1`. This can be a little confusing, but we must also consider that
|
||||
the `Header` also contains the `LastCommitHash`. It would be impossible
|
||||
for a Header to include the commits that sign it, as it would cause an
|
||||
infinite loop here. But when we get block `H`, we find
|
||||
`Header.LastCommitHash`, which must match the hash of `LastCommit`.
|
||||
|
||||
## Header
|
||||
|
||||
The
|
||||
[Header](https://godoc.org/github.com/tendermint/tendermint/types#Header)
|
||||
contains lots of information (follow link for up-to-date info). Notably,
|
||||
it maintains the `Height`, the `LastBlockID` (to make it a chain), and
|
||||
hashes of the data, the app state, and the validator set. This is
|
||||
important as the only item that is signed by the validators is the
|
||||
`Header`, and all other data must be validated against one of the merkle
|
||||
hashes in the `Header`.
|
||||
|
||||
The `DataHash` can provide a nice check on the
|
||||
[Data](https://godoc.org/github.com/tendermint/tendermint/types#Data)
|
||||
returned in this same block. If you are subscribed to new blocks, via
|
||||
tendermint RPC, in order to display or process the new transactions you
|
||||
should at least validate that the `DataHash` is valid. If it is
|
||||
important to verify autheniticity, you must wait for the `LastCommit`
|
||||
from the next block to make sure the block header (including `DataHash`)
|
||||
was properly signed.
|
||||
|
||||
The `ValidatorHash` contains a hash of the current
|
||||
[Validators](https://godoc.org/github.com/tendermint/tendermint/types#Validator).
|
||||
Tracking all changes in the validator set is complex, but a client can
|
||||
quickly compare this hash with the [hash of the currently known
|
||||
validators](https://godoc.org/github.com/tendermint/tendermint/types#ValidatorSet.Hash)
|
||||
to see if there have been changes.
|
||||
|
||||
The `AppHash` serves as the basis for validating any merkle proofs that
|
||||
come from the ABCI application. It represents the state of the actual
|
||||
application, rather that the state of the blockchain itself. This means
|
||||
it's necessary in order to perform any business logic, such as verifying
|
||||
an account balance.
|
||||
|
||||
**Note** After the transactions are committed to a block, they still
|
||||
need to be processed in a separate step, which happens between the
|
||||
blocks. If you find a given transaction in the block at height `H`, the
|
||||
effects of running that transaction will be first visible in the
|
||||
`AppHash` from the block header at height `H+1`.
|
||||
|
||||
Like the `LastCommit` issue, this is a requirement of the immutability
|
||||
of the block chain, as the application only applies transactions *after*
|
||||
they are commited to the chain.
|
||||
|
||||
## Commit
|
||||
|
||||
The
|
||||
[Commit](https://godoc.org/github.com/tendermint/tendermint/types#Commit)
|
||||
contains a set of
|
||||
[Votes](https://godoc.org/github.com/tendermint/tendermint/types#Vote)
|
||||
that were made by the validator set to reach consensus on this block.
|
||||
This is the key to the security in any PoS system, and actually no data
|
||||
that cannot be traced back to a block header with a valid set of Votes
|
||||
can be trusted. Thus, getting the Commit data and verifying the votes is
|
||||
extremely important.
|
||||
|
||||
As mentioned above, in order to find the `precommit votes` for block
|
||||
header `H`, we need to query block `H+1`. Then we need to check the
|
||||
votes, make sure they really are for that block, and properly formatted.
|
||||
Much of this code is implemented in Go in the
|
||||
[light-client](https://github.com/tendermint/light-client) package. If
|
||||
you look at the code, you will notice that we need to provide the
|
||||
`chainID` of the blockchain in order to properly calculate the votes.
|
||||
This is to protect anyone from swapping votes between chains to fake (or
|
||||
frame) a validator. Also note that this `chainID` is in the
|
||||
`genesis.json` from *Tendermint*, not the `genesis.json` from the
|
||||
basecoin app ([that is a different
|
||||
chainID...](https://github.com/cosmos/cosmos-sdk/issues/32)).
|
||||
|
||||
Once we have those votes, and we calculated the proper [sign
|
||||
bytes](https://godoc.org/github.com/tendermint/tendermint/types#Vote.WriteSignBytes)
|
||||
using the chainID and a [nice helper
|
||||
function](https://godoc.org/github.com/tendermint/tendermint/types#SignBytes),
|
||||
we can verify them. The light client is responsible for maintaining a
|
||||
set of validators that we trust. Each vote only stores the validators
|
||||
`Address`, as well as the `Signature`. Assuming we have a local copy of
|
||||
the trusted validator set, we can look up the `Public Key` of the
|
||||
validator given its `Address`, then verify that the `Signature` matches
|
||||
the `SignBytes` and `Public Key`. Then we sum up the total voting power
|
||||
of all validators, whose votes fulfilled all these stringent
|
||||
requirements. If the total number of voting power for a single block is
|
||||
greater than 2/3 of all voting power, then we can finally trust the
|
||||
block header, the AppHash, and the proof we got from the ABCI
|
||||
application.
|
||||
|
||||
### Vote Sign Bytes
|
||||
|
||||
The `sign-bytes` of a vote is produced by taking a
|
||||
[stable-json](https://github.com/substack/json-stable-stringify)-like
|
||||
deterministic JSON [wire](./wire-protocol.html) encoding of the vote
|
||||
(excluding the `Signature` field), and wrapping it with
|
||||
`{"chain_id":"my_chain","vote":...}`.
|
||||
|
||||
For example, a precommit vote might have the following `sign-bytes`:
|
||||
|
||||
```
|
||||
{"chain_id":"my_chain","vote":{"block_hash":"611801F57B4CE378DF1A3FFF1216656E89209A99","block_parts_header":{"hash":"B46697379DBE0774CC2C3B656083F07CA7E0F9CE","total":123},"height":1234,"round":1,"type":2}}
|
||||
```
|
||||
|
||||
## Block Hash
|
||||
|
||||
The [block
|
||||
hash](https://godoc.org/github.com/tendermint/tendermint/types#Block.Hash)
|
||||
is the [Simple Tree hash](./merkle.html#simple-tree-with-dictionaries)
|
||||
of the fields of the block `Header` encoded as a list of `KVPair`s.
|
||||
|
||||
## Transaction
|
||||
|
||||
A transaction is any sequence of bytes. It is up to your ABCI
|
||||
application to accept or reject transactions.
|
||||
|
||||
## BlockID
|
||||
|
||||
Many of these data structures refer to the
|
||||
[BlockID](https://godoc.org/github.com/tendermint/tendermint/types#BlockID),
|
||||
which is the `BlockHash` (hash of the block header, also referred to by
|
||||
the next block) along with the `PartSetHeader`. The `PartSetHeader` is
|
||||
explained below and is used internally to orchestrate the p2p
|
||||
propogation. For clients, it is basically opaque bytes, but they must
|
||||
match for all votes.
|
||||
|
||||
## PartSetHeader
|
||||
|
||||
The
|
||||
[PartSetHeader](https://godoc.org/github.com/tendermint/tendermint/types#PartSetHeader)
|
||||
contains the total number of pieces in a
|
||||
[PartSet](https://godoc.org/github.com/tendermint/tendermint/types#PartSet),
|
||||
and the Merkle root hash of those pieces.
|
||||
|
||||
## PartSet
|
||||
|
||||
PartSet is used to split a byteslice of data into parts (pieces) for
|
||||
transmission. By splitting data into smaller parts and computing a
|
||||
Merkle root hash on the list, you can verify that a part is legitimately
|
||||
part of the complete data, and the part can be forwarded to other peers
|
||||
before all the parts are known. In short, it's a fast way to securely
|
||||
propagate a large chunk of data (like a block) over a gossip network.
|
||||
|
||||
PartSet was inspired by the LibSwift project.
|
||||
|
||||
Usage:
|
||||
|
||||
```
|
||||
data := RandBytes(2 << 20) // Something large
|
||||
|
||||
partSet := NewPartSetFromData(data)
|
||||
partSet.Total() // Total number of 4KB parts
|
||||
partSet.Count() // Equal to the Total, since we already have all the parts
|
||||
partSet.Hash() // The Merkle root hash
|
||||
partSet.BitArray() // A BitArray of partSet.Total() 1's
|
||||
|
||||
header := partSet.Header() // Send this to the peer
|
||||
header.Total // Total number of parts
|
||||
header.Hash // The merkle root hash
|
||||
|
||||
// Now we'll reconstruct the data from the parts
|
||||
partSet2 := NewPartSetFromHeader(header)
|
||||
partSet2.Total() // Same total as partSet.Total()
|
||||
partSet2.Count() // Zero, since this PartSet doesn't have any parts yet.
|
||||
partSet2.Hash() // Same hash as in partSet.Hash()
|
||||
partSet2.BitArray() // A BitArray of partSet.Total() 0's
|
||||
|
||||
// In a gossip network the parts would arrive in arbitrary order, perhaps
|
||||
// in response to explicit requests for parts, or optimistically in response
|
||||
// to the receiving peer's partSet.BitArray().
|
||||
for !partSet2.IsComplete() {
|
||||
part := receivePartFromGossipNetwork()
|
||||
added, err := partSet2.AddPart(part)
|
||||
if err != nil {
|
||||
// A wrong part,
|
||||
// the merkle trail does not hash to partSet2.Hash()
|
||||
} else if !added {
|
||||
// A duplicate part already received
|
||||
}
|
||||
}
|
||||
|
||||
data2, _ := ioutil.ReadAll(partSet2.GetReader())
|
||||
bytes.Equal(data, data2) // true
|
||||
```
|
30
docs/tendermint-core/light-client-protocol.md
Normal file
30
docs/tendermint-core/light-client-protocol.md
Normal file
@ -0,0 +1,30 @@
|
||||
# Light Client Protocol
|
||||
|
||||
Light clients are an important part of the complete blockchain system
|
||||
for most applications. Tendermint provides unique speed and security
|
||||
properties for light client applications.
|
||||
|
||||
See our [lite
|
||||
package](https://godoc.org/github.com/tendermint/tendermint/lite).
|
||||
|
||||
## Overview
|
||||
|
||||
The objective of the light client protocol is to get a
|
||||
[commit](./validators.md#committing-a-block) for a recent [block
|
||||
hash](../spec/consensus/consensus.md.md#block-hash) where the commit includes a
|
||||
majority of signatures from the last known validator set. From there,
|
||||
all the application state is verifiable with [merkle
|
||||
proofs](./merkle.md#iavl-tree).
|
||||
|
||||
## Properties
|
||||
|
||||
- You get the full collateralized security benefits of Tendermint; No
|
||||
need to wait for confirmations.
|
||||
- You get the full speed benefits of Tendermint; transactions
|
||||
commit instantly.
|
||||
- You can get the most recent version of the application state
|
||||
non-interactively (without committing anything to the blockchain).
|
||||
For example, this means that you can get the most recent value of a
|
||||
name from the name-registry without worrying about fork censorship
|
||||
attacks, without posting a commit and waiting for confirmations.
|
||||
It's fast, secure, and free!
|
@ -104,6 +104,69 @@ signals we use the default behaviour in Go: [Default behavior of signals
|
||||
in Go
|
||||
programs](https://golang.org/pkg/os/signal/#hdr-Default_behavior_of_signals_in_Go_programs).
|
||||
|
||||
## Corruption
|
||||
|
||||
**NOTE:** Make sure you have a backup of the Tendermint data directory.
|
||||
|
||||
### Possible causes
|
||||
|
||||
Remember that most corruption is caused by hardware issues:
|
||||
|
||||
- RAID controllers with faulty / worn out battery backup, and an unexpected power loss
|
||||
- Hard disk drives with write-back cache enabled, and an unexpected power loss
|
||||
- Cheap SSDs with insufficient power-loss protection, and an unexpected power-loss
|
||||
- Defective RAM
|
||||
- Defective or overheating CPU(s)
|
||||
|
||||
Other causes can be:
|
||||
|
||||
- Database systems configured with fsync=off and an OS crash or power loss
|
||||
- Filesystems configured to use write barriers plus a storage layer that ignores write barriers. LVM is a particular culprit.
|
||||
- Tendermint bugs
|
||||
- Operating system bugs
|
||||
- Admin error (e.g., directly modifying Tendermint data-directory contents)
|
||||
|
||||
(Source: https://wiki.postgresql.org/wiki/Corruption)
|
||||
|
||||
### WAL Corruption
|
||||
|
||||
If consensus WAL is corrupted at the lastest height and you are trying to start
|
||||
Tendermint, replay will fail with panic.
|
||||
|
||||
Recovering from data corruption can be hard and time-consuming. Here are two approaches you can take:
|
||||
|
||||
1) Delete the WAL file and restart Tendermint. It will attempt to sync with other peers.
|
||||
2) Try to repair the WAL file manually:
|
||||
|
||||
1. Create a backup of the corrupted WAL file:
|
||||
|
||||
```
|
||||
cp "$TMHOME/data/cs.wal/wal" > /tmp/corrupted_wal_backup
|
||||
```
|
||||
|
||||
2. Use `./scripts/wal2json` to create a human-readable version
|
||||
|
||||
```
|
||||
./scripts/wal2json/wal2json "$TMHOME/data/cs.wal/wal" > /tmp/corrupted_wal
|
||||
```
|
||||
|
||||
3. Search for a "CORRUPTED MESSAGE" line.
|
||||
4. By looking at the previous message and the message after the corrupted one
|
||||
and looking at the logs, try to rebuild the message. If the consequent
|
||||
messages are marked as corrupted too (this may happen if length header
|
||||
got corrupted or some writes did not make it to the WAL ~ truncation),
|
||||
then remove all the lines starting from the corrupted one and restart
|
||||
Tendermint.
|
||||
|
||||
```
|
||||
$EDITOR /tmp/corrupted_wal
|
||||
```
|
||||
5. After editing, convert this file back into binary form by running:
|
||||
|
||||
```
|
||||
./scripts/json2wal/json2wal /tmp/corrupted_wal > "$TMHOME/data/cs.wal/wal"
|
||||
```
|
||||
|
||||
## Hardware
|
||||
|
||||
### Processor and Memory
|
||||
|
@ -1,12 +1,11 @@
|
||||
Secure P2P
|
||||
==========
|
||||
# Secure P2P
|
||||
|
||||
The Tendermint p2p protocol uses an authenticated encryption scheme
|
||||
based on the `Station-to-Station
|
||||
Protocol <https://en.wikipedia.org/wiki/Station-to-Station_protocol>`__.
|
||||
based on the [Station-to-Station
|
||||
Protocol](https://en.wikipedia.org/wiki/Station-to-Station_protocol).
|
||||
The implementation uses
|
||||
`golang's <https://godoc.org/golang.org/x/crypto/nacl/box>`__ `nacl
|
||||
box <http://nacl.cr.yp.to/box.html>`__ for the actual authenticated
|
||||
[golang's](https://godoc.org/golang.org/x/crypto/nacl/box) [nacl
|
||||
box](http://nacl.cr.yp.to/box.html) for the actual authenticated
|
||||
encryption algorithm.
|
||||
|
||||
Each peer generates an ED25519 key-pair to use as a persistent
|
||||
@ -19,10 +18,9 @@ their respective ephemeral public keys. This happens in the clear.
|
||||
They then each compute the shared secret. The shared secret is the
|
||||
multiplication of the peer's ephemeral private key by the other peer's
|
||||
ephemeral public key. The result is the same for both peers by the magic
|
||||
of `elliptic
|
||||
curves <https://en.wikipedia.org/wiki/Elliptic_curve_cryptography>`__.
|
||||
The shared secret is used as the symmetric key for the encryption
|
||||
algorithm.
|
||||
of [elliptic
|
||||
curves](https://en.wikipedia.org/wiki/Elliptic_curve_cryptography). The
|
||||
shared secret is used as the symmetric key for the encryption algorithm.
|
||||
|
||||
The two ephemeral public keys are sorted to establish a canonical order.
|
||||
Then a 24-byte nonce is generated by concatenating the public keys and
|
||||
@ -52,8 +50,7 @@ time it is used. The communications maintain Perfect Forward Secrecy, as
|
||||
the persistent key pair was not used for generating secrets - only for
|
||||
authenticating.
|
||||
|
||||
Caveat
|
||||
------
|
||||
## Caveat
|
||||
|
||||
This system is still vulnerable to a Man-In-The-Middle attack if the
|
||||
persistent public key of the remote node is not known in advance. The
|
||||
@ -62,17 +59,15 @@ such as the Web-of-Trust or Certificate Authorities. In our case, we can
|
||||
use the blockchain itself as a certificate authority to ensure that we
|
||||
are connected to at least one validator.
|
||||
|
||||
Config
|
||||
------
|
||||
## Config
|
||||
|
||||
Authenticated encryption is enabled by default.
|
||||
|
||||
Additional Reading
|
||||
------------------
|
||||
## Additional Reading
|
||||
|
||||
- `Implementation <https://github.com/tendermint/go-p2p/blob/master/secret_connection.go#L49>`__
|
||||
- `Original STS paper by Whitfield Diffie, Paul C. van Oorschot and
|
||||
Michael J.
|
||||
Wiener <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.216.6107&rep=rep1&type=pdf>`__
|
||||
- `Further work on secret
|
||||
handshakes <https://dominictarr.github.io/secret-handshake-paper/shs.pdf>`__
|
||||
- [Implementation](https://github.com/tendermint/tendermint/blob/64bae01d007b5bee0d0827ab53259ffd5910b4e6/p2p/conn/secret_connection.go#L47)
|
||||
- [Original STS paper by Whitfield Diffie, Paul C. van Oorschot and
|
||||
Michael J.
|
||||
Wiener](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.216.6107&rep=rep1&type=pdf)
|
||||
- [Further work on secret
|
||||
handshakes](https://dominictarr.github.io/secret-handshake-paper/shs.pdf)
|
@ -31,6 +31,73 @@ For more elaborate initialization, see the tesnet command:
|
||||
tendermint testnet --help
|
||||
```
|
||||
|
||||
### Genesis
|
||||
|
||||
The `genesis.json` file in `$TMHOME/config/` defines the initial
|
||||
TendermintCore state upon genesis of the blockchain ([see
|
||||
definition](https://github.com/tendermint/tendermint/blob/master/types/genesis.go)).
|
||||
|
||||
#### Fields
|
||||
|
||||
- `genesis_time`: Official time of blockchain start.
|
||||
- `chain_id`: ID of the blockchain. This must be unique for
|
||||
every blockchain. If your testnet blockchains do not have unique
|
||||
chain IDs, you will have a bad time.
|
||||
- `validators`:
|
||||
- `pub_key`: The first element specifies the `pub_key` type. 1
|
||||
== Ed25519. The second element are the pubkey bytes.
|
||||
- `power`: The validator's voting power.
|
||||
- `name`: Name of the validator (optional).
|
||||
- `app_hash`: The expected application hash (as returned by the
|
||||
`ResponseInfo` ABCI message) upon genesis. If the app's hash does
|
||||
not match, Tendermint will panic.
|
||||
- `app_state`: The application state (e.g. initial distribution
|
||||
of tokens).
|
||||
|
||||
#### Sample genesis.json
|
||||
|
||||
```
|
||||
{
|
||||
"genesis_time": "2018-07-09T22:43:06.255718641Z",
|
||||
"chain_id": "chain-IAkWsK",
|
||||
"validators": [
|
||||
{
|
||||
"pub_key": {
|
||||
"type": "tendermint/PubKeyEd25519",
|
||||
"value": "oX8HhKsErMluxI0QWNSR8djQMSupDvHdAYrHwP7n73k="
|
||||
},
|
||||
"power": "1",
|
||||
"name": "node0"
|
||||
},
|
||||
{
|
||||
"pub_key": {
|
||||
"type": "tendermint/PubKeyEd25519",
|
||||
"value": "UZNSJA9zmeFQj36Rs296lY+WFQ4Rt6s7snPpuKypl5I="
|
||||
},
|
||||
"power": "1",
|
||||
"name": "node1"
|
||||
},
|
||||
{
|
||||
"pub_key": {
|
||||
"type": "tendermint/PubKeyEd25519",
|
||||
"value": "i9GrM6/MHB4zjCelMZBUYHNXYIzl4n0RkDCVmmLhS/o="
|
||||
},
|
||||
"power": "1",
|
||||
"name": "node2"
|
||||
},
|
||||
{
|
||||
"pub_key": {
|
||||
"type": "tendermint/PubKeyEd25519",
|
||||
"value": "0qq7954l87trEqbQV9c7d1gurnjTGMxreXc848ZZ5aw="
|
||||
},
|
||||
"power": "1",
|
||||
"name": "node3"
|
||||
}
|
||||
],
|
||||
"app_hash": ""
|
||||
}
|
||||
```
|
||||
|
||||
## Run
|
||||
|
||||
To run a Tendermint node, use
|
||||
|
@ -1,5 +1,4 @@
|
||||
Validators
|
||||
==========
|
||||
# Validators
|
||||
|
||||
Validators are responsible for committing new blocks in the blockchain.
|
||||
These validators participate in the consensus protocol by broadcasting
|
||||
@ -19,25 +18,22 @@ to post any collateral at all.
|
||||
Validators have a cryptographic key-pair and an associated amount of
|
||||
"voting power". Voting power need not be the same.
|
||||
|
||||
Becoming a Validator
|
||||
--------------------
|
||||
## Becoming a Validator
|
||||
|
||||
There are two ways to become validator.
|
||||
|
||||
1. They can be pre-established in the `genesis
|
||||
state <./genesis.html>`__
|
||||
2. The ABCI app responds to the EndBlock message with changes to the
|
||||
existing validator set.
|
||||
1. They can be pre-established in the [genesis state](../../tendermint-core/using-tendermint.md#genesis)
|
||||
2. The ABCI app responds to the EndBlock message with changes to the
|
||||
existing validator set.
|
||||
|
||||
Committing a Block
|
||||
------------------
|
||||
## Committing a Block
|
||||
|
||||
*+2/3 is short for "more than 2/3"*
|
||||
|
||||
A block is committed when +2/3 of the validator set sign `precommit
|
||||
votes <./block-structure.html#vote>`__ for that block at the same
|
||||
``round``. The +2/3 set of precommit votes is
|
||||
called a `*commit* <./block-structure.html#commit>`__. While any
|
||||
+2/3 set of precommits for the same block at the same height&round can
|
||||
serve as validation, the canonical commit is included in the next block
|
||||
(see `LastCommit <./block-structure.html>`__).
|
||||
A block is committed when +2/3 of the validator set sign [precommit
|
||||
votes](../spec/blockchain/blockchain.md#vote) for that block at the same `round`.
|
||||
The +2/3 set of precommit votes is called a
|
||||
[*commit*](../spec/blockchain/blockchain.md#commit). While any +2/3 set of
|
||||
precommits for the same block at the same height&round can serve as
|
||||
validation, the canonical commit is included in the next block (see
|
||||
[LastCommit](../spec/blockchain/blockchain.md#last-commit)).
|
Loading…
x
Reference in New Issue
Block a user