mirror of
https://github.com/fluencelabs/tendermint
synced 2025-07-02 22:21:44 +00:00
spec: move to final location (#1576)
This commit is contained in:
46
docs/spec/reactors/block_sync/impl.md
Normal file
46
docs/spec/reactors/block_sync/impl.md
Normal file
@ -0,0 +1,46 @@
|
||||
## Blockchain Reactor
|
||||
|
||||
* coordinates the pool for syncing
|
||||
* coordinates the store for persistence
|
||||
* coordinates the playing of blocks towards the app using a sm.BlockExecutor
|
||||
* handles switching between fastsync and consensus
|
||||
* it is a p2p.BaseReactor
|
||||
* starts the pool.Start() and its poolRoutine()
|
||||
* registers all the concrete types and interfaces for serialisation
|
||||
|
||||
### poolRoutine
|
||||
|
||||
* listens to these channels:
|
||||
* pool requests blocks from a specific peer by posting to requestsCh, block reactor then sends
|
||||
a &bcBlockRequestMessage for a specific height
|
||||
* pool signals timeout of a specific peer by posting to timeoutsCh
|
||||
* switchToConsensusTicker to periodically try and switch to consensus
|
||||
* trySyncTicker to periodically check if we have fallen behind and then catch-up sync
|
||||
* if there aren't any new blocks available on the pool it skips syncing
|
||||
* tries to sync the app by taking downloaded blocks from the pool, gives them to the app and stores
|
||||
them on disk
|
||||
* implements Receive which is called by the switch/peer
|
||||
* calls AddBlock on the pool when it receives a new block from a peer
|
||||
|
||||
## Block Pool
|
||||
|
||||
* responsible for downloading blocks from peers
|
||||
* makeRequestersRoutine()
|
||||
* removes timeout peers
|
||||
* starts new requesters by calling makeNextRequester()
|
||||
* requestRoutine():
|
||||
* picks a peer and sends the request, then blocks until:
|
||||
* pool is stopped by listening to pool.Quit
|
||||
* requester is stopped by listening to Quit
|
||||
* request is redone
|
||||
* we receive a block
|
||||
* gotBlockCh is strange
|
||||
|
||||
## Block Store
|
||||
|
||||
* persists blocks to disk
|
||||
|
||||
# TODO
|
||||
|
||||
* How does the switch from bcR to conR happen? Does conR persist blocks to disk too?
|
||||
* What is the interaction between the consensus and blockchain reactors?
|
49
docs/spec/reactors/block_sync/reactor.md
Normal file
49
docs/spec/reactors/block_sync/reactor.md
Normal file
@ -0,0 +1,49 @@
|
||||
# Blockchain Reactor
|
||||
|
||||
The Blockchain Reactor's high level responsibility is to enable peers who are
|
||||
far behind the current state of the consensus to quickly catch up by downloading
|
||||
many blocks in parallel, verifying their commits, and executing them against the
|
||||
ABCI application.
|
||||
|
||||
Tendermint full nodes run the Blockchain Reactor as a service to provide blocks
|
||||
to new nodes. New nodes run the Blockchain Reactor in "fast_sync" mode,
|
||||
where they actively make requests for more blocks until they sync up.
|
||||
Once caught up, "fast_sync" mode is disabled and the node switches to
|
||||
using (and turns on) the Consensus Reactor.
|
||||
|
||||
## Message Types
|
||||
|
||||
```go
|
||||
const (
|
||||
msgTypeBlockRequest = byte(0x10)
|
||||
msgTypeBlockResponse = byte(0x11)
|
||||
msgTypeNoBlockResponse = byte(0x12)
|
||||
msgTypeStatusResponse = byte(0x20)
|
||||
msgTypeStatusRequest = byte(0x21)
|
||||
)
|
||||
```
|
||||
|
||||
```go
|
||||
type bcBlockRequestMessage struct {
|
||||
Height int64
|
||||
}
|
||||
|
||||
type bcNoBlockResponseMessage struct {
|
||||
Height int64
|
||||
}
|
||||
|
||||
type bcBlockResponseMessage struct {
|
||||
Block Block
|
||||
}
|
||||
|
||||
type bcStatusRequestMessage struct {
|
||||
Height int64
|
||||
|
||||
type bcStatusResponseMessage struct {
|
||||
Height int64
|
||||
}
|
||||
```
|
||||
|
||||
## Protocol
|
||||
|
||||
TODO
|
344
docs/spec/reactors/consensus/consensus-reactor.md
Normal file
344
docs/spec/reactors/consensus/consensus-reactor.md
Normal file
@ -0,0 +1,344 @@
|
||||
# Consensus Reactor
|
||||
|
||||
Consensus Reactor defines a reactor for the consensus service. It contains the ConsensusState service that
|
||||
manages the state of the Tendermint consensus internal state machine.
|
||||
When Consensus Reactor is started, it starts Broadcast Routine which starts ConsensusState service.
|
||||
Furthermore, for each peer that is added to the Consensus Reactor, it creates (and manages) the known peer state
|
||||
(that is used extensively in gossip routines) and starts the following three routines for the peer p:
|
||||
Gossip Data Routine, Gossip Votes Routine and QueryMaj23Routine. Finally, Consensus Reactor is responsible
|
||||
for decoding messages received from a peer and for adequate processing of the message depending on its type and content.
|
||||
The processing normally consists of updating the known peer state and for some messages
|
||||
(`ProposalMessage`, `BlockPartMessage` and `VoteMessage`) also forwarding message to ConsensusState module
|
||||
for further processing. In the following text we specify the core functionality of those separate unit of executions
|
||||
that are part of the Consensus Reactor.
|
||||
|
||||
## ConsensusState service
|
||||
|
||||
Consensus State handles execution of the Tendermint BFT consensus algorithm. It processes votes and proposals,
|
||||
and upon reaching agreement, commits blocks to the chain and executes them against the application.
|
||||
The internal state machine receives input from peers, the internal validator and from a timer.
|
||||
|
||||
Inside Consensus State we have the following units of execution: Timeout Ticker and Receive Routine.
|
||||
Timeout Ticker is a timer that schedules timeouts conditional on the height/round/step that are processed
|
||||
by the Receive Routine.
|
||||
|
||||
|
||||
### Receive Routine of the ConsensusState service
|
||||
|
||||
Receive Routine of the ConsensusState handles messages which may cause internal consensus state transitions.
|
||||
It is the only routine that updates RoundState that contains internal consensus state.
|
||||
Updates (state transitions) happen on timeouts, complete proposals, and 2/3 majorities.
|
||||
It receives messages from peers, internal validators and from Timeout Ticker
|
||||
and invokes the corresponding handlers, potentially updating the RoundState.
|
||||
The details of the protocol (together with formal proofs of correctness) implemented by the Receive Routine are
|
||||
discussed in separate document. For understanding of this document
|
||||
it is sufficient to understand that the Receive Routine manages and updates RoundState data structure that is
|
||||
then extensively used by the gossip routines to determine what information should be sent to peer processes.
|
||||
|
||||
## Round State
|
||||
|
||||
RoundState defines the internal consensus state. It contains height, round, round step, a current validator set,
|
||||
a proposal and proposal block for the current round, locked round and block (if some block is being locked), set of
|
||||
received votes and last commit and last validators set.
|
||||
|
||||
```golang
|
||||
type RoundState struct {
|
||||
Height int64
|
||||
Round int
|
||||
Step RoundStepType
|
||||
Validators ValidatorSet
|
||||
Proposal Proposal
|
||||
ProposalBlock Block
|
||||
ProposalBlockParts PartSet
|
||||
LockedRound int
|
||||
LockedBlock Block
|
||||
LockedBlockParts PartSet
|
||||
Votes HeightVoteSet
|
||||
LastCommit VoteSet
|
||||
LastValidators ValidatorSet
|
||||
}
|
||||
```
|
||||
|
||||
Internally, consensus will run as a state machine with the following states:
|
||||
|
||||
- RoundStepNewHeight
|
||||
- RoundStepNewRound
|
||||
- RoundStepPropose
|
||||
- RoundStepProposeWait
|
||||
- RoundStepPrevote
|
||||
- RoundStepPrevoteWait
|
||||
- RoundStepPrecommit
|
||||
- RoundStepPrecommitWait
|
||||
- RoundStepCommit
|
||||
|
||||
## Peer Round State
|
||||
|
||||
Peer round state contains the known state of a peer. It is being updated by the Receive routine of
|
||||
Consensus Reactor and by the gossip routines upon sending a message to the peer.
|
||||
|
||||
```golang
|
||||
type PeerRoundState struct {
|
||||
Height int64 // Height peer is at
|
||||
Round int // Round peer is at, -1 if unknown.
|
||||
Step RoundStepType // Step peer is at
|
||||
Proposal bool // True if peer has proposal for this round
|
||||
ProposalBlockPartsHeader PartSetHeader
|
||||
ProposalBlockParts BitArray
|
||||
ProposalPOLRound int // Proposal's POL round. -1 if none.
|
||||
ProposalPOL BitArray // nil until ProposalPOLMessage received.
|
||||
Prevotes BitArray // All votes peer has for this round
|
||||
Precommits BitArray // All precommits peer has for this round
|
||||
LastCommitRound int // Round of commit for last height. -1 if none.
|
||||
LastCommit BitArray // All commit precommits of commit for last height.
|
||||
CatchupCommitRound int // Round that we have commit for. Not necessarily unique. -1 if none.
|
||||
CatchupCommit BitArray // All commit precommits peer has for this height & CatchupCommitRound
|
||||
}
|
||||
```
|
||||
|
||||
## Receive method of Consensus reactor
|
||||
|
||||
The entry point of the Consensus reactor is a receive method. When a message is received from a peer p,
|
||||
normally the peer round state is updated correspondingly, and some messages
|
||||
are passed for further processing, for example to ConsensusState service. We now specify the processing of messages
|
||||
in the receive method of Consensus reactor for each message type. In the following message handler, `rs` and `prs` denote
|
||||
`RoundState` and `PeerRoundState`, respectively.
|
||||
|
||||
### NewRoundStepMessage handler
|
||||
|
||||
```
|
||||
handleMessage(msg):
|
||||
if msg is from smaller height/round/step then return
|
||||
// Just remember these values.
|
||||
prsHeight = prs.Height
|
||||
prsRound = prs.Round
|
||||
prsCatchupCommitRound = prs.CatchupCommitRound
|
||||
prsCatchupCommit = prs.CatchupCommit
|
||||
|
||||
Update prs with values from msg
|
||||
if prs.Height or prs.Round has been updated then
|
||||
reset Proposal related fields of the peer state
|
||||
if prs.Round has been updated and msg.Round == prsCatchupCommitRound then
|
||||
prs.Precommits = psCatchupCommit
|
||||
if prs.Height has been updated then
|
||||
if prsHeight+1 == msg.Height && prsRound == msg.LastCommitRound then
|
||||
prs.LastCommitRound = msg.LastCommitRound
|
||||
prs.LastCommit = prs.Precommits
|
||||
} else {
|
||||
prs.LastCommitRound = msg.LastCommitRound
|
||||
prs.LastCommit = nil
|
||||
}
|
||||
Reset prs.CatchupCommitRound and prs.CatchupCommit
|
||||
```
|
||||
|
||||
### CommitStepMessage handler
|
||||
|
||||
```
|
||||
handleMessage(msg):
|
||||
if prs.Height == msg.Height then
|
||||
prs.ProposalBlockPartsHeader = msg.BlockPartsHeader
|
||||
prs.ProposalBlockParts = msg.BlockParts
|
||||
```
|
||||
|
||||
### HasVoteMessage handler
|
||||
|
||||
```
|
||||
handleMessage(msg):
|
||||
if prs.Height == msg.Height then
|
||||
prs.setHasVote(msg.Height, msg.Round, msg.Type, msg.Index)
|
||||
```
|
||||
|
||||
### VoteSetMaj23Message handler
|
||||
|
||||
```
|
||||
handleMessage(msg):
|
||||
if prs.Height == msg.Height then
|
||||
Record in rs that a peer claim to have ⅔ majority for msg.BlockID
|
||||
Send VoteSetBitsMessage showing votes node has for that BlockId
|
||||
```
|
||||
|
||||
### ProposalMessage handler
|
||||
|
||||
```
|
||||
handleMessage(msg):
|
||||
if prs.Height != msg.Height || prs.Round != msg.Round || prs.Proposal then return
|
||||
prs.Proposal = true
|
||||
prs.ProposalBlockPartsHeader = msg.BlockPartsHeader
|
||||
prs.ProposalBlockParts = empty set
|
||||
prs.ProposalPOLRound = msg.POLRound
|
||||
prs.ProposalPOL = nil
|
||||
Send msg through internal peerMsgQueue to ConsensusState service
|
||||
```
|
||||
|
||||
### ProposalPOLMessage handler
|
||||
|
||||
```
|
||||
handleMessage(msg):
|
||||
if prs.Height != msg.Height or prs.ProposalPOLRound != msg.ProposalPOLRound then return
|
||||
prs.ProposalPOL = msg.ProposalPOL
|
||||
```
|
||||
|
||||
### BlockPartMessage handler
|
||||
|
||||
```
|
||||
handleMessage(msg):
|
||||
if prs.Height != msg.Height || prs.Round != msg.Round then return
|
||||
Record in prs that peer has block part msg.Part.Index
|
||||
Send msg trough internal peerMsgQueue to ConsensusState service
|
||||
```
|
||||
|
||||
### VoteMessage handler
|
||||
|
||||
```
|
||||
handleMessage(msg):
|
||||
Record in prs that a peer knows vote with index msg.vote.ValidatorIndex for particular height and round
|
||||
Send msg trough internal peerMsgQueue to ConsensusState service
|
||||
```
|
||||
|
||||
### VoteSetBitsMessage handler
|
||||
|
||||
```
|
||||
handleMessage(msg):
|
||||
Update prs for the bit-array of votes peer claims to have for the msg.BlockID
|
||||
```
|
||||
|
||||
## Gossip Data Routine
|
||||
|
||||
It is used to send the following messages to the peer: `BlockPartMessage`, `ProposalMessage` and
|
||||
`ProposalPOLMessage` on the DataChannel. The gossip data routine is based on the local RoundState (`rs`)
|
||||
and the known PeerRoundState (`prs`). The routine repeats forever the logic shown below:
|
||||
|
||||
```
|
||||
1a) if rs.ProposalBlockPartsHeader == prs.ProposalBlockPartsHeader and the peer does not have all the proposal parts then
|
||||
Part = pick a random proposal block part the peer does not have
|
||||
Send BlockPartMessage(rs.Height, rs.Round, Part) to the peer on the DataChannel
|
||||
if send returns true, record that the peer knows the corresponding block Part
|
||||
Continue
|
||||
|
||||
1b) if (0 < prs.Height) and (prs.Height < rs.Height) then
|
||||
help peer catch up using gossipDataForCatchup function
|
||||
Continue
|
||||
|
||||
1c) if (rs.Height != prs.Height) or (rs.Round != prs.Round) then
|
||||
Sleep PeerGossipSleepDuration
|
||||
Continue
|
||||
|
||||
// at this point rs.Height == prs.Height and rs.Round == prs.Round
|
||||
1d) if (rs.Proposal != nil and !prs.Proposal) then
|
||||
Send ProposalMessage(rs.Proposal) to the peer
|
||||
if send returns true, record that the peer knows Proposal
|
||||
if 0 <= rs.Proposal.POLRound then
|
||||
polRound = rs.Proposal.POLRound
|
||||
prevotesBitArray = rs.Votes.Prevotes(polRound).BitArray()
|
||||
Send ProposalPOLMessage(rs.Height, polRound, prevotesBitArray)
|
||||
Continue
|
||||
|
||||
2) Sleep PeerGossipSleepDuration
|
||||
```
|
||||
|
||||
### Gossip Data For Catchup
|
||||
|
||||
This function is responsible for helping peer catch up if it is at the smaller height (prs.Height < rs.Height).
|
||||
The function executes the following logic:
|
||||
|
||||
if peer does not have all block parts for prs.ProposalBlockPart then
|
||||
blockMeta = Load Block Metadata for height prs.Height from blockStore
|
||||
if (!blockMeta.BlockID.PartsHeader == prs.ProposalBlockPartsHeader) then
|
||||
Sleep PeerGossipSleepDuration
|
||||
return
|
||||
Part = pick a random proposal block part the peer does not have
|
||||
Send BlockPartMessage(prs.Height, prs.Round, Part) to the peer on the DataChannel
|
||||
if send returns true, record that the peer knows the corresponding block Part
|
||||
return
|
||||
else Sleep PeerGossipSleepDuration
|
||||
|
||||
## Gossip Votes Routine
|
||||
|
||||
It is used to send the following message: `VoteMessage` on the VoteChannel.
|
||||
The gossip votes routine is based on the local RoundState (`rs`)
|
||||
and the known PeerRoundState (`prs`). The routine repeats forever the logic shown below:
|
||||
|
||||
```
|
||||
1a) if rs.Height == prs.Height then
|
||||
if prs.Step == RoundStepNewHeight then
|
||||
vote = random vote from rs.LastCommit the peer does not have
|
||||
Send VoteMessage(vote) to the peer
|
||||
if send returns true, continue
|
||||
|
||||
if prs.Step <= RoundStepPrevote and prs.Round != -1 and prs.Round <= rs.Round then
|
||||
Prevotes = rs.Votes.Prevotes(prs.Round)
|
||||
vote = random vote from Prevotes the peer does not have
|
||||
Send VoteMessage(vote) to the peer
|
||||
if send returns true, continue
|
||||
|
||||
if prs.Step <= RoundStepPrecommit and prs.Round != -1 and prs.Round <= rs.Round then
|
||||
Precommits = rs.Votes.Precommits(prs.Round)
|
||||
vote = random vote from Precommits the peer does not have
|
||||
Send VoteMessage(vote) to the peer
|
||||
if send returns true, continue
|
||||
|
||||
if prs.ProposalPOLRound != -1 then
|
||||
PolPrevotes = rs.Votes.Prevotes(prs.ProposalPOLRound)
|
||||
vote = random vote from PolPrevotes the peer does not have
|
||||
Send VoteMessage(vote) to the peer
|
||||
if send returns true, continue
|
||||
|
||||
1b) if prs.Height != 0 and rs.Height == prs.Height+1 then
|
||||
vote = random vote from rs.LastCommit peer does not have
|
||||
Send VoteMessage(vote) to the peer
|
||||
if send returns true, continue
|
||||
|
||||
1c) if prs.Height != 0 and rs.Height >= prs.Height+2 then
|
||||
Commit = get commit from BlockStore for prs.Height
|
||||
vote = random vote from Commit the peer does not have
|
||||
Send VoteMessage(vote) to the peer
|
||||
if send returns true, continue
|
||||
|
||||
2) Sleep PeerGossipSleepDuration
|
||||
```
|
||||
|
||||
## QueryMaj23Routine
|
||||
|
||||
It is used to send the following message: `VoteSetMaj23Message`. `VoteSetMaj23Message` is sent to indicate that a given
|
||||
BlockID has seen +2/3 votes. This routine is based on the local RoundState (`rs`) and the known PeerRoundState
|
||||
(`prs`). The routine repeats forever the logic shown below.
|
||||
|
||||
```
|
||||
1a) if rs.Height == prs.Height then
|
||||
Prevotes = rs.Votes.Prevotes(prs.Round)
|
||||
if there is a ⅔ majority for some blockId in Prevotes then
|
||||
m = VoteSetMaj23Message(prs.Height, prs.Round, Prevote, blockId)
|
||||
Send m to peer
|
||||
Sleep PeerQueryMaj23SleepDuration
|
||||
|
||||
1b) if rs.Height == prs.Height then
|
||||
Precommits = rs.Votes.Precommits(prs.Round)
|
||||
if there is a ⅔ majority for some blockId in Precommits then
|
||||
m = VoteSetMaj23Message(prs.Height,prs.Round,Precommit,blockId)
|
||||
Send m to peer
|
||||
Sleep PeerQueryMaj23SleepDuration
|
||||
|
||||
1c) if rs.Height == prs.Height and prs.ProposalPOLRound >= 0 then
|
||||
Prevotes = rs.Votes.Prevotes(prs.ProposalPOLRound)
|
||||
if there is a ⅔ majority for some blockId in Prevotes then
|
||||
m = VoteSetMaj23Message(prs.Height,prs.ProposalPOLRound,Prevotes,blockId)
|
||||
Send m to peer
|
||||
Sleep PeerQueryMaj23SleepDuration
|
||||
|
||||
1d) if prs.CatchupCommitRound != -1 and 0 < prs.Height and
|
||||
prs.Height <= blockStore.Height() then
|
||||
Commit = LoadCommit(prs.Height)
|
||||
m = VoteSetMaj23Message(prs.Height,Commit.Round,Precommit,Commit.blockId)
|
||||
Send m to peer
|
||||
Sleep PeerQueryMaj23SleepDuration
|
||||
|
||||
2) Sleep PeerQueryMaj23SleepDuration
|
||||
```
|
||||
|
||||
## Broadcast routine
|
||||
|
||||
The Broadcast routine subscribes to an internal event bus to receive new round steps, votes messages and proposal
|
||||
heartbeat messages, and broadcasts messages to peers upon receiving those events.
|
||||
It broadcasts `NewRoundStepMessage` or `CommitStepMessage` upon new round state event. Note that
|
||||
broadcasting these messages does not depend on the PeerRoundState; it is sent on the StateChannel.
|
||||
Upon receiving VoteMessage it broadcasts `HasVoteMessage` message to its peers on the StateChannel.
|
||||
`ProposalHeartbeatMessage` is sent the same way on the StateChannel.
|
212
docs/spec/reactors/consensus/consensus.md
Normal file
212
docs/spec/reactors/consensus/consensus.md
Normal file
@ -0,0 +1,212 @@
|
||||
# Tendermint Consensus Reactor
|
||||
|
||||
Tendermint Consensus is a distributed protocol executed by validator processes to agree on
|
||||
the next block to be added to the Tendermint blockchain. The protocol proceeds in rounds, where
|
||||
each round is a try to reach agreement on the next block. A round starts by having a dedicated
|
||||
process (called proposer) suggesting to other processes what should be the next block with
|
||||
the `ProposalMessage`.
|
||||
The processes respond by voting for a block with `VoteMessage` (there are two kinds of vote
|
||||
messages, prevote and precommit votes). Note that a proposal message is just a suggestion what the
|
||||
next block should be; a validator might vote with a `VoteMessage` for a different block. If in some
|
||||
round, enough number of processes vote for the same block, then this block is committed and later
|
||||
added to the blockchain. `ProposalMessage` and `VoteMessage` are signed by the private key of the
|
||||
validator. The internals of the protocol and how it ensures safety and liveness properties are
|
||||
explained in a forthcoming document.
|
||||
|
||||
For efficiency reasons, validators in Tendermint consensus protocol do not agree directly on the
|
||||
block as the block size is big, i.e., they don't embed the block inside `Proposal` and
|
||||
`VoteMessage`. Instead, they reach agreement on the `BlockID` (see `BlockID` definition in
|
||||
[Blockchain](blockchain.md) section) that uniquely identifies each block. The block itself is
|
||||
disseminated to validator processes using peer-to-peer gossiping protocol. It starts by having a
|
||||
proposer first splitting a block into a number of block parts, that are then gossiped between
|
||||
processes using `BlockPartMessage`.
|
||||
|
||||
Validators in Tendermint communicate by peer-to-peer gossiping protocol. Each validator is connected
|
||||
only to a subset of processes called peers. By the gossiping protocol, a validator send to its peers
|
||||
all needed information (`ProposalMessage`, `VoteMessage` and `BlockPartMessage`) so they can
|
||||
reach agreement on some block, and also obtain the content of the chosen block (block parts). As
|
||||
part of the gossiping protocol, processes also send auxiliary messages that inform peers about the
|
||||
executed steps of the core consensus algorithm (`NewRoundStepMessage` and `CommitStepMessage`), and
|
||||
also messages that inform peers what votes the process has seen (`HasVoteMessage`,
|
||||
`VoteSetMaj23Message` and `VoteSetBitsMessage`). These messages are then used in the gossiping
|
||||
protocol to determine what messages a process should send to its peers.
|
||||
|
||||
We now describe the content of each message exchanged during Tendermint consensus protocol.
|
||||
|
||||
## ProposalMessage
|
||||
|
||||
ProposalMessage is sent when a new block is proposed. It is a suggestion of what the
|
||||
next block in the blockchain should be.
|
||||
|
||||
```go
|
||||
type ProposalMessage struct {
|
||||
Proposal Proposal
|
||||
}
|
||||
```
|
||||
|
||||
### Proposal
|
||||
|
||||
Proposal contains height and round for which this proposal is made, BlockID as a unique identifier
|
||||
of proposed block, timestamp, and two fields (POLRound and POLBlockID) that are needed for
|
||||
termination of the consensus. The message is signed by the validator private key.
|
||||
|
||||
```go
|
||||
type Proposal struct {
|
||||
Height int64
|
||||
Round int
|
||||
Timestamp Time
|
||||
BlockID BlockID
|
||||
POLRound int
|
||||
POLBlockID BlockID
|
||||
Signature Signature
|
||||
}
|
||||
```
|
||||
|
||||
NOTE: In the current version of the Tendermint, the consensus value in proposal is represented with
|
||||
PartSetHeader, and with BlockID in vote message. It should be aligned as suggested in this spec as
|
||||
BlockID contains PartSetHeader.
|
||||
|
||||
## VoteMessage
|
||||
|
||||
VoteMessage is sent to vote for some block (or to inform others that a process does not vote in the
|
||||
current round). Vote is defined in [Blockchain](blockchain.md) section and contains validator's
|
||||
information (validator address and index), height and round for which the vote is sent, vote type,
|
||||
blockID if process vote for some block (`nil` otherwise) and a timestamp when the vote is sent. The
|
||||
message is signed by the validator private key.
|
||||
|
||||
```go
|
||||
type VoteMessage struct {
|
||||
Vote Vote
|
||||
}
|
||||
```
|
||||
|
||||
## BlockPartMessage
|
||||
|
||||
BlockPartMessage is sent when gossipping a piece of the proposed block. It contains height, round
|
||||
and the block part.
|
||||
|
||||
```go
|
||||
type BlockPartMessage struct {
|
||||
Height int64
|
||||
Round int
|
||||
Part Part
|
||||
}
|
||||
```
|
||||
|
||||
## ProposalHeartbeatMessage
|
||||
|
||||
ProposalHeartbeatMessage is sent to signal that a node is alive and waiting for transactions
|
||||
to be able to create a next block proposal.
|
||||
|
||||
```go
|
||||
type ProposalHeartbeatMessage struct {
|
||||
Heartbeat Heartbeat
|
||||
}
|
||||
```
|
||||
|
||||
### Heartbeat
|
||||
|
||||
Heartbeat contains validator information (address and index),
|
||||
height, round and sequence number. It is signed by the private key of the validator.
|
||||
|
||||
```go
|
||||
type Heartbeat struct {
|
||||
ValidatorAddress []byte
|
||||
ValidatorIndex int
|
||||
Height int64
|
||||
Round int
|
||||
Sequence int
|
||||
Signature Signature
|
||||
}
|
||||
```
|
||||
|
||||
## NewRoundStepMessage
|
||||
|
||||
NewRoundStepMessage is sent for every step transition during the core consensus algorithm execution.
|
||||
It is used in the gossip part of the Tendermint protocol to inform peers about a current
|
||||
height/round/step a process is in.
|
||||
|
||||
```go
|
||||
type NewRoundStepMessage struct {
|
||||
Height int64
|
||||
Round int
|
||||
Step RoundStepType
|
||||
SecondsSinceStartTime int
|
||||
LastCommitRound int
|
||||
}
|
||||
```
|
||||
|
||||
## CommitStepMessage
|
||||
|
||||
CommitStepMessage is sent when an agreement on some block is reached. It contains height for which
|
||||
agreement is reached, block parts header that describes the decided block and is used to obtain all
|
||||
block parts, and a bit array of the block parts a process currently has, so its peers can know what
|
||||
parts it is missing so they can send them.
|
||||
|
||||
```go
|
||||
type CommitStepMessage struct {
|
||||
Height int64
|
||||
BlockID BlockID
|
||||
BlockParts BitArray
|
||||
}
|
||||
```
|
||||
|
||||
TODO: We use BlockID instead of BlockPartsHeader (in current implementation) for symmetry.
|
||||
|
||||
## ProposalPOLMessage
|
||||
|
||||
ProposalPOLMessage is sent when a previous block is re-proposed.
|
||||
It is used to inform peers in what round the process learned for this block (ProposalPOLRound),
|
||||
and what prevotes for the re-proposed block the process has.
|
||||
|
||||
```go
|
||||
type ProposalPOLMessage struct {
|
||||
Height int64
|
||||
ProposalPOLRound int
|
||||
ProposalPOL BitArray
|
||||
}
|
||||
```
|
||||
|
||||
## HasVoteMessage
|
||||
|
||||
HasVoteMessage is sent to indicate that a particular vote has been received. It contains height,
|
||||
round, vote type and the index of the validator that is the originator of the corresponding vote.
|
||||
|
||||
```go
|
||||
type HasVoteMessage struct {
|
||||
Height int64
|
||||
Round int
|
||||
Type byte
|
||||
Index int
|
||||
}
|
||||
```
|
||||
|
||||
## VoteSetMaj23Message
|
||||
|
||||
VoteSetMaj23Message is sent to indicate that a process has seen +2/3 votes for some BlockID.
|
||||
It contains height, round, vote type and the BlockID.
|
||||
|
||||
```go
|
||||
type VoteSetMaj23Message struct {
|
||||
Height int64
|
||||
Round int
|
||||
Type byte
|
||||
BlockID BlockID
|
||||
}
|
||||
```
|
||||
|
||||
## VoteSetBitsMessage
|
||||
|
||||
VoteSetBitsMessage is sent to communicate the bit-array of votes a process has seen for a given
|
||||
BlockID. It contains height, round, vote type, BlockID and a bit array of
|
||||
the votes a process has.
|
||||
|
||||
```go
|
||||
type VoteSetBitsMessage struct {
|
||||
Height int64
|
||||
Round int
|
||||
Type byte
|
||||
BlockID BlockID
|
||||
Votes BitArray
|
||||
}
|
||||
```
|
47
docs/spec/reactors/consensus/proposer-selection.md
Normal file
47
docs/spec/reactors/consensus/proposer-selection.md
Normal file
@ -0,0 +1,47 @@
|
||||
# Proposer selection procedure in Tendermint
|
||||
|
||||
This document specifies the Proposer Selection Procedure that is used in Tendermint to choose a round proposer.
|
||||
As Tendermint is “leader-based protocol”, the proposer selection is critical for its correct functioning.
|
||||
Let denote with `proposer_p(h,r)` a process returned by the Proposer Selection Procedure at the process p, at height h
|
||||
and round r. Then the Proposer Selection procedure should fulfill the following properties:
|
||||
|
||||
`Agreement`: Given a validator set V, and two honest validators,
|
||||
p and q, for each height h, and each round r,
|
||||
proposer_p(h,r) = proposer_q(h,r)
|
||||
|
||||
`Liveness`: In every consecutive sequence of rounds of size K (K is system parameter), at least a
|
||||
single round has an honest proposer.
|
||||
|
||||
`Fairness`: The proposer selection is proportional to the validator voting power, i.e., a validator with more
|
||||
voting power is selected more frequently, proportional to its power. More precisely, given a set of processes
|
||||
with the total voting power N, during a sequence of rounds of size N, every process is proposer in a number of rounds
|
||||
equal to its voting power.
|
||||
|
||||
We now look at a few particular cases to understand better how fairness should be implemented.
|
||||
If we have 4 processes with the following voting power distribution (p0,4), (p1, 2), (p2, 2), (p3, 2) at some round r,
|
||||
we have the following sequence of proposer selections in the following rounds:
|
||||
|
||||
`p0, p1, p2, p3, p0, p0, p1, p2, p3, p0, p0, p1, p2, p3, p0, p0, p1, p2, p3, p0, etc`
|
||||
|
||||
Let consider now the following scenario where a total voting power of faulty processes is aggregated in a single process
|
||||
p0: (p0,3), (p1, 1), (p2, 1), (p3, 1), (p4, 1), (p5, 1), (p6, 1), (p7, 1).
|
||||
In this case the sequence of proposer selections looks like this:
|
||||
|
||||
`p0, p1, p2, p3, p0, p4, p5, p6, p7, p0, p0, p1, p2, p3, p0, p4, p5, p6, p7, p0, etc`
|
||||
|
||||
In this case, we see that a number of rounds coordinated by a faulty process is proportional to its voting power.
|
||||
We consider also the case where we have voting power uniformly distributed among processes, i.e., we have 10 processes
|
||||
each with voting power of 1. And let consider that there are 3 faulty processes with consecutive addresses,
|
||||
for example the first 3 processes are faulty. Then the sequence looks like this:
|
||||
|
||||
`p0, p1, p2, p3, p4, p5, p6, p7, p8, p9, p0, p1, p2, p3, p4, p5, p6, p7, p8, p9, etc`
|
||||
|
||||
In this case, we have 3 consecutive rounds with a faulty proposer.
|
||||
One special case we consider is the case where a single honest process p0 has most of the voting power, for example:
|
||||
(p0,100), (p1, 2), (p2, 3), (p3, 4). Then the sequence of proposer selection looks like this:
|
||||
|
||||
p0, p0, p0, p0, p0, p0, p0, p0, p0, p0, p0, p0, p0, p1, p0, p0, p0, p0, p0, etc
|
||||
|
||||
This basically means that almost all rounds have the same proposer. But in this case, the process p0 has anyway enough
|
||||
voting power to decide whatever he wants, so the fact that he coordinates almost all rounds seems correct.
|
||||
|
11
docs/spec/reactors/mempool/README.md
Normal file
11
docs/spec/reactors/mempool/README.md
Normal file
@ -0,0 +1,11 @@
|
||||
# Mempool Specification
|
||||
|
||||
This package contains documents specifying the functionality
|
||||
of the mempool module.
|
||||
|
||||
Components:
|
||||
|
||||
* [Config](./config.md) - how to configure it
|
||||
* [External Messages](./messages.md) - The messages we accept over p2p and rpc interfaces
|
||||
* [Functionality](./functionality.md) - high-level description of the functionality it provides
|
||||
* [Concurrency Model](./concurrency.md) - What guarantees we provide, what locks we require.
|
8
docs/spec/reactors/mempool/concurrency.md
Normal file
8
docs/spec/reactors/mempool/concurrency.md
Normal file
@ -0,0 +1,8 @@
|
||||
# Mempool Concurrency
|
||||
|
||||
Look at the concurrency model this uses...
|
||||
|
||||
* Receiving CheckTx
|
||||
* Broadcasting new tx
|
||||
* Interfaces with consensus engine, reap/update while checking
|
||||
* Calling the ABCI app (ordering. callbacks. how proxy works alongside the blockchain proxy which actually writes blocks)
|
59
docs/spec/reactors/mempool/config.md
Normal file
59
docs/spec/reactors/mempool/config.md
Normal file
@ -0,0 +1,59 @@
|
||||
# Mempool Configuration
|
||||
|
||||
Here we describe configuration options around mempool.
|
||||
For the purposes of this document, they are described
|
||||
as command-line flags, but they can also be passed in as
|
||||
environmental variables or in the config.toml file. The
|
||||
following are all equivalent:
|
||||
|
||||
Flag: `--mempool.recheck_empty=false`
|
||||
|
||||
Environment: `TM_MEMPOOL_RECHECK_EMPTY=false`
|
||||
|
||||
Config:
|
||||
```
|
||||
[mempool]
|
||||
recheck_empty = false
|
||||
```
|
||||
|
||||
|
||||
## Recheck
|
||||
|
||||
`--mempool.recheck=false` (default: true)
|
||||
|
||||
`--mempool.recheck_empty=false` (default: true)
|
||||
|
||||
Recheck determines if the mempool rechecks all pending
|
||||
transactions after a block was committed. Once a block
|
||||
is committed, the mempool removes all valid transactions
|
||||
that were successfully included in the block.
|
||||
|
||||
If `recheck` is true, then it will rerun CheckTx on
|
||||
all remaining transactions with the new block state.
|
||||
|
||||
If the block contained no transactions, it will skip the
|
||||
recheck unless `recheck_empty` is true.
|
||||
|
||||
## Broadcast
|
||||
|
||||
`--mempool.broadcast=false` (default: true)
|
||||
|
||||
Determines whether this node gossips any valid transactions
|
||||
that arrive in mempool. Default is to gossip anything that
|
||||
passes checktx. If this is disabled, transactions are not
|
||||
gossiped, but instead stored locally and added to the next
|
||||
block this node is the proposer.
|
||||
|
||||
## WalDir
|
||||
|
||||
`--mempool.wal_dir=/tmp/gaia/mempool.wal` (default: $TM_HOME/data/mempool.wal)
|
||||
|
||||
This defines the directory where mempool writes the write-ahead
|
||||
logs. These files can be used to reload unbroadcasted
|
||||
transactions if the node crashes.
|
||||
|
||||
If the directory passed in is an absolute path, the wal file is
|
||||
created there. If the directory is a relative path, the path is
|
||||
appended to home directory of the tendermint process to
|
||||
generate an absolute path to the wal directory
|
||||
(default `$HOME/.tendermint` or set via `TM_HOME` or `--home``)
|
37
docs/spec/reactors/mempool/functionality.md
Normal file
37
docs/spec/reactors/mempool/functionality.md
Normal file
@ -0,0 +1,37 @@
|
||||
# Mempool Functionality
|
||||
|
||||
The mempool maintains a list of potentially valid transactions,
|
||||
both to broadcast to other nodes, as well as to provide to the
|
||||
consensus reactor when it is selected as the block proposer.
|
||||
|
||||
There are two sides to the mempool state:
|
||||
|
||||
* External: get, check, and broadcast new transactions
|
||||
* Internal: return valid transaction, update list after block commit
|
||||
|
||||
|
||||
## External functionality
|
||||
|
||||
External functionality is exposed via network interfaces
|
||||
to potentially untrusted actors.
|
||||
|
||||
* CheckTx - triggered via RPC or P2P
|
||||
* Broadcast - gossip messages after a successful check
|
||||
|
||||
## Internal functionality
|
||||
|
||||
Internal functionality is exposed via method calls to other
|
||||
code compiled into the tendermint binary.
|
||||
|
||||
* Reap - get tx to propose in next block
|
||||
* Update - remove tx that were included in last block
|
||||
* ABCI.CheckTx - call ABCI app to validate the tx
|
||||
|
||||
What does it provide the consensus reactor?
|
||||
What guarantees does it need from the ABCI app?
|
||||
(talk about interleaving processes in concurrency)
|
||||
|
||||
## Optimizations
|
||||
|
||||
Talk about the LRU cache to make sure we don't process any
|
||||
tx that we have seen before
|
60
docs/spec/reactors/mempool/messages.md
Normal file
60
docs/spec/reactors/mempool/messages.md
Normal file
@ -0,0 +1,60 @@
|
||||
# Mempool Messages
|
||||
|
||||
## P2P Messages
|
||||
|
||||
There is currently only one message that Mempool broadcasts
|
||||
and receives over the p2p gossip network (via the reactor):
|
||||
`TxMessage`
|
||||
|
||||
```go
|
||||
// TxMessage is a MempoolMessage containing a transaction.
|
||||
type TxMessage struct {
|
||||
Tx types.Tx
|
||||
}
|
||||
```
|
||||
|
||||
TxMessage is go-wire encoded and prepended with `0x1` as a
|
||||
"type byte". This is followed by a go-wire encoded byte-slice.
|
||||
Prefix of 40=0x28 byte tx is: `0x010128...` followed by
|
||||
the actual 40-byte tx. Prefix of 350=0x015e byte tx is:
|
||||
`0x0102015e...` followed by the actual 350 byte tx.
|
||||
|
||||
(Please see the [go-wire repo](https://github.com/tendermint/go-wire#an-interface-example) for more information)
|
||||
|
||||
## RPC Messages
|
||||
|
||||
Mempool exposes `CheckTx([]byte)` over the RPC interface.
|
||||
|
||||
It can be posted via `broadcast_commit`, `broadcast_sync` or
|
||||
`broadcast_async`. They all parse a message with one argument,
|
||||
`"tx": "HEX_ENCODED_BINARY"` and differ in only how long they
|
||||
wait before returning (sync makes sure CheckTx passes, commit
|
||||
makes sure it was included in a signed block).
|
||||
|
||||
Request (`POST http://gaia.zone:46657/`):
|
||||
```json
|
||||
{
|
||||
"id": "",
|
||||
"jsonrpc": "2.0",
|
||||
"method": "broadcast_sync",
|
||||
"params": {
|
||||
"tx": "F012A4BC68..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
Response:
|
||||
```json
|
||||
{
|
||||
"error": "",
|
||||
"result": {
|
||||
"hash": "E39AAB7A537ABAA237831742DCE1117F187C3C52",
|
||||
"log": "",
|
||||
"data": "",
|
||||
"code": 0
|
||||
},
|
||||
"id": "",
|
||||
"jsonrpc": "2.0"
|
||||
}
|
||||
```
|
123
docs/spec/reactors/pex/pex.md
Normal file
123
docs/spec/reactors/pex/pex.md
Normal file
@ -0,0 +1,123 @@
|
||||
# Peer Strategy and Exchange
|
||||
|
||||
Here we outline the design of the AddressBook
|
||||
and how it used by the Peer Exchange Reactor (PEX) to ensure we are connected
|
||||
to good peers and to gossip peers to others.
|
||||
|
||||
## Peer Types
|
||||
|
||||
Certain peers are special in that they are specified by the user as `persistent`,
|
||||
which means we auto-redial them if the connection fails, or if we fail to dial
|
||||
them.
|
||||
Some peers can be marked as `private`, which means
|
||||
we will not put them in the address book or gossip them to others.
|
||||
|
||||
All peers except private peers are tracked using the address book.
|
||||
|
||||
## Discovery
|
||||
|
||||
Peer discovery begins with a list of seeds.
|
||||
When we have no peers, or have been unable to find enough peers from existing ones,
|
||||
we dial a randomly selected seed to get a list of peers to dial.
|
||||
|
||||
On startup, we will also immediately dial the given list of `persistent_peers`,
|
||||
and will attempt to maintain persistent connections with them. If the connections die, or we fail to dial,
|
||||
we will redial every 5s for a few minutes, then switch to an exponential backoff schedule,
|
||||
and after about a day of trying, stop dialing the peer.
|
||||
|
||||
So long as we have less than `MinNumOutboundPeers`, we periodically request additional peers
|
||||
from each of our own. If sufficient time goes by and we still can't find enough peers,
|
||||
we try the seeds again.
|
||||
|
||||
## Listening
|
||||
|
||||
Peers listen on a configurable ListenAddr that they self-report in their
|
||||
NodeInfo during handshakes with other peers. Peers accept up to (MaxNumPeers -
|
||||
MinNumOutboundPeers) incoming peers.
|
||||
|
||||
## Address Book
|
||||
|
||||
Peers are tracked via their ID (their PubKey.Address()).
|
||||
Peers are added to the address book from the PEX when they first connect to us or
|
||||
when we hear about them from other peers.
|
||||
|
||||
The address book is arranged in sets of buckets, and distinguishes between
|
||||
vetted (old) and unvetted (new) peers. It keeps different sets of buckets for vetted and
|
||||
unvetted peers. Buckets provide randomization over peer selection. Peers are put
|
||||
in buckets according to their IP groups.
|
||||
|
||||
A vetted peer can only be in one bucket. An unvetted peer can be in multiple buckets, and
|
||||
each instance of the peer can have a different IP:PORT.
|
||||
|
||||
If we're trying to add a new peer but there's no space in its bucket, we'll
|
||||
remove the worst peer from that bucket to make room.
|
||||
|
||||
## Vetting
|
||||
|
||||
When a peer is first added, it is unvetted.
|
||||
Marking a peer as vetted is outside the scope of the `p2p` package.
|
||||
For Tendermint, a Peer becomes vetted once it has contributed sufficiently
|
||||
at the consensus layer; ie. once it has sent us valid and not-yet-known
|
||||
votes and/or block parts for `NumBlocksForVetted` blocks.
|
||||
Other users of the p2p package can determine their own conditions for when a peer is marked vetted.
|
||||
|
||||
If a peer becomes vetted but there are already too many vetted peers,
|
||||
a randomly selected one of the vetted peers becomes unvetted.
|
||||
|
||||
If a peer becomes unvetted (either a new peer, or one that was previously vetted),
|
||||
a randomly selected one of the unvetted peers is removed from the address book.
|
||||
|
||||
More fine-grained tracking of peer behaviour can be done using
|
||||
a trust metric (see below), but it's best to start with something simple.
|
||||
|
||||
## Select Peers to Dial
|
||||
|
||||
When we need more peers, we pick them randomly from the addrbook with some
|
||||
configurable bias for unvetted peers. The bias should be lower when we have fewer peers
|
||||
and can increase as we obtain more, ensuring that our first peers are more trustworthy,
|
||||
but always giving us the chance to discover new good peers.
|
||||
|
||||
We track the last time we dialed a peer and the number of unsuccessful attempts
|
||||
we've made. If too many attempts are made, we mark the peer as bad.
|
||||
|
||||
Connection attempts are made with exponential backoff (plus jitter). Because
|
||||
the selection process happens every `ensurePeersPeriod`, we might not end up
|
||||
dialing a peer for much longer than the backoff duration.
|
||||
|
||||
If we fail to connect to the peer after 16 tries (with exponential backoff), we remove from address book completely.
|
||||
|
||||
## Select Peers to Exchange
|
||||
|
||||
When we’re asked for peers, we select them as follows:
|
||||
- select at most `maxGetSelection` peers
|
||||
- try to select at least `minGetSelection` peers - if we have less than that, select them all.
|
||||
- select a random, unbiased `getSelectionPercent` of the peers
|
||||
|
||||
Send the selected peers. Note we select peers for sending without bias for vetted/unvetted.
|
||||
|
||||
## Preventing Spam
|
||||
|
||||
There are various cases where we decide a peer has misbehaved and we disconnect from them.
|
||||
When this happens, the peer is removed from the address book and black listed for
|
||||
some amount of time. We call this "Disconnect and Mark".
|
||||
Note that the bad behaviour may be detected outside the PEX reactor itself
|
||||
(for instance, in the mconnection, or another reactor), but it must be communicated to the PEX reactor
|
||||
so it can remove and mark the peer.
|
||||
|
||||
In the PEX, if a peer sends us an unsolicited list of peers,
|
||||
or if the peer sends a request too soon after another one,
|
||||
we Disconnect and MarkBad.
|
||||
|
||||
## Trust Metric
|
||||
|
||||
The quality of peers can be tracked in more fine-grained detail using a
|
||||
Proportional-Integral-Derivative (PID) controller that incorporates
|
||||
current, past, and rate-of-change data to inform peer quality.
|
||||
|
||||
While a PID trust metric has been implemented, it remains for future work
|
||||
to use it in the PEX.
|
||||
|
||||
See the [trustmetric](../../../architecture/adr-006-trust-metric.md )
|
||||
and [trustmetric useage](../../../architecture/adr-007-trust-metric-usage.md )
|
||||
architecture docs for more details.
|
||||
|
Reference in New Issue
Block a user