spec: move to final location (#1576)

This commit is contained in:
Zach
2018-05-17 01:58:15 -04:00
committed by Anton Kaliaev
parent 775b015173
commit 754be1887c
25 changed files with 1 additions and 1 deletions

View File

@ -0,0 +1,46 @@
## Blockchain Reactor
* coordinates the pool for syncing
* coordinates the store for persistence
* coordinates the playing of blocks towards the app using a sm.BlockExecutor
* handles switching between fastsync and consensus
* it is a p2p.BaseReactor
* starts the pool.Start() and its poolRoutine()
* registers all the concrete types and interfaces for serialisation
### poolRoutine
* listens to these channels:
* pool requests blocks from a specific peer by posting to requestsCh, block reactor then sends
a &bcBlockRequestMessage for a specific height
* pool signals timeout of a specific peer by posting to timeoutsCh
* switchToConsensusTicker to periodically try and switch to consensus
* trySyncTicker to periodically check if we have fallen behind and then catch-up sync
* if there aren't any new blocks available on the pool it skips syncing
* tries to sync the app by taking downloaded blocks from the pool, gives them to the app and stores
them on disk
* implements Receive which is called by the switch/peer
* calls AddBlock on the pool when it receives a new block from a peer
## Block Pool
* responsible for downloading blocks from peers
* makeRequestersRoutine()
* removes timeout peers
* starts new requesters by calling makeNextRequester()
* requestRoutine():
* picks a peer and sends the request, then blocks until:
* pool is stopped by listening to pool.Quit
* requester is stopped by listening to Quit
* request is redone
* we receive a block
* gotBlockCh is strange
## Block Store
* persists blocks to disk
# TODO
* How does the switch from bcR to conR happen? Does conR persist blocks to disk too?
* What is the interaction between the consensus and blockchain reactors?

View File

@ -0,0 +1,49 @@
# Blockchain Reactor
The Blockchain Reactor's high level responsibility is to enable peers who are
far behind the current state of the consensus to quickly catch up by downloading
many blocks in parallel, verifying their commits, and executing them against the
ABCI application.
Tendermint full nodes run the Blockchain Reactor as a service to provide blocks
to new nodes. New nodes run the Blockchain Reactor in "fast_sync" mode,
where they actively make requests for more blocks until they sync up.
Once caught up, "fast_sync" mode is disabled and the node switches to
using (and turns on) the Consensus Reactor.
## Message Types
```go
const (
msgTypeBlockRequest = byte(0x10)
msgTypeBlockResponse = byte(0x11)
msgTypeNoBlockResponse = byte(0x12)
msgTypeStatusResponse = byte(0x20)
msgTypeStatusRequest = byte(0x21)
)
```
```go
type bcBlockRequestMessage struct {
Height int64
}
type bcNoBlockResponseMessage struct {
Height int64
}
type bcBlockResponseMessage struct {
Block Block
}
type bcStatusRequestMessage struct {
Height int64
type bcStatusResponseMessage struct {
Height int64
}
```
## Protocol
TODO

View File

@ -0,0 +1,344 @@
# Consensus Reactor
Consensus Reactor defines a reactor for the consensus service. It contains the ConsensusState service that
manages the state of the Tendermint consensus internal state machine.
When Consensus Reactor is started, it starts Broadcast Routine which starts ConsensusState service.
Furthermore, for each peer that is added to the Consensus Reactor, it creates (and manages) the known peer state
(that is used extensively in gossip routines) and starts the following three routines for the peer p:
Gossip Data Routine, Gossip Votes Routine and QueryMaj23Routine. Finally, Consensus Reactor is responsible
for decoding messages received from a peer and for adequate processing of the message depending on its type and content.
The processing normally consists of updating the known peer state and for some messages
(`ProposalMessage`, `BlockPartMessage` and `VoteMessage`) also forwarding message to ConsensusState module
for further processing. In the following text we specify the core functionality of those separate unit of executions
that are part of the Consensus Reactor.
## ConsensusState service
Consensus State handles execution of the Tendermint BFT consensus algorithm. It processes votes and proposals,
and upon reaching agreement, commits blocks to the chain and executes them against the application.
The internal state machine receives input from peers, the internal validator and from a timer.
Inside Consensus State we have the following units of execution: Timeout Ticker and Receive Routine.
Timeout Ticker is a timer that schedules timeouts conditional on the height/round/step that are processed
by the Receive Routine.
### Receive Routine of the ConsensusState service
Receive Routine of the ConsensusState handles messages which may cause internal consensus state transitions.
It is the only routine that updates RoundState that contains internal consensus state.
Updates (state transitions) happen on timeouts, complete proposals, and 2/3 majorities.
It receives messages from peers, internal validators and from Timeout Ticker
and invokes the corresponding handlers, potentially updating the RoundState.
The details of the protocol (together with formal proofs of correctness) implemented by the Receive Routine are
discussed in separate document. For understanding of this document
it is sufficient to understand that the Receive Routine manages and updates RoundState data structure that is
then extensively used by the gossip routines to determine what information should be sent to peer processes.
## Round State
RoundState defines the internal consensus state. It contains height, round, round step, a current validator set,
a proposal and proposal block for the current round, locked round and block (if some block is being locked), set of
received votes and last commit and last validators set.
```golang
type RoundState struct {
Height int64
Round int
Step RoundStepType
Validators ValidatorSet
Proposal Proposal
ProposalBlock Block
ProposalBlockParts PartSet
LockedRound int
LockedBlock Block
LockedBlockParts PartSet
Votes HeightVoteSet
LastCommit VoteSet
LastValidators ValidatorSet
}
```
Internally, consensus will run as a state machine with the following states:
- RoundStepNewHeight
- RoundStepNewRound
- RoundStepPropose
- RoundStepProposeWait
- RoundStepPrevote
- RoundStepPrevoteWait
- RoundStepPrecommit
- RoundStepPrecommitWait
- RoundStepCommit
## Peer Round State
Peer round state contains the known state of a peer. It is being updated by the Receive routine of
Consensus Reactor and by the gossip routines upon sending a message to the peer.
```golang
type PeerRoundState struct {
Height int64 // Height peer is at
Round int // Round peer is at, -1 if unknown.
Step RoundStepType // Step peer is at
Proposal bool // True if peer has proposal for this round
ProposalBlockPartsHeader PartSetHeader
ProposalBlockParts BitArray
ProposalPOLRound int // Proposal's POL round. -1 if none.
ProposalPOL BitArray // nil until ProposalPOLMessage received.
Prevotes BitArray // All votes peer has for this round
Precommits BitArray // All precommits peer has for this round
LastCommitRound int // Round of commit for last height. -1 if none.
LastCommit BitArray // All commit precommits of commit for last height.
CatchupCommitRound int // Round that we have commit for. Not necessarily unique. -1 if none.
CatchupCommit BitArray // All commit precommits peer has for this height & CatchupCommitRound
}
```
## Receive method of Consensus reactor
The entry point of the Consensus reactor is a receive method. When a message is received from a peer p,
normally the peer round state is updated correspondingly, and some messages
are passed for further processing, for example to ConsensusState service. We now specify the processing of messages
in the receive method of Consensus reactor for each message type. In the following message handler, `rs` and `prs` denote
`RoundState` and `PeerRoundState`, respectively.
### NewRoundStepMessage handler
```
handleMessage(msg):
if msg is from smaller height/round/step then return
// Just remember these values.
prsHeight = prs.Height
prsRound = prs.Round
prsCatchupCommitRound = prs.CatchupCommitRound
prsCatchupCommit = prs.CatchupCommit
Update prs with values from msg
if prs.Height or prs.Round has been updated then
reset Proposal related fields of the peer state
if prs.Round has been updated and msg.Round == prsCatchupCommitRound then
prs.Precommits = psCatchupCommit
if prs.Height has been updated then
if prsHeight+1 == msg.Height && prsRound == msg.LastCommitRound then
prs.LastCommitRound = msg.LastCommitRound
prs.LastCommit = prs.Precommits
} else {
prs.LastCommitRound = msg.LastCommitRound
prs.LastCommit = nil
}
Reset prs.CatchupCommitRound and prs.CatchupCommit
```
### CommitStepMessage handler
```
handleMessage(msg):
if prs.Height == msg.Height then
prs.ProposalBlockPartsHeader = msg.BlockPartsHeader
prs.ProposalBlockParts = msg.BlockParts
```
### HasVoteMessage handler
```
handleMessage(msg):
if prs.Height == msg.Height then
prs.setHasVote(msg.Height, msg.Round, msg.Type, msg.Index)
```
### VoteSetMaj23Message handler
```
handleMessage(msg):
if prs.Height == msg.Height then
Record in rs that a peer claim to have majority for msg.BlockID
Send VoteSetBitsMessage showing votes node has for that BlockId
```
### ProposalMessage handler
```
handleMessage(msg):
if prs.Height != msg.Height || prs.Round != msg.Round || prs.Proposal then return
prs.Proposal = true
prs.ProposalBlockPartsHeader = msg.BlockPartsHeader
prs.ProposalBlockParts = empty set
prs.ProposalPOLRound = msg.POLRound
prs.ProposalPOL = nil
Send msg through internal peerMsgQueue to ConsensusState service
```
### ProposalPOLMessage handler
```
handleMessage(msg):
if prs.Height != msg.Height or prs.ProposalPOLRound != msg.ProposalPOLRound then return
prs.ProposalPOL = msg.ProposalPOL
```
### BlockPartMessage handler
```
handleMessage(msg):
if prs.Height != msg.Height || prs.Round != msg.Round then return
Record in prs that peer has block part msg.Part.Index
Send msg trough internal peerMsgQueue to ConsensusState service
```
### VoteMessage handler
```
handleMessage(msg):
Record in prs that a peer knows vote with index msg.vote.ValidatorIndex for particular height and round
Send msg trough internal peerMsgQueue to ConsensusState service
```
### VoteSetBitsMessage handler
```
handleMessage(msg):
Update prs for the bit-array of votes peer claims to have for the msg.BlockID
```
## Gossip Data Routine
It is used to send the following messages to the peer: `BlockPartMessage`, `ProposalMessage` and
`ProposalPOLMessage` on the DataChannel. The gossip data routine is based on the local RoundState (`rs`)
and the known PeerRoundState (`prs`). The routine repeats forever the logic shown below:
```
1a) if rs.ProposalBlockPartsHeader == prs.ProposalBlockPartsHeader and the peer does not have all the proposal parts then
Part = pick a random proposal block part the peer does not have
Send BlockPartMessage(rs.Height, rs.Round, Part) to the peer on the DataChannel
if send returns true, record that the peer knows the corresponding block Part
Continue
1b) if (0 < prs.Height) and (prs.Height < rs.Height) then
help peer catch up using gossipDataForCatchup function
Continue
1c) if (rs.Height != prs.Height) or (rs.Round != prs.Round) then
Sleep PeerGossipSleepDuration
Continue
// at this point rs.Height == prs.Height and rs.Round == prs.Round
1d) if (rs.Proposal != nil and !prs.Proposal) then
Send ProposalMessage(rs.Proposal) to the peer
if send returns true, record that the peer knows Proposal
if 0 <= rs.Proposal.POLRound then
polRound = rs.Proposal.POLRound
prevotesBitArray = rs.Votes.Prevotes(polRound).BitArray()
Send ProposalPOLMessage(rs.Height, polRound, prevotesBitArray)
Continue
2) Sleep PeerGossipSleepDuration
```
### Gossip Data For Catchup
This function is responsible for helping peer catch up if it is at the smaller height (prs.Height < rs.Height).
The function executes the following logic:
if peer does not have all block parts for prs.ProposalBlockPart then
blockMeta = Load Block Metadata for height prs.Height from blockStore
if (!blockMeta.BlockID.PartsHeader == prs.ProposalBlockPartsHeader) then
Sleep PeerGossipSleepDuration
return
Part = pick a random proposal block part the peer does not have
Send BlockPartMessage(prs.Height, prs.Round, Part) to the peer on the DataChannel
if send returns true, record that the peer knows the corresponding block Part
return
else Sleep PeerGossipSleepDuration
## Gossip Votes Routine
It is used to send the following message: `VoteMessage` on the VoteChannel.
The gossip votes routine is based on the local RoundState (`rs`)
and the known PeerRoundState (`prs`). The routine repeats forever the logic shown below:
```
1a) if rs.Height == prs.Height then
if prs.Step == RoundStepNewHeight then
vote = random vote from rs.LastCommit the peer does not have
Send VoteMessage(vote) to the peer
if send returns true, continue
if prs.Step <= RoundStepPrevote and prs.Round != -1 and prs.Round <= rs.Round then
Prevotes = rs.Votes.Prevotes(prs.Round)
vote = random vote from Prevotes the peer does not have
Send VoteMessage(vote) to the peer
if send returns true, continue
if prs.Step <= RoundStepPrecommit and prs.Round != -1 and prs.Round <= rs.Round then
Precommits = rs.Votes.Precommits(prs.Round)
vote = random vote from Precommits the peer does not have
Send VoteMessage(vote) to the peer
if send returns true, continue
if prs.ProposalPOLRound != -1 then
PolPrevotes = rs.Votes.Prevotes(prs.ProposalPOLRound)
vote = random vote from PolPrevotes the peer does not have
Send VoteMessage(vote) to the peer
if send returns true, continue
1b) if prs.Height != 0 and rs.Height == prs.Height+1 then
vote = random vote from rs.LastCommit peer does not have
Send VoteMessage(vote) to the peer
if send returns true, continue
1c) if prs.Height != 0 and rs.Height >= prs.Height+2 then
Commit = get commit from BlockStore for prs.Height
vote = random vote from Commit the peer does not have
Send VoteMessage(vote) to the peer
if send returns true, continue
2) Sleep PeerGossipSleepDuration
```
## QueryMaj23Routine
It is used to send the following message: `VoteSetMaj23Message`. `VoteSetMaj23Message` is sent to indicate that a given
BlockID has seen +2/3 votes. This routine is based on the local RoundState (`rs`) and the known PeerRoundState
(`prs`). The routine repeats forever the logic shown below.
```
1a) if rs.Height == prs.Height then
Prevotes = rs.Votes.Prevotes(prs.Round)
if there is a ⅔ majority for some blockId in Prevotes then
m = VoteSetMaj23Message(prs.Height, prs.Round, Prevote, blockId)
Send m to peer
Sleep PeerQueryMaj23SleepDuration
1b) if rs.Height == prs.Height then
Precommits = rs.Votes.Precommits(prs.Round)
if there is a ⅔ majority for some blockId in Precommits then
m = VoteSetMaj23Message(prs.Height,prs.Round,Precommit,blockId)
Send m to peer
Sleep PeerQueryMaj23SleepDuration
1c) if rs.Height == prs.Height and prs.ProposalPOLRound >= 0 then
Prevotes = rs.Votes.Prevotes(prs.ProposalPOLRound)
if there is a ⅔ majority for some blockId in Prevotes then
m = VoteSetMaj23Message(prs.Height,prs.ProposalPOLRound,Prevotes,blockId)
Send m to peer
Sleep PeerQueryMaj23SleepDuration
1d) if prs.CatchupCommitRound != -1 and 0 < prs.Height and
prs.Height <= blockStore.Height() then
Commit = LoadCommit(prs.Height)
m = VoteSetMaj23Message(prs.Height,Commit.Round,Precommit,Commit.blockId)
Send m to peer
Sleep PeerQueryMaj23SleepDuration
2) Sleep PeerQueryMaj23SleepDuration
```
## Broadcast routine
The Broadcast routine subscribes to an internal event bus to receive new round steps, votes messages and proposal
heartbeat messages, and broadcasts messages to peers upon receiving those events.
It broadcasts `NewRoundStepMessage` or `CommitStepMessage` upon new round state event. Note that
broadcasting these messages does not depend on the PeerRoundState; it is sent on the StateChannel.
Upon receiving VoteMessage it broadcasts `HasVoteMessage` message to its peers on the StateChannel.
`ProposalHeartbeatMessage` is sent the same way on the StateChannel.

View File

@ -0,0 +1,212 @@
# Tendermint Consensus Reactor
Tendermint Consensus is a distributed protocol executed by validator processes to agree on
the next block to be added to the Tendermint blockchain. The protocol proceeds in rounds, where
each round is a try to reach agreement on the next block. A round starts by having a dedicated
process (called proposer) suggesting to other processes what should be the next block with
the `ProposalMessage`.
The processes respond by voting for a block with `VoteMessage` (there are two kinds of vote
messages, prevote and precommit votes). Note that a proposal message is just a suggestion what the
next block should be; a validator might vote with a `VoteMessage` for a different block. If in some
round, enough number of processes vote for the same block, then this block is committed and later
added to the blockchain. `ProposalMessage` and `VoteMessage` are signed by the private key of the
validator. The internals of the protocol and how it ensures safety and liveness properties are
explained in a forthcoming document.
For efficiency reasons, validators in Tendermint consensus protocol do not agree directly on the
block as the block size is big, i.e., they don't embed the block inside `Proposal` and
`VoteMessage`. Instead, they reach agreement on the `BlockID` (see `BlockID` definition in
[Blockchain](blockchain.md) section) that uniquely identifies each block. The block itself is
disseminated to validator processes using peer-to-peer gossiping protocol. It starts by having a
proposer first splitting a block into a number of block parts, that are then gossiped between
processes using `BlockPartMessage`.
Validators in Tendermint communicate by peer-to-peer gossiping protocol. Each validator is connected
only to a subset of processes called peers. By the gossiping protocol, a validator send to its peers
all needed information (`ProposalMessage`, `VoteMessage` and `BlockPartMessage`) so they can
reach agreement on some block, and also obtain the content of the chosen block (block parts). As
part of the gossiping protocol, processes also send auxiliary messages that inform peers about the
executed steps of the core consensus algorithm (`NewRoundStepMessage` and `CommitStepMessage`), and
also messages that inform peers what votes the process has seen (`HasVoteMessage`,
`VoteSetMaj23Message` and `VoteSetBitsMessage`). These messages are then used in the gossiping
protocol to determine what messages a process should send to its peers.
We now describe the content of each message exchanged during Tendermint consensus protocol.
## ProposalMessage
ProposalMessage is sent when a new block is proposed. It is a suggestion of what the
next block in the blockchain should be.
```go
type ProposalMessage struct {
Proposal Proposal
}
```
### Proposal
Proposal contains height and round for which this proposal is made, BlockID as a unique identifier
of proposed block, timestamp, and two fields (POLRound and POLBlockID) that are needed for
termination of the consensus. The message is signed by the validator private key.
```go
type Proposal struct {
Height int64
Round int
Timestamp Time
BlockID BlockID
POLRound int
POLBlockID BlockID
Signature Signature
}
```
NOTE: In the current version of the Tendermint, the consensus value in proposal is represented with
PartSetHeader, and with BlockID in vote message. It should be aligned as suggested in this spec as
BlockID contains PartSetHeader.
## VoteMessage
VoteMessage is sent to vote for some block (or to inform others that a process does not vote in the
current round). Vote is defined in [Blockchain](blockchain.md) section and contains validator's
information (validator address and index), height and round for which the vote is sent, vote type,
blockID if process vote for some block (`nil` otherwise) and a timestamp when the vote is sent. The
message is signed by the validator private key.
```go
type VoteMessage struct {
Vote Vote
}
```
## BlockPartMessage
BlockPartMessage is sent when gossipping a piece of the proposed block. It contains height, round
and the block part.
```go
type BlockPartMessage struct {
Height int64
Round int
Part Part
}
```
## ProposalHeartbeatMessage
ProposalHeartbeatMessage is sent to signal that a node is alive and waiting for transactions
to be able to create a next block proposal.
```go
type ProposalHeartbeatMessage struct {
Heartbeat Heartbeat
}
```
### Heartbeat
Heartbeat contains validator information (address and index),
height, round and sequence number. It is signed by the private key of the validator.
```go
type Heartbeat struct {
ValidatorAddress []byte
ValidatorIndex int
Height int64
Round int
Sequence int
Signature Signature
}
```
## NewRoundStepMessage
NewRoundStepMessage is sent for every step transition during the core consensus algorithm execution.
It is used in the gossip part of the Tendermint protocol to inform peers about a current
height/round/step a process is in.
```go
type NewRoundStepMessage struct {
Height int64
Round int
Step RoundStepType
SecondsSinceStartTime int
LastCommitRound int
}
```
## CommitStepMessage
CommitStepMessage is sent when an agreement on some block is reached. It contains height for which
agreement is reached, block parts header that describes the decided block and is used to obtain all
block parts, and a bit array of the block parts a process currently has, so its peers can know what
parts it is missing so they can send them.
```go
type CommitStepMessage struct {
Height int64
BlockID BlockID
BlockParts BitArray
}
```
TODO: We use BlockID instead of BlockPartsHeader (in current implementation) for symmetry.
## ProposalPOLMessage
ProposalPOLMessage is sent when a previous block is re-proposed.
It is used to inform peers in what round the process learned for this block (ProposalPOLRound),
and what prevotes for the re-proposed block the process has.
```go
type ProposalPOLMessage struct {
Height int64
ProposalPOLRound int
ProposalPOL BitArray
}
```
## HasVoteMessage
HasVoteMessage is sent to indicate that a particular vote has been received. It contains height,
round, vote type and the index of the validator that is the originator of the corresponding vote.
```go
type HasVoteMessage struct {
Height int64
Round int
Type byte
Index int
}
```
## VoteSetMaj23Message
VoteSetMaj23Message is sent to indicate that a process has seen +2/3 votes for some BlockID.
It contains height, round, vote type and the BlockID.
```go
type VoteSetMaj23Message struct {
Height int64
Round int
Type byte
BlockID BlockID
}
```
## VoteSetBitsMessage
VoteSetBitsMessage is sent to communicate the bit-array of votes a process has seen for a given
BlockID. It contains height, round, vote type, BlockID and a bit array of
the votes a process has.
```go
type VoteSetBitsMessage struct {
Height int64
Round int
Type byte
BlockID BlockID
Votes BitArray
}
```

View File

@ -0,0 +1,47 @@
# Proposer selection procedure in Tendermint
This document specifies the Proposer Selection Procedure that is used in Tendermint to choose a round proposer.
As Tendermint is “leader-based protocol”, the proposer selection is critical for its correct functioning.
Let denote with `proposer_p(h,r)` a process returned by the Proposer Selection Procedure at the process p, at height h
and round r. Then the Proposer Selection procedure should fulfill the following properties:
`Agreement`: Given a validator set V, and two honest validators,
p and q, for each height h, and each round r,
proposer_p(h,r) = proposer_q(h,r)
`Liveness`: In every consecutive sequence of rounds of size K (K is system parameter), at least a
single round has an honest proposer.
`Fairness`: The proposer selection is proportional to the validator voting power, i.e., a validator with more
voting power is selected more frequently, proportional to its power. More precisely, given a set of processes
with the total voting power N, during a sequence of rounds of size N, every process is proposer in a number of rounds
equal to its voting power.
We now look at a few particular cases to understand better how fairness should be implemented.
If we have 4 processes with the following voting power distribution (p0,4), (p1, 2), (p2, 2), (p3, 2) at some round r,
we have the following sequence of proposer selections in the following rounds:
`p0, p1, p2, p3, p0, p0, p1, p2, p3, p0, p0, p1, p2, p3, p0, p0, p1, p2, p3, p0, etc`
Let consider now the following scenario where a total voting power of faulty processes is aggregated in a single process
p0: (p0,3), (p1, 1), (p2, 1), (p3, 1), (p4, 1), (p5, 1), (p6, 1), (p7, 1).
In this case the sequence of proposer selections looks like this:
`p0, p1, p2, p3, p0, p4, p5, p6, p7, p0, p0, p1, p2, p3, p0, p4, p5, p6, p7, p0, etc`
In this case, we see that a number of rounds coordinated by a faulty process is proportional to its voting power.
We consider also the case where we have voting power uniformly distributed among processes, i.e., we have 10 processes
each with voting power of 1. And let consider that there are 3 faulty processes with consecutive addresses,
for example the first 3 processes are faulty. Then the sequence looks like this:
`p0, p1, p2, p3, p4, p5, p6, p7, p8, p9, p0, p1, p2, p3, p4, p5, p6, p7, p8, p9, etc`
In this case, we have 3 consecutive rounds with a faulty proposer.
One special case we consider is the case where a single honest process p0 has most of the voting power, for example:
(p0,100), (p1, 2), (p2, 3), (p3, 4). Then the sequence of proposer selection looks like this:
p0, p0, p0, p0, p0, p0, p0, p0, p0, p0, p0, p0, p0, p1, p0, p0, p0, p0, p0, etc
This basically means that almost all rounds have the same proposer. But in this case, the process p0 has anyway enough
voting power to decide whatever he wants, so the fact that he coordinates almost all rounds seems correct.

View File

@ -0,0 +1,11 @@
# Mempool Specification
This package contains documents specifying the functionality
of the mempool module.
Components:
* [Config](./config.md) - how to configure it
* [External Messages](./messages.md) - The messages we accept over p2p and rpc interfaces
* [Functionality](./functionality.md) - high-level description of the functionality it provides
* [Concurrency Model](./concurrency.md) - What guarantees we provide, what locks we require.

View File

@ -0,0 +1,8 @@
# Mempool Concurrency
Look at the concurrency model this uses...
* Receiving CheckTx
* Broadcasting new tx
* Interfaces with consensus engine, reap/update while checking
* Calling the ABCI app (ordering. callbacks. how proxy works alongside the blockchain proxy which actually writes blocks)

View File

@ -0,0 +1,59 @@
# Mempool Configuration
Here we describe configuration options around mempool.
For the purposes of this document, they are described
as command-line flags, but they can also be passed in as
environmental variables or in the config.toml file. The
following are all equivalent:
Flag: `--mempool.recheck_empty=false`
Environment: `TM_MEMPOOL_RECHECK_EMPTY=false`
Config:
```
[mempool]
recheck_empty = false
```
## Recheck
`--mempool.recheck=false` (default: true)
`--mempool.recheck_empty=false` (default: true)
Recheck determines if the mempool rechecks all pending
transactions after a block was committed. Once a block
is committed, the mempool removes all valid transactions
that were successfully included in the block.
If `recheck` is true, then it will rerun CheckTx on
all remaining transactions with the new block state.
If the block contained no transactions, it will skip the
recheck unless `recheck_empty` is true.
## Broadcast
`--mempool.broadcast=false` (default: true)
Determines whether this node gossips any valid transactions
that arrive in mempool. Default is to gossip anything that
passes checktx. If this is disabled, transactions are not
gossiped, but instead stored locally and added to the next
block this node is the proposer.
## WalDir
`--mempool.wal_dir=/tmp/gaia/mempool.wal` (default: $TM_HOME/data/mempool.wal)
This defines the directory where mempool writes the write-ahead
logs. These files can be used to reload unbroadcasted
transactions if the node crashes.
If the directory passed in is an absolute path, the wal file is
created there. If the directory is a relative path, the path is
appended to home directory of the tendermint process to
generate an absolute path to the wal directory
(default `$HOME/.tendermint` or set via `TM_HOME` or `--home``)

View File

@ -0,0 +1,37 @@
# Mempool Functionality
The mempool maintains a list of potentially valid transactions,
both to broadcast to other nodes, as well as to provide to the
consensus reactor when it is selected as the block proposer.
There are two sides to the mempool state:
* External: get, check, and broadcast new transactions
* Internal: return valid transaction, update list after block commit
## External functionality
External functionality is exposed via network interfaces
to potentially untrusted actors.
* CheckTx - triggered via RPC or P2P
* Broadcast - gossip messages after a successful check
## Internal functionality
Internal functionality is exposed via method calls to other
code compiled into the tendermint binary.
* Reap - get tx to propose in next block
* Update - remove tx that were included in last block
* ABCI.CheckTx - call ABCI app to validate the tx
What does it provide the consensus reactor?
What guarantees does it need from the ABCI app?
(talk about interleaving processes in concurrency)
## Optimizations
Talk about the LRU cache to make sure we don't process any
tx that we have seen before

View File

@ -0,0 +1,60 @@
# Mempool Messages
## P2P Messages
There is currently only one message that Mempool broadcasts
and receives over the p2p gossip network (via the reactor):
`TxMessage`
```go
// TxMessage is a MempoolMessage containing a transaction.
type TxMessage struct {
Tx types.Tx
}
```
TxMessage is go-wire encoded and prepended with `0x1` as a
"type byte". This is followed by a go-wire encoded byte-slice.
Prefix of 40=0x28 byte tx is: `0x010128...` followed by
the actual 40-byte tx. Prefix of 350=0x015e byte tx is:
`0x0102015e...` followed by the actual 350 byte tx.
(Please see the [go-wire repo](https://github.com/tendermint/go-wire#an-interface-example) for more information)
## RPC Messages
Mempool exposes `CheckTx([]byte)` over the RPC interface.
It can be posted via `broadcast_commit`, `broadcast_sync` or
`broadcast_async`. They all parse a message with one argument,
`"tx": "HEX_ENCODED_BINARY"` and differ in only how long they
wait before returning (sync makes sure CheckTx passes, commit
makes sure it was included in a signed block).
Request (`POST http://gaia.zone:46657/`):
```json
{
"id": "",
"jsonrpc": "2.0",
"method": "broadcast_sync",
"params": {
"tx": "F012A4BC68..."
}
}
```
Response:
```json
{
"error": "",
"result": {
"hash": "E39AAB7A537ABAA237831742DCE1117F187C3C52",
"log": "",
"data": "",
"code": 0
},
"id": "",
"jsonrpc": "2.0"
}
```

View File

@ -0,0 +1,123 @@
# Peer Strategy and Exchange
Here we outline the design of the AddressBook
and how it used by the Peer Exchange Reactor (PEX) to ensure we are connected
to good peers and to gossip peers to others.
## Peer Types
Certain peers are special in that they are specified by the user as `persistent`,
which means we auto-redial them if the connection fails, or if we fail to dial
them.
Some peers can be marked as `private`, which means
we will not put them in the address book or gossip them to others.
All peers except private peers are tracked using the address book.
## Discovery
Peer discovery begins with a list of seeds.
When we have no peers, or have been unable to find enough peers from existing ones,
we dial a randomly selected seed to get a list of peers to dial.
On startup, we will also immediately dial the given list of `persistent_peers`,
and will attempt to maintain persistent connections with them. If the connections die, or we fail to dial,
we will redial every 5s for a few minutes, then switch to an exponential backoff schedule,
and after about a day of trying, stop dialing the peer.
So long as we have less than `MinNumOutboundPeers`, we periodically request additional peers
from each of our own. If sufficient time goes by and we still can't find enough peers,
we try the seeds again.
## Listening
Peers listen on a configurable ListenAddr that they self-report in their
NodeInfo during handshakes with other peers. Peers accept up to (MaxNumPeers -
MinNumOutboundPeers) incoming peers.
## Address Book
Peers are tracked via their ID (their PubKey.Address()).
Peers are added to the address book from the PEX when they first connect to us or
when we hear about them from other peers.
The address book is arranged in sets of buckets, and distinguishes between
vetted (old) and unvetted (new) peers. It keeps different sets of buckets for vetted and
unvetted peers. Buckets provide randomization over peer selection. Peers are put
in buckets according to their IP groups.
A vetted peer can only be in one bucket. An unvetted peer can be in multiple buckets, and
each instance of the peer can have a different IP:PORT.
If we're trying to add a new peer but there's no space in its bucket, we'll
remove the worst peer from that bucket to make room.
## Vetting
When a peer is first added, it is unvetted.
Marking a peer as vetted is outside the scope of the `p2p` package.
For Tendermint, a Peer becomes vetted once it has contributed sufficiently
at the consensus layer; ie. once it has sent us valid and not-yet-known
votes and/or block parts for `NumBlocksForVetted` blocks.
Other users of the p2p package can determine their own conditions for when a peer is marked vetted.
If a peer becomes vetted but there are already too many vetted peers,
a randomly selected one of the vetted peers becomes unvetted.
If a peer becomes unvetted (either a new peer, or one that was previously vetted),
a randomly selected one of the unvetted peers is removed from the address book.
More fine-grained tracking of peer behaviour can be done using
a trust metric (see below), but it's best to start with something simple.
## Select Peers to Dial
When we need more peers, we pick them randomly from the addrbook with some
configurable bias for unvetted peers. The bias should be lower when we have fewer peers
and can increase as we obtain more, ensuring that our first peers are more trustworthy,
but always giving us the chance to discover new good peers.
We track the last time we dialed a peer and the number of unsuccessful attempts
we've made. If too many attempts are made, we mark the peer as bad.
Connection attempts are made with exponential backoff (plus jitter). Because
the selection process happens every `ensurePeersPeriod`, we might not end up
dialing a peer for much longer than the backoff duration.
If we fail to connect to the peer after 16 tries (with exponential backoff), we remove from address book completely.
## Select Peers to Exchange
When were asked for peers, we select them as follows:
- select at most `maxGetSelection` peers
- try to select at least `minGetSelection` peers - if we have less than that, select them all.
- select a random, unbiased `getSelectionPercent` of the peers
Send the selected peers. Note we select peers for sending without bias for vetted/unvetted.
## Preventing Spam
There are various cases where we decide a peer has misbehaved and we disconnect from them.
When this happens, the peer is removed from the address book and black listed for
some amount of time. We call this "Disconnect and Mark".
Note that the bad behaviour may be detected outside the PEX reactor itself
(for instance, in the mconnection, or another reactor), but it must be communicated to the PEX reactor
so it can remove and mark the peer.
In the PEX, if a peer sends us an unsolicited list of peers,
or if the peer sends a request too soon after another one,
we Disconnect and MarkBad.
## Trust Metric
The quality of peers can be tracked in more fine-grained detail using a
Proportional-Integral-Derivative (PID) controller that incorporates
current, past, and rate-of-change data to inform peer quality.
While a PID trust metric has been implemented, it remains for future work
to use it in the PEX.
See the [trustmetric](../../../architecture/adr-006-trust-metric.md )
and [trustmetric useage](../../../architecture/adr-007-trust-metric-usage.md )
architecture docs for more details.