Merge branch 'develop' into config

This commit is contained in:
Zach
2018-01-06 05:30:38 +00:00
committed by GitHub
51 changed files with 2152 additions and 1328 deletions

View File

@@ -9,6 +9,7 @@ It contains the following components:
- [Encoding and Digests](encoding.md)
- [Blockchain](blockchain.md)
- [State](state.md)
- [P2P](p2p/node.md)
## Overview

View File

@@ -0,0 +1,197 @@
# Tendermint Consensus
Tendermint consensus is a distributed protocol executed by validator processes to agree on
the next block to be added to the Tendermint blockchain. The protocol proceeds in rounds, where
each round is a try to reach agreement on the next block. A round starts by having a dedicated
process (called proposer) suggesting to other processes what should be the next block with
the `ProposalMessage`.
The processes respond by voting for a block with `VoteMessage` (there are two kinds of vote messages, prevote
and precommit votes). Note that a proposal message is just a suggestion what the next block should be; a
validator might vote with a `VoteMessage` for a different block. If in some round, enough number
of processes vote for the same block, then this block is committed and later added to the blockchain.
`ProposalMessage` and `VoteMessage` are signed by the private key of the validator.
The internals of the protocol and how it ensures safety and liveness properties are
explained [here](https://github.com/tendermint/spec).
For efficiency reasons, validators in Tendermint consensus protocol do not agree directly on the block
as the block size is big, i.e., they don't embed the block inside `Proposal` and `VoteMessage`. Instead, they
reach agreement on the `BlockID` (see `BlockID` definition in [Blockchain](blockchain.md) section)
that uniquely identifies each block. The block itself is disseminated to validator processes using
peer-to-peer gossiping protocol. It starts by having a proposer first splitting a block into a
number of block parts, that are then gossiped between processes using `BlockPartMessage`.
Validators in Tendermint communicate by peer-to-peer gossiping protocol. Each validator is connected
only to a subset of processes called peers. By the gossiping protocol, a validator send to its peers
all needed information (`ProposalMessage`, `VoteMessage` and `BlockPartMessage`) so they can
reach agreement on some block, and also obtain the content of the chosen block (block parts). As part of
the gossiping protocol, processes also send auxiliary messages that inform peers about the
executed steps of the core consensus algorithm (`NewRoundStepMessage` and `CommitStepMessage`), and
also messages that inform peers what votes the process has seen (`HasVoteMessage`,
`VoteSetMaj23Message` and `VoteSetBitsMessage`). These messages are then used in the gossiping protocol
to determine what messages a process should send to its peers.
We now describe the content of each message exchanged during Tendermint consensus protocol.
## ProposalMessage
ProposalMessage is sent when a new block is proposed. It is a suggestion of what the
next block in the blockchain should be.
```
type ProposalMessage struct {
Proposal Proposal
}
```
### Proposal
Proposal contains height and round for which this proposal is made, BlockID as a unique identifier of proposed
block, timestamp, and two fields (POLRound and POLBlockID) that are needed for termination of the consensus.
The message is signed by the validator private key.
```
type Proposal struct {
Height int64
Round int
Timestamp Time
BlockID BlockID
POLRound int
POLBlockID BlockID
Signature Signature
}
```
NOTE: In the current version of the Tendermint, the consensus value in proposal is represented with
PartSetHeader, and with BlockID in vote message. It should be aligned as suggested in this spec as
BlockID contains PartSetHeader.
## VoteMessage
VoteMessage is sent to vote for some block (or to inform others that a process does not vote in the current round).
Vote is defined in [Blockchain](blockchain.md) section and contains validator's information (validator address
and index), height and round for which the vote is sent, vote type, blockID if process vote for some
block (`nil` otherwise) and a timestamp when the vote is sent. The message is signed by the validator private key.
```
type VoteMessage struct {
Vote Vote
}
```
## BlockPartMessage
BlockPartMessage is sent when gossipping a piece of the proposed block. It contains height, round
and the block part.
```
type BlockPartMessage struct {
Height int64
Round int
Part Part
}
```
## ProposalHeartbeatMessage
ProposalHeartbeatMessage is sent to signal that a node is alive and waiting for transactions
to be able to create a next block proposal.
```
type ProposalHeartbeatMessage struct {
Heartbeat Heartbeat
}
```
### Heartbeat
Heartbeat contains validator information (address and index),
height, round and sequence number. It is signed by the private key of the validator.
```
type Heartbeat struct {
ValidatorAddress []byte
ValidatorIndex int
Height int64
Round int
Sequence int
Signature Signature
}
```
## NewRoundStepMessage
NewRoundStepMessage is sent for every step transition during the core consensus algorithm execution. It is
used in the gossip part of the Tendermint protocol to inform peers about a current height/round/step
a process is in.
```
type NewRoundStepMessage struct {
Height int64
Round int
Step RoundStepType
SecondsSinceStartTime int
LastCommitRound int
}
```
## CommitStepMessage
CommitStepMessage is sent when an agreement on some block is reached. It contains height for which agreement
is reached, block parts header that describes the decided block and is used to obtain all block parts,
and a bit array of the block parts a process currently has, so its peers can know what parts
it is missing so they can send them.
```
type CommitStepMessage struct {
Height int64
BlockID BlockID
BlockParts BitArray
}
```
TODO: We use BlockID instead of BlockPartsHeader (in current implementation) for symmetry.
## ProposalPOLMessage
ProposalPOLMessage is sent when a previous block is re-proposed.
It is used to inform peers in what round the process learned for this block (ProposalPOLRound),
and what prevotes for the re-proposed block the process has.
```
type ProposalPOLMessage struct {
Height int64
ProposalPOLRound int
ProposalPOL BitArray
}
```
## HasVoteMessage
HasVoteMessage is sent to indicate that a particular vote has been received. It contains height,
round, vote type and the index of the validator that is the originator of the corresponding vote.
```
type HasVoteMessage struct {
Height int64
Round int
Type byte
Index int
}
```
## VoteSetMaj23Message
VoteSetMaj23Message is sent to indicate that a process has seen +2/3 votes for some BlockID.
It contains height, round, vote type and the BlockID.
```
type VoteSetMaj23Message struct {
Height int64
Round int
Type byte
BlockID BlockID
}
```
## VoteSetBitsMessage
VoteSetBitsMessage is sent to communicate the bit-array of votes a process has seen for a given
BlockID. It contains height, round, vote type, BlockID and a bit array of
the votes a process has.
```
type VoteSetBitsMessage struct {
Height int64
Round int
Type byte
BlockID BlockID
Votes BitArray
}
```

View File

@@ -93,6 +93,17 @@ encode([]int{1, 2, 3, 4}) == [0x01, 0x04, 0x01, 0x01, 0x01, 0x02, 0x01, 0x
encode([]string{"abc", "efg"}) == [0x01, 0x02, 0x01, 0x03, 0x61, 0x62, 0x63, 0x01, 0x03, 0x65, 0x66, 0x67]
```
### BitArray
BitArray is encoded as an `int` of the number of bits, and with an array of `uint64` to encode
value of each array element.
```
type BitArray struct {
Bits int
Elems []uint64
}
```
### Time
Time is encoded as an `int64` of the number of nanoseconds since January 1, 1970,
@@ -176,3 +187,13 @@ TMBIN encode an object and slice it into parts.
```
MakeParts(object, partSize)
```
### Part
```
type Part struct {
Index int
Bytes byte[]
Proof byte[]
}
```

View File

@@ -0,0 +1,39 @@
# P2P Config
Here we describe configuration options around the Peer Exchange.
## Seed Mode
`--p2p.seed_mode`
The node operates in seed mode. It will kick incoming peers after sharing some peers.
It will continually crawl the network for peers.
## Seeds
`--p2p.seeds “1.2.3.4:466656,2.3.4.5:4444”`
Dials these seeds when we need more peers. They will return a list of peers and then disconnect.
If we already have enough peers in the address book, we may never need to dial them.
## Persistent Peers
`--p2p.persistent_peers “1.2.3.4:46656,2.3.4.5:466656”`
Dial these peers and auto-redial them if the connection fails.
These are intended to be trusted persistent peers that can help
anchor us in the p2p network.
Note that the auto-redial uses exponential backoff and will give up
after a day of trying to connect.
NOTE: If `dial_seeds` and `persistent_peers` intersect,
the user will be WARNED that seeds may auto-close connections
and the node may not be able to keep the connection persistent.
## Private Persistent Peers
`--p2p.private_persistent_peers “1.2.3.4:46656,2.3.4.5:466656”`
These are persistent peers that we do not add to the address book or
gossip to other peers. They stay private to us.

View File

@@ -0,0 +1,116 @@
## MConnection
`MConnection` is a multiplex connection:
__multiplex__ *noun* a system or signal involving simultaneous transmission of
several messages along a single channel of communication.
Each `MConnection` handles message transmission on multiple abstract communication
`Channel`s. Each channel has a globally unique byte id.
The byte id and the relative priorities of each `Channel` are configured upon
initialization of the connection.
The `MConnection` supports three packet types: Ping, Pong, and Msg.
### Ping and Pong
The ping and pong messages consist of writing a single byte to the connection; 0x1 and 0x2, respectively
When we haven't received any messages on an `MConnection` in a time `pingTimeout`, we send a ping message.
When a ping is received on the `MConnection`, a pong is sent in response.
If a pong is not received in sufficient time, the peer's score should be decremented (TODO).
### Msg
Messages in channels are chopped into smaller msgPackets for multiplexing.
```
type msgPacket struct {
ChannelID byte
EOF byte // 1 means message ends here.
Bytes []byte
}
```
The msgPacket is serialized using go-wire, and prefixed with a 0x3.
The received `Bytes` of a sequential set of packets are appended together
until a packet with `EOF=1` is received, at which point the complete serialized message
is returned for processing by the corresponding channels `onReceive` function.
### Multiplexing
Messages are sent from a single `sendRoutine`, which loops over a select statement that results in the sending
of a ping, a pong, or a batch of data messages. The batch of data messages may include messages from multiple channels.
Message bytes are queued for sending in their respective channel, with each channel holding one unsent message at a time.
Messages are chosen for a batch one a time from the channel with the lowest ratio of recently sent bytes to channel priority.
## Sending Messages
There are two methods for sending messages:
```go
func (m MConnection) Send(chID byte, msg interface{}) bool {}
func (m MConnection) TrySend(chID byte, msg interface{}) bool {}
```
`Send(chID, msg)` is a blocking call that waits until `msg` is successfully queued
for the channel with the given id byte `chID`. The message `msg` is serialized
using the `tendermint/wire` submodule's `WriteBinary()` reflection routine.
`TrySend(chID, msg)` is a nonblocking call that returns false if the channel's
queue is full.
`Send()` and `TrySend()` are also exposed for each `Peer`.
## Peer
Each peer has one `MConnection` instance, and includes other information such as whether the connection
was outbound, whether the connection should be recreated if it closes, various identity information about the node,
and other higher level thread-safe data used by the reactors.
## Switch/Reactor
The `Switch` handles peer connections and exposes an API to receive incoming messages
on `Reactors`. Each `Reactor` is responsible for handling incoming messages of one
or more `Channels`. So while sending outgoing messages is typically performed on the peer,
incoming messages are received on the reactor.
```go
// Declare a MyReactor reactor that handles messages on MyChannelID.
type MyReactor struct{}
func (reactor MyReactor) GetChannels() []*ChannelDescriptor {
return []*ChannelDescriptor{ChannelDescriptor{ID:MyChannelID, Priority: 1}}
}
func (reactor MyReactor) Receive(chID byte, peer *Peer, msgBytes []byte) {
r, n, err := bytes.NewBuffer(msgBytes), new(int64), new(error)
msgString := ReadString(r, n, err)
fmt.Println(msgString)
}
// Other Reactor methods omitted for brevity
...
switch := NewSwitch([]Reactor{MyReactor{}})
...
// Send a random message to all outbound connections
for _, peer := range switch.Peers().List() {
if peer.IsOutbound() {
peer.Send(MyChannelID, "Here's a random message")
}
}
```
### PexReactor/AddrBook
A `PEXReactor` reactor implementation is provided to automate peer discovery.
```go
book := p2p.NewAddrBook(addrBookFilePath)
pexReactor := p2p.NewPEXReactor(book)
...
switch := NewSwitch([]Reactor{pexReactor, myReactor, ...})
```

View File

@@ -0,0 +1,62 @@
# Tendermint Peer Discovery
A Tendermint P2P network has different kinds of nodes with different requirements for connectivity to others.
This document describes what kind of nodes Tendermint should enable and how they should work.
## Seeds
Seeds are the first point of contact for a new node.
They return a list of known active peers and disconnect.
Seeds should operate full nodes, and with the PEX reactor in a "crawler" mode
that continuously explores to validate the availability of peers.
Seeds should only respond with some top percentile of the best peers it knows about.
See [reputation] for details on peer quality.
## New Full Node
A new node needs a few things to connect to the network:
- a list of seeds, which can be provided to Tendermint via config file or flags,
or hardcoded into the software by in-process apps
- a `ChainID`, also called `Network` at the p2p layer
- a recent block height, H, and hash, HASH for the blockchain.
The values `H` and `HASH` must be received and corroborated by means external to Tendermint, and specific to the user - ie. via the user's trusted social consensus.
This requirement to validate `H` and `HASH` out-of-band and via social consensus
is the essential difference in security models between Proof-of-Work and Proof-of-Stake blockchains.
With the above, the node then queries some seeds for peers for its chain,
dials those peers, and runs the Tendermint protocols with those it successfully connects to.
When the peer catches up to height H, it ensures the block hash matches HASH.
If not, Tendermint will exit, and the user must try again - either they are connected
to bad peers or their social consensus was invalidated.
## Restarted Full Node
A node checks its address book on startup and attempts to connect to peers from there.
If it can't connect to any peers after some time, it falls back to the seeds to find more.
Restarted full nodes can run the `blockchain` or `consensus` reactor protocols to sync up
to the latest state of the blockchain, assuming they aren't too far behind.
If they are too far behind, they may need to validate a recent `H` and `HASH` out-of-band again.
## Validator Node
A validator node is a node that interfaces with a validator signing key.
These nodes require the highest security, and should not accept incoming connections.
They should maintain outgoing connections to a controlled set of "Sentry Nodes" that serve
as their proxy shield to the rest of the network.
Validators that know and trust each other can accept incoming connections from one another and maintain direct private connectivity via VPN.
## Sentry Node
Sentry nodes are guardians of a validator node and provide it access to the rest of the network.
Sentry nodes may be dynamic, but should maintain persistent connections to some evolving random subset of each other.
They should always expect to have direct incoming connections from the validator node and its backup/s.
They do not report the validator node's address in the PEX.
They may be more strict about the quality of peers they keep.
Sentry nodes belonging to validators that trust each other may wish to maintain persistent connections via VPN with one another, but only report each other sparingly in the PEX.

View File

@@ -0,0 +1,118 @@
# Tendermint Peers
This document explains how Tendermint Peers are identified, how they connect to one another,
and how other peers are found.
## Peer Identity
Tendermint peers are expected to maintain long-term persistent identities in the form of a private key.
Each peer has an ID defined as `peer.ID == peer.PrivKey.Address()`, where `Address` uses the scheme defined in go-crypto.
Peer ID's must come with some Proof-of-Work; that is,
they must satisfy `peer.PrivKey.Address() < target` for some difficulty target.
This ensures they are not too easy to generate. To begin, let `target == 2^240`.
A single peer ID can have multiple IP addresses associated with it.
For simplicity, we only keep track of the latest one.
When attempting to connect to a peer, we use the PeerURL: `<ID>@<IP>:<PORT>`.
We will attempt to connect to the peer at IP:PORT, and verify,
via authenticated encryption, that it is in possession of the private key
corresponding to `<ID>`. This prevents man-in-the-middle attacks on the peer layer.
Peers can also be connected to without specifying an ID, ie. just `<IP>:<PORT>`.
In this case, the peer must be authenticated out-of-band of Tendermint,
for instance via VPN
## Connections
All p2p connections use TCP.
Upon establishing a successful TCP connection with a peer,
two handhsakes are performed: one for authenticated encryption, and one for Tendermint versioning.
Both handshakes have configurable timeouts (they should complete quickly).
### Authenticated Encryption Handshake
Tendermint implements the Station-to-Station protocol
using ED25519 keys for Diffie-Helman key-exchange and NACL SecretBox for encryption.
It goes as follows:
- generate an emphemeral ED25519 keypair
- send the ephemeral public key to the peer
- wait to receive the peer's ephemeral public key
- compute the Diffie-Hellman shared secret using the peers ephemeral public key and our ephemeral private key
- generate two nonces to use for encryption (sending and receiving) as follows:
- sort the ephemeral public keys in ascending order and concatenate them
- RIPEMD160 the result
- append 4 empty bytes (extending the hash to 24-bytes)
- the result is nonce1
- flip the last bit of nonce1 to get nonce2
- if we had the smaller ephemeral pubkey, use nonce1 for receiving, nonce2 for sending;
else the opposite
- all communications from now on are encrypted using the shared secret and the nonces, where each nonce
- we now have an encrypted channel, but still need to authenticate
increments by 2 every time it is used
- generate a common challenge to sign:
- SHA256 of the sorted (lowest first) and concatenated ephemeral pub keys
- sign the common challenge with our persistent private key
- send the go-wire encoded persistent pubkey and signature to the peer
- wait to receive the persistent public key and signature from the peer
- verify the signature on the challenge using the peer's persistent public key
If this is an outgoing connection (we dialed the peer) and we used a peer ID,
then finally verify that the peer's persistent public key corresponds to the peer ID we dialed,
ie. `peer.PubKey.Address() == <ID>`.
The connection has now been authenticated. All traffic is encrypted.
Note that only the dialer can authenticate the identity of the peer,
but this is what we care about since when we join the network we wish to
ensure we have reached the intended peer (and are not being MITMd).
### Peer Filter
Before continuing, we check if the new peer has the same ID as ourselves or
an existing peer. If so, we disconnect.
We also check the peer's address and public key against
an optional whitelist which can be managed through the ABCI app -
if the whitelist is enabled and the peer does not qualigy, the connection is
terminated.
### Tendermint Version Handshake
The Tendermint Version Handshake allows the peers to exchange their NodeInfo:
```
type NodeInfo struct {
PubKey crypto.PubKey `json:"pub_key"`
Moniker string `json:"moniker"`
Network string `json:"network"`
RemoteAddr string `json:"remote_addr"`
ListenAddr string `json:"listen_addr"` // accepting in
Version string `json:"version"` // major.minor.revision
Channels []int8 `json:"channels"` // active reactor channels
Other []string `json:"other"` // other application specific data
}
```
The connection is disconnected if:
- `peer.NodeInfo.PubKey != peer.PubKey`
- `peer.NodeInfo.Version` is not formatted as `X.X.X` where X are integers known as Major, Minor, and Revision
- `peer.NodeInfo.Version` Major is not the same as ours
- `peer.NodeInfo.Version` Minor is not the same as ours
- `peer.NodeInfo.Network` is not the same as ours
- `peer.Channels` does not intersect with our known Channels.
At this point, if we have not disconnected, the peer is valid.
It is added to the switch and hence all reactors via the `AddPeer` method.
Note that each reactor may handle multiple channels.
## Connection Activity
Once a peer is added, incoming messages for a given reactor are handled through
that reactor's `Receive` method, and output messages are sent directly by the Reactors
on each peer. A typical reactor maintains per-peer go-routine/s that handle this.

View File

@@ -0,0 +1,94 @@
# Peer Strategy and Exchange
Here we outline the design of the AddressBook
and how it used by the Peer Exchange Reactor (PEX) to ensure we are connected
to good peers and to gossip peers to others.
## Peer Types
Certain peers are special in that they are specified by the user as `persistent`,
which means we auto-redial them if the connection fails.
Some such peers can additional be marked as `private`, which means
we will not gossip them to others.
All others peers are tracked using an address book.
## Discovery
Peer discovery begins with a list of seeds.
When we have no peers, or have been unable to find enough peers from existing ones,
we dial a randomly selected seed to get a list of peers to dial.
So long as we have less than `MaxPeers`, we periodically request additional peers
from each of our own. If sufficient time goes by and we still can't find enough peers,
we try the seeds again.
## Address Book
Peers are tracked via their ID (their PubKey.Address()).
For each ID, the address book keeps the most recent IP:PORT.
Peers are added to the address book from the PEX when they first connect to us or
when we hear about them from other peers.
The address book is arranged in sets of buckets, and distinguishes between
vetted and unvetted peers. It keeps different sets of buckets for vetted and
unvetted peers. Buckets provide randomization over peer selection.
A vetted peer can only be in one bucket. An unvetted peer can be in multiple buckets.
## Vetting
When a peer is first added, it is unvetted.
Marking a peer as vetted is outside the scope of the `p2p` package.
For Tendermint, a Peer becomes vetted once it has contributed sufficiently
at the consensus layer; ie. once it has sent us valid and not-yet-known
votes and/or block parts for `NumBlocksForVetted` blocks.
Other users of the p2p package can determine their own conditions for when a peer is marked vetted.
If a peer becomes vetted but there are already too many vetted peers,
a randomly selected one of the vetted peers becomes unvetted.
If a peer becomes unvetted (either a new peer, or one that was previously vetted),
a randomly selected one of the unvetted peers is removed from the address book.
More fine-grained tracking of peer behaviour can be done using
a Trust Metric, but it's best to start with something simple.
## Select Peers to Dial
When we need more peers, we pick them randomly from the addrbook with some
configurable bias for unvetted peers. The bias should be lower when we have fewer peers,
and can increase as we obtain more, ensuring that our first peers are more trustworthy,
but always giving us the chance to discover new good peers.
## Select Peers to Exchange
When were asked for peers, we select them as follows:
- select at most `maxGetSelection` peers
- try to select at least `minGetSelection` peers - if we have less than that, select them all.
- select a random, unbiased `getSelectionPercent` of the peers
Send the selected peers. Note we select peers for sending without bias for vetted/unvetted.
## Preventing Spam
There are various cases where we decide a peer has misbehaved and we disconnect from them.
When this happens, the peer is removed from the address book and black listed for
some amount of time. We call this "Disconnect and Mark".
Note that the bad behaviour may be detected outside the PEX reactor itseld
(for instance, in the mconnection, or another reactor), but it must be communicated to the PEX reactor
so it can remove and mark the peer.
In the PEX, if a peer sends us unsolicited lists of peers,
or if the peer sends too many requests for more peers in a given amount of time,
we Disconnect and Mark.
## Trust Metric
The quality of peers can be tracked in more fine-grained detail using a
Proportional-Integral-Derrivative (PID) controller that incorporates
current, past, and rate-of-change data to inform peer quality.
While a PID trust metric has been implemented, it remains for future work
to use it in the PEX.

View File

@@ -0,0 +1,16 @@
The trust metric tracks the quality of the peers.
When a peer exceeds a certain quality for a certain amount of time,
it is marked as vetted in the addrbook.
If a vetted peer's quality degrades sufficiently, it is booted, and must prove itself from scratch.
If we need to make room for a new vetted peer, we move the lowest scoring vetted peer back to unvetted.
If we need to make room for a new unvetted peer, we remove the lowest scoring unvetted peer -
possibly only if its below some absolute minimum ?
Peer quality is tracked in the connection and across the reactors.
Behaviours are defined as one of:
- fatal - something outright malicious. we should disconnect and remember them.
- bad - any kind of timeout, msgs that dont unmarshal, or fail other validity checks, or msgs we didn't ask for or arent expecting
- neutral - normal correct behaviour. unknown channels/msg types (version upgrades).
- good - some random majority of peers per reactor sending us useful messages