Merge branch 'develop' into abci-spec-docs

2025-07-31 20:21:56 +00:00 · 2018-01-04 13:57:08 -05:00
parent a573b20888 6855e62f0a
commit ecb7303e35
8 changed files with 190 additions and 40 deletions
--- a/docs/specification/new-spec/README.md
+++ b/docs/specification/new-spec/README.md
@@ -9,6 +9,7 @@ It contains the following components:
 - [Encoding and Digests](encoding.md)
 - [Blockchain](blockchain.md)
 - [State](state.md)
+- [P2P](p2p/node.md)

 ## Overview

--- a/docs/specification/new-spec/p2p/config.md
+++ b/docs/specification/new-spec/p2p/config.md
@@ -0,0 +1,39 @@
+# P2P Config
+
+Here we describe configuration options around the Peer Exchange.
+
+## Seed Mode
+
+`--p2p.seed_mode`
+
+The node operates in seed mode. It will kick incoming peers after sharing some peers.
+It will continually crawl the network for peers.
+
+## Seeds
+
+`--p2p.seeds “1.2.3.4:466656,2.3.4.5:4444”`
+
+Dials these seeds when we need more peers. They will return a list of peers and then disconnect.
+If we already have enough peers in the address book, we may never need to dial them.
+
+## Persistent Peers
+
+`--p2p.persistent_peers “1.2.3.4:46656,2.3.4.5:466656”`
+
+Dial these peers and auto-redial them if the connection fails.
+These are intended to be trusted persistent peers that can help
+anchor us in the p2p network.
+
+Note that the auto-redial uses exponential backoff and will give up
+after a day of trying to connect.
+
+NOTE: If `dial_seeds` and `persistent_peers` intersect,
+the user will be WARNED that seeds may auto-close connections
+and the node may not be able to keep the connection persistent.
+
+## Private Persistent Peers
+
+`--p2p.private_persistent_peers “1.2.3.4:46656,2.3.4.5:466656”`
+
+These are persistent peers that we do not add to the address book or
+gossip to other peers. They stay private to us.
--- a/docs/specification/new-spec/p2p/connection.md
+++ b/docs/specification/new-spec/p2p/connection.md
@@ -0,0 +1,116 @@
+## MConnection
+
+`MConnection` is a multiplex connection:
+
+__multiplex__ *noun* a system or signal involving simultaneous transmission of
+several messages along a single channel of communication.
+
+Each `MConnection` handles message transmission on multiple abstract communication
+`Channel`s.  Each channel has a globally unique byte id.
+The byte id and the relative priorities of each `Channel` are configured upon
+initialization of the connection.
+
+The `MConnection` supports three packet types: Ping, Pong, and Msg.
+
+### Ping and Pong
+
+The ping and pong messages consist of writing a single byte to the connection; 0x1 and 0x2, respectively
+
+When we haven't received any messages on an `MConnection` in a time `pingTimeout`, we send a ping message.
+When a ping is received on the `MConnection`, a pong is sent in response.
+
+If a pong is not received in sufficient time, the peer's score should be decremented (TODO).
+
+### Msg
+
+Messages in channels are chopped into smaller msgPackets for multiplexing.
+
+```
+type msgPacket struct {
+	ChannelID byte
+	EOF       byte // 1 means message ends here.
+	Bytes     []byte
+}
+```
+
+The msgPacket is serialized using go-wire, and prefixed with a 0x3.
+The received `Bytes` of a sequential set of packets are appended together
+until a packet with `EOF=1` is received, at which point the complete serialized message
+is returned for processing by the corresponding channels `onReceive` function.
+
+### Multiplexing
+
+Messages are sent from a single `sendRoutine`, which loops over a select statement that results in the sending
+of a ping, a pong, or a batch of data messages. The batch of data messages may include messages from multiple channels.
+Message bytes are queued for sending in their respective channel, with each channel holding one unsent message at a time.
+Messages are chosen for a batch one a time from the channel with the lowest ratio of recently sent bytes to channel priority.
+
+## Sending Messages
+
+There are two methods for sending messages:
+```go
+func (m MConnection) Send(chID byte, msg interface{}) bool {}
+func (m MConnection) TrySend(chID byte, msg interface{}) bool {}
+```
+
+`Send(chID, msg)` is a blocking call that waits until `msg` is successfully queued
+for the channel with the given id byte `chID`.  The message `msg` is serialized
+using the `tendermint/wire` submodule's `WriteBinary()` reflection routine.
+
+`TrySend(chID, msg)` is a nonblocking call that returns false if the channel's
+queue is full.
+
+`Send()` and `TrySend()` are also exposed for each `Peer`.
+
+## Peer
+
+Each peer has one `MConnection` instance, and includes other information such as whether the connection
+was outbound, whether the connection should be recreated if it closes, various identity information about the node,
+and other higher level thread-safe data used by the reactors.
+
+## Switch/Reactor
+
+The `Switch` handles peer connections and exposes an API to receive incoming messages
+on `Reactors`.  Each `Reactor` is responsible for handling incoming messages of one
+or more `Channels`.  So while sending outgoing messages is typically performed on the peer,
+incoming messages are received on the reactor.
+
+```go
+// Declare a MyReactor reactor that handles messages on MyChannelID.
+type MyReactor struct{}
+
+func (reactor MyReactor) GetChannels() []*ChannelDescriptor {
+    return []*ChannelDescriptor{ChannelDescriptor{ID:MyChannelID, Priority: 1}}
+}
+
+func (reactor MyReactor) Receive(chID byte, peer *Peer, msgBytes []byte) {
+    r, n, err := bytes.NewBuffer(msgBytes), new(int64), new(error)
+    msgString := ReadString(r, n, err)
+    fmt.Println(msgString)
+}
+
+// Other Reactor methods omitted for brevity
+...
+
+switch := NewSwitch([]Reactor{MyReactor{}})
+
+...
+
+// Send a random message to all outbound connections
+for _, peer := range switch.Peers().List() {
+    if peer.IsOutbound() {
+        peer.Send(MyChannelID, "Here's a random message")
+    }
+}
+```
+
+### PexReactor/AddrBook
+
+A `PEXReactor` reactor implementation is provided to automate peer discovery.
+
+```go
+book := p2p.NewAddrBook(addrBookFilePath)
+pexReactor := p2p.NewPEXReactor(book)
+...
+switch := NewSwitch([]Reactor{pexReactor, myReactor, ...})
+```
--- a/docs/specification/new-spec/p2p/node.md
+++ b/docs/specification/new-spec/p2p/node.md
@@ -0,0 +1,62 @@
+# Tendermint Peer Discovery
+
+A Tendermint P2P network has different kinds of nodes with different requirements for connectivity to others.
+This document describes what kind of nodes Tendermint should enable and how they should work.
+
+## Seeds
+
+Seeds are the first point of contact for a new node.
+They return a list of known active peers and disconnect.
+
+Seeds should operate full nodes, and with the PEX reactor in a "crawler" mode
+that continuously explores to validate the availability of peers.
+
+Seeds should only respond with some top percentile of the best peers it knows about.
+See [reputation] for details on peer quality.
+
+## New Full Node
+
+A new node needs a few things to connect to the network:
+- a list of seeds, which can be provided to Tendermint via config file or flags,
+or hardcoded into the software by in-process apps
+- a `ChainID`, also called `Network` at the p2p layer
+- a recent block height, H, and hash, HASH for the blockchain.
+
+The values `H` and `HASH` must be received and corroborated by means external to Tendermint, and specific to the user - ie. via the user's trusted social consensus.
+This requirement to validate `H` and `HASH` out-of-band and via social consensus
+is the essential difference in security models between Proof-of-Work and Proof-of-Stake blockchains.
+
+With the above, the node then queries some seeds for peers for its chain,
+dials those peers, and runs the Tendermint protocols with those it successfully connects to.
+
+When the peer catches up to height H, it ensures the block hash matches HASH.
+If not, Tendermint will exit, and the user must try again - either they are connected
+to bad peers or their social consensus was invalidated.
+
+## Restarted Full Node
+
+A node checks its address book on startup and attempts to connect to peers from there.
+If it can't connect to any peers after some time, it falls back to the seeds to find more.
+
+Restarted full nodes can run the `blockchain` or `consensus` reactor protocols to sync up
+to the latest state of the blockchain, assuming they aren't too far behind.
+If they are too far behind, they may need to validate a recent `H` and `HASH` out-of-band again.
+
+## Validator Node
+
+A validator node is a node that interfaces with a validator signing key.
+These nodes require the highest security, and should not accept incoming connections.
+They should maintain outgoing connections to a controlled set of "Sentry Nodes" that serve
+as their proxy shield to the rest of the network.
+
+Validators that know and trust each other can accept incoming connections from one another and maintain direct private connectivity via VPN.
+
+## Sentry Node
+
+Sentry nodes are guardians of a validator node and provide it access to the rest of the network.
+Sentry nodes may be dynamic, but should maintain persistent connections to some evolving random subset of each other.
+They should always expect to have direct incoming connections from the validator node and its backup/s.
+They do not report the validator node's address in the PEX.
+They may be more strict about the quality of peers they keep.
+
+Sentry nodes belonging to validators that trust each other may wish to maintain persistent connections via VPN with one another, but only report each other sparingly in the PEX.
--- a/docs/specification/new-spec/p2p/peer.md
+++ b/docs/specification/new-spec/p2p/peer.md
@@ -0,0 +1,118 @@
+# Tendermint Peers
+
+This document explains how Tendermint Peers are identified, how they connect to one another,
+and how other peers are found.
+
+## Peer Identity
+
+Tendermint peers are expected to maintain long-term persistent identities in the form of a private key.
+Each peer has an ID defined as `peer.ID == peer.PrivKey.Address()`, where `Address` uses the scheme defined in go-crypto.
+
+Peer ID's must come with some Proof-of-Work; that is,
+they must satisfy `peer.PrivKey.Address() < target` for some difficulty target.
+This ensures they are not too easy to generate. To begin, let `target == 2^240`.
+
+A single peer ID can have multiple IP addresses associated with it.
+For simplicity, we only keep track of the latest one.
+
+When attempting to connect to a peer, we use the PeerURL: `<ID>@<IP>:<PORT>`.
+We will attempt to connect to the peer at IP:PORT, and verify,
+via authenticated encryption, that it is in possession of the private key
+corresponding to `<ID>`. This prevents man-in-the-middle attacks on the peer layer.
+
+Peers can also be connected to without specifying an ID, ie. just `<IP>:<PORT>`.
+In this case, the peer must be authenticated out-of-band of Tendermint,
+for instance via VPN
+
+## Connections
+
+All p2p connections use TCP.
+Upon establishing a successful TCP connection with a peer,
+two handhsakes are performed: one for authenticated encryption, and one for Tendermint versioning.
+Both handshakes have configurable timeouts (they should complete quickly).
+
+### Authenticated Encryption Handshake
+
+Tendermint implements the Station-to-Station protocol
+using ED25519 keys for Diffie-Helman key-exchange and NACL SecretBox for encryption.
+It goes as follows:
+- generate an emphemeral ED25519 keypair
+- send the ephemeral public key to the peer
+- wait to receive the peer's ephemeral public key
+- compute the Diffie-Hellman shared secret using the peers ephemeral public key and our ephemeral private key
+- generate two nonces to use for encryption (sending and receiving) as follows:
+    - sort the ephemeral public keys in ascending order and concatenate them
+    - RIPEMD160 the result
+    - append 4 empty bytes (extending the hash to 24-bytes)
+    - the result is nonce1
+    - flip the last bit of nonce1 to get nonce2
+    - if we had the smaller ephemeral pubkey, use nonce1 for receiving, nonce2 for sending;
+        else the opposite
+- all communications from now on are encrypted using the shared secret and the nonces, where each nonce
+- we now have an encrypted channel, but still need to authenticate
+increments by 2 every time it is used
+- generate a common challenge to sign:
+    - SHA256 of the sorted (lowest first) and concatenated ephemeral pub keys
+- sign the common challenge with our persistent private key
+- send the go-wire encoded persistent pubkey and signature to the peer
+- wait to receive the persistent public key and signature from the peer
+- verify the signature on the challenge using the peer's persistent public key
+
+
+If this is an outgoing connection (we dialed the peer) and we used a peer ID,
+then finally verify that the peer's persistent public key corresponds to the peer ID we dialed,
+ie. `peer.PubKey.Address() == <ID>`.
+
+The connection has now been authenticated. All traffic is encrypted.
+
+Note that only the dialer can authenticate the identity of the peer,
+but this is what we care about since when we join the network we wish to
+ensure we have reached the intended peer (and are not being MITMd).
+
+### Peer Filter
+
+Before continuing, we check if the new peer has the same ID as ourselves or
+an existing peer. If so, we disconnect.
+
+We also check the peer's address and public key against
+an optional whitelist which can be managed through the ABCI app -
+if the whitelist is enabled and the peer does not qualigy, the connection is
+terminated.
+
+
+### Tendermint Version Handshake
+
+The Tendermint Version Handshake allows the peers to exchange their NodeInfo:
+
+```
+type NodeInfo struct {
+	PubKey     crypto.PubKey `json:"pub_key"`
+	Moniker    string        `json:"moniker"`
+	Network    string        `json:"network"`
+	RemoteAddr string        `json:"remote_addr"`
+	ListenAddr string        `json:"listen_addr"` // accepting in
+	Version    string        `json:"version"` // major.minor.revision
+    Channels   []int8        `json:"channels"` // active reactor channels
+	Other      []string      `json:"other"`   // other application specific data
+}
+```
+
+The connection is disconnected if:
+- `peer.NodeInfo.PubKey != peer.PubKey`
+- `peer.NodeInfo.Version` is not formatted as `X.X.X` where X are integers known as Major, Minor, and Revision
+- `peer.NodeInfo.Version` Major is not the same as ours
+- `peer.NodeInfo.Version` Minor is not the same as ours
+- `peer.NodeInfo.Network` is not the same as ours
+- `peer.Channels` does not intersect with our known Channels.
+
+
+At this point, if we have not disconnected, the peer is valid.
+It is added to the switch and hence all reactors via the `AddPeer` method.
+Note that each reactor may handle multiple channels.
+
+## Connection Activity
+
+Once a peer is added, incoming messages for a given reactor are handled through
+that reactor's `Receive` method, and output messages are sent directly by the Reactors
+on each peer. A typical reactor maintains per-peer go-routine/s that handle this.
+
--- a/docs/specification/new-spec/p2p/pex.md
+++ b/docs/specification/new-spec/p2p/pex.md
@@ -0,0 +1,94 @@
+# Peer Strategy and Exchange
+
+Here we outline the design of the AddressBook
+and how it used by the Peer Exchange Reactor (PEX) to ensure we are connected
+to good peers and to gossip peers to others.
+
+## Peer Types
+
+Certain peers are special in that they are specified by the user as `persistent`,
+which means we auto-redial them if the connection fails.
+Some such peers can additional be marked as `private`, which means
+we will not gossip them to others.
+
+All others peers are tracked using an address book.
+
+## Discovery
+
+Peer discovery begins with a list of seeds.
+When we have no peers, or have been unable to find enough peers from existing ones,
+we dial a randomly selected seed to get a list of peers to dial.
+
+So long as we have less than `MaxPeers`, we periodically request additional peers
+from each of our own. If sufficient time goes by and we still can't find enough peers,
+we try the seeds again.
+
+## Address Book
+
+Peers are tracked via their ID (their PubKey.Address()).
+For each ID, the address book keeps the most recent IP:PORT.
+Peers are added to the address book from the PEX when they first connect to us or
+when we hear about them from other peers.
+
+The address book is arranged in sets of buckets, and distinguishes between
+vetted and unvetted peers. It keeps different sets of buckets for vetted and
+unvetted peers. Buckets provide randomization over peer selection.
+
+A vetted peer can only be in one bucket. An unvetted peer can be in multiple buckets.
+
+## Vetting
+
+When a peer is first added, it is unvetted.
+Marking a peer as vetted is outside the scope of the `p2p` package.
+For Tendermint, a Peer becomes vetted once it has contributed sufficiently
+at the consensus layer; ie. once it has sent us valid and not-yet-known
+votes and/or block parts for `NumBlocksForVetted` blocks.
+Other users of the p2p package can determine their own conditions for when a peer is marked vetted.
+
+If a peer becomes vetted but there are already too many vetted peers,
+a randomly selected one of the vetted peers becomes unvetted.
+
+If a peer becomes unvetted (either a new peer, or one that was previously vetted),
+a randomly selected one of the unvetted peers is removed from the address book.
+
+More fine-grained tracking of peer behaviour can be done using
+a Trust Metric, but it's best to start with something simple.
+
+## Select Peers to Dial
+
+When we need more peers, we pick them randomly from the addrbook with some
+configurable bias for unvetted peers. The bias should be lower when we have fewer peers,
+and can increase as we obtain more, ensuring that our first peers are more trustworthy,
+but always giving us the chance to discover new good peers.
+
+## Select Peers to Exchange
+
+When we’re asked for peers, we select them as follows:
+- select at most `maxGetSelection` peers
+- try to select at least `minGetSelection` peers - if we have less than that, select them all.
+- select a random, unbiased `getSelectionPercent` of the peers
+
+Send the selected peers. Note we select peers for sending without bias for vetted/unvetted.
+
+## Preventing Spam
+
+There are various cases where we decide a peer has misbehaved and we disconnect from them.
+When this happens, the peer is removed from the address book and black listed for
+some amount of time. We call this "Disconnect and Mark".
+Note that the bad behaviour may be detected outside the PEX reactor itseld
+(for instance, in the mconnection, or another reactor), but it must be communicated to the PEX reactor
+so it can remove and mark the peer.
+
+In the PEX, if a peer sends us unsolicited lists of peers,
+or if the peer sends too many requests for more peers in a given amount of time,
+we Disconnect and Mark.
+
+## Trust Metric
+
+The quality of peers can be tracked in more fine-grained detail using a
+Proportional-Integral-Derrivative (PID) controller that incorporates
+current, past, and rate-of-change data to inform peer quality.
+
+While a PID trust metric has been implemented, it remains for future work
+to use it in the PEX.
+
--- a/docs/specification/new-spec/p2p/trustmetric.md
+++ b/docs/specification/new-spec/p2p/trustmetric.md
@@ -0,0 +1,16 @@
+
+The trust metric tracks the quality of the peers.
+When a peer exceeds a certain quality for a certain amount of time,
+it is marked as vetted in the addrbook.
+If a vetted peer's quality degrades sufficiently, it is booted, and must prove itself from scratch.
+If we need to make room for a new vetted peer, we move the lowest scoring vetted peer back to unvetted.
+If we need to make room for a new unvetted peer, we remove the lowest scoring unvetted peer -
+possibly only if its below some absolute minimum ?
+
+Peer quality is tracked in the connection and across the reactors.
+Behaviours are defined as one of:
+    - fatal - something outright malicious. we should disconnect and remember them.
+    - bad - any kind of timeout, msgs that dont unmarshal, or fail other validity checks, or msgs we didn't ask for or arent expecting
+    - neutral - normal correct behaviour. unknown channels/msg types (version upgrades).
+    - good - some random majority of peers per reactor sending us useful messages
+