* Address some TODOs, refactor queries and public API.
The following left-over issues are addressed:
* The key for FIND_NODE requests is generalised to any Multihash,
instead of just peer IDs.
* All queries get a (configurable) timeout.
* Finishing queries as soon as enough results have been received is simplified
to avoid code duplication.
* No more panics in provider-API-related code paths. The provider API is
however still untested and (I think) still incomplete (e.g. expiration
of provider records).
* Numerous smaller TODOs encountered in the code.
The following public API changes / additions are made:
* Introduce a `KademliaConfig` with new configuration options for
the replication factor and query timeouts.
* Rename `find_node` to `get_closest_peers`.
* Rename `get_value` to `get_record` and `put_value` to `put_record`,
introducing a `Quorum` parameter for both functions, replacing the
existing `num_results` parameter with clearer semantics.
* Rename `add_providing` to `start_providing` and `remove_providing`
to `stop_providing`.
* Add a `bootstrap` function that implements a (almost) standard
Kademlia bootstrapping procedure.
* Rename `KademliaOut` to `KademliaEvent` with an updated list of
constructors (some renaming). All events that report query results
now report a `Result` to uniformly permit reporting of errors.
The following refactorings are made:
* Introduce some constants.
* Consolidate `query.rs` and `write.rs` behind a common query interface
to reduce duplication and facilitate better code reuse, introducing
the notion of a query peer iterator. `query/peers/closest.rs`
contains the code that was formerly in `query.rs`. `query/peers/fixed.rs` contains
a modified variant of `write.rs` (which is removed). The new `query.rs`
provides an interface for working with a collection of queries, taking
over some code from `behaviour.rs`.
* Reduce code duplication in tests and use the current_thread runtime for
polling swarms to avoid spurious errors in the test output due to aborted
connections when a test finishes prematurely (e.g. because a quorum of
results has been collected).
* Some additions / improvements to the existing tests.
* Fix test.
* Fix rebase.
* Tweak kad-ipfs example.
* Incorporate some feedback.
* Provide easy access and conversion to keys in error results.
Refactoring of iterative queries (`query.rs`) to improve both
correctness and performance (for larger DHTs):
Correctness:
1. Queries no longer terminate prematurely due to counting results
from peers farther from the target while results from closer
peers are still pending. (#1105).
2. Queries no longer ignore reported closer peers that are not duplicates
just because they are currently not among the `num_results` closest.
The currently `max_results` closest may contain peers marked as failed
or pending / waiting. Hence all reported closer peers that are not
duplicates must be considered candidates that may still end up
among the `num_results` closest that successfully responded.
3. Bounded parallelism based on the `active_counter` was not working
correctly, as new (not yet contacted) peers closer to the target
may be discovered at any time and thus appear in `closer_peers`
before the already active / pending peers.
4. The `Frozen` query mechanism allowed all remaining not-yet contacted
peers to be contacted, but their results were discarded, because
`inject_rpc_result` would only incorporate results while the
query is `Iterating`. The `Frozen` state has been reworked into
a `Stalled` state that implements a slightly more permissive
variant of the following from the paper / specs: "If a round of
FIND_NODEs fails to return a node any closer than the closest
already seen, the initiator resends the FIND_NODE to all of the
k closest nodes it has not already queried.". Importantly, though
not explicitly mentioned, the query can move back to `Iterating`
if it makes further progress again as a result of these requests.
The `Stalled` state thus allows (temporarily) higher parallelism
in an effort to make progress and bring the query to an end.
Performance:
1. Repeated distance calculations between the same peers and the
target is avoided.
2. Enabled by #1108, use of a more appropriate data structure (`BTreeMap`) for
the incrementally updated list of closer peers. The data structure needs
efficient lookups (to avoid duplicates) and insertions at any position,
both of which large(r) vectors are not that good at. Unscientific benchmarks
showed a ~40-60% improvement in somewhat pathological scenarios with at least
20 healthy nodes, each possibly returning a distinct list of closer 20 peers
to the requestor. A previous assumption may have been that the vector always
stays very small, but that is not the case in larger clusters: Even if the
lists of closer peers reported by the 20 contacted peers are heavily overlapping,
typically a lot more than 20 peers have to be (at least temporarily) considered
as closest peers until the query completes. See also issue (2) above.
New tests are added for:
* Query termination conditions.
* Bounded parallelism.
* Absence of duplicates.
* initial implementation of the records
* move to multihash keys
* correctly process query results
* comments and formatting
* correctly return closer_peers in query
* checking wrong peer id in test
* Apply suggestions from code review
Co-Authored-By: Roman Borschel <romanb@users.noreply.github.com>
* Fix changes from suggestions
* Send responses to PUT_VALUE requests
* Shortcut in get_value
* Update protocols/kad/src/behaviour.rs
Co-Authored-By: Roman Borschel <romanb@users.noreply.github.com>
* Revert "Update protocols/kad/src/behaviour.rs"
This reverts commit 579ce742a7f4c94587f1e1f0866d2a3a37418efb.
* Remove duplicate insertion
* Adds a record to a PUT_VALUE response
* Fix a racy put_value test
* Store value ourselves only if we are in K closest
* Abstract over storage
* Revert "Abstract over storage": bad take
This reverts commit eaebf5b6d915712eaf3b05929577fdf697f204d8.
* Abstract over records storage using hashmap as default
* Constructor for custom records
* New Record type and its traits
* Fix outdated storage name
* Fixes returning an event
* Change FindNodeReq key type to Multihash
* WriteState for a second stage of a PUT_VALUE request
* GET_VALUE should not have a record
* Refactor a match arm
* Add successes and failures counters to PutValueRes
* If value is found no need to return closer peers
* Remove a custo storage from tests
* Rename a test to get_value_not_found
* Adds a TODO to change FindNode request key to Multihash
Co-Authored-By: Roman Borschel <romanb@users.noreply.github.com>
* Move MemoryRecordStorage to record.rs
* Return a Cow-ed Record from get
* Fix incorrect GET_VALUE parsing
* Various fixes with review
* Fixes get_value_not_found
* Fix peerids names in test
* another fix
* PutValue correctly distributes values
* Simplify the test
* Check that results are actually the closest
* Reverts changes to tests
* Fix the test topology and checking the results
* Run put_value test ten times
* Adds a get_value test
* Apply suggestions from code review
Co-Authored-By: Roman Borschel <romanb@users.noreply.github.com>
* Make Record fields public
* Moves WriteState to write.rs
* A couple of minor fixes
* Another few fixes of review
* Simplify the put_value test
* Dont synchronously return an error from put_value
* Formatting fixes and comments
* Collect a bunch of results
* Take exactly as much elements as neede
* Check if the peer is still connected
* Adds a multiple GetValueResults results number test
* Unnecessary mut iterators in put_value
* Ask for num_results in get_value
* Dont allocate twice in get_value
* Dont count same errored peer multiple times
* Apply suggestions from code review
Co-Authored-By: Roman Borschel <romanb@users.noreply.github.com>
* Fix another review
* Apply suggestions from code review
Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com>
* Bring back FromIterator and improve a panic message
* Update protocols/kad/src/behaviour.rs
Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com>
* Fix status updates in KBuckets.
`KBucket::update` does currently not correctly update the `first_connected_pos`
before reinserting the node. This may result in connected nodes being considered
disconnected and thus eligible for replacement by a pending node if the bucket is
full and a new connected node is added.
Tests have been added for checking that `KBucket::update` preserves the status
and ordering of all other nodes in the bucket.
* Small test improvement.
Set an expectation for the new position, instead of taking the assigned position.
* Kademlia: Optimise iteration over closest entries.
The current implementation for finding the entries whose keys are closest
to some target key in the Kademlia routing table involves copying the
keys of all buckets into a new `Vec` which is then sorted based on the
distances to the target and turned into an iterator from which only a
small number of elements (by default 20) are drawn.
This commit introduces an iterator over buckets for finding the closest
keys to a target that visits the buckets in the optimal order, based on
the information contained in the distance bit-string representing the
distance between the local key and the target.
Correctness is tested against full-table scans.
Also included:
* Updated documentation.
* The `Entry` API was moved to the `kbucket::entry` sub-module for
ease of maintenance.
* The pending node handling has been slightly refactored in order to
bring code and documentation in agreement and clarify the semantics
a little.
* Rewrite pending node handling and add tests.
There are two issues with the current definition and use of Kademlia's
XOR metric:
1. The distance is currently equated with the bucket index, i.e.
`distance(a,b) - 1` is the index of the bucket into which either
peer is put by the other. The result is a metric that is not
unidirectional, as defined in the Kademlia paper and as implemented
in e.g. libp2p-go and libp2p-js, which is to interpret the result
of the XOR as an integer in its entirety.
2. The current `KBucketsPeerId` trait and its instances allow computing
distances between types with differing bit lengths as well as between
types that hash all inputs again (i.e. `KadHash`) and "plain" `PeerId`s
or `Multihash`es. This can result in computed distances that are either
incorrect as per the requirement of the libp2p specs that all distances
are to be computed from the XOR of the SHA256 of the input keys, or
even fall outside of the image of the metric used for the `KBucketsTable`.
In the latter case, such distances are not currently used as a bucket index
- they can only occur in the context of comparing distances for the purpose
of sorting peers - but that still seems undesirable.
These issues are addressed here as follows:
* Unidirectionality of the XOR metric is restored by keeping the "full"
integer representation of the bitwise XOR. The result is an XOR metric
as defined in the paper. This also opens the door to avoiding the
"full table scan" when searching for the keys closest to a given key -
the ideal order in which to visit the buckets can be computed with the
help of the distance bit string.
* As a simplification and to make it easy to "do the right thing", the
XOR metric is only defined on an opaque `kbucket::Key` type, partially
derived from the current `KadHash`. `KadHash` and `KBucketsPeerId`
are removed.
* Replace `secp256k1` crate with `libsecp256k1`.
Unfortunately we could not implement `AsRef<[u8]>` for `SecretKey`
as the crate does not provide a means to do so.
* Fix `DecodingError` invocation.
* Remove the cc for wasm
* Revert "Remove the cc for wasm"
This reverts commit 3a19db35e62931c6e9ffbff6c21f9b0d7ae5403a.
* Fix CircleCI build
* Integrate use of identity keys into libp2p-noise.
In order to make libp2p-noise usable with a `Swarm`, which requires a
`Transport::Output` that is a pair of a peer ID and an implementation
of `StreamMuxer`, it is necessary to bridge the gap between static
DH public keys and public identity keys from which peer IDs are derived.
Because the DH static keys and the identity keys need not be
related, it is thus generally necessary that the public identity keys are
exchanged as part of the Noise handshake, which the Noise protocol
accomodates for through the use of handshake message payloads.
The implementation of the existing (IK, IX, XX) handshake patterns is thus
changed to send the public identity keys in the handshake payloads.
Additionally, to facilitate the use of any identity keypair with Noise
handshakes, the static DH public keys are signed using the identity
keypairs and the signatures sent alongside the public identity key
in handshake payloads, unless the static DH public key is "linked"
to the public identity key by other means, e.g. when an Ed25519 identity
keypair is (re)used as an X25519 keypair.
* libp2p-noise doesn't build for wasm.
Thus the development transport needs to be still constructed with secio
for transport security when building for wasm.
* Documentation tweaks.
* For consistency, avoid wildcard enum imports.
* For consistency, avoid wildcard enum imports.
* Slightly simplify io:🤝:State::finish.
* Simplify creation of 2-byte arrays.
* Remove unnecessary cast and obey 100 char line limit.
* Update protocols/noise/src/protocol.rs
Co-Authored-By: romanb <romanb@users.noreply.github.com>
* Address more review comments.
* Cosmetics
* Cosmetics
* Give authentic DH keypairs a distinct type.
This has a couple of advantages:
* Signing the DH public key only needs to happen once, before
creating a `NoiseConfig` for an authenticated handshake.
* The identity keypair only needs to be borrowed and can be
dropped if it is not used further outside of the Noise
protocol, since it is no longer needed during Noise handshakes.
* It is explicit in the construction of a `NoiseConfig` for
a handshake pattern, whether it operates with a plain `Keypair`
or a keypair that is authentic w.r.t. a public identity key
and future handshake patterns may be built with either.
* The function signatures for constructing `NoiseConfig`s for
handshake patterns are simplified and a few unnecessary trait
bounds removed.
* Post-merge corrections.
* Add note on experimental status of libp2p-noise.
Although not explicitly mentioned in the paper, it seems clear that
including an entry for the requesting peer in a FIND_NODE response
never gives useful information and just occupies a result slot that may
have been better filled with another peer that the requestor may not
know about.
There is one explicit mention that this is the desired behavior
in a somewhat dated design document of another p2p framework [1]:
"The recipient of a FIND_NODE should never return a triple containing
the nodeID of the requestor."
The same reasoning supposedly applies to the libp2p-specific `GET_PROVIDERS`
request.
[1] http://xlattice.sourceforge.net/components/protocol/kademlia/specs.html#FIND_NODE
* Fix self-dialing in Kademlia.
Addresses https://github.com/libp2p/rust-libp2p/issues/341 which is the cause
for one of the observations made in https://github.com/libp2p/rust-libp2p/issues/1053.
However, the latter is not assumed to be fully addressed by these changes and
needs further investigation.
Currently, whenever a search for a key yields a response containing the initiating
peer as one of the closest peers known to the remote, the local node
would attempt to dial itself. That attempt is ignored by the Swarm, but
the Kademlia behaviour now believes it still has a query ongoing which is
always doomed to time out. That timeout delays successful completion of the query.
Hence, any query where a remote responds with the ID of the local node takes at
least as long as the `rpc_timeout` to complete, which possibly affects almost
all queries in smaller clusters where every node knows about every other.
This problem is fixed here by ensuring that Kademlia never tries to dial the local node.
Furthermore, `Discovered` events are no longer emitted for the local node
and it is not inserted into the `untrusted_addresses` from discovery, as described
in #341.
This commit also includes a change to the condition for freezing / terminating
a Kademlia query upon receiving a response. Specifically, the condition is
tightened such that it only applies if in addition to `parallelism`
consecutive responses that failed to yield a peer closer to the target, the
last response must also either not have reported any new peer or the
number of collected peers has already reached the number of desired results.
In effect, a Kademlia query now tries harder to actually return `k`
closest peers.
Tests have been refactored and expanded.
* Add another comment.
* muxing: adds an error type to streammuxer
* Update examples/chat.rs
Co-Authored-By: montekki <fedor.sakharov@gmail.com>
* make the trait error type bound to io error
This is now a very simple option serving multiple purposes:
* It allows for stable (integration) tests involving a Swarm, which
are otherwise subject to race conditions due to the connection being
allowed to terminate at any time with `KeepAlive::No`
(which remains the default).
* It makes for a more entertaining ping example which continuously
sends pings.
* Maybe someone wants to use the ping protocol for application-layer
connection keep-alive after all.
* Bump to 0.7.0
* Update CHANGELOG.md
Co-Authored-By: tomaka <pierre.krieger1708@gmail.com>
* Update for #1078
* New version of multihash and multiaddr as well
* Fix connection & handler shutdown when using `KeepAlive::Now`.
Delay::new(Instant::now()) is never immediately ready, resulting in
`KeepAlive::Now` to have no effect, since the delay is re-created on
every execution of `poll()` in the `NodeHandlerWrapper`. It can also
send the node handler into a busy-loop, since every newly
created Delay will trigger a task wakeup, which creates a new Delay
with Instant::now(), and so forth.
The use of `Delay::new(Instant::now())` for "immediate" connection shutdown
is therefore removed here entirely. An important assumption is thereby
that as long as the node handler non-empty `negotiating_in` and `negotiating_out`,
the handler is not dependent on such a Delay for task wakeup.
* Correction to the libp2p-ping connection timeout.
The current connection timeout is always short of one `interval`,
because the "countdown" begins with the last received or sent pong
(depending on the policy). In effect, the current default config has
a connection timeout of 5 seconds (20 - 15) from the point when a ping is sent.
Instead, the "countdown" of the connection timeout should always begin
with the next scheduled ping. That also makes all configurations valid,
avoiding pitfalls.
The important properties of the ping handler are now checked to hold for all
configurations, in particular:
* The next ping must be scheduled no earlier than the ping interval
and no later than the connection timeout.
* The "countdown" for the connection timeout starts on the next ping,
i.e. the full connection timeout remains at the instant when the
next ping is sent.
* Do not keep connections alive.
The ping protocol is not supposed to keep otherwise idle connections
alive, only to add an additional condition for terminating them in
the form of a configurable number of consecutive failed ping requests.
In this context, the `PingPolicy` does not seem useful any longer.
* Fix connection & handler shutdown when using `KeepAlive::Now`.
Delay::new(Instant::now()) is never immediately ready, resulting in
`KeepAlive::Now` to have no effect, since the delay is re-created on
every execution of `poll()` in the `NodeHandlerWrapper`. It can also
send the node handler into a busy-loop, since every newly
created Delay will trigger a task wakeup, which creates a new Delay
with Instant::now(), and so forth.
The use of `Delay::new(Instant::now())` for "immediate" connection shutdown
is therefore removed here entirely. An important assumption is thereby
that as long as the node handler non-empty `negotiating_in` and `negotiating_out`,
the handler is not dependent on such a Delay for task wakeup.
* Trigger CI.
The functionality is available through `Multiaddr::replace`.
What we currently call "nat_traversal" is merley a replacement of an IP
address prefix in a `Multiaddr`, hence it can be done directly on
`Multiaddr` values instead of having to go through a `Transport`.
In addition this PR consolidates changes made to `Multiaddr` in
previous commits which resulted in lots of deprecations. It adds some
more (see below for the complete list of API changes) and removes all
deprecated functionality, requiring a minor version bump.
Here are the changes to `multiaddr` compared to the currently published
version:
1. Removed `into_bytes` (use `to_vec` instead).
2. Renamed `to_bytes` to `to_vec`.
3. Removed `from_bytes` (use the `TryFrom` impl instead).
4. Added `with_capacity`.
5. Added `len`.
6. Removed `as_slice` (use `AsRef` impl instead).
7. Removed `encapsulate` (use `push` or `with` instead).
8. Removed `decapsulate` (use `pop` instead).
9. Renamed `append` to `push`.
10. Added `with`.
11. Added `replace`.
12. Removed `ToMultiaddr` trait (use `TryFrom` instead).
* libp2p-ping improvements.
* re #950: Removes use of the `OneShotHandler`, but still sending each
ping over a new substream, as seems to be intentional since #828.
* re #842: Adds an integration test that exercises the ping behaviour through
a Swarm, requiring the RTT to be below a threshold. This requires disabling
Nagle's algorithm as it can interact badly with delayed ACKs (and has been
observed to do so in the context of the new ping example and integration test).
* re #864: Control of the inbound and outbound (sub)stream protocol upgrade
timeouts has been moved from the `NodeHandlerWrapperBuilder` to the
`ProtocolsHandler`. That may also alleviate the need for a custom timeout
on an `OutboundSubstreamRequest` as a `ProtocolsHandler` is now free to
adjust these timeouts over time.
Other changes:
* A new ping example.
* Documentation improvements.
* More documentation improvements.
* Add PingPolicy and ensure no event is dropped.
* Remove inbound_timeout/outbound_timeout.
As per review comment, the inbound timeout is now configured
as part of the `listen_protocol` and the outbound timeout as
part of the `OutboundSubstreamRequest`.
* Simplify and generalise.
Generalise `ListenProtocol` to `SubstreamProtocol`, reusing it in
the context of `ProtocolsHandlerEvent::OutboundSubstreamRequest`.
* Doc comments for SubstreamProtocol.
* Adapt to changes in master.
* Relax upper bound for ping integration test rtt.
For "slow" CI build machines?
Wildcard IP addresses (e.g. 0.0.0.0) are used to listen on all host
interfaces. To report those addresses such that clients know about them
and can actually make use of them we use the `get_if_addrs` crate and
maintain a collection of addresses. We report the whole expansion at the
very beginning of the listener stream with `ListenerEvent::NewAddress`
events and add new addresses should they come to our attention.
What remains to be done is to potentially allow users to filter IP
addresses, for example the local loopback one, and to detect expired
addresses not only if a new address is discovered.
Replace the listener and address pair returned from `Transport::listen_on` with just a listener that produces `ListenerEvent` values which include upgrades as well as address changes.