31 Commits

Author SHA1 Message Date
folex
4d6478a38e Merge branch 'upstream_master' into merge_0.36
# Conflicts:
#	Cargo.toml
#	core/Cargo.toml
#	examples/distributed-key-value-store.rs
#	misc/multiaddr/Cargo.toml
#	misc/multiaddr/src/onion_addr.rs
#	misc/multistream-select/Cargo.toml
#	muxers/mplex/Cargo.toml
#	muxers/yamux/Cargo.toml
#	protocols/floodsub/Cargo.toml
#	protocols/gossipsub/Cargo.toml
#	protocols/identify/Cargo.toml
#	protocols/kad/Cargo.toml
#	protocols/kad/src/addresses.rs
#	protocols/kad/src/behaviour.rs
#	protocols/kad/src/behaviour/test.rs
#	protocols/kad/src/kbucket/bucket.rs
#	protocols/kad/src/kbucket/key.rs
#	protocols/mdns/Cargo.toml
#	protocols/ping/Cargo.toml
#	protocols/request-response/Cargo.toml
#	protocols/secio/Cargo.toml
#	swarm-derive/Cargo.toml
#	swarm/Cargo.toml
#	transports/deflate/Cargo.toml
#	transports/dns/Cargo.toml
#	transports/noise/Cargo.toml
#	transports/plaintext/Cargo.toml
#	transports/tcp/Cargo.toml
#	transports/uds/Cargo.toml
#	transports/wasm-ext/Cargo.toml
#	transports/websocket/Cargo.toml
2021-03-05 15:20:08 +03:00
Roman Borschel
6499e924a3
Make clippy "happy". (#1950)
* Make clippy "happy".

Address all clippy complaints that are not purely stylistic (or even
have corner cases with false positives). Ignore all "style" and "pedantic" lints.

* Fix tests.

* Undo unnecessary API change.
2021-02-15 11:59:51 +01:00
folex
8b6a5db985 Merge branch 'rust_master' into fluence_master
# Conflicts:
#	protocols/kad/src/behaviour.rs
#	protocols/kad/src/query.rs
2020-07-30 16:36:51 +03:00
Demi Obenour
9178459cc8
Automatic fixes by cargo-fix (#1662) 2020-07-27 22:27:33 +02:00
folex
fdeda46898 Disable disjoint queries in tests, add certificate dissemination test 2020-07-07 20:07:36 +03:00
folex
c908e194ef Merge branch 'rust_master' into libp2p_0_20
# Conflicts:
#	.github/workflows/ci.yml
#	Cargo.toml
#	core/Cargo.toml
#	examples/distributed-key-value-store.rs
#	misc/multistream-select/Cargo.toml
#	muxers/mplex/Cargo.toml
#	protocols/deflate/Cargo.toml
#	protocols/gossipsub/Cargo.toml
#	protocols/identify/Cargo.toml
#	protocols/kad/src/behaviour.rs
#	protocols/kad/src/behaviour/test.rs
#	protocols/kad/src/kbucket.rs
#	protocols/kad/src/kbucket/bucket.rs
#	protocols/kad/src/lib.rs
#	protocols/kad/src/query.rs
#	protocols/kad/src/query/peers/closest.rs
#	protocols/mdns/Cargo.toml
#	protocols/ping/Cargo.toml
#	protocols/secio/Cargo.toml
#	transports/tcp/Cargo.toml
#	transports/uds/Cargo.toml
2020-07-01 17:30:41 +03:00
Max Inden
9dd2d662e9
protocols/kad: Implement S-Kademlia's lookup over disjoint paths v2 (#1473)
The extension paper S-Kademlia includes a proposal for lookups over
disjoint paths. Within vanilla Kademlia, queries keep track of the
closest nodes in a single bucket. Any adversary along the path can thus
influence all future paths, in case they can come up with the
next-closest (not overall closest) hops. S-Kademlia tries to solve the
attack above by querying over disjoint paths using multiple buckets.

To adjust the libp2p Kademlia implementation accordingly this change-set
introduces an additional peers iterator: `ClosestDisjointPeersIter`.
This new iterator wraps around a set of `ClosestPeersIter`
`ClosestDisjointPeersIter` enforces that each of the `ClosestPeersIter`
explore disjoint paths by having each peer instantly return that was
queried by a different iterator before.
2020-06-19 12:22:26 +02:00
folex
3e2ed64faa
Weighted routing (#17) 2020-06-05 20:04:53 +03:00
Roman Borschel
3a96ebf57f
More insight into Kademlia queries. (#1567)
* [libp2p-kad] Provide more insight and control into Kademlia queries.

More insight: The API allows iterating over the active queries and
inspecting their state and execution statistics.

More control: The API allows aborting queries prematurely
at any time.

To that end, API operations that initiate new queries return the query ID
and multi-phase queries such as `put_record` retain the query ID across all
phases, each phase being executed by a new (internal) query.

* Cleanup

* Cleanup

* Update examples and re-exports.

* Incorporate review feedback.

* Update CHANGELOG

* Update CHANGELOG

Co-authored-by: Max Inden <mail@max-inden.de>
2020-05-16 10:43:09 +02:00
Max Inden
06721128f1
protocols/kad: Remove unnecessary PeerId conversion for fixed iter (#1573)
`FixedPeersIter` requires the initial set of peers to be passed as
`PeerId`s and not as `Key<PeerId>`s. This commit removes the unnecessary
conversion.
2020-05-13 16:47:02 +02:00
Roman Borschel
cde93f5432
Kademlia: Somewhat complete the records implementation. (#1189)
* Somewhat complete the implementation of Kademlia records.

This commit relates to [libp2p-146] and [libp2p-1089].

  * All records expire (by default, configurable).
  * Provider records are also stored in the RecordStore, and the RecordStore
    API extended.
  * Background jobs for periodic (re-)replication and (re-)publication
    of records. Regular (value-)records are subject to re-replication and
    re-publication as per standard Kademlia. Provider records are only
    subject to re-publication.
  * For standard Kademlia value lookups (quorum = 1), the record is cached
    at the closest peer to the key that did not return the value, as per
    standard Kademlia.
  * Expiration times of regular (value-)records is computed exponentially
    inversely proportional to the number of nodes between the local node
    and the closest node known to the key (beyond the k closest), as per
    standard Kademlia.

The protobuf messages are extended with two fields: `ttl` and `publisher`
in order to implement the different semantics of re-replication (by any
of the k closest peers to the key, not affecting expiry) and re-publication
(by the original publisher, resetting the expiry). This is not done yet in
other libp2p Kademlia implementations, see e.g. [libp2p-go-323]. The new protobuf fields
have been given somewhat unique identifiers to prevent future collision.

Similarly, periodic re-publication of provider records does not seem to
be done yet in other implementations, see e.g. [libp2p-js-98].

[libp2p-146]: https://github.com/libp2p/rust-libp2p/issues/146
[libp2p-1089]: https://github.com/libp2p/rust-libp2p/issues/1089
[libp2p-go-323]: https://github.com/libp2p/go-libp2p-kad-dht/issues/323
[libp2p-js-98]: https://github.com/libp2p/js-libp2p-kad-dht/issues/98

* Tweak kad-ipfs example.

* Add missing files.

* Ensure new delays are polled immediately.

To ensure task notification, since `NotReady` is returned right after.

* Fix ipfs-kad example and use wasm_timer.

* Small cleanup.

* Incorporate some feedback.

* Adjustments after rebase.

* Distinguish events further.

In order for a user to easily distinguish the result of e.g.
a `put_record` operation from the result of a later republication,
different event constructors are used. Furthermore, for now,
re-replication and "caching" of records (at the closest peer to
the key that did not return a value during a successful lookup)
do not yield events for now as they are less interesting.

* Speed up tests for CI.

* Small refinements and more documentation.

  * Guard a node against overriding records for which it considers
    itself to be the publisher.

  * Document the jobs module more extensively.

* More inline docs around removal of "unreachable" addresses.

* Remove wildcard re-exports.

* Use NonZeroUsize for the constants.

* Re-add method lost on merge.

* Add missing 'pub'.

* Further increase the timeout in the ipfs-kad example.

* Readd log dependency to libp2p-kad.

* Simplify RecordStore API slightly.

* Some more commentary.

* Change Addresses::remove to return Result<(),()>.

Change the semantics of `Addresses::remove` so that the error case
is unambiguous, instead of the success case. Use the `Result` for
clearer semantics to that effect.

* Add some documentation to .
2019-07-17 14:40:48 +02:00
Roman Borschel
ef9cb056b2
Kademlia: Address some TODOs - Refactoring - API updates. (#1174)
* Address some TODOs, refactor queries and public API.

The following left-over issues are addressed:

  * The key for FIND_NODE requests is generalised to any Multihash,
    instead of just peer IDs.
  * All queries get a (configurable) timeout.
  * Finishing queries as soon as enough results have been received is simplified
    to avoid code duplication.
  * No more panics in provider-API-related code paths. The provider API is
    however still untested and (I think) still incomplete (e.g. expiration
    of provider records).
  * Numerous smaller TODOs encountered in the code.

The following public API changes / additions are made:

  * Introduce a `KademliaConfig` with new configuration options for
    the replication factor and query timeouts.
  * Rename `find_node` to `get_closest_peers`.
  * Rename `get_value` to `get_record` and `put_value` to `put_record`,
    introducing a `Quorum` parameter for both functions, replacing the
    existing `num_results` parameter with clearer semantics.
  * Rename `add_providing` to `start_providing` and `remove_providing`
    to `stop_providing`.
  * Add a `bootstrap` function that implements a (almost) standard
    Kademlia bootstrapping procedure.
  * Rename `KademliaOut` to `KademliaEvent` with an updated list of
    constructors (some renaming). All events that report query results
    now report a `Result` to uniformly permit reporting of errors.

The following refactorings are made:

  * Introduce some constants.
  * Consolidate `query.rs` and `write.rs` behind a common query interface
    to reduce duplication and facilitate better code reuse, introducing
    the notion of a query peer iterator. `query/peers/closest.rs`
    contains the code that was formerly in `query.rs`. `query/peers/fixed.rs` contains
    a modified variant of `write.rs` (which is removed). The new `query.rs`
    provides an interface for working with a collection of queries, taking
    over some code from `behaviour.rs`.
  * Reduce code duplication in tests and use the current_thread runtime for
    polling swarms to avoid spurious errors in the test output due to aborted
    connections when a test finishes prematurely (e.g. because a quorum of
    results has been collected).
  * Some additions / improvements to the existing tests.

* Fix test.

* Fix rebase.

* Tweak kad-ipfs example.

* Incorporate some feedback.

* Provide easy access and conversion to keys in error results.
2019-07-03 16:16:25 +02:00
Roman Borschel
69bd0dfffb
Refactor iterative queries. (#1154)
Refactoring of iterative queries (`query.rs`) to improve both
correctness and performance (for larger DHTs):

Correctness:

  1. Queries no longer terminate prematurely due to counting results
     from peers farther from the target while results from closer
     peers are still pending. (#1105).

  2. Queries no longer ignore reported closer peers that are not duplicates
     just because they are currently not among the `num_results` closest.
     The currently `max_results` closest may contain peers marked as failed
     or pending / waiting. Hence all reported closer peers that are not
     duplicates must be considered candidates that may still end up
     among the `num_results` closest that successfully responded.

  3. Bounded parallelism based on the `active_counter` was not working
     correctly, as new (not yet contacted) peers closer to the target
     may be discovered at any time and thus appear in `closer_peers`
     before the already active / pending peers.

  4. The `Frozen` query mechanism allowed all remaining not-yet contacted
     peers to be contacted, but their results were discarded, because
     `inject_rpc_result` would only incorporate results while the
     query is `Iterating`. The `Frozen` state has been reworked into
     a `Stalled` state that implements a slightly more permissive
     variant of the following from the paper / specs: "If a round of
     FIND_NODEs fails to return a node any closer than the closest
     already seen, the initiator resends the FIND_NODE to all of the
     k closest nodes it has not already queried.". Importantly, though
     not explicitly mentioned, the query can move back to `Iterating`
     if it makes further progress again as a result of these requests.
     The `Stalled` state thus allows (temporarily) higher parallelism
     in an effort to make progress and bring the query to an end.

Performance:

  1. Repeated distance calculations between the same peers and the
     target is avoided.

  2. Enabled by #1108, use of a more appropriate data structure (`BTreeMap`) for
     the incrementally updated list of closer peers. The data structure needs
     efficient lookups (to avoid duplicates) and insertions at any position,
     both of which large(r) vectors are not that good at. Unscientific benchmarks
     showed a ~40-60% improvement in somewhat pathological scenarios with at least
     20 healthy nodes, each possibly returning a distinct list of closer 20 peers
     to the requestor. A previous assumption may have been that the vector always
     stays very small, but that is not the case in larger clusters: Even if the
     lists of closer peers reported by the 20 contacted peers are heavily overlapping,
     typically a lot more than 20 peers have to be (at least temporarily) considered
     as closest peers until the query completes. See also issue (2) above.

New tests are added for:

  * Query termination conditions.
  * Bounded parallelism.
  * Absence of duplicates.
2019-06-20 13:26:09 +02:00
Roman Borschel
710ce90c6e
Kademlia: Fix premature query termination. (#1155)
* Fix premature query termination.

* Don't skip over 'NotContacted' closest peers.

* Code style tweak.
2019-06-03 12:47:01 +02:00
Roman Borschel
09f54df44d
Kademlia: Optimise iteration over closest keys / entries. (#1117)
* Kademlia: Optimise iteration over closest entries.

The current implementation for finding the entries whose keys are closest
to some target key in the Kademlia routing table involves copying the
keys of all buckets into a new `Vec` which is then sorted based on the
distances to the target and turned into an iterator from which only a
small number of elements (by default 20) are drawn.

This commit introduces an iterator over buckets for finding the closest
keys to a target that visits the buckets in the optimal order, based on
the information contained in the distance bit-string representing the
distance between the local key and the target.

Correctness is tested against full-table scans.

Also included:

  * Updated documentation.
  * The `Entry` API was moved to the `kbucket::entry` sub-module for
    ease of maintenance.
  * The pending node handling has been slightly refactored in order to
    bring code and documentation in agreement and clarify the semantics
    a little.

* Rewrite pending node handling and add tests.
2019-05-22 14:49:38 +02:00
Roman Borschel
c80205454a
Improve XOR metric. (#1108)
There are two issues with the current definition and use of Kademlia's
XOR metric:

  1. The distance is currently equated with the bucket index, i.e.
     `distance(a,b) - 1` is the index of the bucket into which either
     peer is put by the other. The result is a metric that is not
     unidirectional, as defined in the Kademlia paper and as implemented
     in e.g. libp2p-go and libp2p-js, which is to interpret the result
     of the XOR as an integer in its entirety.

  2. The current `KBucketsPeerId` trait and its instances allow computing
     distances between types with differing bit lengths as well as between
     types that hash all inputs again (i.e. `KadHash`) and "plain" `PeerId`s
     or `Multihash`es. This can result in computed distances that are either
     incorrect as per the requirement of the libp2p specs that all distances
     are to be computed from the XOR of the SHA256 of the input keys, or
     even fall outside of the image of the metric used for the `KBucketsTable`.
     In the latter case, such distances are not currently used as a bucket index
     - they can only occur in the context of comparing distances for the purpose
     of sorting peers - but that still seems undesirable.

These issues are addressed here as follows:

  * Unidirectionality of the XOR metric is restored by keeping the "full"
    integer representation of the bitwise XOR. The result is an XOR metric
    as defined in the paper. This also opens the door to avoiding the
    "full table scan" when searching for the keys closest to a given key -
    the ideal order in which to visit the buckets can be computed with the
    help of the distance bit string.

  * As a simplification and to make it easy to "do the right thing", the
    XOR metric is only defined on an opaque `kbucket::Key` type, partially
    derived from the current `KadHash`. `KadHash` and `KBucketsPeerId`
    are removed.
2019-05-17 17:27:57 +02:00
Roman Borschel
61b236172b
Some Kademlia code cleanup. (#1101) 2019-05-08 10:10:58 +02:00
Roman Borschel
808a7a5ef6
Fix self-dialing in Kademlia. (#1097)
* Fix self-dialing in Kademlia.

Addresses https://github.com/libp2p/rust-libp2p/issues/341 which is the cause
for one of the observations made in https://github.com/libp2p/rust-libp2p/issues/1053.
However, the latter is not assumed to be fully addressed by these changes and
needs further investigation.

Currently, whenever a search for a key yields a response containing the initiating
peer as one of the closest peers known to the remote, the local node
would attempt to dial itself. That attempt is ignored by the Swarm, but
the Kademlia behaviour now believes it still has a query ongoing which is
always doomed to time out. That timeout delays successful completion of the query.
Hence, any query where a remote responds with the ID of the local node takes at
least as long as the `rpc_timeout` to complete, which possibly affects almost
all queries in smaller clusters where every node knows about every other.

This problem is fixed here by ensuring that Kademlia never tries to dial the local node.
Furthermore, `Discovered` events are no longer emitted for the local node
and it is not inserted into the `untrusted_addresses` from discovery, as described
in #341.

This commit also includes a change to the condition for freezing / terminating
a Kademlia query upon receiving a response. Specifically, the condition is
tightened such that it only applies if in addition to `parallelism`
consecutive responses that failed to yield a peer closer to the target, the
last response must also either not have reported any new peer or the
number of collected peers has already reached the number of desired results.
In effect, a Kademlia query now tries harder to actually return `k`
closest peers.

Tests have been refactored and expanded.

* Add another comment.
2019-05-02 21:43:29 +02:00
Pierre Krieger
ce4ca3cc75
Switch to wasm-timer (#1071) 2019-04-25 15:08:06 +02:00
Pierre Krieger
fc535f532b
Some Kademlia improvements (#994)
* Move QueryTarget to the behaviour

* Rework query system

* Add a few tests

* Add some Kademlia tests

* More tests

* Don't return self entry

* Fix tests
2019-03-18 18:20:57 +01:00
Roman Borschel
eeed66707b Address edition-2018 idioms. (#929) 2019-02-11 14:58:15 +01:00
Pierre Krieger
c9b7e237b6
Add NetworkBehaviour::inject_replaced (#914)
* Add NetworkBehaviour::inject_replaced

* Address style

* Forgot to call set_disconnected

* Also add incoming addresses to kbuckets
2019-02-04 15:21:50 +01:00
Dan Robertson
6d24596f9f Update protocols and transport to 2018 edition (#875)
Update the protocols and transport subdirectories to the 2018 edition.

NB: The websocket transport cannot be moved to 2018 edition due to
websocket-rs's use of the keyword async as the name of a module.
2019-01-21 11:33:51 +01:00
Pierre Krieger
4bc5dea27d
Fixes to Kademlia queries (#855)
* Fixes to Kademlia queries

* Bump libp2p-kad to 0.2.1

* Fix bucket_num

* Nicer IDs generation in tests
2019-01-15 17:25:09 +01:00
James Ray
f541df391a Chore/semi colons (#799)
* Add helpers for easier Transports creation (#777)

* Add helpers for easier Transports creation

* Fix doctests

* Fix ' ;' occurrences
2018-12-19 23:22:39 +01:00
Pierre Krieger
3aa1fcbdc6
Add a KademliaHandler (#580)
* Rework Kademlia for the new design

* Minor work on protocol.rs

* More work

* Remove QueryTarget::FindValue

* Finish work on query

* Query timeout test

* Work on topology

* More work

* Update protocols/kad/src/topology.rs

Co-Authored-By: tomaka <pierre.krieger1708@gmail.com>

* Fix trailing whitespaces

* Use if let
2018-11-29 12:11:35 +01:00
James Ray
45cd7db6e9 Remove spaces before semicolons (#591) 2018-10-29 10:38:32 +01:00
Toralf Wittner
84b089cacc
Refactor multiaddr crate. (#498)
Refactor multiaddr crate.

- Remove `AddrComponent`. Instead `Protocol` directly contains its
associated data.

- Various smaller changes around conversions to Multiaddr from other
types, e.g. socket addresses.

- Expand tests to include property tests which test encoding/decoding
identity.
2018-09-20 19:51:00 +02:00
Pierre Krieger
304e9c72c8
Report the entire peers after a result (#467) 2018-09-07 16:46:34 +02:00
Pierre Krieger
ea1f172397
Implement Send everywhere (#458) 2018-09-06 09:54:35 +02:00
Benjamin Kampmann
2ea49718f3
Clean up directory structure (#426)
* Remove unused circular-buffer crate
* Move transports into subdirectory
* Move misc into subdirectory
* Move stores into subdirectory
* Move multiplexers
* Move protocols
* Move libp2p top layer
* Fix Test: skip doctest if secio isn't enabled
2018-08-29 11:24:44 +02:00