340 Commits

Author SHA1 Message Date
Roman Borschel
69bd0dfffb
Refactor iterative queries. (#1154)
Refactoring of iterative queries (`query.rs`) to improve both
correctness and performance (for larger DHTs):

Correctness:

  1. Queries no longer terminate prematurely due to counting results
     from peers farther from the target while results from closer
     peers are still pending. (#1105).

  2. Queries no longer ignore reported closer peers that are not duplicates
     just because they are currently not among the `num_results` closest.
     The currently `max_results` closest may contain peers marked as failed
     or pending / waiting. Hence all reported closer peers that are not
     duplicates must be considered candidates that may still end up
     among the `num_results` closest that successfully responded.

  3. Bounded parallelism based on the `active_counter` was not working
     correctly, as new (not yet contacted) peers closer to the target
     may be discovered at any time and thus appear in `closer_peers`
     before the already active / pending peers.

  4. The `Frozen` query mechanism allowed all remaining not-yet contacted
     peers to be contacted, but their results were discarded, because
     `inject_rpc_result` would only incorporate results while the
     query is `Iterating`. The `Frozen` state has been reworked into
     a `Stalled` state that implements a slightly more permissive
     variant of the following from the paper / specs: "If a round of
     FIND_NODEs fails to return a node any closer than the closest
     already seen, the initiator resends the FIND_NODE to all of the
     k closest nodes it has not already queried.". Importantly, though
     not explicitly mentioned, the query can move back to `Iterating`
     if it makes further progress again as a result of these requests.
     The `Stalled` state thus allows (temporarily) higher parallelism
     in an effort to make progress and bring the query to an end.

Performance:

  1. Repeated distance calculations between the same peers and the
     target is avoided.

  2. Enabled by #1108, use of a more appropriate data structure (`BTreeMap`) for
     the incrementally updated list of closer peers. The data structure needs
     efficient lookups (to avoid duplicates) and insertions at any position,
     both of which large(r) vectors are not that good at. Unscientific benchmarks
     showed a ~40-60% improvement in somewhat pathological scenarios with at least
     20 healthy nodes, each possibly returning a distinct list of closer 20 peers
     to the requestor. A previous assumption may have been that the vector always
     stays very small, but that is not the case in larger clusters: Even if the
     lists of closer peers reported by the 20 contacted peers are heavily overlapping,
     typically a lot more than 20 peers have to be (at least temporarily) considered
     as closest peers until the query completes. See also issue (2) above.

New tests are added for:

  * Query termination conditions.
  * Bounded parallelism.
  * Absence of duplicates.
2019-06-20 13:26:09 +02:00
Pierre Krieger
b6378ac526
PollParameters is now a trait (#1177)
* PollParameters is now a trait

* Fix unused variable
2019-06-18 10:23:26 +02:00
Fedor Sakharov
58015d1fb4
Report which key exactly was not found. (#1171) 2019-06-07 17:50:06 +03:00
Pierre Krieger
f51573b628
Publish v0.9.0 (#1160)
* Publish v0.9.0

* Update CHANGELOG.md

Co-Authored-By: Roman Borschel <romanb@users.noreply.github.com>

* Published soketto
2019-06-04 17:17:03 +02:00
Fedor Sakharov
22527e7eb6 Kademlia Records (#1144)
* initial implementation of the records

* move to multihash keys

* correctly process query results

* comments and formatting

* correctly return closer_peers in query

* checking wrong peer id in test

* Apply suggestions from code review

Co-Authored-By: Roman Borschel <romanb@users.noreply.github.com>

* Fix changes from suggestions

* Send responses to PUT_VALUE requests

* Shortcut in get_value

* Update protocols/kad/src/behaviour.rs

Co-Authored-By: Roman Borschel <romanb@users.noreply.github.com>

* Revert "Update protocols/kad/src/behaviour.rs"

This reverts commit 579ce742a7f4c94587f1e1f0866d2a3a37418efb.

* Remove duplicate insertion

* Adds a record to a PUT_VALUE response

* Fix a racy put_value test

* Store value ourselves only if we are in K closest

* Abstract over storage

* Revert "Abstract over storage": bad take

This reverts commit eaebf5b6d915712eaf3b05929577fdf697f204d8.

* Abstract over records storage using hashmap as default

* Constructor for custom records

* New Record type and its traits

* Fix outdated storage name

* Fixes returning an event

* Change FindNodeReq key type to Multihash

* WriteState for a second stage of a PUT_VALUE request

* GET_VALUE should not have a record

* Refactor a match arm

* Add successes and failures counters to PutValueRes

* If value is found no need to return closer peers

* Remove a custo storage from tests

* Rename a test to get_value_not_found

* Adds a TODO to change FindNode request key to Multihash

Co-Authored-By: Roman Borschel <romanb@users.noreply.github.com>

* Move MemoryRecordStorage to record.rs

* Return a Cow-ed Record from get

* Fix incorrect GET_VALUE parsing

* Various fixes with review

* Fixes get_value_not_found

* Fix peerids names in test

* another fix

* PutValue correctly distributes values

* Simplify the test

* Check that results are actually the closest

* Reverts changes to tests

* Fix the test topology and checking the results

* Run put_value test ten times

* Adds a get_value test

* Apply suggestions from code review

Co-Authored-By: Roman Borschel <romanb@users.noreply.github.com>

* Make Record fields public

* Moves WriteState to write.rs

* A couple of minor fixes

* Another few fixes of review

* Simplify the put_value test

* Dont synchronously return an error from put_value

* Formatting fixes and comments

* Collect a bunch of results

* Take exactly as much elements as neede

* Check if the peer is still connected

* Adds a multiple GetValueResults results number test

* Unnecessary mut iterators in put_value

* Ask for num_results in get_value

* Dont allocate twice in get_value

* Dont count same errored peer multiple times

* Apply suggestions from code review

Co-Authored-By: Roman Borschel <romanb@users.noreply.github.com>

* Fix another review

* Apply suggestions from code review

Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com>

* Bring back FromIterator and improve a panic message

* Update protocols/kad/src/behaviour.rs

Co-Authored-By: Pierre Krieger <pierre.krieger1708@gmail.com>
2019-06-04 13:44:24 +02:00
Roman Borschel
710ce90c6e
Kademlia: Fix premature query termination. (#1155)
* Fix premature query termination.

* Don't skip over 'NotContacted' closest peers.

* Code style tweak.
2019-06-03 12:47:01 +02:00
Roman Borschel
3440d1896e
Send on pending RPCs on established connection. (#1156) 2019-06-03 12:08:01 +02:00
Roman Borschel
0bdd637b01
Kademlia: Fix status updates in KBuckets. (#1140)
* Fix status updates in KBuckets.

`KBucket::update` does currently not correctly update the `first_connected_pos`
before reinserting the node. This may result in connected nodes being considered
disconnected and thus eligible for replacement by a pending node if the bucket is
full and a new connected node is added.

Tests have been added for checking that `KBucket::update` preserves the status
and ordering of all other nodes in the bucket.

* Small test improvement.

Set an expectation for the new position, instead of taking the assigned position.
2019-05-23 12:18:13 +02:00
Pierre Krieger
6b78fa0cf7
Update dependencies (#1142)
* parking-lot to 0.8

* zeroize to 0.8
2019-05-23 11:14:34 +02:00
Roman Borschel
b6ab8e64d4
Update Kademlia protobuf. (#1138) 2019-05-22 15:47:46 +02:00
Roman Borschel
09f54df44d
Kademlia: Optimise iteration over closest keys / entries. (#1117)
* Kademlia: Optimise iteration over closest entries.

The current implementation for finding the entries whose keys are closest
to some target key in the Kademlia routing table involves copying the
keys of all buckets into a new `Vec` which is then sorted based on the
distances to the target and turned into an iterator from which only a
small number of elements (by default 20) are drawn.

This commit introduces an iterator over buckets for finding the closest
keys to a target that visits the buckets in the optimal order, based on
the information contained in the distance bit-string representing the
distance between the local key and the target.

Correctness is tested against full-table scans.

Also included:

  * Updated documentation.
  * The `Entry` API was moved to the `kbucket::entry` sub-module for
    ease of maintenance.
  * The pending node handling has been slightly refactored in order to
    bring code and documentation in agreement and clarify the semantics
    a little.

* Rewrite pending node handling and add tests.
2019-05-22 14:49:38 +02:00
Roman Borschel
c80205454a
Improve XOR metric. (#1108)
There are two issues with the current definition and use of Kademlia's
XOR metric:

  1. The distance is currently equated with the bucket index, i.e.
     `distance(a,b) - 1` is the index of the bucket into which either
     peer is put by the other. The result is a metric that is not
     unidirectional, as defined in the Kademlia paper and as implemented
     in e.g. libp2p-go and libp2p-js, which is to interpret the result
     of the XOR as an integer in its entirety.

  2. The current `KBucketsPeerId` trait and its instances allow computing
     distances between types with differing bit lengths as well as between
     types that hash all inputs again (i.e. `KadHash`) and "plain" `PeerId`s
     or `Multihash`es. This can result in computed distances that are either
     incorrect as per the requirement of the libp2p specs that all distances
     are to be computed from the XOR of the SHA256 of the input keys, or
     even fall outside of the image of the metric used for the `KBucketsTable`.
     In the latter case, such distances are not currently used as a bucket index
     - they can only occur in the context of comparing distances for the purpose
     of sorting peers - but that still seems undesirable.

These issues are addressed here as follows:

  * Unidirectionality of the XOR metric is restored by keeping the "full"
    integer representation of the bitwise XOR. The result is an XOR metric
    as defined in the paper. This also opens the door to avoiding the
    "full table scan" when searching for the keys closest to a given key -
    the ideal order in which to visit the buckets can be computed with the
    help of the distance bit string.

  * As a simplification and to make it easy to "do the right thing", the
    XOR metric is only defined on an opaque `kbucket::Key` type, partially
    derived from the current `KadHash`. `KadHash` and `KBucketsPeerId`
    are removed.
2019-05-17 17:27:57 +02:00
Pierre Krieger
bd96b66fb5
Publish version 0.8.0 (#1123) 2019-05-15 16:50:43 +02:00
Pierre Krieger
e9448ec8ca
Allow changing the Kademlia protocol name (#1118)
* Allow changing the Kademlia protocol name

* Expose the method to the behaviour

* Address review
2019-05-15 15:44:51 +02:00
Roman Borschel
61b236172b
Some Kademlia code cleanup. (#1101) 2019-05-08 10:10:58 +02:00
Roman Borschel
e44b443b91
Filter requesting peer from results. (#1102)
Although not explicitly mentioned in the paper, it seems clear that
including an entry for the requesting peer in a FIND_NODE response
never gives useful information and just occupies a result slot that may
have been better filled with another peer that the requestor may not
know about.

There is one explicit mention that this is the desired behavior
in a somewhat dated design document of another p2p framework [1]:

"The recipient of a FIND_NODE should never return a triple containing
the nodeID of the requestor."

The same reasoning supposedly applies to the libp2p-specific `GET_PROVIDERS`
request.

[1] http://xlattice.sourceforge.net/components/protocol/kademlia/specs.html#FIND_NODE
2019-05-06 11:40:13 +02:00
Roman Borschel
808a7a5ef6
Fix self-dialing in Kademlia. (#1097)
* Fix self-dialing in Kademlia.

Addresses https://github.com/libp2p/rust-libp2p/issues/341 which is the cause
for one of the observations made in https://github.com/libp2p/rust-libp2p/issues/1053.
However, the latter is not assumed to be fully addressed by these changes and
needs further investigation.

Currently, whenever a search for a key yields a response containing the initiating
peer as one of the closest peers known to the remote, the local node
would attempt to dial itself. That attempt is ignored by the Swarm, but
the Kademlia behaviour now believes it still has a query ongoing which is
always doomed to time out. That timeout delays successful completion of the query.
Hence, any query where a remote responds with the ID of the local node takes at
least as long as the `rpc_timeout` to complete, which possibly affects almost
all queries in smaller clusters where every node knows about every other.

This problem is fixed here by ensuring that Kademlia never tries to dial the local node.
Furthermore, `Discovered` events are no longer emitted for the local node
and it is not inserted into the `untrusted_addresses` from discovery, as described
in #341.

This commit also includes a change to the condition for freezing / terminating
a Kademlia query upon receiving a response. Specifically, the condition is
tightened such that it only applies if in addition to `parallelism`
consecutive responses that failed to yield a peer closer to the target, the
last response must also either not have reported any new peer or the
number of collected peers has already reached the number of desired results.
In effect, a Kademlia query now tries harder to actually return `k`
closest peers.

Tests have been refactored and expanded.

* Add another comment.
2019-05-02 21:43:29 +02:00
elferdo
585f84c88a Replace PeerId with Multihash for interface consistency (#1095)
* Change a PeerId for a Multihash

* Update protocols/kad/src/behaviour.rs

Co-Authored-By: elferdo <elferdo@gmail.com>

* Update protocols/kad/src/behaviour.rs

Co-Authored-By: elferdo <elferdo@gmail.com>
2019-04-30 19:39:26 +02:00
Pierre Krieger
ce4ca3cc75
Switch to wasm-timer (#1071) 2019-04-25 15:08:06 +02:00
Pierre Krieger
b4345ee8ba
Bump to 0.7.0 (#1081)
* Bump to 0.7.0

* Update CHANGELOG.md

Co-Authored-By: tomaka <pierre.krieger1708@gmail.com>

* Update for #1078

* New version of multihash and multiaddr as well
2019-04-23 13:03:29 +02:00
Roman Borschel
8cde987e6d Rename KeepAlive constructors. (#1078)
* KeepAlive::Now => KeepAlive::No
  * KeepAlive::Forever => KeepAlive::Yes

As suggested in #1072.
2019-04-23 11:58:49 +02:00
Toralf Wittner
ca58f8029c
Remove Transport::nat_traversal and refactor multiaddr. (#1052)
The functionality is available through `Multiaddr::replace`.
What we currently call "nat_traversal" is merley a replacement of an IP
address prefix in a `Multiaddr`, hence it can be done directly on
`Multiaddr` values instead of having to go through a `Transport`.

In addition this PR consolidates changes made to `Multiaddr` in
previous commits which resulted in lots of deprecations. It adds some
more (see below for the complete list of API changes) and removes all
deprecated functionality, requiring a minor version bump.

Here are the changes to `multiaddr` compared to the currently published
version:

1.  Removed `into_bytes` (use `to_vec` instead).
2.  Renamed `to_bytes` to `to_vec`.
3.  Removed `from_bytes` (use the `TryFrom` impl instead).
4.  Added `with_capacity`.
5.  Added `len`.
6.  Removed `as_slice` (use `AsRef` impl instead).
7.  Removed `encapsulate` (use `push` or `with` instead).
8.  Removed `decapsulate` (use `pop` instead).
9.  Renamed `append` to `push`.
10. Added `with`.
11. Added `replace`.
12. Removed `ToMultiaddr` trait (use `TryFrom` instead).
2019-04-17 20:12:31 +02:00
Roman Borschel
bee5c58b27
libp2p-ping improvements. (#1049)
* libp2p-ping improvements.

  * re #950: Removes use of the `OneShotHandler`, but still sending each
    ping over a new substream, as seems to be intentional since #828.

  * re #842: Adds an integration test that exercises the ping behaviour through
    a Swarm, requiring the RTT to be below a threshold. This requires disabling
    Nagle's algorithm as it can interact badly with delayed ACKs (and has been
    observed to do so in the context of the new ping example and integration test).

  * re #864: Control of the inbound and outbound (sub)stream protocol upgrade
    timeouts has been moved from the `NodeHandlerWrapperBuilder` to the
    `ProtocolsHandler`. That may also alleviate the need for a custom timeout
    on an `OutboundSubstreamRequest` as a `ProtocolsHandler` is now free to
    adjust these timeouts over time.

Other changes:

  * A new ping example.
  * Documentation improvements.

* More documentation improvements.

* Add PingPolicy and ensure no event is dropped.

* Remove inbound_timeout/outbound_timeout.

As per review comment, the inbound timeout is now configured
as part of the `listen_protocol` and the outbound timeout as
part of the `OutboundSubstreamRequest`.

* Simplify and generalise.

Generalise `ListenProtocol` to `SubstreamProtocol`, reusing it in
the context of `ProtocolsHandlerEvent::OutboundSubstreamRequest`.

* Doc comments for SubstreamProtocol.

* Adapt to changes in master.

* Relax upper bound for ping integration test rtt.

For "slow" CI build machines?
2019-04-16 15:57:29 +02:00
Toralf Wittner
6917b8f543
Have Transport::Listeners produce ListenerEvents. (#1032)
Replace the listener and address pair returned from `Transport::listen_on` with just a listener that produces `ListenerEvent` values which include upgrades as well as address changes.
2019-04-10 10:29:21 +02:00
Toralf Wittner
98b2517403
Change Multiaddr representation to Bytes. (#1041)
* Change `Multiaddr` representation to `Bytes`.

* Mark several `Multiaddr` methods as deprecated.
2019-04-08 10:57:09 +02:00
Pierre Krieger
235ad98863
Publish v0.6.0 (#1031) 2019-03-29 11:41:42 -03:00
elferdo
603c7be7c2
[Kademlia] Rehash PeerId before inserting in a KBucketsTable (#1025)
Add KadHash as the type to be used as key within KBuckets and replace PeerId.
2019-03-26 16:17:34 +01:00
Pierre Krieger
34db72a080
Split address reach error and node reach error (#1013)
* Split address reach error and node reach error

* Small comments about order of operatoins

* Minor doc change
2019-03-20 20:28:55 +01:00
Pierre Krieger
761c46ef19
Revert accidental change in Kademlia (#1018) 2019-03-20 18:55:09 +01:00
Pierre Krieger
1e8a976af6
Add kbuckets_entries (#1016) 2019-03-20 18:36:01 +01:00
Pierre Krieger
8de3cddfe8
Bugfix in Kademlia disconnected (#1017) 2019-03-20 17:52:35 +01:00
Pierre Krieger
7b2980133d
Simplify the Addresses (#1012)
* Simplify the Addresses

* Remove println
2019-03-20 17:30:00 +01:00
Pierre Krieger
7be97b4f93
Rewrite the Kademlia k-buckets to be more explicit (#996)
* Some k-buckets improvements

* Apply suggestions from code review

Co-Authored-By: tomaka <pierre.krieger1708@gmail.com>

* Use NonZeroUsize for the distance

* Update TODO comment
2019-03-20 17:09:48 +01:00
Pierre Krieger
96e559b503
Wrap multistream-select streams under a Negotiated (#1001) 2019-03-19 17:27:30 +01:00
Pierre Krieger
fc535f532b
Some Kademlia improvements (#994)
* Move QueryTarget to the behaviour

* Rework query system

* Add a few tests

* Add some Kademlia tests

* More tests

* Don't return self entry

* Fix tests
2019-03-18 18:20:57 +01:00
Pierre Krieger
1820bcb5ef
Version 0.5.0 (#999) 2019-03-13 10:14:55 +01:00
Pierre Krieger
8059a693a3
Cleaner shutdown process (#992)
* Cleaner shutdown process

* Finish

* Fix Yamux panic

* Remove irrelevant tests

* Update core/src/nodes/handled_node_tasks.rs

Co-Authored-By: tomaka <pierre.krieger1708@gmail.com>

* Fix yamux error handling

* Update yamux
2019-03-11 17:19:50 +01:00
Pierre Krieger
040d8c8c9a
Bump to v0.4 (#964) 2019-02-20 16:39:30 +01:00
Pierre Krieger
e9535c5c02
Add a proper list of addresses type for Kademlia (#928)
* Add a proper list of addresses type for Kademlia

* Some adjustements
2019-02-12 12:56:39 +01:00
Roman Borschel
eeed66707b Address edition-2018 idioms. (#929) 2019-02-11 14:58:15 +01:00
Pierre Krieger
a22121c8e7
Limit Kademlia messages to 4kiB (#920) 2019-02-06 14:33:45 +01:00
Pierre Krieger
479924f8dc
Bump libp2p, libp2p-core, libp2p-core-derive and libp2p-kad (#916)
* Bump libp2p-core, libp2p-core-derive and libp2p-kad

* Bump libp2p as well
2019-02-04 15:46:08 +01:00
Pierre Krieger
c9b7e237b6
Add NetworkBehaviour::inject_replaced (#914)
* Add NetworkBehaviour::inject_replaced

* Address style

* Forgot to call set_disconnected

* Also add incoming addresses to kbuckets
2019-02-04 15:21:50 +01:00
Pierre Krieger
780f4ddbc1
Avoid duplicate addresses in kbuckets (#906)
* Avoid duplicate addresses in kbuckets

* Bump kad to 0.3.1
2019-01-31 10:37:37 +01:00
Pierre Krieger
fcb2ac36e6
Bump to v0.3.0 (#905) 2019-01-30 16:50:47 +01:00
Pierre Krieger
663ec7e8da
connection_keep_alive() now returns KeepAlive (#899)
* connection_keep_alive() now returns Option<Instant>

* Use KeepAlive instead of Option<Instant>
2019-01-30 16:37:34 +01:00
Toralf Wittner
bbf56c6371
Update protobuf to version 2.3.0 (#904)
Initially I had hoped that the deprecated `#![allow(clippy)]` would no
longer be put into the generated rust files, but -- as of 2019-01-30 --
it still is (see [1] for details). Since we explicitly update the
protobuf files I decided to *manually edit the generated code* and
replace this with `#![allow(clippy:all)]`. Hopefully, by the time we do
the next upgrade, no such manual tweaking would be necessary anymore. I
think the benefit of a less polluted clippy output is worth it this
time.

[1]: https://github.com/stepancheg/rust-protobuf/pull/332
2019-01-30 16:25:45 +01:00
Toralf Wittner
e23b2733e2
Fix some rustc/clippy warnings. (#895) 2019-01-30 15:41:54 +01:00
Pierre Krieger
a77da73010
Add inject_dial_failure and make addresses_of_peer mut (#901)
* Add inject_dial_failure and make addresses_of_peer mut

* Fix tests
2019-01-30 14:55:39 +01:00
Pierre Krieger
df923526ca
Embed the topology in the NetworkBehaviour (#889)
* Embed the topology in the NetworkBehaviour

* Put topologies inside of Floodsub and Kad

* Fix core tests

* Fix chat example

* More work

* Some cleanup

* Restore external addresses system
2019-01-26 23:57:53 +01:00