Improve XOR metric. (#1108)

There are two issues with the current definition and use of Kademlia's
XOR metric:

  1. The distance is currently equated with the bucket index, i.e.
     `distance(a,b) - 1` is the index of the bucket into which either
     peer is put by the other. The result is a metric that is not
     unidirectional, as defined in the Kademlia paper and as implemented
     in e.g. libp2p-go and libp2p-js, which is to interpret the result
     of the XOR as an integer in its entirety.

  2. The current `KBucketsPeerId` trait and its instances allow computing
     distances between types with differing bit lengths as well as between
     types that hash all inputs again (i.e. `KadHash`) and "plain" `PeerId`s
     or `Multihash`es. This can result in computed distances that are either
     incorrect as per the requirement of the libp2p specs that all distances
     are to be computed from the XOR of the SHA256 of the input keys, or
     even fall outside of the image of the metric used for the `KBucketsTable`.
     In the latter case, such distances are not currently used as a bucket index
     - they can only occur in the context of comparing distances for the purpose
     of sorting peers - but that still seems undesirable.

These issues are addressed here as follows:

  * Unidirectionality of the XOR metric is restored by keeping the "full"
    integer representation of the bitwise XOR. The result is an XOR metric
    as defined in the paper. This also opens the door to avoiding the
    "full table scan" when searching for the keys closest to a given key -
    the ideal order in which to visit the buckets can be computed with the
    help of the distance bit string.

  * As a simplification and to make it easy to "do the right thing", the
    XOR metric is only defined on an opaque `kbucket::Key` type, partially
    derived from the current `KadHash`. `KadHash` and `KBucketsPeerId`
    are removed.
This commit is contained in:
Roman Borschel
2019-05-17 17:27:57 +02:00
committed by GitHub
parent 93d89964e1
commit c80205454a
7 changed files with 310 additions and 363 deletions

View File

@ -20,7 +20,7 @@
#![cfg(test)]
use crate::{Kademlia, KademliaOut, kbucket::KBucketsPeerId};
use crate::{Kademlia, KademliaOut, kbucket::{self, Distance}};
use futures::{future, prelude::*};
use libp2p_core::{
PeerId,
@ -80,6 +80,13 @@ fn build_nodes(num: usize) -> (u64, Vec<TestSwarm>) {
#[test]
fn query_iter() {
fn distances(key: &kbucket::Key<PeerId>, peers: Vec<PeerId>) -> Vec<Distance> {
peers.into_iter()
.map(kbucket::Key::from)
.map(|k| k.distance(key))
.collect()
}
fn run(n: usize) {
// Build `n` nodes. Node `n` knows about node `n-1`, node `n-1` knows about node `n-2`, etc.
// Node `n` is queried for a random peer and should return nodes `1..n-1` sorted by
@ -96,14 +103,13 @@ fn query_iter() {
// Ask the last peer in the list to search a random peer. The search should
// propagate backwards through the list of peers.
let search_target = PeerId::random();
let search_target_key = kbucket::Key::from(search_target.clone());
swarms.last_mut().unwrap().find_node(search_target.clone());
// Set up expectations.
let expected_swarm_id = swarm_ids.last().unwrap().clone();
let expected_peer_ids: Vec<_> = swarm_ids
.iter().cloned().take(n - 1).collect();
let mut expected_distances: Vec<_> = expected_peer_ids
.iter().map(|p| p.distance_with(&search_target)).collect();
let expected_peer_ids: Vec<_> = swarm_ids.iter().cloned().take(n - 1).collect();
let mut expected_distances = distances(&search_target_key, expected_peer_ids.clone());
expected_distances.sort();
// Run test
@ -118,10 +124,8 @@ fn query_iter() {
assert_eq!(key, search_target);
assert_eq!(swarm_ids[i], expected_swarm_id);
assert!(expected_peer_ids.iter().all(|p| closer_peers.contains(p)));
assert_eq!(expected_distances,
closer_peers.iter()
.map(|p| p.distance_with(&key))
.collect::<Vec<_>>());
let key = kbucket::Key::from(key);
assert_eq!(expected_distances, distances(&key, closer_peers));
return Ok(Async::Ready(()));
}
Async::Ready(_) => (),