Kademlia: Somewhat complete the records implementation. (#1189)

* Somewhat complete the implementation of Kademlia records.

This commit relates to [libp2p-146] and [libp2p-1089].

  * All records expire (by default, configurable).
  * Provider records are also stored in the RecordStore, and the RecordStore
    API extended.
  * Background jobs for periodic (re-)replication and (re-)publication
    of records. Regular (value-)records are subject to re-replication and
    re-publication as per standard Kademlia. Provider records are only
    subject to re-publication.
  * For standard Kademlia value lookups (quorum = 1), the record is cached
    at the closest peer to the key that did not return the value, as per
    standard Kademlia.
  * Expiration times of regular (value-)records is computed exponentially
    inversely proportional to the number of nodes between the local node
    and the closest node known to the key (beyond the k closest), as per
    standard Kademlia.

The protobuf messages are extended with two fields: `ttl` and `publisher`
in order to implement the different semantics of re-replication (by any
of the k closest peers to the key, not affecting expiry) and re-publication
(by the original publisher, resetting the expiry). This is not done yet in
other libp2p Kademlia implementations, see e.g. [libp2p-go-323]. The new protobuf fields
have been given somewhat unique identifiers to prevent future collision.

Similarly, periodic re-publication of provider records does not seem to
be done yet in other implementations, see e.g. [libp2p-js-98].

[libp2p-146]: https://github.com/libp2p/rust-libp2p/issues/146
[libp2p-1089]: https://github.com/libp2p/rust-libp2p/issues/1089
[libp2p-go-323]: https://github.com/libp2p/go-libp2p-kad-dht/issues/323
[libp2p-js-98]: https://github.com/libp2p/js-libp2p-kad-dht/issues/98

* Tweak kad-ipfs example.

* Add missing files.

* Ensure new delays are polled immediately.

To ensure task notification, since `NotReady` is returned right after.

* Fix ipfs-kad example and use wasm_timer.

* Small cleanup.

* Incorporate some feedback.

* Adjustments after rebase.

* Distinguish events further.

In order for a user to easily distinguish the result of e.g.
a `put_record` operation from the result of a later republication,
different event constructors are used. Furthermore, for now,
re-replication and "caching" of records (at the closest peer to
the key that did not return a value during a successful lookup)
do not yield events for now as they are less interesting.

* Speed up tests for CI.

* Small refinements and more documentation.

  * Guard a node against overriding records for which it considers
    itself to be the publisher.

  * Document the jobs module more extensively.

* More inline docs around removal of "unreachable" addresses.

* Remove wildcard re-exports.

* Use NonZeroUsize for the constants.

* Re-add method lost on merge.

* Add missing 'pub'.

* Further increase the timeout in the ipfs-kad example.

* Readd log dependency to libp2p-kad.

* Simplify RecordStore API slightly.

* Some more commentary.

* Change Addresses::remove to return Result<(),()>.

Change the semantics of `Addresses::remove` so that the error case
is unambiguous, instead of the success case. Use the `Result` for
clearer semantics to that effect.

* Add some documentation to .
This commit is contained in:
Roman Borschel
2019-07-17 14:40:48 +02:00
committed by GitHub
parent 01bce16d09
commit cde93f5432
21 changed files with 2715 additions and 947 deletions

View File

@ -83,19 +83,19 @@ const NUM_BUCKETS: usize = 256;
/// A `KBucketsTable` represents a Kademlia routing table.
#[derive(Debug, Clone)]
pub struct KBucketsTable<TPeerId, TVal> {
pub struct KBucketsTable<TKey, TVal> {
/// The key identifying the local peer that owns the routing table.
local_key: Key<TPeerId>,
local_key: TKey,
/// The buckets comprising the routing table.
buckets: Vec<KBucket<TPeerId, TVal>>,
buckets: Vec<KBucket<TKey, TVal>>,
/// The list of evicted entries that have been replaced with pending
/// entries since the last call to [`KBucketsTable::take_applied_pending`].
applied_pending: VecDeque<AppliedPending<TPeerId, TVal>>
applied_pending: VecDeque<AppliedPending<TKey, TVal>>
}
/// A (type-safe) index into a `KBucketsTable`, i.e. a non-negative integer in the
/// interval `[0, NUM_BUCKETS)`.
#[derive(Copy, Clone, PartialEq, Eq)]
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
struct BucketIndex(usize);
impl BucketIndex {
@ -124,17 +124,18 @@ impl BucketIndex {
for i in 0 .. quot {
bytes[31 - i] = rng.gen();
}
let rem = self.0 % 8;
let lower = usize::pow(2, rem as u32);
let upper = usize::pow(2, rem as u32 + 1);
let rem = (self.0 % 8) as u32;
let lower = usize::pow(2, rem);
let upper = usize::pow(2, rem + 1);
bytes[31 - quot] = rng.gen_range(lower, upper) as u8;
Distance(bigint::U256::from(bytes))
}
}
impl<TPeerId, TVal> KBucketsTable<TPeerId, TVal>
impl<TKey, TVal> KBucketsTable<TKey, TVal>
where
TPeerId: Clone,
TKey: Clone + AsRef<KeyBytes>,
TVal: Clone
{
/// Creates a new, empty Kademlia routing table with entries partitioned
/// into buckets as per the Kademlia protocol.
@ -142,7 +143,7 @@ where
/// The given `pending_timeout` specifies the duration after creation of
/// a [`PendingEntry`] after which it becomes eligible for insertion into
/// a full bucket, replacing the least-recently (dis)connected node.
pub fn new(local_key: Key<TPeerId>, pending_timeout: Duration) -> Self {
pub fn new(local_key: TKey, pending_timeout: Duration) -> Self {
KBucketsTable {
local_key,
buckets: (0 .. NUM_BUCKETS).map(|_| KBucket::new(pending_timeout)).collect(),
@ -151,14 +152,14 @@ where
}
/// Returns the local key.
pub fn local_key(&self) -> &Key<TPeerId> {
pub fn local_key(&self) -> &TKey {
&self.local_key
}
/// Returns an `Entry` for the given key, representing the state of the entry
/// in the routing table.
pub fn entry<'a>(&'a mut self, key: &'a Key<TPeerId>) -> Entry<'a, TPeerId, TVal> {
let index = BucketIndex::new(&self.local_key.distance(key));
pub fn entry<'a>(&'a mut self, key: &'a TKey) -> Entry<'a, TKey, TVal> {
let index = BucketIndex::new(&self.local_key.as_ref().distance(key));
if let Some(i) = index {
let bucket = &mut self.buckets[i.get()];
if let Some(applied) = bucket.apply_pending() {
@ -171,7 +172,7 @@ where
}
/// Returns an iterator over all the entries in the routing table.
pub fn iter<'a>(&'a mut self) -> impl Iterator<Item = EntryRefView<'a, TPeerId, TVal>> {
pub fn iter<'a>(&'a mut self) -> impl Iterator<Item = EntryRefView<'a, TKey, TVal>> {
let applied_pending = &mut self.applied_pending;
self.buckets.iter_mut().flat_map(move |table| {
if let Some(applied) = table.apply_pending() {
@ -194,7 +195,7 @@ where
///
/// The buckets are ordered by proximity to the `local_key`, i.e. the first
/// bucket is the closest bucket (containing at most one key).
pub fn buckets<'a>(&'a mut self) -> impl Iterator<Item = KBucketRef<'a, TPeerId, TVal>> + 'a {
pub fn buckets<'a>(&'a mut self) -> impl Iterator<Item = KBucketRef<'a, TKey, TVal>> + 'a {
let applied_pending = &mut self.applied_pending;
self.buckets.iter_mut().enumerate().map(move |(i, b)| {
if let Some(applied) = b.apply_pending() {
@ -219,24 +220,24 @@ where
/// buckets are updated accordingly. The fact that a pending entry was applied is
/// recorded in the `KBucketsTable` in the form of `AppliedPending` results, which must be
/// consumed by calling this function.
pub fn take_applied_pending(&mut self) -> Option<AppliedPending<TPeerId, TVal>> {
pub fn take_applied_pending(&mut self) -> Option<AppliedPending<TKey, TVal>> {
self.applied_pending.pop_front()
}
/// Returns an iterator over the keys closest to `target`, ordered by
/// increasing distance.
pub fn closest_keys<'a, T>(&'a mut self, target: &'a Key<T>)
-> impl Iterator<Item = Key<TPeerId>> + 'a
pub fn closest_keys<'a, T>(&'a mut self, target: &'a T)
-> impl Iterator<Item = TKey> + 'a
where
T: Clone
T: Clone + AsRef<KeyBytes>
{
let distance = self.local_key.distance(target);
let distance = self.local_key.as_ref().distance(target);
ClosestIter {
target,
iter: None,
table: self,
buckets_iter: ClosestBucketsIter::new(distance),
fmap: |b: &KBucket<_, _>| -> ArrayVec<_> {
fmap: |b: &KBucket<TKey, _>| -> ArrayVec<_> {
b.iter().map(|(n,_)| n.key.clone()).collect()
}
}
@ -244,13 +245,13 @@ where
/// Returns an iterator over the nodes closest to the `target` key, ordered by
/// increasing distance.
pub fn closest<'a, T>(&'a mut self, target: &'a Key<T>)
-> impl Iterator<Item = EntryView<TPeerId, TVal>> + 'a
pub fn closest<'a, T>(&'a mut self, target: &'a T)
-> impl Iterator<Item = EntryView<TKey, TVal>> + 'a
where
T: Clone,
T: Clone + AsRef<KeyBytes>,
TVal: Clone
{
let distance = self.local_key.distance(target);
let distance = self.local_key.as_ref().distance(target);
ClosestIter {
target,
iter: None,
@ -264,23 +265,46 @@ where
}
}
}
/// Counts the number of nodes between the local node and the node
/// closest to `target`.
///
/// The number of nodes between the local node and the target are
/// calculated by backtracking from the target towards the local key.
pub fn count_nodes_between<T>(&mut self, target: &T) -> usize
where
T: AsRef<KeyBytes>
{
let local_key = self.local_key.clone();
let distance = target.as_ref().distance(&local_key);
let mut iter = ClosestBucketsIter::new(distance).take_while(|i| i.get() != 0);
if let Some(i) = iter.next() {
let num_first = self.buckets[i.get()].iter()
.filter(|(n,_)| n.key.as_ref().distance(&local_key) <= distance)
.count();
let num_rest: usize = iter.map(|i| self.buckets[i.get()].num_entries()).sum();
num_first + num_rest
} else {
0
}
}
}
/// An iterator over (some projection of) the closest entries in a
/// `KBucketsTable` w.r.t. some target `Key`.
struct ClosestIter<'a, TTarget, TPeerId, TVal, TMap, TOut> {
struct ClosestIter<'a, TTarget, TKey, TVal, TMap, TOut> {
/// A reference to the target key whose distance to the local key determines
/// the order in which the buckets are traversed. The resulting
/// array from projecting the entries of each bucket using `fmap` is
/// sorted according to the distance to the target.
target: &'a Key<TTarget>,
target: &'a TTarget,
/// A reference to all buckets of the `KBucketsTable`.
table: &'a mut KBucketsTable<TPeerId, TVal>,
table: &'a mut KBucketsTable<TKey, TVal>,
/// The iterator over the bucket indices in the order determined by the
/// distance of the local key to the target.
buckets_iter: ClosestBucketsIter,
/// The iterator over the entries in the currently traversed bucket.
iter: Option<arrayvec::IntoIter<[TOut; K_VALUE]>>,
iter: Option<arrayvec::IntoIter<[TOut; K_VALUE.get()]>>,
/// The projection function / mapping applied on each bucket as
/// it is encountered, producing the next `iter`ator.
fmap: TMap
@ -376,12 +400,14 @@ impl Iterator for ClosestBucketsIter {
}
}
impl<TTarget, TPeerId, TVal, TMap, TOut> Iterator
for ClosestIter<'_, TTarget, TPeerId, TVal, TMap, TOut>
impl<TTarget, TKey, TVal, TMap, TOut> Iterator
for ClosestIter<'_, TTarget, TKey, TVal, TMap, TOut>
where
TPeerId: Clone,
TMap: Fn(&KBucket<TPeerId, TVal>) -> ArrayVec<[TOut; K_VALUE]>,
TOut: AsRef<Key<TPeerId>>
TTarget: AsRef<KeyBytes>,
TKey: Clone + AsRef<KeyBytes>,
TVal: Clone,
TMap: Fn(&KBucket<TKey, TVal>) -> ArrayVec<[TOut; K_VALUE.get()]>,
TOut: AsRef<KeyBytes>
{
type Item = TOut;
@ -400,8 +426,8 @@ where
}
let mut v = (self.fmap)(bucket);
v.sort_by(|a, b|
self.target.distance(a.as_ref())
.cmp(&self.target.distance(b.as_ref())));
self.target.as_ref().distance(a.as_ref())
.cmp(&self.target.as_ref().distance(b.as_ref())));
self.iter = Some(v.into_iter());
} else {
return None
@ -418,9 +444,10 @@ pub struct KBucketRef<'a, TPeerId, TVal> {
bucket: &'a mut KBucket<TPeerId, TVal>
}
impl<TPeerId, TVal> KBucketRef<'_, TPeerId, TVal>
impl<TKey, TVal> KBucketRef<'_, TKey, TVal>
where
TPeerId: Clone
TKey: Clone + AsRef<KeyBytes>,
TVal: Clone
{
/// Returns the number of entries in the bucket.
pub fn num_entries(&self) -> usize {
@ -432,6 +459,7 @@ where
self.bucket.pending().map_or(false, |n| !n.is_ready())
}
/// Tests whether the given distance falls into this bucket.
pub fn contains(&self, d: &Distance) -> bool {
BucketIndex::new(d).map_or(false, |i| i == self.index)
}
@ -453,6 +481,34 @@ mod tests {
use super::*;
use libp2p_core::PeerId;
use quickcheck::*;
use rand::Rng;
type TestTable = KBucketsTable<KeyBytes, ()>;
impl Arbitrary for TestTable {
fn arbitrary<G: Gen>(g: &mut G) -> TestTable {
let local_key = Key::from(PeerId::random());
let timeout = Duration::from_secs(g.gen_range(1, 360));
let mut table = TestTable::new(local_key.clone().into(), timeout);
let mut num_total = g.gen_range(0, 100);
for (i, b) in &mut table.buckets.iter_mut().enumerate().rev() {
let ix = BucketIndex(i);
let num = g.gen_range(0, usize::min(K_VALUE.get(), num_total) + 1);
num_total -= num;
for _ in 0 .. num {
let distance = ix.rand_distance(g);
let key = local_key.for_distance(distance);
let node = Node { key: key.clone(), value: () };
let status = NodeStatus::arbitrary(g);
match b.insert(node, status) {
InsertResult::Inserted => {}
_ => panic!()
}
}
}
table
}
}
#[test]
fn rand_distance() {
@ -469,7 +525,7 @@ mod tests {
}
#[test]
fn basic_closest() {
fn entry_inserted() {
let local_key = Key::from(PeerId::random());
let other_id = Key::from(PeerId::random());
@ -489,7 +545,7 @@ mod tests {
}
#[test]
fn update_local_id_fails() {
fn entry_self() {
let local_key = Key::from(PeerId::random());
let mut table = KBucketsTable::<_, ()>::new(local_key.clone(), Duration::from_secs(5));
match table.entry(&local_key) {
@ -545,7 +601,7 @@ mod tests {
match e.insert((), NodeStatus::Connected) {
InsertResult::Pending { disconnected } => {
expected_applied = AppliedPending {
inserted: key.clone(),
inserted: Node { key: key.clone(), value: () },
evicted: Some(Node { key: disconnected, value: () })
};
full_bucket_index = BucketIndex::new(&key.distance(&local_key));
@ -569,7 +625,7 @@ mod tests {
let elapsed = Instant::now() - Duration::from_secs(1);
full_bucket.pending_mut().unwrap().set_ready_at(elapsed);
match table.entry(&expected_applied.inserted) {
match table.entry(&expected_applied.inserted.key) {
Entry::Present(_, NodeStatus::Connected) => {}
x => panic!("Unexpected entry: {:?}", x)
}
@ -582,4 +638,28 @@ mod tests {
assert_eq!(Some(expected_applied), table.take_applied_pending());
assert_eq!(None, table.take_applied_pending());
}
#[test]
fn count_nodes_between() {
fn prop(mut table: TestTable, target: Key<PeerId>) -> bool {
let num_to_target = table.count_nodes_between(&target);
let distance = table.local_key.distance(&target);
let base2 = U256::from(2);
let mut iter = ClosestBucketsIter::new(distance);
iter.all(|i| {
// Flip the distance bit related to the bucket.
let d = Distance(distance.0 ^ (base2.pow(U256::from(i.get()))));
let k = table.local_key.for_distance(d);
if distance.0.bit(i.get()) {
// Bit flip `1` -> `0`, the key must be closer than `target`.
d < distance && table.count_nodes_between(&k) <= num_to_target
} else {
// Bit flip `0` -> `1`, the key must be farther than `target`.
d > distance && table.count_nodes_between(&k) >= num_to_target
}
})
}
QuickCheck::new().tests(10).quickcheck(prop as fn(_,_) -> _)
}
}