Evolution of Peer-to-Peer Networks and Distributed Hash Tables

Peer to Peer Networks

•

Distributed Hash Tables

Chord, Kelips, Dynamo

Galen Marchetti, Cornell University

1960s – 1999: Research Origins

•

ARPANET

•

every node requests and serves content

•

No self-organization

•

USENET

•

Decentralized messaging system

•

Self-organized

•

World Wide Web

•

Originally imagined as fundamentally P2P

•

Each node providing and receiving content

Sean Parker

1999-2001: Industry Popularity

•

Napster

•

Centralized index

•

P2P file transfer

•

FastTrack / Kazaa

•

“Supernodes” can act as proxy servers and routers

•

 P2P file transfer

•

Gnutella

•

Fully distributed P2P network

Robert Morris

Chord Protocol

•

 Handles one operation:

•

Map keys to nodes

•

 With system properties:

•

Completely decentralized

•

Dynamic membership

•

 Hits performance goals:

•

High availability

•

Scalable in number of nodes

Decentralization Requirements

•

User has a key

, wants a node

•

Given k, every node must:

•

locate

OR

•

Delegate the location of

to another node

•

System must eventually return

Consistent Hashing System

•

Given

k,

every node can locate

•

Hash every node’s IP address

•

 map these values on a circle

•

Given a key

, hash k

•

 k is assigned to closest node on

circle, moving clockwise.

Consistent Hashing System

Consistent Hashing System

•

 Pros:

•

Load Balanced

•

Dynamic membership

•

when N

th

 node joins network, only O(1/N) keys are

moved to rebalance

•

 Con:

•

Every node must know about every

other node

•

O(N) memory, O(1) communication

•

Not scalable in number of nodes

Scaling Consistent Hashing

•

 Approach 0:

•

Each node keeps track of only

their successor

•

 Resolution of hash function done

through routing

•

 O(1) memory

•

 O(N) communication

Scaling Consistent Hashing

•

 Approach 1:

•

Each node keeps track of O(log N)

successors in a “finger table”

•

 O(log N) memory

•

 O(log N) communication

Finger Table Pointers

Routing with Finger Tables

Node Joins

•

Learn finger table from predecessor

•

O(log n)

•

Update other node’s tables

•

O(log

n)

•

Notify application for state transfer

•

O(1)

Concurrent Joins

•

Maintain correctness by ensuring

successor is likely correct

•

“Stabilize” periodically

•

Verify successor

•

Update a random finger table entry

Handling Failures

•

Maintain list of “r” immediate

successors

•

To higher level applications, this list

may be a list of replicas

Chord Shortcomings

•

High churn rate really hurts the ability

to find keys

•

Transient network partitions can

permanently disrupt network

•

Chord does not converge – nodes are

not eventually reachable

•

Researched with Alloy Modeling by

Pamela Zave at AT&T

Two Circle Failure

Zave, Pamela. "Using lightweight modeling to understand chord."

ACM SIGCOMM Computer Communication Review

 42.2 (2012): 49-57.

Cornell’s Response

•

Kelips: Building an Efficient and Stable

P2P DHT Through Increased Memory

and Background Overhead

Gossip!

Leiden; Dec 06

Gossip-Based Networking

Workshop

Kelips

Take a a collection of

“nodes”

Taken from Gossip-Based Networking Workshop: Leiden ‘06

Leiden; Dec 06

Gossip-Based Networking

Workshop

Kelips

members

per affinity

group

Map nodes to

affinity groups

Affinity Groups:

peer membership thru

consistent

hash

Taken from Gossip-Based Networking Workshop: Leiden ‘06

Leiden; Dec 06

Gossip-Based Networking

Workshop

Kelips

Affinity Groups:

peer membership thru

consistent

hash

Affinity group

pointers

members

per affinity

group

Affinity group view

110 knows about

other members –

230, 30…

Taken from Gossip-Based Networking

Workshop: Leiden ‘06

Leiden; Dec 06

Gossip-Based Networking

Workshop

Affinity Groups:

peer membership thru

consistent

hash

Kelips

Contact

pointers

members

per

affinity

group

Affinity group view

Contacts

202 is a “contact”

for 110 in group 2

Taken from Gossip-Based Networking Workshop:

Leiden ‘06

Leiden; Dec 06

Gossip-Based Networking

Workshop

Affinity Groups:

peer membership thru

consistent

hash

Kelips

Gossip protocol

replicates data

cheaply

members

per

affinity

group

Affinity group view

Contacts

Resource Tuples

“cnn.com” maps to group 2.  So

110 tells group 2 to “route”

inquiries about cnn.com to it.

Taken from Gossip-Based Networking Workshop:

Leiden ‘06

Leiden; Dec 06

Gossip-Based Networking

Workshop

How it works

•

Kelips is

entirely

gossip based!

•

Gossip about membership

•

Gossip to replicate and repair data

•

Gossip about “last heard from” time used to discard failed nodes

•

Gossip “channel” uses fixed bandwidth

•

… fixed rate, packets of limited size

Taken from Gossip-Based Networking Workshop:

Leiden ‘06

Leiden; Dec 06

Gossip-Based Networking

Workshop

Connection to self-stabilization

•

Self-stabilization theory

•

Describe a system and a desired property

•

Assume a failure in which code remains correct but node states

are corrupted

•

Proof obligation: property reestablished within bounded time

•

Kelips is self-stabilizing.  Chord isn’t.

Taken from Gossip-Based Networking Workshop:

Leiden ‘06

Amazon Dynamo

•

Highly available distributed hash table

•

Uses Chord-like ring structure

•

Two operations:

•

get()

•

put()

•

Following “CAP Theorem” lore…

•

Sacrifice consistency

•

Gain availability

•

No “ACID” transactions

Performance Requirements

•

Service Level Agreement (SLA)

•

Cloud providers must maintain certain

performance levels according to

contracts

•

Clients describe an expected request

rate distribution: SLA describes

expected latency

•

Amazon expresses SLA’s at 99.9

th

percentile of latency

High Availability for Writes

•

Clients write to first node they find

•

Vector clocks timestamp writes

•

Different versions of key’s value live

on different nodes

•

Conflicts are resolved during reads

•

Like git: “automerge conflict” is

handled by end application

Incremental Scalability

•

Consistent Hashing a la Chord

•

Utilize “virtual nodes” along ring

•

Many virtual nodes per physical

node

•

larger machines can hold more

virtual nodes

•

Heterogeneous hardware is properly

load balanced

Membership

•

Background gossip

•

propagates membership knowledge

•

Gives O(1) hops for routing

•

Heartbeats and timeouts

•

detects failures

Replication: Sloppy Quorum

•

Each node maintains a “preference

list” of replicas for its own data

•

Replicas are made on first N

healthy

nodes from preference list

•

require R nodes to respond for get()

•

require W nodes to respond for put()

Replication: Sloppy Quorum

•

Quorum System

: R + W > N, W > N/2

•

Dynamo: W < N, R < N

•

R, W, N are tunable

•

Blessing:

 highly flexible

•

Curse:

 developers must know how to

work with Dynamo correctly

Replication: Hinted Handoff

•

If replica node is down

•

Use next node on preference list as

replica

•

Include “hint” declaring the original

replica

•

Periodically check if original comes

back up : update replica

Permanent Failure Recovery

•

Anti-Entropy: Merkle Trees

•

Maintain a tree per virtual node

•

Every leaf is a hash of a block of

data (value with an individual key)

•

Every node is the hash of its

children

•

Quick check for consistency

Slide Note

Embed Share

Download

Peer-to-peer networks and distributed hash tables have evolved significantly over the years, from the early days of ARPANET to the emergence of decentralized systems like Chord, Kelips, and Dynamo. This evolution has brought about a shift towards greater decentralization, improved scalability, and enhanced performance in peer-to-peer communication and data storage. Key figures like Sean Parker and Robert Morris have played important roles in shaping the development of these technologies, paving the way for more efficient and robust distributed systems.

didion_k Follow

Uploaded on Sep 24, 2024 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

Peer to Peer Networks Distributed Hash Tables Chord, Kelips, Dynamo Galen Marchetti, Cornell University

1960s 1999: Research Origins ARPANET every node requests and serves content No self-organization USENET Decentralized messaging system Self-organized World Wide Web Originally imagined as fundamentally P2P Each node providing and receiving content

Sean Parker

1999-2001: Industry Popularity Napster Centralized index P2P file transfer FastTrack / Kazaa Supernodes can act as proxy servers and routers P2P file transfer Gnutella Fully distributed P2P network

Robert Morris

Chord Protocol Handles one operation: Map keys to nodes With system properties: Completely decentralized Dynamic membership Hits performance goals: High availability Scalable in number of nodes

Decentralization Requirements User has a key k, wants a node nk Given k, every node must: locate nkOR Delegate the location of nkto another node System must eventually return nk

Consistent Hashing System Given k, every node can locate nk Hash every node s IP address map these values on a circle Given a key k, hash k k is assigned to closest node on circle, moving clockwise.

Consistent Hashing System 6 1 0 1 successor(1) = 1 7 successor(2) = 3 6 2 successor(6) = 0 5 3 4 2 Figure 2: An identifier circle consisting of the three nodes 0, 1, and 3. In this example, key 1 is located at node 1, key 2 at node 3, and key 6 at node 0. Figure 1: Structure of an example Chord-based distributed storage system. Chord improves the scalability of consistent hashing by avoid- ing the requirement that every node know about every other node. A Chord node needs only a small amount of routing informa- tion about other nodes. Because this information is distributed, a node resolves the hash function by communicating witha few other nodes. In an -node network, each node maintains information only about other nodes, and a lookup requires messages. Chord must update the routing information when a node joins or leaves the network; a join or leave requires machine is onlyoccasionally available, theycan offer tostore others data while they are up, in return for having their data stored elsewhere when they are down. The data s name can serve as a key to identify the (live) Chord node responsible for storing the data item at any given time. Many of the same issues arise as in the Cooperative Mirroring applica- tion, though the focus here is on availability rather than load balance. messages. Distributed Indexes tosupport Gnutella- or Napster-like keyword search. A key in this application could be derived from the desired keywords, while values could be lists of machines offering documents with those keywords. Consistent Hashing The consistent hash function assigns each node and key an identifier using a base hash function such as SHA-1 [9]. A node s identifier is chosen by hashing the node s IP address, while a key identifier is produced by hashing the key. We will use the term key to refer to both the original key and its image under the hash function, as the meaning will be clear from context. Similarly, the term node will refer to both the node and its identifier under the hash function. The identifier length make theprobability of two nodes or keys hashing to the sameiden- tifier negligible. Consistent hashing assigns keys to nodes as follows. Identifiers are ordered in an identifier circle modulo the first node whose identifier is equal to or follows (the identifier of) in the identifier space. This node is called the successor node of key , denoted by successor a circle of numbers from to first node clockwise from . Figure 2 shows an identifier circle with three nodes: 0, 1, and 3. The successor of identifier 1 is node 1, so key 1 would be located at node 1. Similarly, key 2 would be located at node 3, and key 6 at node 0. Consistent hashing is designed to let nodes enter and leave the network with minimal disruption. To maintain the consistent hash- ing mapping when a node joins the network, certain keys previ- ously assigned to s successor now become assigned to node leaves the network, all of its assigned keys are reassigned to s successor. No other changes in assignment of keys to nodes need occur. In the example above, if a node were to join with iden- tifier 7, it would capture the key with identifier 6 from the node with identifier 0. The following results are proven in the papers that introduced consistent hashing [11, 13]: 4.2 -bit Large-Scale Combinatorial Search, such as code breaking. In this case keys are candidate solutions to the problem (such as cryptographic keys); Chord maps these keys to the machines responsible for testing them as solutions. must be large enough to Figure 1 shows a possible three-layered software structure for a cooperative mirror system. The highest layer would provide a file- like interface to users, including user-friendly naming and authenti- cation. This filesystem layer might implement named directories and files, mapping operations on them to lower-level block opera- tions. The next layer, a block storage layer, would implement the block operations. It would take care of storage, caching, and replication of blocks. The block storage layer would use Chord to identify the node responsible for storing a block, and then talk to the block storage server on that node to read or write the block. . Key is assigned to . If identifiers are represented as , then is the . The circle has The Base Chord Protocol The Chord protocol specifies how to find the locations of keys, how new nodes join the system, and how to recover from the failure (or planned departure) of existing nodes. This section describes a simplified version of the protocol that does not handle concurrent joins or failures. Section 5 describes enhancements to the base pro- tocol to handle concurrent joins and failures. 4. . When 4.1 Overview At its heart, Chord provides fast distributed computation of a hash function mapping keys to nodes responsible for them. It uses consistent hashing [11, 13], which has several good properties. With high probability the hash function balances load (all nodes receive roughly the same number of keys). Also with high prob- ability, when an node joins (or leaves) the network, only an fraction of the keys are moved to a different location this is clearly the minimum necessary to maintain a balanced load. THEOREM 1. For any set of probability: nodes and keys, with high 1. Each node is responsible for at most keys 3

Consistent Hashing System Pros: Load Balanced Dynamic membership when Nthnode joins network, only O(1/N) keys are moved to rebalance Con: Every node must know about every other node O(N) memory, O(1) communication Not scalable in number of nodes

Scaling Consistent Hashing Approach 0: Each node keeps track of only their successor Resolution of hash function done through routing O(1) memory O(N) communication

Scaling Consistent Hashing Approach 1: Each node keeps track of O(log N) successors in a finger table O(log N) memory O(log N) communication

Finger Table Pointers

Routing with Finger Tables

Node Joins Learn finger table from predecessor O(log n) Update other node s tables O(log2 n) Notify application for state transfer O(1)

Concurrent Joins Maintain correctness by ensuring successor is likely correct Stabilize periodically Verify successor Update a random finger table entry

Handling Failures Maintain list of r immediate successors To higher level applications, this list may be a list of replicas

Chord Shortcomings High churn rate really hurts the ability to find keys Transient network partitions can permanently disrupt network Chord does not converge nodes are not eventually reachable Researched with Alloy Modeling by Pamela Zave at AT&T

Two Circle Failure Zave, Pamela. "Using lightweight modeling to understand chord." ACM SIGCOMM Computer Communication Review 42.2 (2012): 49-57.

Cornells Response Gossip! Kelips: Building an Efficient and Stable P2P DHT Through Increased Memory and Background Overhead

Kelips Leiden; Dec 06 Take a a collection of nodes Gossip-Based Networking Workshop 110 230 202 30 21 Taken from Gossip-Based Networking Workshop: Leiden 06

Kelips Affinity Groups: peer membership thru consistent hash Leiden; Dec 06 Map nodes to affinity groups N 0 1 2 1 Gossip-Based Networking Workshop 110 N members per affinity group 230 202 30 22 Taken from Gossip-Based Networking Workshop: Leiden 06

Kelips 110 knows about other members 230, 30 Affinity Groups: peer membership thru consistent hash Leiden; Dec 06 Affinity group view id hbeat rtt 30 234 90ms 0 1 2 N 1 Gossip-Based Networking Workshop 230 322 30ms 110 N 230 202 members per affinity group 30 Affinity group pointers 23 Taken from Gossip-Based Networking Workshop: Leiden 06

Kelips 202 is a contact for 110 in group 2 Affinity Groups: peer membership thru consistent hash Leiden; Dec 06 Affinity group view id hbeat rtt 30 234 90ms N 0 1 2 1 Gossip-Based Networking Workshop 230 322 30ms 110 N Contacts 230 202 members per group contactNode affinity group 30 2 202 Contact pointers 24 Taken from Gossip-Based Networking Workshop: Leiden 06

Kelips cnn.com maps to group 2. So 110 tells group 2 to route inquiries about cnn.com to it. Affinity Groups: peer membership thru consistent hash Leiden; Dec 06 Affinity group view id hbeat rtt 30 234 90ms N 0 1 2 1 Gossip-Based Networking Workshop 230 322 30ms 110 N Contacts 230 202 members per group contactNode affinity group 30 2 202 Resource Tuples Gossip protocol replicates data cheaply resource info 25 Taken from Gossip-Based Networking Workshop: Leiden 06 cnn.com 110

How it works Kelips is entirely gossip based! Gossip about membership Gossip to replicate and repair data Gossip about last heard from time used to discard failed nodes Gossip channel uses fixed bandwidth fixed rate, packets of limited size Leiden; Dec 06 Gossip-Based Networking Workshop 26 Taken from Gossip-Based Networking Workshop: Leiden 06

Connection to self-stabilization Self-stabilization theory Describe a system and a desired property Assume a failure in which code remains correct but node states are corrupted Proof obligation: property reestablished within bounded time Kelips is self-stabilizing. Chord isn t. Leiden; Dec 06 Gossip-Based Networking Workshop 27 Taken from Gossip-Based Networking Workshop: Leiden 06

Amazon Dynamo Highly available distributed hash table Uses Chord-like ring structure Two operations: get() put() Following CAP Theorem lore Sacrifice consistency Gain availability No ACID transactions

Performance Requirements Service Level Agreement (SLA) Cloud providers must maintain certain performance levels according to contracts Clients describe an expected request rate distribution: SLA describes expected latency Amazon expresses SLA s at 99.9th percentile of latency

High Availability for Writes Clients write to first node they find Vector clocks timestamp writes Different versions of key s value live on different nodes Conflicts are resolved during reads Like git: automerge conflict is handled by end application

Incremental Scalability Consistent Hashing a la Chord Utilize virtual nodes along ring Many virtual nodes per physical node larger machines can hold more virtual nodes Heterogeneous hardware is properly load balanced

Membership Background gossip propagates membership knowledge Gives O(1) hops for routing Heartbeats and timeouts detects failures

Replication: Sloppy Quorum Each node maintains a preference list of replicas for its own data Replicas are made on first N healthy nodes from preference list require R nodes to respond for get() require W nodes to respond for put()

Replication: Sloppy Quorum Quorum System: R + W > N, W > N/2 Dynamo: W < N, R < N R, W, N are tunable Blessing: highly flexible Curse: developers must know how to work with Dynamo correctly

Replication: Hinted Handoff If replica node is down Use next node on preference list as replica Include hint declaring the original replica Periodically check if original comes back up : update replica

Permanent Failure Recovery Anti-Entropy: Merkle Trees Maintain a tree per virtual node Every leaf is a hash of a block of data (value with an individual key) Every node is the hash of its children Quick check for consistency

Evolution of Peer-to-Peer Networks and Distributed Hash Tables

Download Presentation

Presentation Transcript

Related

More Related Content