P2P Systems and Gossip Protocols in Distributed Computing

P2P Systems: Gossip Protocols

CS 6410

By Alane Suhr & Danny Adams

Outline

❖

Timeline

❖

CAP Theorem

❖

Epidemic algorithms for replicated database maintenance

❖

Managing update conflicts in Bayou, a weakly connected

replicated storage system

❖

Conclusion

Timeline

CAP

●

Consistency -- all nodes contain the same state

●

Availability -- requests are responded to

promptly

●

Partition

○

part of a system completely independent from the rest of

the system

○

ideally should maintain itself autonomously

●

Partition tolerance -- system can stay online and functional

even when message passing fails

CAP Theorem

Paxos

Gossip

●

Paxos: prioritize consistency

given a network partition

●

Gossip: prioritize availability

given a network partition

Gossip

Gossip Overview

❏

Authors

❏

Motivations

❏

Epidemic Models

❏

Direct Mail

❏

Anti-Entropy

❏

Rumor mongering

❏

Evaluation

❏

DC’s

❏

Spatial Distribution

Alan Demers

Cornell

University

Dan Greene

PARC

Research

Scott Shenker

EECS

Berkeley

Doug Terry

Amazon Web

Services

Carl Hauser

PhD Cornell

Washington

State University

Motivations

●

Unreliable network

●

Unreliable nodes

●

CAP:  *AP*

○

always be able to respond to a

(read/write) request

○

eventual consistency

Epidemic

Models

Proposers and Acceptors

●

Proposer

○

In Paxos: clients

propose

 an update to the database

○

Epidemic model: a node

infects

 its neighbors

●

Acceptor

○

In Paxos: acceptor

accepts

 an update based on one or

more proposals

○

Epidemic model: a node is

infected

 by a neighbor

Types of

Epidemics

❖

Direct Mail

❖

Anti-Entropy

❖

Rumor Mongering

Advantages

➢

Simple algorithms

➢

High Availability

➢

Fault Tolerant

➢

Tunable

➢

Scalable

➢

Works in Partition

Direct Mail

●

Notify all neighbors of

an update

●

Timely and reasonably

efficient

●

 messages per update

Direct Mail

Direct Mail

Direct Mail

Messages sent: O(n) where n is

number of neighbors

Not fault tolerant -- doesn’t

guarantee eventual consistency

High volume of traffic with site

at the epicenter

Anti-Entropy

❏

Site chooses random

partner to share data

❏

Number of rounds til

consistency: O(log n)

❏

Sites use custom

protocols to resolve

conflicts

❏

Fault tolerant

Anti-Entropy

Anti-Entropy

Anti-Entropy

Anti-Entropy

Anti-Entropy

Anti-Entropy

Anti-Entropy

Anti-Entropy

Anti-Entropy

What

happens

next?

Mechanism: Push & Pull

Push vs. Pull

Push

Pull

{A, B}                           {A, C}

{A, B}                            {A, C}

{A, B}                           {A,B,C}

{A, B, C}                          {A, C}

What is Push-Pull?

{A, B}                           {A, C}

 {A, B, C}                        {A,B,C}

Propagation times of Push vs. Pull

P= Probability node hasn’t received

update after the i

th

 round

Pull is faster!!

Rumor Mongering

Sites choose a random neighbor to

share information with

Transmission rate is tuneable

1.

How long new updates are

interesting is also tuneable

2.

Can use push or pull mechanisms

Rumor Mongering Complexity

●

O(ln n) rounds leads to consistency

with

high probability

●

Push requires O(n ln n) transmissions

until consistency

●

Further proved lower bound for all push-

pull transmissions: 0(n ln ln n)

Karp et al 2000. Randomized rumor spreading. In

FOCS.

Analogy to epidemiology

●

Susceptible:

 site does not know an update yet

●

Infective:

actively sharing an update

●

Removed:

 updated and no longer sharing

Rumor mongering: nodes go from

susceptible

to

infective

and

eventually (probabilistically) to

removed

Rumor mongering

Rumor mongering

Rumor mongering

Rumor mongering

Rumor mongering

Rumor mongering

Rumor mongering

Rumor mongering

Pros:

●

Fast

●

Low call on resources

●

Fault-Tolerant

●

Less traffic

Cons:

●

A site can potentially miss an

update

Backups

●

Anti-entropy can be used to

“update” the network

regularly after direct mail or

rumor mongering

●

If inconsistency found in anti-

entropy, run the original

algorithm again

Death Certificates

❖

How are items deleted

using epidemic models?

I like Bread

I DON’T like

Bread!

I like orange

juice

Death Certificates

❖

How to remove items

from epidemic model?

❖

Drawbacks

➢

Space

➢

Increases traffic

➢

DC Can be lost

❖

Dormant death

certificates & retention

Evaluating Epidemic Models

➢

Residue:

 remaining susceptibles

when epidemic finishes

➢

Traffic:

➢

Delay:

○

avg

 Average time between

start of outbreak and arrival

of update @ given site

○

last

 Delay until last update

Spatial

Distribution

Helping

Or

Hurting

Convergence Times and Traffic

●

Linear network: anti entropy

○

Nearest-neighbors

■

 O(n) convergence

■

 O(1) traffic

○

Random connections

■

O(log(n)) convergence

■

 O(n) traffic

Optimizations for realistic network distributions

●

Select connections from

list of neighbors sorted

by distance

●

Treat network as linear

●

Compute probabilities

based on position in list

Rumor Mongering Non-Standard Distribution

●

Increase

--

number of rounds

a rumor is

“interesting”

●

Use push-pull

Takeaways

●

Availability >> consistency

●

Updates can be expensive

●

Distribution protocols should be

robust

●

Network design can hurt overall

performance

●

Byzantine Behavior not addressed

Questions?

Managing update conflicts in

Bayou, a weakly connected

replicated storage system

Additional Reading

●

Weak consistency makes

unstable network applications

possible

●

Developing good interfaces

allows for complex functions

like merging to be

interchangeable via the

application

Timeline

What is Bayou?

●

Storage system designed for

mobile computing

○

Network is not stable

○

Parts of the network may

not be connected all the time

○

Goal: high availability

○

Guarantees

weak

consistency

Bayou System Diagram

Server

Client

Client

Write

(unique ID)

Server

Anti-Entropy

Read Request

Data

Consistent Replicas

●

Writes are first

tentative

●

Eventually they are

committed

, ordered by time

●

Clients can tell whether

writes are

stable

committed

●

Primary

 servers deal with

committing updates

Detecting and Resolving Conflicts

●

Dependency checks

●

Merge procedures

●

Described by the clients,

application-dependent

Conclusions

●

Distributed systems need a

form of consensus

●

Effectively choosing the

correct consensus model for a

system has to be weighed

carefully with the attributes

of the system

Acknowledgements

Content Inspired by:

Ki Suh Lee:

“Epidemic Techniques”[2009]

Eugene Bagdasaryan:

“P2P Gossip Protocols” [2016]

Photos

www.pixabay.com

www.unsplash.com

www.1001freedownloads.com/free-cliparts

Slide Note

Embed Share

Download

Delve into the world of P2P systems and gossip protocols through a comprehensive exploration of CAP Theorem, epidemic algorithms, managing update conflicts, and key events in distributed systems history. Learn about the prioritization of consistency versus availability, the roles of Paxos and Gossip protocols, and the motivations driving research in this field.

calir Follow

Uploaded on Sep 30, 2024 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

P2P Systems: Gossip Protocols CS 6410 By Alane Suhr & Danny Adams 1

Outline Timeline CAP Theorem Epidemic algorithms for replicated database maintenance Managing update conflicts in Bayou, a weakly connected replicated storage system Conclusion 2 A

Timeline 1978 1982 1985 1987 1990 1995 1998 Lamport Lamport FLP Demers Schneider Terry Lamport Implementing fault-tolerant services using the state machine approach: A tutorial The part-time parliament Time, Clocks, and the Ordering of Events in a Distributed System The Byzantine Generals Problem Impossibility of Distributed Consensus with One Faulty Process Epidemic algorithms for replicated database maintenance Managing update conflicts in Bayou, a weakly connected replicated storage system 3 A

CAP Consistency -- all nodes contain the same state Availability -- requests are responded to promptly Partition part of a system completely independent from the rest of the system ideally should maintain itself autonomously Partition tolerance -- system can stay online and functional even when message passing fails 4 A

CAP Theorem Paxos: prioritize consistency given a network partition Gossip: prioritize availability given a network partition Paxos & Gossip 5 A

Gossip 6 6 D

Gossip Overview Authors Motivations Epidemic Models Direct Mail Anti-Entropy Rumor mongering Evaluation DC s Spatial Distribution 7 D

A u t h o r s Carl Hauser PhD Cornell Washington State University Alan Demers Cornell University Dan Greene PARC Research Scott Shenker EECS Berkeley Doug Terry Amazon Web Services 8 D

Motivations Unreliable network Unreliable nodes CAP: *AP* always be able to respond to a (read/write) request eventual consistency 9 D

Epidemic Models 10 A

Proposers and Acceptors Proposer In Paxos: clients propose an update to the database Epidemic model: a node infects its neighbors Acceptor In Paxos: acceptor accepts an update based on one or more proposals Epidemic model: a node is infected by a neighbor 11 A

Types of Epidemics Direct Mail Anti-Entropy Rumor Mongering A 12

Advantages Simple algorithms High Availability Fault Tolerant Tunable Scalable Works in Partition 13 A

Notify all neighbors of an update Timely and reasonably efficient n messages per update Direct Mail 14 D

Direct Mail 15 D

Direct Mail 16 D

Direct Mail Messages sent: O(n) where n is number of neighbors Not fault tolerant -- doesn t guarantee eventual consistency High volume of traffic with site at the epicenter 17 D

Anti-Entropy Site chooses random partner to share data Number of rounds til consistency: O(log n) Sites use custom protocols to resolve conflicts Fault tolerant 18 A

Anti-Entropy 19 A

Anti-Entropy 20 A

Anti-Entropy 21 A

Anti-Entropy 22 A

Anti-Entropy 23 A

Anti-Entropy 24 A

Anti-Entropy 25 A

Anti-Entropy 26 A

Anti-Entropy What happens next? 27 A

Mechanism: Push & Pull 28 D

Push vs. Pull Push Pull {A, B} {A, C} {A, B} {A, C} H(A), H(B) H(A), H(B) H(B C B {A, B} {A,B,C} {A, B, C} {A, C} 29 D

{A, B} {A, C} What is Push-Pull? H(A), H(B) C, H(B) B 30 {A, B, C} {A,B,C} D

Propagation times of Push vs. Pull Push: Pi+1 = Pie-1 Pull: Pi+1= Pi2 Pull is faster!! P= Probability node hasn t received update after the ithround 31 D

Rumor Mongering Sites choose a random neighbor to share information with Transmission rate is tuneable 1. How long new updates are interesting is also tuneable 2. Can use push or pull mechanisms 32 A

Rumor Mongering Complexity O(ln n) rounds leads to consistency with high probability Push requires O(n ln n) transmissions until consistency Further proved lower bound for all push- pull transmissions: 0(n ln ln n) 33 Karp et al 2000. Randomized rumor spreading. In FOCS. A

Analogy to epidemiology Susceptible: site does not know an update yet Infective: actively sharing an update Removed: updated and no longer sharing Rumor mongering: nodes go from susceptible to infective and eventually (probabilistically) to removed 34 A

Rumor mongering 35 A

Rumor mongering 36 A

Rumor mongering 37 A

Rumor mongering 38 A

Rumor mongering 39 A

Rumor mongering 40 A

Rumor mongering A 41 A

Rumor mongering Pros: Cons: Fast Low call on resources Fault-Tolerant Less traffic A site can potentially miss an update 42 A

Backups Anti-entropy can be used to update the network regularly after direct mail or rumor mongering If inconsistency found in anti- entropy, run the original algorithm again 43 D

Death Certificates How are items deleted using epidemic models? 44 D

I DONT like Bread! I like Bread System Update I like orange juice 45 D

Death Certificates How to remove items from epidemic model? Drawbacks Space Increases traffic DC Can be lost 46 Dormant death D

Evaluating Epidemic Models Residue: remaining susceptibles when epidemic finishes Traffic: Delay: Tavg: Average time between start of outbreak and arrival of update @ given site Tlast: Delay until last update 47 D

Spatial Distribution Helping Or Hurting 48 A

Convergence Times and Traffic Linear network: anti entropy Nearest-neighbors O(n) convergence O(1) traffic Random connections O(log(n)) convergence O(n) traffic 49 A

Optimizations for realistic network distributions Select connections from list of neighbors sorted by distance Treat network as linear Compute probabilities based on position in list 50 A

P2P Systems and Gossip Protocols in Distributed Computing

Download Presentation

Presentation Transcript

Related

More Related Content