Strong Consistency & CAP Theorem in Computing Systems

 
Strong Consistency & CAP Theorem
 
CS 240: Computing Systems and Concurrency
Lecture 15
 
Marco Canini
 
Credits: Michael Freedman and Kyle Jamieson developed much of the original material.
 
1.
Network Partitions
2.
Linearizability
3.
CAP Theorem
4.
Consistency Hierarchy
 
2
 
O
u
t
l
i
n
e
 
3
 
N
e
t
w
o
r
k
 
p
a
r
t
i
t
i
o
n
s
 
d
i
v
i
d
e
 
s
y
s
t
e
m
s
 
4
 
N
e
t
w
o
r
k
 
p
a
r
t
i
t
i
o
n
s
 
d
i
v
i
d
e
 
s
y
s
t
e
m
s
 
Totally-ordered Multicast?
Bayou?
Viewstamped Replication?
Chord?
Paxos?
Dynamo?
RAFT?
 
5
 
H
o
w
 
c
a
n
 
w
e
 
h
a
n
d
l
e
 
p
a
r
t
i
t
i
o
n
s
?
 
6
 
H
o
w
 
a
b
o
u
t
 
t
h
i
s
 
s
e
t
 
o
f
 
p
a
r
t
i
t
i
o
n
s
?
 
Replicas appear to be a 
single machine
,
but 
lose
 
availability 
during a network partition
OR
All replicas 
remain available 
during a network
partition but 
do not appear to be a single machine
 
7
 
F
u
n
d
a
m
e
n
t
a
l
 
t
r
a
d
e
-
o
f
f
?
 
You cannot achieve all three of:
1.
Consistency
2.
Availability
3.
Partition-Tolerance
Partition Tolerance => Partitions Can Happen
Availability => All Sides of Partition Continue
Consistency => Replicas Act Like Single Machine
Specifically, 
Linearizability
 
8
 
C
A
P
 
t
h
e
o
r
e
m
 
p
r
e
v
i
e
w
 
1.
Network Partitions
2.
Linearizability
3.
CAP Theorem
4.
Consistency Hierarchy
 
9
 
O
u
t
l
i
n
e
 
All replicas execute operations in 
some
 total order
That total order preserves the 
real-time ordering
between operations
If operation A 
completes
 
before operation B
begins
, then A is ordered before B in real-time
If neither A nor B completes before the other
begins, then there is no real-time order
(But there must be 
some
 total order)
 
10
10
 
L
i
n
e
a
r
i
z
a
b
i
l
i
t
y
 
[
H
e
r
l
i
h
y
 
a
n
d
 
W
i
n
g
 
1
9
9
0
]
 
Single machine processes requests one by one in
the order it receives them
Will receive requests ordered by real-time in that
order
Will receive all requests in some order
Atomic Multicast, Viewstamped Replication,
Paxos, and RAFT provide Linearizability
 
11
11
 
L
i
n
e
a
r
i
z
a
b
i
l
i
t
y
 
=
=
A
p
p
e
a
r
s
 
t
o
 
b
e
 
a
 
S
i
n
g
l
e
 
M
a
c
h
i
n
e
 
Hides the complexity of the underlying distributed
system from applications!
Easier to write applications
Easier to write correct applications
But, performance trade-offs, e.g., CAP
12
12
L
i
n
e
a
r
i
z
a
b
i
l
i
t
y
 
i
s
 
i
d
e
a
l
?
 
1.
Network Partitions
2.
Linearizability
3.
CAP Theorem
4.
Consistency Hierarchy
 
13
13
 
O
u
t
l
i
n
e
 
From keynote lecture by Eric Brewer (2000)
History:  Eric started Inktomi, early Internet search site based
around “commodity” clusters of computers
Using CAP to justify “BASE” model:  Basically Available, Soft-
state services with Eventual consistency
Popular interpretation: 2-out-of-3
Consistency (Linearizability)
Availability
Partition Tolerance:  Arbitrary crash/network failures
14
14
C
A
P
 
c
o
n
j
e
c
t
u
r
e
 
[
B
r
e
w
e
r
 
0
0
]
 
C
A
P
 
t
h
e
o
r
e
m
 
[
G
i
l
b
e
r
t
 
L
y
n
c
h
 
0
2
]
 
Assume to contradict that Algorithm 
A
 provides all of CAP
C
l
i
e
n
t
 
1
C
l
i
e
n
t
 
1
 
C
A
P
 
t
h
e
o
r
e
m
 
[
G
i
l
b
e
r
t
 
L
y
n
c
h
 
0
2
]
 
Assume to contradict that Algorithm 
A
 provides all of CAP
 
P
a
r
t
i
t
i
o
n
 
P
o
s
s
i
b
l
e
 
(
f
r
o
m
 
P
)
C
l
i
e
n
t
 
1
C
l
i
e
n
t
 
1
C
A
P
 
t
h
e
o
r
e
m
 
[
G
i
l
b
e
r
t
 
L
y
n
c
h
 
0
2
]
Assume to contradict that Algorithm 
A
 provides all of CAP
P
a
r
t
i
t
i
o
n
 
P
o
s
s
i
b
l
e
 
(
f
r
o
m
 
P
)
 
W
r
i
t
e
 
e
v
e
n
t
u
a
l
l
y
 
r
e
t
u
r
n
s
(
f
r
o
m
 
A
)
C
l
i
e
n
t
 
1
 
w
(
x
=
1
)
 
o
k
C
l
i
e
n
t
 
1
C
A
P
 
t
h
e
o
r
e
m
 
[
G
i
l
b
e
r
t
 
L
y
n
c
h
 
0
2
]
Assume to contradict that Algorithm 
A
 provides all of CAP
P
a
r
t
i
t
i
o
n
 
P
o
s
s
i
b
l
e
 
(
f
r
o
m
 
P
)
W
r
i
t
e
 
e
v
e
n
t
u
a
l
l
y
 
r
e
t
u
r
n
s
(
f
r
o
m
 
A
)
C
l
i
e
n
t
 
1
w
(
x
=
1
)
o
k
C
l
i
e
n
t
 
1
 
r
(
x
)
 
x
=
0
 
R
e
a
d
 
b
e
g
i
n
s
 
a
f
t
e
r
 
w
r
i
t
e
c
o
m
p
l
e
t
e
s
R
e
a
d
 
e
v
e
n
t
u
a
l
l
y
 
r
e
t
u
r
n
s
 
(
f
r
o
m
A
)
C
A
P
 
t
h
e
o
r
e
m
 
[
G
i
l
b
e
r
t
 
L
y
n
c
h
 
0
2
]
Assume to contradict that Algorithm 
A
 provides all of CAP
P
a
r
t
i
t
i
o
n
 
P
o
s
s
i
b
l
e
 
(
f
r
o
m
 
P
)
W
r
i
t
e
 
e
v
e
n
t
u
a
l
l
y
 
r
e
t
u
r
n
s
(
f
r
o
m
 
A
)
C
l
i
e
n
t
 
1
w
(
x
=
1
)
o
k
C
l
i
e
n
t
 
1
r
(
x
)
x
=
0
R
e
a
d
 
b
e
g
i
n
s
 
a
f
t
e
r
 
w
r
i
t
e
c
o
m
p
l
e
t
e
s
R
e
a
d
 
e
v
e
n
t
u
a
l
l
y
 
r
e
t
u
r
n
s
 
(
f
r
o
m
A
)
 
N
o
t
 
c
o
n
s
i
s
t
e
n
t
 
(
C
)
 
=
>
 
c
o
n
t
r
a
d
i
c
t
i
o
n
!
C
A
P
 
i
n
t
e
r
p
r
e
t
a
t
i
o
n
 
1
/
2
 
Cannot “choose” no partitions
2-out-of-3 interpretation doesn’t make sense
Instead, availability OR consistency?
 
i.e., fundamental trade-off between availability and
consistency
When designing system must choose one or the
other, both are not possible
 
C
A
P
 
i
n
t
e
r
p
r
e
t
a
t
i
o
n
 
2
/
2
 
It is a theorem, with a proof, that you understand!
 
Cannot “beat” CAP theorem
 
Can engineer systems to make partitions
extremely rare, however, and then just take the
rare hit to availability (or consistency)
M
o
r
e
 
t
r
a
d
e
-
o
f
f
s
 
L
 
v
s
.
 
C
 
Low-latency:  Speak to fewer than quorum of nodes?
2PC: 
 
  
write N, read 1
RAFT:  
  
write ⌊N/2⌋ + 1,  read ⌊N/2⌋ + 1
General:  
 
|W| + |R| > N
 
L and C are fundamentally at odds
“C” = linearizability, sequential, serializability (more later)
22
22
P
A
C
E
L
C
 
If there is a partition  (P):
How does system tradeoff  A and C?
Else (no partition)
How does system tradeoff  L and C?
Is there a useful system that switches?
Dynamo:  PA/EL
“ACID” dbs:  PC/EC
http://dbmsmusings.blogspot.com/2010/04/problems-with-cap-and-yahoos-little.html
23
23
 
1.
Network Partitions
2.
Linearizability
3.
CAP Theorem
4.
Consistency Hierarchy
 
24
24
 
O
u
t
l
i
n
e
 
C
o
n
s
i
s
t
e
n
c
y
 
m
o
d
e
l
s
 
Contract between a distributed system and the
applications that run on it
A consistency model is a set of 
guarantees
 made
by the distributed system
e.g., Linearizability
Guarantees a total order of operations
Guarantees the real-time ordering is respected
 
S
t
r
o
n
g
e
r
 
v
s
 
w
e
a
k
e
r
 
c
o
n
s
i
s
t
e
n
c
y
 
Stronger consistency models
+ Easier to write applications
-
 More guarantees for the system to ensure
Results in performance tradeoffs
Weaker consistency models
-  Harder to write applications
+ Fewer guarantees for the system to ensure
 
C
o
n
s
i
s
t
e
n
c
y
 
h
i
e
r
a
r
c
h
y
 
L
i
n
e
a
r
i
z
a
b
i
l
i
t
y
 
(
S
t
r
o
n
g
/
S
t
r
i
c
t
 
C
o
n
s
i
s
t
e
n
c
y
)
 
S
e
q
u
e
n
t
i
a
l
 
C
o
n
s
i
s
t
e
n
c
y
 
C
a
u
s
a
l
+
 
C
o
n
s
i
s
t
e
n
c
y
 
E
v
e
n
t
u
a
l
 
C
o
n
s
i
s
t
e
n
c
y
 
e
.
g
.
,
 
R
A
F
T
 
e
.
g
.
,
 
B
a
y
o
u
 
e
.
g
.
,
 
D
y
n
a
m
o
 
S
t
r
i
c
t
l
y
 
s
t
r
o
n
g
e
r
 
c
o
n
s
i
s
t
e
n
c
y
 
A consistency model 
A
 is strictly stronger than 
B
 if
it allows a strict subset of the behaviors of B
Guarantees are strictly stronger
 
Linearizability is strictly stronger than Sequential
Consistency
Linearizability: 
total order + real-time ordering
Sequential: 
total order + process ordering
Process ordering 
 Real-time ordering
 
Consistency model defines what values reads are
admissible
 
29
29
 
I
n
t
u
i
t
i
v
e
 
e
x
a
m
p
l
e
 
wall-clock time
 
P
1
:
 
P
2
:
 
P
3
:
 
P
4
:
 
Consistency model defines what values reads are
admissible
 
30
30
 
I
n
t
u
i
t
i
v
e
 
e
x
a
m
p
l
e
 
wall-clock time
 
P
1
:
 
P
2
:
 
P
3
:
 
P
4
:
Time when
process issues
operation
Time when
process receives
response
 
A
n
y
 
e
x
e
c
u
t
i
o
n
 
i
s
 
t
h
e
 
s
a
m
e
 
a
s
 
i
f
 
a
l
l
 
r
e
a
d
/
w
r
i
t
e
 
o
p
s
 
w
e
r
e
 
e
x
e
c
u
t
e
d
 
i
n
 
o
r
d
e
r
 
o
f
w
a
l
l
-
c
l
o
c
k
 
t
i
m
e
 
a
t
 
w
h
i
c
h
 
t
h
e
y
 
w
e
r
e
 
i
s
s
u
e
d
Therefore:
Reads are never stale
All replicas enforce wall-clock ordering for all writes
 
31
31
 
L
i
n
e
a
r
i
z
a
b
i
l
i
t
y
 
wall-clock time
 
P
1
:
 
P
2
:
 
P
3
:
 
P
4
:
 
A
n
y
 
e
x
e
c
u
t
i
o
n
 
i
s
 
t
h
e
 
s
a
m
e
 
a
s
 
i
f
 
a
l
l
 
r
e
a
d
/
w
r
i
t
e
 
o
p
s
 
w
e
r
e
 
e
x
e
c
u
t
e
d
 
i
n
 
o
r
d
e
r
 
o
f
w
a
l
l
-
c
l
o
c
k
 
t
i
m
e
 
a
t
 
w
h
i
c
h
 
t
h
e
y
 
w
e
r
e
 
i
s
s
u
e
d
Therefore:
Reads are never stale
All replicas enforce wall-clock ordering for all writes
 
32
32
 
L
i
n
e
a
r
i
z
a
b
i
l
i
t
y
:
 
Y
E
S
 
wall-clock time
 
P
1
:
 
P
2
:
 
P
3
:
 
P
4
:
 
A
n
y
 
e
x
e
c
u
t
i
o
n
 
i
s
 
t
h
e
 
s
a
m
e
 
a
s
 
i
f
 
a
l
l
 
r
e
a
d
/
w
r
i
t
e
 
o
p
s
 
w
e
r
e
 
e
x
e
c
u
t
e
d
 
i
n
 
o
r
d
e
r
 
o
f
w
a
l
l
-
c
l
o
c
k
 
t
i
m
e
 
a
t
 
w
h
i
c
h
 
t
h
e
y
 
w
e
r
e
 
i
s
s
u
e
d
Therefore:
Reads are never stale
All replicas enforce wall-clock ordering for all writes
 
33
33
 
L
i
n
e
a
r
i
z
a
b
i
l
i
t
y
:
 
N
O
 
wall-clock time
 
P
1
:
 
P
2
:
 
P
3
:
 
P
4
:
Sequential = Linearizability – real-time ordering
1.
All servers execute all ops in 
some
 identical sequential order
2.
Global ordering preserves each client’s own local ordering
S
e
q
u
e
n
t
i
a
l
 
c
o
n
s
i
s
t
e
n
c
y
 
With concurrent ops, “reordering” of ops (w.r.t. real-time ordering)
acceptable, but all servers must see same order
e.g.,
 
linearizability cares about 
time
  
sequential consistency cares about 
program order
 
A
n
y
 
e
x
e
c
u
t
i
o
n
 
i
s
 
t
h
e
 
s
a
m
e
 
a
s
 
i
f
 
a
l
l
 
r
e
a
d
/
w
r
i
t
e
 
o
p
s
 
w
e
r
e
 
e
x
e
c
u
t
e
d
 
i
n
 
s
o
m
e
 
g
l
o
b
a
l
o
r
d
e
r
i
n
g
,
 
a
n
d
 
t
h
e
 
o
p
s
 
o
f
 
e
a
c
h
 
c
l
i
e
n
t
 
p
r
o
c
e
s
s
 
a
p
p
e
a
r
 
i
n
 
t
h
e
 
p
r
o
g
r
a
m
 
o
r
d
e
r
Therefore:
Reads may be stale in terms of real time, but not in logical time
Writes are totally ordered according to logical time across all replicas
 
35
35
 
S
e
q
u
e
n
t
i
a
l
 
c
o
n
s
i
s
t
e
n
c
y
 
wall-clock time
 
P
1
:
 
P
2
:
 
P
3
:
 
P
4
:
 
A
n
y
 
e
x
e
c
u
t
i
o
n
 
i
s
 
t
h
e
 
s
a
m
e
 
a
s
 
i
f
 
a
l
l
 
r
e
a
d
/
w
r
i
t
e
 
o
p
s
 
w
e
r
e
 
e
x
e
c
u
t
e
d
 
i
n
 
s
o
m
e
 
g
l
o
b
a
l
o
r
d
e
r
i
n
g
,
 
a
n
d
 
t
h
e
 
o
p
s
 
o
f
 
e
a
c
h
 
c
l
i
e
n
t
 
p
r
o
c
e
s
s
 
a
p
p
e
a
r
 
i
n
 
t
h
e
 
p
r
o
g
r
a
m
 
o
r
d
e
r
Therefore:
Reads may be stale in terms of real time, but not in logical time
Writes are totally ordered according to logical time across all replicas
 
36
36
 
S
e
q
u
e
n
t
i
a
l
 
c
o
n
s
i
s
t
e
n
c
y
:
 
Y
E
S
 
wall-clock time
 
P
1
:
 
P
2
:
 
P
3
:
 
P
4
:
 
A
l
s
o
 
v
a
l
i
d
 
w
i
t
h
 
l
i
n
e
a
r
i
z
a
b
i
l
i
t
y
 
A
n
y
 
e
x
e
c
u
t
i
o
n
 
i
s
 
t
h
e
 
s
a
m
e
 
a
s
 
i
f
 
a
l
l
 
r
e
a
d
/
w
r
i
t
e
 
o
p
s
 
w
e
r
e
 
e
x
e
c
u
t
e
d
 
i
n
 
s
o
m
e
 
g
l
o
b
a
l
o
r
d
e
r
i
n
g
,
 
a
n
d
 
t
h
e
 
o
p
s
 
o
f
 
e
a
c
h
 
c
l
i
e
n
t
 
p
r
o
c
e
s
s
 
a
p
p
e
a
r
 
i
n
 
t
h
e
 
p
r
o
g
r
a
m
 
o
r
d
e
r
Therefore:
Reads may be stale in terms of real time, but not in logical time
Writes are totally ordered according to logical time across all replicas
 
37
37
 
S
e
q
u
e
n
t
i
a
l
 
c
o
n
s
i
s
t
e
n
c
y
:
 
Y
E
S
 
wall-clock time
 
P
1
:
 
P
2
:
 
P
3
:
 
P
4
:
 
N
o
t
 
v
a
l
i
d
 
w
i
t
h
 
l
i
n
e
a
r
i
z
a
b
i
l
i
t
y
 
A
n
y
 
e
x
e
c
u
t
i
o
n
 
i
s
 
t
h
e
 
s
a
m
e
 
a
s
 
i
f
 
a
l
l
 
r
e
a
d
/
w
r
i
t
e
 
o
p
s
 
w
e
r
e
 
e
x
e
c
u
t
e
d
 
i
n
 
s
o
m
e
 
g
l
o
b
a
l
o
r
d
e
r
i
n
g
,
 
a
n
d
 
t
h
e
 
o
p
s
 
o
f
 
e
a
c
h
 
c
l
i
e
n
t
 
p
r
o
c
e
s
s
 
a
p
p
e
a
r
 
i
n
 
t
h
e
 
p
r
o
g
r
a
m
 
o
r
d
e
r
Therefore:
Reads may be stale in terms of real time, but not in logical time
Writes are totally ordered according to logical time across all replicas
 
38
38
 
S
e
q
u
e
n
t
i
a
l
 
c
o
n
s
i
s
t
e
n
c
y
:
 
N
O
 
wall-clock time
 
P
1
:
 
P
2
:
 
P
3
:
 
P
4
:
 
N
o
 
g
l
o
b
a
l
 
o
r
d
e
r
i
n
g
 
c
a
n
 
e
x
p
l
a
i
n
 
t
h
e
s
e
 
r
e
s
u
l
t
s
 
A
n
y
 
e
x
e
c
u
t
i
o
n
 
i
s
 
t
h
e
 
s
a
m
e
 
a
s
 
i
f
 
a
l
l
 
r
e
a
d
/
w
r
i
t
e
 
o
p
s
 
w
e
r
e
 
e
x
e
c
u
t
e
d
 
i
n
 
s
o
m
e
 
g
l
o
b
a
l
o
r
d
e
r
i
n
g
,
 
a
n
d
 
t
h
e
 
o
p
s
 
o
f
 
e
a
c
h
 
c
l
i
e
n
t
 
p
r
o
c
e
s
s
 
a
p
p
e
a
r
 
i
n
 
t
h
e
 
p
r
o
g
r
a
m
 
o
r
d
e
r
Therefore:
Reads may be stale in terms of real time, but not in logical time
Writes are totally ordered according to logical time across all replicas
 
39
39
 
S
e
q
u
e
n
t
i
a
l
 
c
o
n
s
i
s
t
e
n
c
y
:
 
N
O
 
wall-clock time
 
P
1
:
 
P
2
:
 
P
3
:
 
P
4
:
w
(
x
=
a
)
w
(
x
=
b
)
 
N
o
 
s
e
q
u
e
n
t
i
a
l
 
g
l
o
b
a
l
 
o
r
d
e
r
i
n
g
 
c
a
n
 
e
x
p
l
a
i
n
 
t
h
e
s
e
 
r
e
s
u
l
t
s
E
.
g
.
:
 
w
(
x
=
c
)
,
 
r
(
x
)
=
c
,
 
r
(
x
)
=
a
,
 
w
(
x
=
b
)
 
d
o
e
s
n
t
 
p
r
e
s
e
r
v
e
 
P
1
s
 
o
r
d
e
r
i
n
g
w
(
x
=
c
)
 
C
a
u
s
a
l
+
 
C
o
n
s
i
s
t
e
n
c
y
 
Partially orders all operations, does not totally order them
Does not look like a single machine
 
Guarantees
For each process, 
 an order of all writes + that process’s reads
Order respects the happens-before (
) 
ordering of operations
+ replicas converge to the same state
Skip details, makes it stronger than eventual consistency
C
a
u
s
a
l
+
 
B
u
t
 
N
o
t
 
S
e
q
u
e
n
t
i
a
l
P
A
P
B
w
(
x
=
1
)
w
(
y
=
1
)
r
(
y
)
=
0
r
(
x
)
=
0
 
P
A
 
O
r
d
e
r
:
 
w
(
x
=
1
)
,
 
r
(
y
=
0
)
,
 
w
(
y
=
1
)
 
H
a
p
p
e
n
s
B
e
f
o
r
e
O
r
d
e
r
 
P
r
o
c
e
s
s
O
r
d
e
r
i
n
g
w
(
x
=
1
)
w
(
y
=
1
)
r
(
y
)
=
0
r
(
x
)
=
0
 
N
o
 
T
o
t
a
l
O
r
d
e
r
w
(
x
=
1
)
w
(
y
=
1
)
r
(
y
)
=
0
r
(
x
)
=
0
 
C
a
s
u
a
l
+
X
 
S
e
q
u
e
n
t
i
a
l
 
P
B
 
O
r
d
e
r
:
 
w
(
y
=
1
)
,
 
r
(
x
=
0
)
,
 
w
(
x
=
1
)
E
v
e
n
t
u
a
l
 
B
u
t
 
N
o
t
 
C
a
u
s
a
l
+
P
A
P
B
 
A
s
 
l
o
n
g
 
a
s
 
P
B
e
v
e
n
t
u
a
l
l
y
 
w
o
u
l
d
s
e
e
 
r
(
x
)
=
1
 
t
h
i
s
 
i
s
f
i
n
e
 
H
a
p
p
e
n
s
B
e
f
o
r
e
O
r
d
e
r
i
n
g
w
(
x
=
1
)
r
(
y
)
=
1
w
(
y
)
=
1
r
(
x
)
=
0
 
N
o
 
O
r
d
e
r
f
o
r
 
P
B
w
(
x
=
1
)
r
(
y
)
=
1
w
(
y
)
=
1
r
(
x
)
=
0
 
E
v
e
n
t
u
a
l
X
 
C
a
u
s
a
l
+
Slide Note
Embed
Share

Explore the concepts of strong consistency, CAP theorem, network partitions, linearizability, and how systems handle partitions. Delve into the trade-offs between consistency, availability, and partition-tolerance as outlined by the CAP theorem.

  • Consistency
  • CAP theorem
  • Network partitions
  • Linearizability
  • Computing systems

Uploaded on Oct 05, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Strong Consistency & CAP Theorem CS 240: Computing Systems and Concurrency Lecture 15 Marco Canini Credits: Michael Freedman and Kyle Jamieson developed much of the original material.

  2. Outline 1. Network Partitions 2. Linearizability 3. CAP Theorem 4. Consistency Hierarchy 2

  3. Network partitions divide systems 3

  4. Network partitions divide systems 4

  5. How can we handle partitions? Totally-ordered Multicast? Bayou? Viewstamped Replication? Chord? Paxos? Dynamo? RAFT? 5

  6. How about this set of partitions? 6

  7. Fundamental trade-off? Replicas appear to be a single machine, but lose availability during a network partition OR All replicas remain available during a network partition but do not appear to be a single machine 7

  8. CAP theorem preview You cannot achieve all three of: 1. Consistency 2. Availability 3. Partition-Tolerance Partition Tolerance => Partitions Can Happen Availability => All Sides of Partition Continue Consistency => Replicas Act Like Single Machine Specifically, Linearizability 8

  9. Outline 1. Network Partitions 2. Linearizability 3. CAP Theorem 4. Consistency Hierarchy 9

  10. Linearizability [Herlihy and Wing 1990] All replicas execute operations in some total order That total order preserves the real-time ordering between operations If operation A completesbefore operation B begins, then A is ordered before B in real-time If neither A nor B completes before the other begins, then there is no real-time order (But there must be some total order) 10

  11. Linearizability == Appears to be a Single Machine Single machine processes requests one by one in the order it receives them Will receive requests ordered by real-time in that order Will receive all requests in some order Atomic Multicast, Viewstamped Replication, Paxos, and RAFT provide Linearizability 11

  12. Linearizability is ideal? Hides the complexity of the underlying distributed system from applications! Easier to write applications Easier to write correct applications But, performance trade-offs, e.g., CAP 12

  13. Outline 1. Network Partitions 2. Linearizability 3. CAP Theorem 4. Consistency Hierarchy 13

  14. CAP conjecture[Brewer 00] From keynote lecture by Eric Brewer (2000) History: Eric started Inktomi, early Internet search site based around commodity clusters of computers Using CAP to justify BASE model: Basically Available, Soft- state services with Eventual consistency Popular interpretation: 2-out-of-3 Consistency (Linearizability) Availability Partition Tolerance: Arbitrary crash/network failures 14

  15. CAP theorem [Gilbert Lynch 02] Assume to contradict that Algorithm A provides all of CAP Client 1 Client 1

  16. CAP theorem [Gilbert Lynch 02] Assume to contradict that Algorithm A provides all of CAP Client 1 Client 1 Partition Possible (from P)

  17. CAP theorem [Gilbert Lynch 02] Assume to contradict that Algorithm A provides all of CAP w(x=1) Client 1 Client 1 ok Write eventually returns (from A) Partition Possible (from P)

  18. CAP theorem [Gilbert Lynch 02] Assume to contradict that Algorithm A provides all of CAP w(x=1) r(x) Client 1 Client 1 ok x=0 Read begins after write completes Read eventually returns (from A) Write eventually returns (from A) Partition Possible (from P)

  19. CAP theorem [Gilbert Lynch 02] Assume to contradict that Algorithm A provides all of CAP Not consistent (C) => contradiction! w(x=1) r(x) Client 1 Client 1 ok x=0 Read begins after write completes Read eventually returns (from A) Write eventually returns (from A) Partition Possible (from P)

  20. CAP interpretation 1/2 Cannot choose no partitions 2-out-of-3 interpretation doesn t make sense Instead, availability OR consistency? i.e., fundamental trade-off between availability and consistency When designing system must choose one or the other, both are not possible

  21. CAP interpretation 2/2 It is a theorem, with a proof, that you understand! Cannot beat CAP theorem Can engineer systems to make partitions extremely rare, however, and then just take the rare hit to availability (or consistency)

  22. More trade-offs L vs. C Low-latency: Speak to fewer than quorum of nodes? 2PC: write N, read 1 RAFT: write N/2 + 1, read N/2 + 1 General: |W| + |R| > N L and C are fundamentally at odds C = linearizability, sequential, serializability (more later) 22

  23. PACELC If there is a partition (P): How does system tradeoff A and C? Else (no partition) How does system tradeoff L and C? Is there a useful system that switches? Dynamo: PA/EL ACID dbs: PC/EC http://dbmsmusings.blogspot.com/2010/04/problems-with-cap-and-yahoos-little.html 23

  24. Outline 1. Network Partitions 2. Linearizability 3. CAP Theorem 4. Consistency Hierarchy 24

  25. Consistency models Contract between a distributed system and the applications that run on it A consistency model is a set of guarantees made by the distributed system e.g., Linearizability Guarantees a total order of operations Guarantees the real-time ordering is respected

  26. Stronger vs weaker consistency Stronger consistency models + Easier to write applications - More guarantees for the system to ensure Results in performance tradeoffs Weaker consistency models - Harder to write applications + Fewer guarantees for the system to ensure

  27. Consistency hierarchy Linearizability (Strong/Strict Consistency) e.g., RAFT Sequential Consistency e.g., Bayou Causal+ Consistency Eventual Consistency e.g., Dynamo

  28. Strictly stronger consistency A consistency model A is strictly stronger than B if it allows a strict subset of the behaviors of B Guarantees are strictly stronger Linearizability is strictly stronger than Sequential Consistency Linearizability: total order + real-time ordering Sequential: total order + process ordering Process ordering Real-time ordering

  29. Intuitive example Consistency model defines what values reads are admissible wall-clock time P1: w(x=a) P2: w(x=b) P3: r(x)=? r(x)=? P4: r(x)=? r(x)=? 29

  30. Intuitive example Consistency model defines what values reads are admissible Time when process issues operation wall-clock time P1: w(x=a) P2: w(x=b) P3: r(x)=? r(x)=? Time when process receives response P4: r(x)=? r(x)=? 30

  31. Linearizability Any execution is the same as if all read/write ops were executed in order of wall-clock time at which they were issued Therefore: Reads are never stale All replicas enforce wall-clock ordering for all writes wall-clock time P1: w(x=a) P2: w(x=b) P3: r(x)=? r(x)=? P4: r(x)=? r(x)=? 31

  32. Linearizability: YES Any execution is the same as if all read/write ops were executed in order of wall-clock time at which they were issued Therefore: Reads are never stale All replicas enforce wall-clock ordering for all writes wall-clock time P1: w(x=a) P2: w(x=b) P3: r(x)=b r(x)=b P4: r(x)=b r(x)=b 32

  33. Linearizability: NO Any execution is the same as if all read/write ops were executed in order of wall-clock time at which they were issued Therefore: Reads are never stale All replicas enforce wall-clock ordering for all writes wall-clock time P1: w(x=a) P2: w(x=b) P3: r(x)=a r(x)=b P4: r(x)=b r(x)=b 33

  34. Sequential consistency Sequential = Linearizability real-time ordering 1. All servers execute all ops in some identical sequential order 2. Global ordering preserves each client s own local ordering With concurrent ops, reordering of ops (w.r.t. real-time ordering) acceptable, but all servers must see same order e.g., linearizability cares about time sequential consistency cares about program order

  35. Sequential consistency Any execution is the same as if all read/write ops were executed in some global ordering, and the ops of each client process appear in the program order Therefore: Reads may be stale in terms of real time, but not in logical time Writes are totally ordered according to logical time across all replicas wall-clock time P1: w(x=a) P2: w(x=b) P3: r(x)=? r(x)=? P4: r(x)=? r(x)=? 35

  36. Sequential consistency: YES Any execution is the same as if all read/write ops were executed in some global ordering, and the ops of each client process appear in the program order Therefore: Reads may be stale in terms of real time, but not in logical time Writes are totally ordered according to logical time across all replicas wall-clock time P1: w(x=a) P2: w(x=b) P3: r(x)=b r(x)=b P4: r(x)=b r(x)=b Also valid with linearizability 36

  37. Sequential consistency: YES Any execution is the same as if all read/write ops were executed in some global ordering, and the ops of each client process appear in the program order Therefore: Reads may be stale in terms of real time, but not in logical time Writes are totally ordered according to logical time across all replicas wall-clock time P1: w(x=a) P2: w(x=b) P3: r(x)=a r(x)=b P4: r(x)=b r(x)=b Not valid with linearizability 37

  38. Sequential consistency: NO Any execution is the same as if all read/write ops were executed in some global ordering, and the ops of each client process appear in the program order Therefore: Reads may be stale in terms of real time, but not in logical time Writes are totally ordered according to logical time across all replicas wall-clock time P1: w(x=a) P2: w(x=b) P3: r(x)=b r(x)=a P4: r(x)=b r(x)=a No global ordering can explain these results 38

  39. Sequential consistency: NO Any execution is the same as if all read/write ops were executed in some global ordering, and the ops of each client process appear in the program order Therefore: Reads may be stale in terms of real time, but not in logical time Writes are totally ordered according to logical time across all replicas wall-clock time P1: w(x=a) w(x=c) P2: w(x=b) P3: r(x)=c r(x)=a P4: r(x)=b r(x)=a No sequentialglobal ordering can explain these results E.g.: w(x=c), r(x)=c, r(x)=a, w(x=b) doesn t preserve P1 s ordering 39

  40. Causal+ Consistency Partially orders all operations, does not totally order them Does not look like a single machine Guarantees For each process, an order of all writes + that process s reads Order respects the happens-before ( ) ordering of operations + replicas converge to the same state Skip details, makes it stronger than eventual consistency

  41. Causal+ But Not Sequential PA r(y)=0 w(x=1) PB w(y=1) r(x)=0 Casual+ X Sequential w(x=1) r(y)=0 Happens Before Order w(x=1) r(y)=0 Process Ordering w(y=1) r(x)=0 w(y=1) r(x)=0 PA Order: w(x=1), r(y=0), w(y=1) w(x=1) r(y)=0 No Total Order PB Order: w(y=1), r(x=0), w(x=1) w(y=1) r(x)=0

  42. Eventual But Not Causal+ PA w(y=1) w(x=1) PB r(y)=1 r(x)=0 Eventual X Causal+ As long as PB eventually would see r(x)=1 this is fine w(x=1) w(y)=1 Happens Before Ordering r(y)=1 r(x)=0 w(x=1) w(y)=1 No Order for PB r(y)=1 r(x)=0

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#