Database System Concurrency Control and Transactions Overview

 
6.5830 Lecture 10
Transactions
 
10/16/2022
Project proposals are due today
Quiz 1 results (Wednesday)
 
T
h
e
 
L
e
c
t
u
r
e
 
A
r
t
 
C
o
l
l
e
c
t
i
o
n
 
S
o
 
f
a
r
 
T
o
d
a
y
s
 
A
r
t
 
W
h
e
r
e
 
A
r
e
 
W
e
?
 
So far:
Studied relational model & SQL
Learned basic architecture of a database system
Studied different operator implementations
Looked at several data layouts
Saw how query optimizer works with statistics to select plans
and operators
 
What next:
Concurrency Control and Recovery
: How to ensure correctness in
the presence of modifications and failures to the database
Distributed and parallel query processing
“Advanced Topics”
Next 4 lectures
 
C
o
n
c
u
r
r
e
n
c
y
 
C
o
n
t
r
o
l
 
K
e
y
I
d
e
a
:
 
T
r
a
n
s
a
c
t
i
o
n
s
 
Group related sequence of actions so they are
“all or nothing”
If the system crashes, partial effects are not seen
Other transactions do not see partial effects
A set of implementation techniques that
provides this abstraction with good
performance
 
A
C
I
D
 
P
r
o
p
e
r
t
i
e
s
 
o
f
 
T
r
a
n
s
a
c
t
i
o
n
s
 
A
 tomicity – many actions look like one; “all or
nothing”
C
 onsistency  – database preserves invariants
I
 solation – concurrent actions don’t see each
other’s results
D
 urability 
 completed actions in effect after
crash (“recoverable”)
C
o
n
c
u
r
r
e
n
t
 
P
r
o
g
r
a
m
m
i
n
g
 
I
s
 
H
a
r
d
 
Example:
T1
t = A
t = t + 1
A = t
Looks correct!
But maybe not if other updates to A are interleaved!
Suppose T1 increment runs just before T2 increment
T1 increment will be lost
 
Conventional approach: programmer adds locks
But must reason about other concurrent programs
 
T2
t = A
t = t + 1
A = t
A = 0
A = 
0
 1
A = 
0 1
 1
 
T
r
a
n
s
a
c
t
i
o
n
s
 
D
r
a
m
a
t
i
c
a
l
l
y
 
S
i
m
p
l
i
f
y
C
o
n
c
u
r
r
e
n
t
 
P
r
o
g
r
a
m
m
i
n
g
 
Concurrent actions are 
serially equivalent
I.e., appear to have run one after the other
 
Programmer does not have to think about
what is running at the same time!
One of the big ideas in computer science
 
S
Q
L
 
S
y
n
t
a
x
 
BEGIN TRANSACTION
Followed by SQL operations that modify database
COMMIT
: make the effects of the transaction
durable
After COMMIT returns database guarantees
results present even after crash
And results are visible to other transactions
ABORT
: undo all effects of the transaction
 
T
h
i
s
 
L
e
c
t
u
r
e
:
 
A
t
o
m
i
c
i
t
y
 
A
tomicity – many actions like one; “all or nothing”
In reality, actions take time!
To get atomicity, to prevent multiple actions from
interfering with each other
I.e., are 
I
solated
 
Will return to 
D
urability in 2 lectures
E.g., how to 
recover
 a database after a crash into a state
where no partial transactions are present
 
C
o
n
s
i
s
t
e
n
c
y
 
Preservation of invariants
Usually expressed in terms of constraints
E.g., primary keys / foreign keys
Triggers
Example: no employee makes more than their
manager
Requires ugly non-SQL syntax (e.g. PL/pgSQL)
Often done in the application
P
o
s
t
g
r
e
s
 
P
L
/
p
g
S
Q
L
 
T
r
i
g
g
e
r
 
E
x
a
m
p
l
e
 
CREATE FUNCTION sal_check() RETURNS trigger AS $sal_check$
 
DECLARE
  
mgr_sal integer;
    
 
BEGIN
                      IF NEW.salary IS NOT NULL THEN
   
SELECT INTO mgr_sal salary
    
FROM emp
    
JOIN manages
     
ON NEW.eid = manages.eid
     
AND emp.eid = manages.eid
    
LIMIT 1;
   
IF (mgr_sal < NEW.salary) THEN
            
  
                 RAISE EXCEPTION 'employee cannot make more than manager’;
 
                      END IF;
                       END IF;
                       RETURN NEW;
    END;
$sal_check$ LANGUAGE plpgsql;
CREATE TRIGGER eid_sal BEFORE INSERT OR UPDATE ON emp
FOR EACH ROW EXECUTE FUNCTION sal_check();
 
NEW is the record being added
 
mgr_sal is a local variable
Query finds the salary of one
manager
 
Check salary (if no manager, mgr_sal is NULL)
 
Declare that we want to call sal_check
every time a record changes or is added to emp
H
o
w
 
C
a
n
 
W
e
 
I
s
o
l
a
t
e
 
A
c
t
i
o
n
s
?
 
Serialize execution:  one transaction at a time
Problems with this?
No ability to use multiple processors
Long running transactions 
starve
 others
 
Goal:  allow 
concurrent
 execution while
preserving 
serial equivalence
 
Concurrency control 
algorithms do this
S
e
r
i
a
l
i
z
a
b
i
l
i
t
y
An ordering of actions in concurrent
transactions that is serially equivalent
T1
   
T2
RA
   
RA
RB
   
RB
WA
   
WA
WB
   
WB
    
   
T1
   
T2
RA
    
WA
   
RA
   
WA
RB
WB
   
RB
   
WB
   
    
 
RA
: Read A
WA
: Write A, may depend on anything read previously
 
A/B are “objects” – e.g., records, disk pages, etc
 
Assume arbitrary application logic between reads and
writes
 
Serially equivalent 
to T1 then T2
S
e
r
i
a
l
i
z
a
b
i
l
i
t
y
An ordering of actions in concurrent
transactions that is serially equivalent
T1
   
T2
RA
   
RA
RB
   
RB
WA
   
WA
WB
   
WB
    
   
T1
   
T2
RA
    
   
RA
   
WA
WA
RB
WB
   
RB
   
WB
   
    
RA
: Read A
WA
: Write A, may depend on anything read previously
A/B are “objects” – e.g., records, disk pages, etc
Assume arbitrary application logic between reads and
writes
 
Not serially equivalent 
T2’s write to A is lost, couldn’t
occur in a serial schedule
 
In T1-T2, T2 should see T1’s write to A
 
In T2-T1, T1 should see T2’s write to A
 
T
e
s
t
i
n
g
 
f
o
r
 
S
e
r
i
a
l
i
z
a
b
i
l
i
t
y
 
View
Serializability
 
Any schedule that is conflict serializable is view serializable, but not vice-versa
Conflict
Serializability
V
i
e
w
 
S
e
r
i
a
l
i
z
a
b
i
l
i
t
y
 
A particular ordering of instructions in a schedule S is 
view
equivalent 
to a serial ordering S' iff:
 
Every value read in S is the same value that was read by the
same read in S'.
 
The final write of every object is done by the same transaction
T in S and S’
 
Less formally, all transactions in S “view” the same
values they view in S', and the final state after the
transactions run is  the same.
V
i
e
w
 
S
e
r
i
a
l
i
z
a
b
i
l
i
t
y
 
E
x
a
m
p
l
e
Every value read in S is the same value that
was read by the same read in S'.
The final write of every object is done by the
same transaction T in S and S'
S
 
         
S’
T1
  
 
  
T2
 
     
T1 
    
T2
RA=A1 
        
RA= A1
WA
A2 
       
WA
A2
    
RA = A2 
    
RB = B1
    
WA

 
   
WB
B2
RB=B1 
            
RA = A2
WB
B2 
           
WA

    
RB=B2 
        
RB = B2
    
WB

 
       
WB

h
t
t
p
s
:
/
/
c
l
i
c
k
e
r
.
m
i
t
.
e
d
u
/
6
.
5
8
3
0
/
I
s
 
t
h
e
 
f
o
l
l
o
w
i
n
g
 
s
c
h
e
d
u
l
e
v
i
e
w
 
s
e
r
i
a
l
i
z
a
b
l
e
?
A)
Yes
B)
No
 
A particular ordering of instructions in a schedule S is 
view
equivalent 
to a serial ordering S' iff:
Every value read in S is the same value that was read by the
same read in S'.
The final write of every object is done by the same transaction
T in S and S’
V
i
e
w
 
S
e
r
i
a
l
i
z
a
b
i
l
i
t
y
 
L
i
m
i
t
a
t
i
o
n
s
Must test against each possible serial schedule
to determine serial equivalence
NP-Hard!
No protocol to ensure view serializability as
transactions run
Conflict serializability 
addresses both points
 
(For N concurrent transactions, there
are 2
N
 possible serial schedules)
C
o
n
f
l
i
c
t
i
n
g
 
O
p
e
r
a
t
i
o
n
s
 
Two operations are said to conflict if:
Both operations are on the same object
At least one operation is a write
E.g.,
T1
WA
 conflicts with T2
RA
, but
T1
RA
 does not conflict with T2
RA
 
T1
 
T2
C
o
n
f
l
i
c
t
 
S
e
r
i
a
l
i
z
a
b
i
l
i
t
y
 
A schedule is 
conflict serializable
 if it is possible
to swap non-conflicting operations to derive a
serial schedule.
   
Equivalently
For all pairs of conflicting operations {O1 in T1,
O2 in T2} either
O1 always precedes O2, or
O2 always precedes O1.
 
T1 
 T2 : “T1 precedes T2”
E
x
a
m
p
l
e
T1
   
T2
RA
    
   
RA
   
WA
WA
RB
WB
   
RB
   
WB
   
    
T1
   
T2
RA
WA
    
   
RA
   
WA
RB
WB
   
RB
   
WB
   
    
For all pairs of conflicting operations {O1 in T1, O2 in T2} either
O1 always precedes O2, or
O2 always precedes O1.
 
T1 
 T2
 
T2 
 T1
 
Not conflict serializable!
 
T1 
 T2
 
T1 
 T2
 
Conflict serializable!
T1
   
T2
RA
WA
RB
WB
    
   
RA
   
WA
   
RB
   
WB
   
    
In conflict serializable schedule,
can reorder non-conflicting ops to
get serial schedule
 
O1
 
O2
 
P
r
e
c
e
d
e
n
c
e
 
G
r
a
p
h
 
Given transactions Ti and Tj,
Create an edge from Ti
Tj if:
 
Ti reads/writes some A before Tj writes A
RA
Ti
 WA
Tj  
or
 
WA
Ti
 WA
Tj
      
or
Ti writes some A before Tj reads A
WA
Ti
 RA
Tj
 
If there are cycles in this graph, schedule is not conflict
serializable
N
o
n
-
S
e
r
i
a
l
i
z
a
b
l
e
 
E
x
a
m
p
l
e
T1
   
T2
RA
    
   
RA
   
WA
WA
RB
WB
   
RB
   
WB
   
    
T1
T2
 
Cycle!
Create an edge from Ti
Tj if:
Ti reads/writes some A before Tj writes A, or
RA
Ti
 WA
Tj  
or
 
WA
Ti
 WA
Tj 
Ti writes some A before Tj reads A
WA
Ti
 RA
Tj
Precedence Graph
S
e
r
i
a
l
i
z
a
b
l
e
 
E
x
a
m
p
l
e
T1
   
T2
RA
WA
    
   
RA
   
WA
RB
WB
   
RB
   
WB
   
    
T1
T2
 
No Cycles!
Create an edge from Ti
Tj if:
Ti reads/writes some A before Tj writes A, or
RA
Ti
 WA
Tj  
or
 
WA
Ti
 WA
Tj 
Ti writes some A before Tj reads A
WA
Ti
 RA
Tj
Precedence Graph
 
WA
T1
≺ RA
T2
 
WA
T1
≺ WA
T2
 
R
e
c
a
p
:
 
3
 
W
a
y
s
 
t
o
 
T
e
s
t
 
f
o
r
 
C
o
n
f
l
i
c
t
S
e
r
i
a
l
i
z
a
b
i
l
i
y
1.
Check: For all pairs of conflicting
operations {O1 in T1, O2 in T2}
either
a.
O1 always precedes O2, or
b.
O2 always precedes O1.
 
2.
Swap non-conflicting operations
to get serial schedule
3.
Build precedence graph, check
for cycles
 
C
l
i
c
k
e
r
:
h
t
t
p
s
:
/
/
c
l
i
c
k
e
r
.
m
i
t
.
e
d
u
/
6
.
5
8
3
0
/
 
Is this schedule conflict serializable?
 
A)
Yes
B)
No
S
t
u
d
y
 
B
r
e
a
k
Is this schedule conflict serializable?
T1
T2
T3
 
No!
 
h
t
t
p
s
:
/
/
c
l
i
c
k
e
r
.
m
i
t
.
e
d
u
/
6
.
5
8
3
0
/
 
Is this schedule
A)
neither view nor conflict serializable
B)
conflict serializable but 
not
 view serializable
C)
view serializable but 
not
 conflict serializable
D)
conflict and view serializable
V
i
e
w
 
v
s
 
C
o
n
f
l
i
c
t
 
S
e
r
i
a
l
i
z
a
b
l
e
 
Testing for view serializability is NP-Hard
Have to consider all possible orderings
Conflict serializability used in practice
Not because of NP-Hardness
Because we have a way to enforce it as transactions run
Example of schedule that is view serializable but not conflict serializable:
 
T1
 
  
T2
 
  
T3
RA
  
WA
WA
    
WA
RB
WB
 
Equivalent to T1, T2, T3
Conflict serializability does not permit this
Only happens with 
blind writes
 
Cycle!
 
V
i
e
w
 
v
s
 
C
o
n
f
l
i
c
t
 
S
e
r
i
a
l
i
z
a
b
l
e
 
View
Serializability
 
Any schedule that is conflict serializable is view serializable, but not vice-versa
Conflict
Serializability
I
m
p
l
e
m
e
n
t
i
n
g
 
C
o
n
f
l
i
c
t
 
S
e
r
i
a
l
i
z
a
b
i
l
i
t
y
Several different protocols
Today: Two Phase Locking (2PL)
Basic idea:
Acquire a shared (S) lock before each read of
an object
Acquire an exclusive (X) lock before each write
of an object
Several transactions can hold an S lock
Only one transaction can hold an X lock
If a transaction cannot acquire a lock it waits
(“blocks”)
 
Lock Compatibility Table
 
Conflicting operations (from def. of conflict serializability) are not
compatible with each other
 
T1
 
T2
 
T1
 
T2
W
h
e
n
 
t
o
 
R
e
l
e
a
s
e
 
L
o
c
k
s
 
After each op completes?
Or after xaction is done with
variable?
No! Example of problem 
T2 “sneaks in” and updates
A and B before T1 updates B
T1
    
T2
Xlock A
RA
WA
Rel A
    
Xlock A
    
RA
    
WA
    
Xlock B
    
RB
    
WB
    
Rel A,B
Xlock B
RB
WB
Rel B
 
This schedule is not serializable
 
S
o
l
u
t
i
o
n
:
 
T
w
o
 
P
h
a
s
e
 
L
o
c
k
i
n
g
 
A transaction cannot release any locks until it
has acquired all of its locks
 
E
x
a
m
p
l
e
,
 
R
e
v
i
s
i
t
e
d
 
Rule: A transaction
cannot release any
locks until it has
acquired all of its
locks
T1
    
T2
Xlock A
RA
WA
Rel A
    
Xlock A
    
RA
    
WA
    
Xlock B
    
RB
    
WB
    
Rel A,B
Xlock B
RB
WB
Rel B
 
This schedule is not serializable
 
Not allowed 
E
x
a
m
p
l
e
,
 
R
e
v
i
s
i
t
e
d
 
Rule: A transaction
cannot release any
locks until it has
acquired all of its
locks
Serial schedule
defined by 
lock
points
Where they acquire
last lock
 
T1
    
T2
Xlock A
RA
WA
Xlock B
Rel A
    
Xlock A
    
RA
  
    
WA
RB
WB
  
Rel B
     
    
Xlock B
    
RB
    
WB
    
Rel A,B
    
This schedule *is* serializable
Acquired all 
locks so
can release
 
 
Lock point
 
Lock point 
 
C
o
r
r
e
c
t
n
e
s
s
 
I
n
t
u
i
t
i
o
n
 
Once a transaction T reached its lock point:
T’s place in serial order is set
Any transactions that haven't acquired all their
locks can’t take any conflicting actions until after T
releases locks
Ordered later
Any transactions which already have all their locks
must have completed their conflicting actions
(released their locks) before T can proceed
Ordered earlier
T
w
o
 
P
h
a
s
e
 
L
o
c
k
i
n
g
 
(
2
P
L
)
 
P
r
o
t
o
c
o
l
 
 
Before every read, acquire a shared lock
 
Before every write, acquire an exclusive lock
(or "upgrade") a shared to an exclusive lock
 
Release locks only after last lock has been
acquired, and ops on that object are finished
 
 
C
a
n
 
y
o
u
 
t
h
i
n
k
 
o
f
 
a
n
y
 
p
o
t
e
n
t
i
a
l
p
r
o
b
l
e
m
s
 
w
i
t
h
 
2
P
L
?
 
R
e
f
i
n
i
n
g
 
2
P
L
 
Problems:
Deadlocks
Cascading Aborts
 
How do we know when we are done with all
operations on an object?
D
e
a
d
l
o
c
k
s
Possible for Ti to hold a lock Tj needs, and vice
versa
T1
    
T2
RA
WA
    
RB
  
    
WB
RB
WB
  
    
RA
    
WA
 
    
 
T1 waits for T2 
 
 T2 waits for T1
C
o
m
p
l
e
x
 
D
e
a
d
l
o
c
k
s
 
A
r
e
 
P
o
s
s
i
b
l
e
T1
    
T2
    
T3
RA
WA
        
RC
    
RB
      
    
WB
        
RA
        
WA
RB
WB
  
    
RC
    
WC
 
    
 
T1 waits for T2 
 
 T2 waits for T3
 
 T3 waits for T1
T1
T2
Waits-for 
graph
Cycle 
 Deadlock
T3
R
e
s
o
l
v
i
n
g
 
D
e
a
d
l
o
c
k
Solution: abort one of the transactions
Recall: users can abort too
T3
 
Equivalent to T2 - T1
C
a
s
c
a
d
i
n
g
 
A
b
o
r
t
s
Problem: if T1 aborts, and T2 read something
T1 wrote, T2 also needs to abort
T1
    
T2
Xlock A
RA
WA
Xlock B
Rel A
    
Xlock A
    
RA
  
    
WA
RB
WB
  
Rel B
     
    
Xlock B
    
RB
    
WB
    
Rel A,B
    
 
If T1 aborts here
 
T2 also needs to abort,
It reads T1’s write of A
 
C
a
n
 
y
o
u
 
t
h
i
n
k
 
o
f
 
a
 
2
P
L
 
v
a
r
i
a
n
t
 
w
h
i
c
h
n
e
i
t
h
e
r
 
r
e
q
u
i
r
e
s
 
d
e
a
d
l
o
c
k
 
d
e
t
e
c
t
i
o
n
 
n
o
r
h
a
s
 
c
a
s
c
a
d
i
n
g
 
a
b
o
r
t
s
?
 
S
t
r
i
c
t
 
T
w
o
-
P
h
a
s
e
 
L
o
c
k
i
n
g
 
Can avoid cascading aborts by holding
exclusive locks until end of transaction
 
Ensures that transactions never read other
transaction’s uncommitted data
S
t
r
i
c
t
 
T
w
o
-
P
h
a
s
e
 
L
o
c
k
i
n
g
 
P
r
o
t
o
c
o
l
 
Before every read, acquire a shared lock
 
Before every write, acquire an exclusive lock (or "upgrade") a
shared to an exclusive lock
 
Release locks only after last lock has been acquired, and ops on that
object are finished
Release 
shared
 locks only after last lock has been acquired, and ops
on that object are finished
 
Release 
exclusive
 locks only after the transaction commits
 
Ensures cascadeless-ness
 
P
r
o
b
l
e
m
:
 
W
h
e
n
 
i
s
 
i
t
 
O
K
 
t
o
 
r
e
l
e
a
s
e
?
 
How does DBMS know a transaction no longer
needs a lock?
Difficult, since transactions can be issued
interactively
In practice, this means that all locks held til
end of transaction
This is called 
rigorous two-phase locking
R
i
g
o
r
o
u
s
 
T
w
o
-
P
h
a
s
e
 
L
o
c
k
i
n
g
P
r
o
t
o
c
o
l
 
 
Before every read, acquire a shared lock
 
Before every write, acquire an exclusive lock (or
"upgrade") a shared to an exclusive lock
 
Release (all) locks only after the transaction commits
 
Ensures cascadeless-ness, and
Commit order = serialization order
 
 
C
a
n
 
y
o
u
 
a
v
o
i
d
 
d
e
a
d
l
o
c
k
 
d
e
t
e
c
t
i
o
n
?
 
 
C
l
i
c
k
e
r
:
h
t
t
p
s
:
/
/
c
l
i
c
k
e
r
.
m
i
t
.
e
d
u
/
6
.
5
8
3
0
/
 
UPDATE
 
 professors
SET
 
status = ‘teaching'
WHERE
 
name = 'Tim'
AND
 
NOT
 
EXISTS
 
SELECT
 
1
 
FROM
 
employees 
WHERE
 
status = 
‘teaching’ AND name= ' Sam'
 
Can you create a serializable interleaved
schedule?
 
UPDATE
 
 professors
SET
 
status = ‘teaching'
WHERE
 
name = ' Sam'
AND
 
NOT
 
EXISTS
 
SELECT
 
1
 
FROM
 
employees 
WHERE
 
status = 
‘teaching’ AND name= ‘Tim'
 
C
l
i
c
k
e
r
:
h
t
t
p
s
:
/
/
c
l
i
c
k
e
r
.
m
i
t
.
e
d
u
/
6
.
5
8
3
0
/
 
UPDATE
 
 professors
SET
 
status = ‘teaching'
WHERE
 
name = 'Tim'
AND
 
NOT
 
EXISTS
 
SELECT
 
1
 
FROM
 
employees 
WHERE
 
status = 
‘teaching’ AND name= ' Sam'
 
S
T1
  
 
  
T2
Rs=s1
    
Rt=t1
Wt

2
    
Ws

 
UPDATE
 
 professors
SET
 
status = ‘teaching'
WHERE
 
name = ' Sam'
AND
 
NOT
 
EXISTS
 
SELECT
 
1
 
FROM
 
employees 
WHERE
 
status = 
‘teaching’ AND name= ‘Tim'
 
C
l
i
c
k
e
r
:
h
t
t
p
s
:
/
/
c
l
i
c
k
e
r
.
m
i
t
.
e
d
u
/
6
.
5
8
3
0
/
 
UPDATE
 
 professors
SET
 
status = ‘teaching'
WHERE
 
name = 'Tim'
AND
 
NOT
 
EXISTS
 
SELECT
 
1
 
FROM
 
employees 
WHERE
 
status = 
‘teaching’ AND name= ' Sam'
 
S
T1
  
 
  
T2
Rs=s1
    
Rt=t1
Wt

2
    
Ws

 
UPDATE
 
 professors
SET
 
status = ‘teaching'
WHERE
 
name = ' Sam'
AND
 
NOT
 
EXISTS
 
SELECT
 
1
 
FROM
 
employees 
WHERE
 
status = 
‘teaching’ AND name= ‘Tim'
 
Is this schedule
A)
neither view nor conflict
serializable
B)
conflict serializable but 
not
 view
serializable
C)
view serializable but 
not
 conflict
serializable
D)
conflict and view serializable
 
N
e
x
t
 
1
.
5
 
L
e
c
t
u
r
e
s
 
Optimistic concurrency control: Another
protocol to achieve conflict serializability
 
Nuances that arise with locking granularity
Slide Note
Embed
Share

Studying relational models, SQL, database system architecture, operator implementations, data layouts, and query optimization laid the foundation for advanced topics like Concurrency Control and Recovery. Discover how transactions group related actions, ACID properties ensure data integrity, and the challenges of concurrent programming are simplified through transactions. Explore distributed and parallel query processing alongside future lecture topics in the exciting world of database systems.


Uploaded on May 16, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. 6.5830 Lecture 10 Transactions 10/16/2022 Project proposals are due today Quiz 1 results (Wednesday)

  2. The Lecture Art Collection So far

  3. Todays Art

  4. Where Are We? So far: Studied relational model & SQL Learned basic architecture of a database system Studied different operator implementations Looked at several data layouts Saw how query optimizer works with statistics to select plans and operators What next: Concurrency Control and Recovery: How to ensure correctness in the presence of modifications and failures to the database Distributed and parallel query processing Advanced Topics

  5. Next 4 lectures

  6. Concurrency Control Key Idea: Transactions Group related sequence of actions so they are all or nothing If the system crashes, partial effects are not seen Other transactions do not see partial effects A set of implementation techniques that provides this abstraction with good performance

  7. ACID Properties of Transactions A tomicity many actions look like one; all or nothing C onsistency database preserves invariants I solation concurrent actions don t see each other s results D urability completed actions in effect after crash ( recoverable )

  8. Concurrent Programming Is Hard A = 0 A = 0 1 A = 0 1 1 Example: T1 t = A t = t + 1 A = t Looks correct! But maybe not if other updates to A are interleaved! Suppose T1 increment runs just before T2 increment T1 increment will be lost T2 t = A t = t + 1 A = t Conventional approach: programmer adds locks But must reason about other concurrent programs

  9. Transactions Dramatically Simplify Concurrent Programming Concurrent actions are serially equivalent I.e., appear to have run one after the other Programmer does not have to think about what is running at the same time! One of the big ideas in computer science

  10. SQL Syntax BEGIN TRANSACTION Followed by SQL operations that modify database COMMIT: make the effects of the transaction durable After COMMIT returns database guarantees results present even after crash And results are visible to other transactions ABORT: undo all effects of the transaction

  11. This Lecture: Atomicity Atomicity many actions like one; all or nothing In reality, actions take time! To get atomicity, to prevent multiple actions from interfering with each other I.e., are Isolated Will return to Durability in 2 lectures E.g., how to recover a database after a crash into a state where no partial transactions are present

  12. Consistency Preservation of invariants Usually expressed in terms of constraints E.g., primary keys / foreign keys Triggers Example: no employee makes more than their manager Requires ugly non-SQL syntax (e.g. PL/pgSQL) Often done in the application

  13. Postgres PL/pgSQL Trigger Example CREATE FUNCTION sal_check() RETURNS trigger AS $sal_check$ DECLARE mgr_sal integer; BEGIN IF NEW.salary IS NOT NULL THEN SELECT INTO mgr_sal salary FROM emp JOIN manages ON NEW.eid = manages.eid AND emp.eid = manages.eid LIMIT 1; IF (mgr_sal < NEW.salary) THEN RAISE EXCEPTION 'employee cannot make more than manager ; END IF; END IF; RETURN NEW; END; $sal_check$ LANGUAGE plpgsql; CREATE TRIGGER eid_sal BEFORE INSERT OR UPDATE ON emp FOR EACH ROW EXECUTE FUNCTION sal_check(); NEW is the record being added mgr_sal is a local variable Query finds the salary of one manager Check salary (if no manager, mgr_sal is NULL) Declare that we want to call sal_check every time a record changes or is added to emp

  14. How Can We Isolate Actions? Serialize execution: one transaction at a time Problems with this? No ability to use multiple processors Long running transactions starve others Goal: allow concurrent execution while preserving serial equivalence Concurrency control algorithms do this

  15. Serializability An ordering of actions in concurrent transactions that is serially equivalent T1 RA RB WA WB WB Serially equivalent to T1 then T2 T1 RA WA RB T2 T2 RA RB WA WB RB WB RA: Read A WA: Write A, may depend on anything read previously RA WA A/B are objects e.g., records, disk pages, etc Assume arbitrary application logic between reads and writes

  16. Serializability An ordering of actions in concurrent transactions that is serially equivalent T1 RA RB WA WB WB Not serially equivalent T2 s write to A is lost, couldn t occur in a serial schedule In T1-T2, T2 should see T1 s write to A In T2-T1, T1 should see T2 s write to A T1 RA WA RB T2 RA RB WA WB RB WB T2 RA WA RA: Read A WA: Write A, may depend on anything read previously A/B are objects e.g., records, disk pages, etc Assume arbitrary application logic between reads and writes

  17. Testing for Serializability Conflict Serializability View Serializability Any schedule that is conflict serializable is view serializable, but not vice-versa

  18. View Serializability A particular ordering of instructions in a schedule S is view equivalent to a serial ordering S' iff: Every value read in S is the same value that was read by the same read in S'. The final write of every object is done by the same transaction T in S and S Less formally, all transactions in S view the same values they view in S', and the final state after the transactions run is the same.

  19. View Serializability Example S T1 RA=A1 WA A2 RB=B1 WB B2 T2 RA = A2 WA A3 RB=B2 WB B3 S T1 RA= A1 WA A2 RB = B1 WB B2 T2 RA = A2 WA A3 RB = B2 WB B3 Same values read in both schedules T2 does final write in both schedules Every value read in S is the same value that was read by the same read in S'. The final write of every object is done by the same transaction T in S and S'

  20. https://clicker.mit.edu/6.5830/ Is the following schedule view serializable? T1 T2 RA=A1 RA=A1 WA->A2 WB->B2 WB->B3 A particular ordering of instructions in a schedule S is view equivalent to a serial ordering S' iff: Every value read in S is the same value that was read by the same read in S'. The final write of every object is done by the same transaction T in S and S A)Yes B)No

  21. View Serializability Limitations Must test against each possible serial schedule to determine serial equivalence NP-Hard! (For N concurrent transactions, there are 2N possible serial schedules) No protocol to ensure view serializability as transactions run Conflict serializability addresses both points

  22. Conflicting Operations Two operations are said to conflict if: Both operations are on the same object At least one operation is a write E.g., T1WA conflicts with T2RA, but T1RA does not conflict with T2RA T1 R W T2 R W

  23. Conflict Serializability A schedule is conflict serializable if it is possible to swap non-conflicting operations to derive a serial schedule. Equivalently For all pairs of conflicting operations {O1 in T1, O2 in T2} either O1 always precedes O2, or O2 always precedes O1. T1 T2 : T1 precedes T2

  24. Example T1 RA WA RB WB Not conflict serializable! T2 RA WA T1 RA WA RB WB Conflict serializable! can reorder non-conflicting ops to get serial schedule T2 T2 T1 RA WA RB WB T1 T2 O1 T1 T2 RA WA RA WA RB WB O2 T2 T1 T1 T2 RB WB RB WB In conflict serializable schedule, For all pairs of conflicting operations {O1 in T1, O2 in T2} either O1 always precedes O2, or O2 always precedes O1.

  25. Precedence Graph Given transactions Ti and Tj, Create an edge from Ti Tj if: Ti reads/writes some A before Tj writes A RATi WATj orWATi WATj Ti writes some A before Tj reads A WATi RATj or If there are cycles in this graph, schedule is not conflict serializable

  26. Non-Serializable Example Precedence Graph T1 RA WA RB WB Create an edge from Ti Tj if: T2 RA WA T2 T1 RB WB Cycle! Ti reads/writes some A before Tj writes A, or RATi WATj orWATi WATj Ti writes some A before Tj reads A WATi RATj

  27. Serializable Example Precedence Graph T1 RA WA RB WB Create an edge from Ti Tj if: T2 RA WA T2 T1 RB WB No Cycles! Ti reads/writes some A before Tj writes A, or RATi WATj orWATi WATj Ti writes some A before Tj reads A WATi RATj

  28. Recap: 3 Ways to Test for Conflict Serializabiliy 1. Check: For all pairs of conflicting operations {O1 in T1, O2 in T2} either a. O1 always precedes O2, or b. O2 always precedes O1. 2. Swap non-conflicting operations to get serial schedule 3. Build precedence graph, check for cycles

  29. Clicker: https://clicker.mit.edu/6.5830/ Is this schedule conflict serializable? T1 T2 T3 A)Yes B)No RA RB WA RB WB WB RA WA COMMIT COMMIT COMMIT

  30. T3 Study Break T2 Is this schedule conflict serializable? T1 T1 T2 T3 RA RB No! WA RB WB WB RA WA COMMIT COMMIT COMMIT

  31. https://clicker.mit.edu/6.5830/ T1 T2 T3 RA WA WA WA RB WB Is this schedule A) neither view nor conflict serializable B) conflict serializable but not view serializable C) view serializable but not conflict serializable D) conflict and view serializable

  32. View vs Conflict Serializable Testing for view serializability is NP-Hard Have to consider all possible orderings Conflict serializability used in practice Not because of NP-Hardness Because we have a way to enforce it as transactions run Example of schedule that is view serializable but not conflict serializable: T1 RA WA RB WB T2 T3 Cycle! T2 Blind Writes WA T1 WA Equivalent to T1, T2, T3 Conflict serializability does not permit this Only happens with blind writes T3

  33. View vs Conflict Serializable Conflict Serializability View Serializability Any schedule that is conflict serializable is view serializable, but not vice-versa

  34. Implementing Conflict Serializability T1 R W Several different protocols Today: Two Phase Locking (2PL) Basic idea: Acquire a shared (S) lock before each read of an object Acquire an exclusive (X) lock before each write of an object Several transactions can hold an S lock Only one transaction can hold an X lock If a transaction cannot acquire a lock it waits ( blocks ) T2 R W T1 S X T2 S X Lock Compatibility Table Conflicting operations (from def. of conflict serializability) are not compatible with each other

  35. When to Release Locks T1 Xlock A RA WA Rel A Xlock B RB WB Rel B T2 After each op completes? Or after xaction is done with variable? No! Example of problem T2 sneaks in and updates A and B before T1 updates B Xlock A RA WA Xlock B RB WB Rel A,B This schedule is not serializable

  36. Solution: Two Phase Locking A transaction cannot release any locks until it has acquired all of its locks

  37. Example, Revisited T1 Xlock A RA WA Rel A Xlock B RB WB Rel B T2 Rule: A transaction cannot release any locks until it has acquired all of its locks Not allowed Xlock A RA WA Xlock B RB WB Rel A,B This schedule is not serializable

  38. Example, Revisited Rule: A transaction cannot release any locks until it has acquired all of its locks Serial schedule defined by lock points Where they acquire last lock T1 Xlock A RA WA Xlock B Rel A RB WB Rel B T2 Lock point Acquired all locks so can release Xlock A RA WA Xlock B RB WB Rel A,B Lock point This schedule *is* serializable

  39. Correctness Intuition Once a transaction T reached its lock point: T s place in serial order is set Any transactions that haven't acquired all their locks can t take any conflicting actions until after T releases locks Ordered later Any transactions which already have all their locks must have completed their conflicting actions (released their locks) before T can proceed Ordered earlier

  40. Two Phase Locking (2PL) Protocol Before every read, acquire a shared lock Before every write, acquire an exclusive lock (or "upgrade") a shared to an exclusive lock Release locks only after last lock has been acquired, and ops on that object are finished

  41. Can you think of any potential problems with 2PL?

  42. Refining 2PL Problems: Deadlocks Cascading Aborts How do we know when we are done with all operations on an object?

  43. Deadlocks Possible for Ti to hold a lock Tj needs, and vice versa T1 RA WA RB WB T2 Waits-for graph Cycle Deadlock T1 RB WB T2 T1 waits for T2 T2 waits for T1 RA WA

  44. Complex Deadlocks Are Possible T1 RA WA RB WB T2 T3 RB WB RC RA WA T3 waits for T1 T1 waits for T2 T2 waits for T3 RC WC Waits-for graph Cycle Deadlock T1 T2 T3

  45. Resolving Deadlock Solution: abort one of the transactions Recall: users can abort too T1 RA WA RB WB T2 T3 RB WB RC RA WA T3 waits for T1 T1 waits for T2 Waits-for graph Cycle Deadlock T1 RC WC T2 waits for T3 T2 Equivalent to T2 - T1 T3

  46. Cascading Aborts Problem: if T1 aborts, and T2 read something T1 wrote, T2 also needs to abort T1 Xlock A RA WA Xlock B Rel A RB WB Rel B T2 Xlock A RA WA If T1 aborts here T2 also needs to abort, It reads T1 s write of A Xlock B RB WB Rel A,B

  47. Can you think of a 2PL variant which neither requires deadlock detection nor has cascading aborts?

  48. Strict Two-Phase Locking Can avoid cascading aborts by holding exclusive locks until end of transaction Ensures that transactions never read other transaction s uncommitted data

  49. Strict Two-Phase Locking Protocol Before every read, acquire a shared lock Before every write, acquire an exclusive lock (or "upgrade") a shared to an exclusive lock Release locks only after last lock has been acquired, and ops on that object are finished Release shared locks only after last lock has been acquired, and ops on that object are finished Release exclusive locks only after the transaction commits Ensures cascadeless-ness

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#