Lock-Free and Wait-Free Algorithms in Concurrent Data Structures

 
Expander: Lock-free Cache
for a Concurrent Data
Structure
 
P
O
O
J
A
 
A
G
G
A
R
W
A
L
 
(
I
B
M
 
R
E
S
E
A
R
C
H
,
 
B
A
N
G
A
L
O
R
E
)
S
M
R
U
T
I
 
R
.
 
S
A
R
A
N
G
I
 
(
I
I
T
 
D
E
L
H
I
)
 
1
Concurrent Object
Concurrent Object
Threads
Each thread executes a method on
the object
method
request
response
Thread 
1
 
method
Thread 2
Thread 3
Timeline
Linearizability
2
Blocking vs Non-blocking Algorithms
Blocking algorithm with a lock
Non-blocking algorithm
 
lock
     
val 
= acc.
balance
     acc.
balance
 += 100
unlock
return 
val
 
val = 
acc.
balance
.
          fetchAndAdd
 (100)
return
 
val
3
Few Definitions
 
 Blocking algorithm with locks
  Only one thread is allowed to hold a lock and enter the critical section
 Lock-free algorithms
 There are no locks
 At any point of time at least one thread makes progress and completes the
operation
 Wait-free algorithms
 Every request completes in a finite amount of time.
4
Comparison Between Locks, Lock-free,
and Wait-free
With Locks
Lock-free
Wait-free
Disjoint Access
Parallelism
Starvation
Freedom
Finite
Execution
Time
5
Why not make all algorithms wait-free?
Locks
Lock-free
Wait-free
 
Easy to program
 
Starvation, Blocking,
No disjoint access
      parallelism
Need a black belt in programming
6
Example of Temporary Fields
 
while(
true
) {
 
<
val
, ts> = acc.
balance
;
 
newval
  = 
val 
+ 100;
 
newts  = ts + 1;
 
if (CAS (acc.
balance
, <
val
,ts>,
                 <
newval
,newts>)) break;
}
 
while(
true
) {
 
val
 = acc.
balance
;
 
newval
  = 
val 
+ 100;
 
if (CAS (acc.
balance
, 
val
,
                 
newval
)) break;
}
CAS 
 CompareAndSet
CAS
 (val, old, new) 
if (val == old)
        val = new;
Temporary
Field
7
 
Packing
Value
Field1
Field2
Field3
Memory Word
Temporary Fields
 
Size of a field is restricted by
the size of the memory word
 
8
 
Redirection
 
Value
Pointer
Field1
Field2
Field3
Object
 
Lot of memory wastage.
 
9
 
Examples of Packing and Redirection
Packing
Redirection
 
10
 
Idea of the Expander
 
 The Expander works as a 
cache
  It caches the value of a 
memory word 
and the 
temporary fields
 If the word is not required, its value is flushed to memory, and the temporary
fields are removed
 
Advantages
: Any number of temporary fields with arbitrary sizes
 Makes packing 
feasible
, and 
eliminates
 the memory overhead of redirection
Program
Expander
Memory Space
 
11
Design of the Expander
Memory Word
Hash
memIndex
dataState
value
 
tmpFields
 
Hash Table
(
listHead)
version
state
next
 
Node of type MemCell
reference
mark bit
timestamp
12
 
Basic Operations
 
 kGet 
 
Get
 the value of a memory word along with temporary
fields
 kSet 
 
Set
 
the value of a memory word and its fields
 kCAS 
 
Compare
 the value and all the fields, and if all of the
match, then do a kSet
 free 
 
Remove
 the entry of the Expander
 
13
FSM of an Expander’s Node
 
CLEAN
DIRTY
FLUSH
WRITE
BACK
 
free
14
kGet
Return the value and
the temporary fields
Is the word
there in the
Expander?
Read the value from
memory
Use default values for
the temporary fields
Yes
No
15
kCAS
node 
 lookUpOrAllocate()
node.state
== FLUSH
No
Yes
do all the
values/fields
match?
No
return
false
Yes
help delete
set state to
DIRTY
create a new
version
set a new
dataState
if not successful return false
else return true
16
free
if state ==
WRITEBACK or
FLUSH
No
Yes
return
false
state = WRITEBACK
write to memory
state = FLUSH
help delete
17
 
Proofs and Example
 
 All methods are 
linearizable 
and 
lock-free
 If we consider a special set of wait-free algorithms where
a request is guaranteed to complete if at least one thread
handling it has 
completed
 
N
 steps 
 the algorithm 
remains
wait-free 
with the Expander
 The paper shows a wait-free queue with the Expander –
code changes in 
only
 8 lines
 
18
Evaluation Setup
Details of the
Machine
Dell PowerEdge R820 Server
Four socket, each socket has an 16-thread 2.2 GHz Xeon Chip
16 MB L2 cache, and 64 GB main memory 
Software
Ubuntu 12.10 and Java 1.7
Use Java’s built-in 
totalMemory
 and 
freeMemory
 calls
Packing
Redirection
Wait-free multi-word compare-and-set
Wait-free slot scheduler
Slot scheduler for SSD devices (RADIR)
Wait-free queue 
Lock-free linked list and binary search tree
Lock-free skiplist
1-6 temporary fields
Additional: 1-62 bits
19
 
Slowdown (with 64 threads)
 
32 threads
2-20% Slowdown
 
20
 
Reduction in Memory Usage
(Redirection)
 
5-35% reduction in the
memory footprint
 
21
 
Conclusions
 
  The Expander has many 
advantages
 Makes algorithms with packing 
feasible
 Reduces the memory footprint of algorithms with
redirection
 by up to 35%
  
Minimal
 modifications in the code
  Most wait-free algorithms 
remain
 wait-free with the
Expander
 
22
 
 
 
23
Slide Note
Embed
Share

Illustration of lock-free and wait-free algorithms compared to blocking algorithms, with insights on concurrent object execution, blocking vs. non-blocking algorithms, definitions, comparisons between locks, lock-free, and wait-free approaches, and explanations on making algorithms wait-free. Examples of temporary fields, packing, and redirection in concurrent data structures are also explored.

  • Concurrent Data Structures
  • Lock-Free Algorithms
  • Wait-Free Algorithms
  • Blocking vs Non-Blocking
  • Temporary Fields

Uploaded on Oct 10, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Expander: Lock-free Cache for a Concurrent Data Structure POOJA POOJA AGGARWAL AGGARWAL (IBM RESEARCH, BANGALORE) SMRUTI SMRUTI R. R. SARANGI SARANGI (IIT DELHI) 1

  2. Concurrent Object Each thread executes a method on the object method Concurrent Object Threads request response method Thread 1 Thread 2 Linearizability Thread 3 Timeline 2

  3. Blocking vs Non-blocking Algorithms Blocking algorithm with a lock Non-blocking algorithm lock val = acc.balance acc.balance += 100 unlock return val val = acc.balance. fetchAndAdd (100) return val 3

  4. Few Definitions Blocking algorithm with locks Only one thread is allowed to hold a lock and enter the critical section Lock-free algorithms There are no locks At any point of time at least one thread makes progress and completes the operation Wait-free algorithms Every request completes in a finite amount of time. 4

  5. Comparison Between Locks, Lock-free, and Wait-free Disjoint Access Parallelism Finite Execution Time Starvation Freedom With Locks Lock-free Wait-free 5

  6. Why not make all algorithms wait-free? Locks Lock-free Wait-free Easy to program Starvation, Blocking, No disjoint access parallelism Need a black belt in programming 6

  7. Example of Temporary Fields while(true) { <val, ts> = acc.balance; newval = val + 100; newts = ts + 1; if (CAS (acc.balance, <val,ts>, <newval,newts>)) break; } while(true) { val = acc.balance; newval = val + 100; if (CAS (acc.balance, val, newval)) break; } CAS CompareAndSet CAS (val, old, new) if (val == old) val = new; Temporary Field balance value timestamp 7

  8. Packing Temporary Fields Value Field1 Field2 Field3 Memory Word Size of a field is restricted by the size of the memory word 8

  9. Redirection Object Value Pointer Field1 Field2 Field3 Lot of memory wastage. 9

  10. Examples of Packing and Redirection Packing Paper Authors Temporary Fields Wait-free multiword CAS Universal construction Sundell Anderson et al. index, thread id, descriptor thread id, valid bit, count Wait-free slot scheduler Aggarwal et al. request id, thread id, round, timestamp, slot number, state Redirection Paper Authors Temporary Fields Wait-free queue Kogan et al. enqueue id, dequeue id Wait-free priority queue Israeli et al. value, type, freeze bit Wait-free linked list Timnat et al. mark bit and success bit 10

  11. Idea of the Expander Program Memory Space Expander The Expander works as a cache It caches the value of a memory word and the temporary fields If the word is not required, its value is flushed to memory, and the temporary fields are removed Advantages: Any number of temporary fields with arbitrary sizes Makes packing feasible, and eliminates the memory overhead of redirection 11

  12. Design of the Expander Node of type MemCell Memory Word value Hash memIndex tmpFields dataState Hash Table (listHead) next version state reference timestamp mark bit 12

  13. Basic Operations kGet Get the value of a memory word along with temporary fields kSet Set the value of a memory word and its fields kCAS Compare the value and all the fields, and if all of the match, then do a kSet free Remove the entry of the Expander 13

  14. FSM of an Expanders Node kSet/kCAS CLEAN DIRTY kGet kSet/kCAS/kGet free free kGet WRITE BACK FLUSH kSet/kCAS/kGet 14

  15. kGet Is the word there in the Expander? No Yes Return the value and the temporary fields Read the value from memory Use default values for the temporary fields 15

  16. kCAS node lookUpOrAllocate() do all the values/fields match? No Yes node.state == FLUSH help delete No Yes create a new version set state to DIRTY set a new dataState return false if not successful return false else return true 16

  17. free if state == WRITEBACK or FLUSH Yes No return false state = WRITEBACK state = FLUSH help delete write to memory 17

  18. Proofs and Example All methods are linearizable and lock-free If we consider a special set of wait-free algorithms where a request is guaranteed to complete if at least one thread handling it has completed N steps the algorithm remains wait-free with the Expander The paper shows a wait-free queue with the Expander code changes in only 8 lines 18

  19. Evaluation Setup Dell PowerEdge R820 Server Four socket, each socket has an 16-thread 2.2 GHz Xeon Chip 16 MB L2 cache, and 64 GB main memory Details of the Machine Ubuntu 12.10 and Java 1.7 Use Java s built-in totalMemory and freeMemory calls 1-6 temporary fields Additional: 1-62 bits Software Packing Redirection Wait-free multi-word compare-and-set Wait-free slot scheduler Slot scheduler for SSD devices (RADIR) Wait-free queue Lock-free linked list and binary search tree Lock-free skiplist 19

  20. Slowdown (with 64 threads) 32 threads 2-20% Slowdown 20

  21. Reduction in Memory Usage (Redirection) 5-35% reduction in the memory footprint 21

  22. Conclusions The Expander has many advantages Makes algorithms with packing feasible Reduces the memory footprint of algorithms with redirection by up to 35% Minimal modifications in the code Most wait-free algorithms remain wait-free with the Expander 22

  23. 23

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#