Scalable Task Pool with Adjustable Fairness

CAFÉ: Scalable Task Pool with Adjustable
Fairness and Contention
 
 Dmitry Basin, Rui Fan, Idit Keidar, Ofer Kiselov, Dmitri Perelman
Technion,  Israel Institute of Technology
Task Pools
Exist in most server applications:
Web Servers, e.g., building block of SEDA Architecture
Handling asynchronous requests
Ubiquitous programming pattern for parallel programs
Scalability is essential!
Task Pool
Producers
Consumers
Shared Memory
Typical Implementation: FIFO Queue
Has inherent scalability problem
 
Do we really need FIFO ?
In many cases no!
We would like to:
Relax the  requirement
Control the degree of relaxation
CAFÉ: Contention and Fairness Explorer
 
Ordered list of scalable bounded non-FIFO pools
 
TreeContainer size controls contention-fairness trade-off
 
garbage
 collected
 
TreeContainer
 
TreeContainer
 
TreeContainer
 
Java VM
TreeContainer (TC) Specification
 
Bounded container
A put operation can fail if no free space found
A get operation returns a task or null if  TC is empty
Randomized algorithms for put and get operations
TreeContainer Data Structure
Complete binary tree
Free node
Used node
without a task
Occupied node
containing  a task
Right sub-tree
doesn’t  contain
tasks
Left sub-tree has
tasks
Get/Put Operations
 
Step 1: find target node
By navigating the tree
Step 2: perform put/get on that node
Need to handle races
Step 3: update routes
Update meta-data on path to node – tricky!
 
 
 
Task
Get()  
Operation
Get():
 
Start from the root
 
CAS
Close to the root
updates are rare
Step3: Update routes
up to the root
Step1: Navigate to a
task by random walk
on arrows graph
Step 2: Extract the task.
Put()  
Operation
 
Level  0
 
Level  1
 
Level  2
 
Level  3
 
√ free
Task
put()
:
Every level of  the tree
implemented by array
of nodes
Put()  
Operation
Task
put()
:
Go to the highest
free predecessor
 
CAS 
operation
Level  0
Level  1
Level  2
Level  3
√ free
Finished Step 1:
found free node
Step 2: occupy
the free node
Put()  
Operation
put()
:
Close to the root, updates are rare
 
true
 
Update routes
When 
Put()  
Operation Fails
put()
:
 
false
 
k  trials fail
Level  0
Level  1
Level  2
Last Level
Races
 
Concurrency issues are not trivial 
:)
Challenge:
guarantee linearizability
avoid updating all the metadata up to the root upon each
operation
See the paper
TreeContainer properties
 
Put/Get operations are
linearizable
wait-free
 
Under the worst-case thread scheduling:
Good step complexity of puts
When N nodes occupied  - O(log
2
N) whp
Does not depend on TC size
Good tree density (arbitrarily close to 2
h
 whp)
 
CAFÉ Data Structures
 
TC
TC
GT
TC
PT
CAFÉ Data Structures
 
TC
TC
TC
PT
TC
 
TC.Put(task)
 false
Allocate and connect
new TC
 
TC.Put(task)
 true
GT
TC
CAFÉ:
get() 
from Empty TC
TC
TC
TC
PT
TC
 
TC.Get
null
 
TC.Get
task
Garbage collected
CAFÉ: Races
TC
TC
TC
PT
TC
Suspended producer
thread
The task is lost for
consumers
CAFÉ: Handling Races – Try 1
TC
TC
TC
PT
TC
Move GT
back
 
Check if GT bypassed TC
CAFÉ: Races (2)
 
TC
TC
TC
PT
TC
 
TC.Get 
 null
Consumer thread
 
Producer thread
Going  to
move GT
forward
 
TC.Put(task) 
 true
Consumers can access the
task 
 can terminate
Task is lost
CAFÉ: Handling Races – Try 2
TC
TC
TC
PT
TC
Read prev,
If empty read curr
GT
cur>
<prev
Lock-Free. To make it wait-free we do additional tricks.
CAFÉ: Properties
 
Safety:
Put()/Get() 
operations are linearizable
Wait-freedom:
Get() 
operations are deterministically wait-free
Put() 
operations are wait-free with probability 1
Fairness:
Preserves order among trees
Evaluation Setup
 
Compared pools:
LBQ: Java 6  FIFO blocking queue
CLQ: Java 6 FIFO non-blocking queue (M&S)
EDQ: non-FIFO Elimination-Diffraction Tree Queue
 
Evaluation server:
8 AMD Opteron quad-cores 
 total 32 cores
CAFÉ evaluation
Throughput
CAFÉ-13: CAFÉ with tree height 13
LBQ: Java 6  FIFO blocking queue
CLQ: Java 6 FIFO non-blocking queue (M&S)
EDQ: non-FIFO Elimination-Diffraction Tree
Queue
 Throughput as a function of thread number
 
factor  of 30
over lock-free
implementations
CAFÉ evaluation
Throughput
CAFÉ: CAFÉ queue
LBQ: Java 6  FIFO blocking queue
CLQ: Java 6 FIFO non-blocking queue (M&S)
EDQ: non-FIFO Elimination-Diffraction Tree
Queue
.
 CAFÉ throughput as a function of TreeContainer height
CAFÉ evaluation
CAS-failures
 CAS failures per operation as a function of TreeContainer height
Summary
CAFÉ:
Efficient
Wait-Free
With adjustable fairness and contention
Thank you
Slide Note
Embed
Share

Explore CAF, a scalable task pool with adjustable fairness and contention, offering a solution to the inherent scalability problems of FIFO queues. The system allows for control over the level of relaxation, providing more fairness or less contention as needed. With a focus on bounded non-FIFO pools and randomized algorithms for put and get operations, CAF presents a unique approach to managing tasks in shared memory environments.

  • Scalable Task Pool
  • Adjustable Fairness
  • Contentious Explorer
  • Bounded Containers
  • Shared Memory

Uploaded on Aug 22, 2024 | 5 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. CAF: Scalable Task Pool with Adjustable Fairness and Contention Dmitry Basin, Rui Fan, Idit Keidar, Ofer Kiselov, Dmitri Perelman Technion, Israel Institute of Technology

  2. Task Pools Shared Memory Task Pool Producers Consumers Exist in most server applications: Web Servers, e.g., building block of SEDA Architecture Handling asynchronous requests Ubiquitous programming pattern for parallel programs Scalability is essential!

  3. Typical Implementation: FIFO Queue Has inherent scalability problem contention points Shared Memory Do we really need FIFO ? In many cases no! We would like to: Relax the requirement Control the degree of relaxation

  4. CAF: Contention and Fairness Explorer Ordered list of scalable bounded non-FIFO pools Java VM TreeContainer TreeContainer TreeContainer garbage collected TreeContainer size controls contention-fairness trade-off More fairness More contention Less fairness Less contention Tree height 0 Pure FIFO

  5. TreeContainer (TC) Specification Bounded container A put operation can fail if no free space found A get operation returns a task or null if TC is empty Randomized algorithms for put and get operations

  6. TreeContainer Data Structure Complete binary tree Right sub-tree doesn t contain tasks Left sub-tree has tasks Used node without a task Occupied node containing a task Free node

  7. Get/Put Operations Step 1: find target node By navigating the tree Step 2: perform put/get on that node Need to handle races Step 3: update routes Update meta-data on path to node tricky!

  8. TreeContainer Get() Operation Get(): Close to the root updates are rare task by random walk on arrows graph Start from the root Task Step1: Navigate to a Step3: Update routes up to the root Step 2: Extract the task.

  9. TreeContainer Put() Operation Every level of the tree implemented by array of nodes put(): Task Random node Level 0 occupied Random node Level 1 occupied Random node Level 2 occupied Random node Level 3 free

  10. TreeContainer Put() Operation put(): Task Level 0 Step 2: occupy the free node Level 1 Finished Step 1: found free node Level 2 Go to the highest free predecessor Random node Level 3 free

  11. TreeContainer Put() Operation put(): true Close to the root, updates are rare

  12. Races Concurrency issues are not trivial :) Challenge: guarantee linearizability avoid updating all the metadata up to the root upon each operation See the paper

  13. TreeContainer TreeContainer properties Put/Get operations are linearizable wait-free Under the worst-case thread scheduling: Good step complexity of puts When N nodes occupied - O(log2N) whp Does not depend on TC size Good tree density (arbitrarily close to 2h whp)

  14. CAF Data Structures PT GT TC TC TC

  15. CAF Data Structures PT GT TC TC TC TC TC.Put(task) false TC.Put(task) true Allocate and connect new TC

  16. CAF:get get() () from Empty TC PT GT TC TC TC TC TC TC.Get null TC.Get task Garbage collected

  17. CAF: Races PT GT The task is lost for consumers TC TC TC TC Suspended producer thread

  18. CAF: Handling Races Try 1 PT GT Check if GT bypassed TC Move GT back TC TC TC TC

  19. CAF: Races (2) PT GT Consumers can access the task can terminate Task is lost Going to move GT forward TC TC TC TC TC.Get null TC.Put(task) true Consumer thread Producer thread

  20. CAF: Handling Races Try 2 GT PT Read prev, If empty read curr <prev cur> TC TC TC TC Lock-Free. To make it wait-free we do additional tricks.

  21. CAF: Properties Safety: Put()/Get() operations are linearizable Wait-freedom: Get() operations are deterministically wait-free Put() operations are wait-free with probability 1 Fairness: Preserves order among trees

  22. Evaluation Setup Compared pools: LBQ: Java 6 FIFO blocking queue CLQ: Java 6 FIFO non-blocking queue (M&S) EDQ: non-FIFO Elimination-Diffraction Tree Queue Evaluation server: 8 AMD Opteron quad-cores total 32 cores

  23. CAF-13: CAF with tree height 13 CAF evaluation Throughput LBQ: Java 6 FIFO blocking queue CLQ: Java 6 FIFO non-blocking queue (M&S) EDQ: non-FIFO Elimination-Diffraction Tree Queue Throughput as a function of thread number factor of 30 over lock-free implementations

  24. CAF: CAF queue CAF evaluation Throughput LBQ: Java 6 FIFO blocking queue CLQ: Java 6 FIFO non-blocking queue (M&S) EDQ: non-FIFO Elimination-Diffraction Tree Queue . CAF throughput as a function of TreeContainer height

  25. CAF evaluation CAS-failures CAS failures per operation as a function of TreeContainer height

  26. Summary CAF : Efficient Wait-Free With adjustable fairness and contention Thank you

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#