Tackling Store Buffers in TSO Analysis

 
Getting Rid of Store-Buffers in
Getting Rid of Store-Buffers in
TSO Analysis
TSO Analysis
 
 
 
Mohamed Faouzi Atig
Mohamed Faouzi Atig
Uppsala University, Sweden
Uppsala University, Sweden
 
 
Ahmed Bouajjani
Ahmed Bouajjani
 LIAFA, University of Paris 7, France
 LIAFA, University of Paris 7, France
 
                       Gennaro Parlato                    
                       Gennaro Parlato                    
University of Southampton, UK
University of Southampton, UK
 
Sequential consistency memory model 
(SC)
 
T
1
Shared
Memory
 
T
n
 
 
Total Store Ordering 
(TSO)
(x
4
) (z
7
) (y
3
)
 
T
1
 
M
1
Shared
Memory
(z
4
) (y
4
)
 
T
n
 
M
n
 
 
 
Correct under SC -- Wrong  under TSO
Dekker’s mutual exclusion protocol
Thread 1
a: y:=1
b: r
1
:=x
c: if (r
1
==0) then
d: critical section
Thread 2
1: x:=1
2: r
2
:=y
4: if (r
2
==0) then
4: critical section
 
 
 
Verification for TSO?
 
For finite state  programs
For finite state  programs
reachability is non-primitive recursive
reachability is non-primitive recursive
[Atig, Bouajjani, Burckhardt, Masuvathi – POPL’10]
[Atig, Bouajjani, Burckhardt, Masuvathi – POPL’10]
What shall we do?
What shall we do?
Symbolic representation of the store buffers?
Symbolic representation of the store buffers?
  
  
[Linden, Wolper—SPIN’10]
[Linden, Wolper—SPIN’10]
: Regular model-checking
: Regular model-checking
Our approach reduce the analysis from TSO to SC
Our approach reduce the analysis from TSO to SC
can be done only with approximations …
can be done only with approximations …
 
 
 
What is this talk about
 
If we restrict to only executions where each thread is executed at most
k
 times with no interruption (for a fixed 
k
)
 
we can translate any concurrent program P
TSO
 (recursion, thread
creation, heap, …) into another program P
SC  
s.t.
 
P
SC
 (under SC) simulates all possible executions of P
TSO
  (under TSO)
where each thread is executed at most 
k
 times
P
SC
 has no buffer at all!  Simulation of the store-buffers using 
2k
 copies of
the shared variables as locals
P
SC
 has linear size in the size of P
TSO
Advantage: use off-the-shelf SC tools for the analysis of TSO programs
 
 
 
Code-to-code translation
from TSO to SC
 
 
k-round (for each thread) reachability
 
Run    =      (T
Run    =      (T
i1+
i1+
+M
+M
i1
i1
)
)
+      
+      
(T
(T
i2+
i2+
+M
+M
i2
i2
)
)
+ 
+ 
   ...
   ...
 
 
                      round P
                      round P
i1
i1
      round P
      round P
i2
i2
 
A k-round run :  Ɐi   # round P
A k-round run :  Ɐi   # round P
i
i
 ≤ k
 ≤ k
 
P
i
 
P
1
Compositional reasoning
[(T
i 
+M
i
)*]
k
 
Getting rid of store-buffers
 
is a copy of the shared
vars  (as locals)
 
is a copy of the shared
vars as Boolean
(as  locals)
 
Mask
i
 
Buff
i
 
Invariant
:
 
Mask
0
 
Buff
0
 
Buff
1
 
Buff
2
 
Mask
1
 
Mask
2
 
store-buffer
Simulation
Buff
i
Skeleton of the translation
Shared sh_vars;
Shared sh_vars;
Thread_i()
Thread_i()
Begin
Begin
   locals l_vars;
   locals l_vars;
   stmt_1;
   stmt_1;
   stmt_2;
   stmt_2;
   stmt_n;
   stmt_n;
end
end
 
 
r_TSO, r_SC, sim, Mask
0 , 
Buff
0, …,
Mask
k , 
Buff
k
;
 
Init();  // initialize Masks to False, r_SC=0, r_TSO, sim=0;
 
Characteristics of the translation
 
For fixed k, P
SC
 is linear in the size of P
TSO
2k copies of the shared variable as locals (no store-buffer)
P
SC
 and P
TSO 
are in the same class
no restriction on the programs is imposed
The reachable shared states are the same in P
SC
 and P
TSO
Bounding Store Ages
Observation:
When 
r_SC =1
 
(Mask
0
,  Buff
0
)
are not used any longer
Reuse the Mask and Queue variables:
Translation:
 
(Mask
j
 , Buff
j
)
 
are
used circularly (modulo k+1).
k store-ages
:
Unbounded rounds!
Constraint
: each write pair remains
in the store-buffer for at most 
k
rounds
 
How can we use this
code-to-code
translation?
 
Corollaries
Decidability results for TSO reachability
Decidability results for TSO reachability
Our code-to-code translation is a linear reduction TSO -> SC.    Inherit decidability from SC
Our code-to-code translation is a linear reduction TSO -> SC.    Inherit decidability from SC
 
Tools for SC 
 Tools for TSO
(our code-to-code translation as a plug-in)
 
A convenient way to get new tools for TSO …
 
SC tools:
SC tools:
Bounded model checking:
Bounded model checking:
ESBMC (FSE’11)
ESBMC (FSE’11)
Poirot (by MSR)
Poirot (by MSR)
Storm (CAV’09)
Storm (CAV’09)
Boolean Programs:
Boolean Programs:
Boom, Boppo
Boom, Boppo
G
G
ETAFIX
ETAFIX
 (PLDI’09)
 (PLDI’09)
jMoped [SPIN’08]
jMoped [SPIN’08]
CHESS (MSR)
CHESS (MSR)
Sequentialization + sequ. tools
Sequentialization + sequ. tools
 
Experiments
 
P
OIROT
: SMT-based bounded model-checkers for SC  programs
 
Errors due to TSO discovered in few seconds!
P
OIROT
 can also be a model-checker for TSO!
 
Conclusions
 
Conclusions
 
We have proposed a code-to-code translation from TSO to SC
We have proposed a code-to-code translation from TSO to SC
allows to use existing and future tools designed for SC
allows to use existing and future tools designed for SC
to analyze programs running under TSO
to analyze programs running under TSO
under-approximation (error finding)
under-approximation (error finding)
restrictions imposed on the analyzed runs is useful to
restrictions imposed on the analyzed runs is useful to
find errors in programs
find errors in programs
 
 
Beyond TSO ? Generic approach ?
Beyond TSO ? Generic approach ?
Thanks!
Thanks!
Slide Note
Embed
Share

The discussion revolves around the management of store buffers in Total Store Ordering (TSO) analysis, contrasting Sequential Consistency (SC) memory model. Different protocols and approaches like Dekker's mutual exclusion and verification techniques for TSO are explored. An innovative method of reducing TSO analysis to SC with approximations is addressed, focusing on symbolic representation of store buffers and finite state program reachability. Moreover, a simulation technique is proposed, translating concurrent programs in TSO to SC for effective analysis leveraging off-the-shelf tools.

  • Store Buffers
  • TSO Analysis
  • SC Memory Model
  • Verification Techniques
  • Dekkers Mutual Exclusion

Uploaded on Sep 21, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Getting Rid of Store-Buffers in TSO Analysis Mohamed Faouzi Atig Uppsala University, Sweden Ahmed Bouajjani LIAFA, University of Paris 7, France Gennaro Parlato University of Southampton, UK

  2. Sequential consistency memory model (SC) T1 Shared Memory Tn Write(var,val): sh_mem[var] val; (immidialy visible to all threads Read(var): returns sh_mem[val]; SC= actions of different threads interleaved in any order action of the same thread maintain the execution order WMM= For performance reason modern multi-processors reorder memory operations of the same thread

  3. Total Store Ordering (TSO) M1 T1 (x 4) (z 7) (y 3) Shared Memory Mn Tn (z 4) (y 4) Each thread has its store-buffer (FIFO) Write(var,val): the pair (var val) is sent to the buffer Memory update = execution of a Write taken from some buffer Read(var) returns val - If (var val) the last value written into var still in the store-buffer - the buffer does not contain any Write to var, and sh_mem(var) = val fence requires that the store-buffer is empty

  4. Correct under SC -- Wrong under TSO Dekker s mutual exclusion protocol Thread 1 a: y:=1 b: r1:=x c: if (r1==0) then d: critical section y 1 x y 0 0 Thread 2 1: x:=1 2: r2:=y 4: if (r2==0) then 4: critical section x 1 Bad Schedule for TSO: a b c d 1 2 3 4 both threads in the critical section!!!

  5. Verification for TSO? For finite state programs reachability is non-primitive recursive [Atig, Bouajjani, Burckhardt, Masuvathi POPL 10] What shall we do? Symbolic representation of the store buffers? [Linden, Wolper SPIN 10]: Regular model-checking Our approach reduce the analysis from TSO to SC can be done only with approximations

  6. What is this talk about If we restrict to only executions where each thread is executed at most k times with no interruption (for a fixed k) we can translate any concurrent program PTSO (recursion, thread creation, heap, ) into another program PSC s.t. PSC (under SC) simulates all possible executions of PTSO (under TSO) where each thread is executed at most k times PSC has no buffer at all! Simulation of the store-buffers using 2k copies of the shared variables as locals PSC has linear size in the size of PTSO Advantage: use off-the-shelf SC tools for the analysis of TSO programs

  7. Code-to-code translation from TSO to SC

  8. k-round (for each thread) reachability M1 T1 P1 Shared Memory Mi Ti Pi Run = (Ti1++Mi1)+ (Ti2++Mi2)+ ... round Pi1 round Pi2 A k-round run : i # round Pi k

  9. Compositional reasoning [(Ti +Mi)*]k round0 (Mask0 Buff0) round1 (Mask1 Buff1) round2 (Mask2 Buff2)

  10. Getting rid of store-buffers Maski x y z is a copy of the shared vars as Boolean (as locals) (Mask0 Buff0) (Mask1 Buff1) Buffi x - y 6 z - (Mask2 Buff2) is a copy of the shared vars (as locals)

  11. Invariant: x y z x 3 0 0 y 5 - 1 z - - 4 Buff0 Buff1 Buff2 Mask0 Mask1 Mask2 store-buffer round 0 round 1 round 2 4) (y 7) (x 0) (x 4) (x 7) (x 3) (x 7) (y 5) (x 0) (y 1) (z at each time in the simulation Maski [var]=1 iff there is a store in the store-buffer for var that update the Shared memory at round i Buffi[var] containts the last value sent for var

  12. Simulation Before simulation: Masks set to False r_SC 0; r_TSO 0; (Mask0 Buff0) Simulation: All statements not involving shared vars are executed (Mask1 Buff1) Write(var,val) Maskr_TSO[var] T; Queuer_TSO[var] val; (Mask2 Buff2) Read(var) Let i be the greatest index s.t. i>=r_SC & Maski(var) =1 End of round : (Update shared vars): if i>=0 return Queuei[var] else return var ; For all var if Maskr_SC (var) ==1 var Buffr_SC [var]; round round Buffi 1 round 0 2

  13. Skeleton of the translation before(){ // start round if (!sim){ lock; sim=1; r_SC++; if (r_TSO< r_SC) r_TSO=r_SC; } while(*) r_TSO++; } Shared sh_vars; Thread_i() Begin r_TSO, r_SC, sim, Mask0 , Buff0, ,Maskk , Buffk; locals l_vars; Init(); // initialize Masks to False, r_SC=0, r_TSO, sim=0; stmt_1; stmt_j before(); stmt_j; after(); stmt_2; after() { if(*) //end round Update_shared(r_SC, Mask, Queue) sim=0; unlock; } stmt_n; end

  14. Characteristics of the translation For fixed k, PSC is linear in the size of PTSO 2k copies of the shared variable as locals (no store-buffer) PSC and PTSO are in the same class no restriction on the programs is imposed The reachable shared states are the same in PSC and PTSO A state S is reachable in PTSO with at most k rounds per thread iff S is reachable in PSC

  15. Bounding Store Ages Observation: When r_SC =1 (Mask0, Buff0) are not used any longer Reuse the Mask and Queue variables: (Mask0 Buff0) Translation: (Maskj , Buffj) are used circularly (modulo k+1). (Mask1 Buff1) k store-ages: (Mask2 Buff2) Unbounded rounds! Constraint: each write pair remains in the store-buffer for at most k rounds (Mask0 Buff0)

  16. How can we use this code-to-code translation?

  17. Corollaries Decidability results for TSO reachability Our code-to-code translation is a linear reduction TSO -> SC. Inherit decidability from SC schedules (k fixed) Concurrent Boolean Prog. no recursion Complexity References k-store-ages Pspace [Qadeer, Rehof TACAS 05] k context- switches k round-robin Recursion Exptime [Lal, Reps CAV 08] [La Torre, P., Madhusudan CAV 09] [La Torre, P., Madhusudan CAV 10] Recursion Finite # threads |parameterized recursion thread-creation Exptime [Atig, Bouajjani, Qadeer TACAS 09] k-rounds per thread 2-Expspace [Emmi, Qadeer, Rakamaric POPL 11] k-delay bound recursion thread- creation recursion thread-creation Exptime [Bouajjani, Emmi, P. SAS 11] k-compositional Exptime

  18. Tools for SC Tools for TSO (our code-to-code translation as a plug-in) A convenient way to get new tools for TSO Concurrent Program

  19. Experiments Mutual exclusion Protocols POIROT (by MSR) Loop unrolling: 2 D stands for Delay bound No fences (buggy for TSO) D=1 7 s 26 s 5 s 8 s With fences (correct for TSO) D=1 D=2 6 s 110 s 6 s 6 s Dekker Lamport Peterson Szymanski 72 s 1608 s 47 s 978 s POIROT: SMT-based bounded model-checkers for SC programs Errors due to TSO discovered in few seconds! POIROT can also be a model-checker for TSO!

  20. Conclusions

  21. Conclusions We have proposed a code-to-code translation from TSO to SC allows to use existing and future tools designed for SC to analyze programs running under TSO under-approximation (error finding) restrictions imposed on the analyzed runs is useful to find errors in programs Beyond TSO ? Generic approach ? Thanks!

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#