Effective Program Verification for Relaxed Memory Models
Memory model vulnerabilities arise when programmers do not strictly adhere to locking disciplines in performance-critical code, leading to data races that can break on relaxed memory models. This can be challenging to detect and analyze due to the complexities introduced by both compilers and hardware. Examples and explanations provided in the content illustrate these vulnerabilities and the importance of understanding memory models for effective program verification techniques.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Effective Program Verification for Relaxed Memory Models Sebastian Burckhardt Madanlal Musuvathi Microsoft Research CAV, July 10, 2008
Motivation: Memory Model Vulnerabilities Programmers do not always follow strict locking discipline in performance-critical code Ad-hoc synchronization with normal loads and stores or interlocked operations is faster Result: benign or intentional data races Such code can break on relaxed memory models Most multicore machines are not sequentially consistent Both compilers and actual hardware can contribute to effect Vulnerabilities are hard to find, reproduce, and analyze May require specific hardware configuration and schedule 2
C# Example volatile bool isIdling; volatile bool hasWork; //Consumer thread void BlockOnIdle(){ lock (condVariable){ isIdling = true; if (!hasWork) Monitor.Wait(condVariable); isIdling = false; } } //Producer thread void NotifyPotentialWork(){ hasWork = true; if (isIdling) lock (condVariable) { Monitor.Pulse(condVariable); } } 3
Example: Store Buffer Vulnerability Key pieces of code on previous slide: volatile int ii = 0; volatile int hw = 0; Consumer Producer Store ii, 1 Store ii, 1 Load hw, 0 Store hw, 1 Load ii, 1 0 On x86, hardware may perform store late Bug: Producer thread does not notice waiting Consumer, does not send signal 4
Abstract View of Memory Models Given a program P, a memory model Y defines the subset TP,Y T of traces corresponding to some (partial or complete) execution of P on Y. T TP, SC TP, Y SC (sequential consistency) Is strongest memory model More executions may be possible on a relaxed memory model Y 5 5
Example: TSO Under TSO, processors can buffer stores in FIFO queue. T TP, SC TP, TSO 1.1 Store ii, 1 2.1 Store hw, 1 Trace corresponding to code on slide 4 1.2 Load hw, 0 2.2 Load ii, 0 6 6
Why TSO? Memory models are platform dependent & ridden with details RMO We focus on TSO because it models store buffers, the most common relaxation PSO TSO z6 SC Alpha IA-32 IA-64 In practice, TSO is almost the same as the x86 hardware model 7
Model Checking Programs on Relaxed Memory Models Covering all relaxed executions is challenging Highly nondeterministic (exposed to low-level hardware concurrency) Memory models are usually not finite-state Memory models are often a matter of negotiation (formal descriptions are the exception) State of the art has limited scalability Model checking using simplified operational models Bounded model checking using axiomatic models (CheckFence) 8
Memory Model Safety Observation: Programmer writes code for SC Resorts to {locks, fences, volatiles, interlocked operations} to maintain SC behavior where needed If program P exhibits non-SC behavior, it is most likely a bug Definition: A program P is Y-safe if TP,SC = TP,Y 9
Decomposed Program Verification on Relaxed Memory Models TP, SC T TP, Y 1. Verify sequentially consistent executions (show that all executions in TP,SC are correct) 2. Verify memory model safety (show that TP,SC = TP,Y ) Can we do 1 and 2 at the same time? Yes. 10
Borderline Executions Def.: A borderline execution for P is an execution with a successor in TP,TSO - TP,SC TP,SC TP,TSO Thm.: A program P is TSO-safe if and only if it has no borderline executions. 11
Borderline Executions Def.: A borderline execution for P is an execution with a successor in TP,TSO - TP,SC We can verify / falsify this as a safety property of sequentially consistent executions! TP,SC TP,TSO Thm.: A program P is TSO-safe if and only if it has no borderline executions. 12
Example: TSO Borderline Execution TP, SC 1.1 Store ii, 1 2.1 Store hw, 1 TP, TSO 1.2 Load hw, 0 1.1 Store ii, 1 2.1 Store hw, 1 1.2 Load hw, 0 2.2 Load ii, 1 1.1 Store ii, 1 2.1 Store hw, 1 1.2 Load hw, 0 2.2 Load ii, 0 Successor traces are traces with one more instruction. 13
Sober Tool Structure Event Stream (shared memory accesses, sync ops) Instrumented Program Borderline Monitor Scheduler Enumerates Traces Stateless Model Checker (CHESS) Outputs: (1) P correct (2) P not TSO-safe (+cex) (3) P has SC-bug (+cex) Program output is always sound. Tool may not terminate exploration if # of executions is too large. 14
Define SC using hb relation Trace = Set of Instructions (Vertices) with attributes [processor]. [issue index] [operation] [address], [coherence index] coh.index is the position of the value within the sequence of values written to the same location (i.e., we replace each value with its sequence number ) Add edges: program order p /conflict order c Define happens-before order hb = ( p c) Trace is sequentially consistent if and only if hb is acyclic. This trace is SC: This trace is not SC: 1.1 Store ii, 1 1.1 Store ii, 1 1.2 Load hw, 0 1.2 Load hw, 0 2.1 Store hw, 1 2.1 Store hw, 1 2.2 Load ii, 1 2.2 Load ii, 0 15
Define TSO by Relaxing hb Define relaxed happens-before order rhb = ( p c) \ { (s,l) | s is store, l is load, and s p l } Trace is possible on TSO if and only if (1) rhb is acyclic (2) there do not exist s, l such that s p l and l c s This trace is TSO, but not SC: Thm.: Def. Is equivalent to operational TSO model (see Tech Report) 1.1 Store ii, 1 1.2 Load hw, 0 2.1 Store hw, 1 2.2 Load ii, 0 1.1 Store ii, 1 1.1 Store ii, 1 1.2 Load hw, 0 1.2 Load hw, 0 2.1 Store hw, 1 2.1 Store hw, 1 hb rhb 2.2 Load ii, 0 2.2 Load ii, 0 16
Borderline Monitor Implementation Receiving a stream of memory accesses: Record all stores to all locations. For each load L, check if there exists a reordering of L with prior stores to the same location such that (1) hb has a cycle (2) rhb is acyclic (3) there do not exist s, l such that s p l and l c s Implementation: use standard vector clock to compute hb , and custom vector clock (twice the width) to compute rhb 17
Equivalent Interleavings Typically, many different interleavings map to the same (Mazurkiewic) trace. By construction, our monitor is insensitive to the choice of interleaving Checks all hb -equivalent ones simultaneously Makes it compatible with partial order reduction Improves probability of finding bugs 18
Results Good at finding bugs even if only a small number of schedules is explored Monitor checks all hb-equivalent interleavings Chess heuristic (iterative context bounding) seems to mix well Found expected store buffer vulnerabilities in standard examples (Dekker, Bakery) Detected 2 store buffer vulnerabilities in a production-level concurrency library. Overall code size ~ 33 kloc Used existing test harness written by product team (slightly adapted for use with CHESS) Bugs not previously known 19
Some Numbers program name Fig. 1(b) dekker (2 threads, 2 crit-sec) (loc 82) context bound 1 2 3 4 5 0 1 2 3 # interleavings total borderline 10 5 36 183 1,219 8,472 1 25 742 12,436 time [s] < 0.1 < 0.1 < 0.1 < 0.1 < 0.1 < 0.1 < 0.1 < 0.1 < 0.1 < 0.1 ver. time [s] SoBeR < 0.2 < 0.2 0.39 1.9 13.2 106.0 < 0.2 0.47 10.3 189.0 CHESS < 0.2 < 0.2 0.37 4 4 23 50 124 349 1.8 13.0 100.6 < 0.2 0.43 9.8 181.0 bakery (2 threads, 3 crit-sec) (loc 122) 1 20 533 8,599 takequeue (2 threads, 6 ops) (loc 374) 0 1 2 3 4 5 3 0 n.a. 0.34 0.43 0.74 0.84 0.86 < 0.3 0.72 5.2 28.9 125.5 481.5 < 0.3 0.69 4.9 27.8 118.9 461.6 47 14 189 402 2,318 9,147 29,821 1,197 5,321 17,922 20
Conclusion With increasing use of multicores, more and more programs are likely to exhibit failures caused by the memory model. Such failures are hard to find by conventional means (code inspection, testing). Our combination of borderline monitor & stateless model checking makes it practical to detect memory model safety violations in a unit test environment. 21
Future Work Run on larger programs (runtime verification) Handle more memory models Which memory models guarantee borderline executions? Prove memory model safety of concurrent data type implementations Develop borderline monitors for other relaxed concurrent APIs Transactional memory Concurrency Libraries 22