Understanding Probabilistic Concurrency Testing for Bug Detection

Slide Note
Embed
Share

Explore the concept of probabilistic concurrency testing and how randomized scheduling algorithms can help detect bugs efficiently. Learn about bug depth, randomized algorithms, and the development of PCT to improve the effectiveness of stress testing tools like Cuzz.


Uploaded on Sep 24, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs Sebastian Burckhardt Microsoft Research Madanlal Musuvathi Microsoft Research Pravesh Kothari Indian Institute of Technology, Kanpur Santosh Nagarakatte University of Pennsylvania

  2. What is Concurrency Testing? Whether a test finds a bug depends on the configuration the inputs the schedule Concurrency bugs are bugs that surface only for some schedules The Concurrency Testing Problem How to cover buggy schedules as best we can? Testing all schedules is infeasible!

  3. Idea: Randomize the Schedule Child Parent void* p = 0; void* p = 0; void* p = 0; void* p = 0; 1. Instrument code with calls to insert random delays 2. If we are lucky, delay exposes bugs 3. But: how long to delay? where not to delay? CreateThd(child); RandDelay(); RandDelay(); RandDelay(); Init(); Init(); Init(); Init(); p = malloc( ); CreateThd(child); CreateThd(child); Start(child); DoMoreWork(); RandDelay(); RandDelay(); RandDelay(); RandDelay(); RandDelay(); RandDelay(); p->f ++; DoMoreWork(); DoMoreWork(); DoMoreWork(); p = malloc( ); p = malloc( ); p = malloc( ); RandDelay(); RandDelay(); RandDelay(); p->f ++; p->f ++; p->f ++;

  4. What is a Randomized Algorithm? A randomized algorithm: An algorithm that makes nondeterministic choices An algorithm using a random source with a precisely defined distribution A probabilistic guarantee: A guarantee that doesn t always hold A lower bound on the probability of success

  5. What we did / Talk Outline 1. Define bug depth in such a way that common bugs have low depth 2. Develop PCT algorithm (probabilistic concurrency testing), a randomized scheduling algorithm with a good probabilistic guarantee to find bugs of low depth 3. Build it into Cuzz, a concurrency fuzzing tool that improves the efficiency of stress testing

  6. Part I BUG DEPTH

  7. Bug Depth Bug Depth = the number of ordering constraints a schedule has to satisfy to find the bug. More constraints means more things have to go just right to find the bug. Conjecture: many typical bugs have low depth. Let s look at 3 examples.

  8. Ordering Violation Example: A Bug of Depth 1 Bug depth = the number of ordering constraints sufficient to find the bug. Parent Thread Child Thread start(child); p = malloc(); do_init(); p->f ++; All schedules that satisfy the find the bug.

  9. Atomicity Violation Example: A Bug of Depth 2 Bug depth = the number of ordering constraints sufficient to find the bug. Parent Thread Child Thread p = null; p = malloc(); start(child); If (p != null) p->f++ All schedules that satisfy both find the bug.

  10. Deadlock Example: A Bug of Depth 2 Bug depth = the number of ordering constraints sufficient to find the bug. Parent Thread Child Thread Lock(A); Lock(B); Lock(B); Lock(A); All schedules that satisfy both find the bug.

  11. Part II THE PCT ALGORITHM

  12. PCT Algorithm: Randomly Assign & Change Thread Priorities Input: int k; // no. of steps - guessed from previous runs int d; // target bug depth - randomly chosen State: int pri[]; // thread priorities int change[]; // when to change priorities int stepCnt; // current step count PCT::Init() { PCT::RandDelay( tid ) { stepCnt = 0; stepCnt ++; if stepCnt == change[i] for some i pri[tid] = i; if (tid is not highest pri enabled thread) spin; foreach tid pri[tid] = rand() + d; for( i=0; i<d-1; i++ ) change[i] = rand() % k; } }

  13. The PCT Guarantee Given a program with n threads k steps a bug of depth d (~tens) (~millions) (1,2) Each run PCT finds the bug with a probability of at least 1 k p 1 d n (this is a worst-case guarantee)

  14. Part III THE CUZZ TOOL & RESULTS

  15. How it Works binary instrumentation for data accesses (optional) Program Cuzz Randomized Algorithm Win32 API Kernel Scheduler Intercept at synchronization points Detour win32 synchronization calls Optionally instrument data accesses No manual instrumentation required

  16. Some Results

  17. Practice Beats Worst-Case Measured Probability often significantly better than worst-case guaranteed probability

  18. Why Does Practice Beat Worst-Case? Worst-case guarantee applies to hardest-to-find bug of given depth If bugs can be found in multiple ways, probabilities add up! Example: Increasing the number of threads helps: 0.02 0.018 Measured Probability 0.016 0.014 0.012 0.01 0.008 0.006 0.004 0.002 0 2 3 5 9 17 33 65 Number of Threads

  19. Internal Tool Status TheCuzz tool is available internally at Microsoft We are working with several product groups that actively use Cuzz to improve their stress testing

  20. DEMO

  21. Demo Conclusion Measure probabilities on cluster Without Cuzz: With Cuzz: 1 Fail in 238 820 runs ratio = 0.000004817 12 Fails in 320 runs ratio = 0.0375 Resource Savings: factor 7,800 1 day of stress testing = 11 seconds of Cuzz testing

  22. Conclusions Bug depth is a useful metric to focus testing efforts Systematic randomization improves concurrency testing No reason not to use Cuzz Thank You For Your Attention.

Related


More Related Content