Detecting Data Races in Production
Discover insights and strategies for identifying and addressing data races in production environments. Explore visualization tools and techniques to analyze thread interactions efficiently and improve system performance. Learn about thread behavior, sampling periods, non-sampling periods, and detecting race probabilities in multithreaded applications.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Michael Bond Katherine Coons Kathryn McKinley University of Texas at Austin
Overhead FastTrack [Flanagan & Freund 09] 80x 8x
Overhead FastTrack [Flanagan & Freund 09] creads&writes+ csyncn Number of threads
Overhead FastTrack [Flanagan & Freund 09] creads&writes + csync n Problem today Problem in future
Overhead FastTrack [Flanagan & Freund 09] creads&writes + csync n Pacer (creads&writes + csync n) r + cnon-sampling (1 r) Sampling rate
Overhead FastTrack [Flanagan & Freund 09] creads&writes + csync n Pacer (creads&writes + csync n) r + cnon-sampling (1 r) Sampling periods Non-sampling periods
Overhead FastTrack [Flanagan & Freund 09] creads&writes + csync n Pacer (creads&writes + csync n) r + cnon-sampling (1 r) Probability (detecting any race) FastTrack 1 Pacer r
Thread A Thread B Non-sampling period Sampling period Non-sampling period Sampling period Non-sampling period
Thread A Thread B write x read y write y read x
Insight #1: Stop tracking variable after non-sampled access
Thread A Thread B write x unlock m
Thread A Thread B write x unlock m lock m
Thread A Thread B write x unlock m lock m write x
Thread A Thread B write x unlock m lock m write x read x
Thread A Thread B write x unlock m lock m write x read x
Thread A Thread B write x unlock m lock m write x read x
Thread A A Thread B A B B 5 2 3 4 Vector clocks write x unlock m lock m write x read x
Thread A A Thread B A B B 5 2 3 4 Vector clocks write x unlock m lock m write x read x
Thread A A Thread B A B B 5 2 3 4 Vector clocks write x unlock m lock m write x read x
Thread A A Thread B A B B 5 2 3 4 write x unlock m lock m write x read x
Thread A A Thread B A B B 5 2 3 4 write x 5@A unlock m lock m write x read x
Thread A A Thread B A B B 5 2 3 4 write x 5@A unlock m 5 2 lock m write x read x
Thread A A Thread B A B B 5 2 3 4 write x 5@A unlock m 5 2 Increment clock 6 2 lock m write x read x
Thread A A Thread B A B B 5 2 3 4 write x 5@A unlock m 5 2 6 2 lock m Join clocks 5 4 write x read x
Thread A A Thread B A B B 5 2 3 4 write x 5@A unlock m Happens before? 5 2 6 2 lock m 5 4 write x read x
Thread A A Thread B A B B 5 2 3 4 write x 5@A unlock m 5 2 6 2 lock m 5 4 write x read x
Thread A A Thread B A B B 5 2 3 4 write x 5@A unlock m 5 2 6 2 lock m 5 4 write x read x
Thread A A Thread B A B B 5 2 3 4 write x 5@A unlock m 5 2 6 2 lock m 5 4 write x No work performed read x
Thread A A Thread B A B B 5 2 3 4 write x 5@A unlock m 5 2 6 2 lock m 5 4 write x Race uncaught read x
Thread A A Thread B A B B 5 2 3 4 write x unlock m 5 2 6 2 lock m 5 4 write x 4@B read x
Thread A A Thread B A B B 5 2 3 4 write x unlock m 5 2 6 2 lock m 5 4 write x 4@B Happens before? read x
Insight #2: We only care whether A happens before B if A is sampled
Thread A Thread B Do these events happen before other events? We don t care!
Thread A Thread B Don t increment clocks Increment clocks Do these events happen before other events? We don t care! Don t increment clocks Increment clocks Don t increment clocks
Thread A A Thread B A B B 5 2 3 4 unlock m1 lock m1 unlock m2 lock m2
Thread A A Thread B A B B 5 2 3 4 unlock m1 5 2 No clock lock m1 5 increment 4 unlock m2 lock m2 5 4
Thread A A Thread B A B B 5 2 3 4 unlock m1 5 2 lock m1 5 4 unlock m2 5 2 lock m2 5 4
Thread A A Thread B A B B 5 2 3 4 unlock m1 5 2 lock m1 5 4 unlock m2 5 2 lock m2 5 Unnecessary join 4
Thread A A Thread B A B B 5 2 3 4 unlock m1 5 2 lock m1 5 4 unlock m2 5 2 lock m2 5 O(n) O(1) 4
20 15 Slowdown eclipse 10 hsqldb xalan pseudojbb 5 1 0 0% 20% 40% 60% 80% 100% Sampling rate
Qualitative improvement in time & space
? Probability (detecting any race) = r
10% Detection rate 1% 0% Distinct races (ordered by detection rate)
LiteRace [Marino et al. 09] Cold-region hypothesis [Chilimbi & Hauswirth 04] Full analysis at synchronization operations
Accuracy, time, space sampling rate Detect race first access sampled
Accuracy, time, space sampling rate Detect race first access sampled Qualitative improvement
Accuracy, time, space sampling rate Detect race first access sampled Qualitative improvement Help developers fix difficult-to-reproduce bugs