Understanding Userspace Synchronization Techniques in Operating Systems
Exploring various hardware-supported atomic operations like __atomic_compare_exchange(), __atomic_add_fetch(), and __atomic_sub_fetch() for userspace synchronization. Differentiating between spinlocks and futexes, focusing on fast userspace mutexes like futexes to enhance lock performance. Detailing the components and usage of futexes in synchronizing processes or threads. Illustrating Futexes optimization using compare and exchange operations.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Userspace Synchronization Chris Gill, Brian Kocoloski, Marion Sudvarg, James Orr CSE 422S - Operating Systems Organization Washington University in St. Louis St. Louis, MO 63143 1
Atomic Operations Today, we ll explore various hardware-supported atomic operations and their use in userspace synchronization in more detail Some important new atomic functions we ll examine __atomic_compare_exchange() set a variable to a new value if its old value is something we expect it to be, otherwise update expected to be old value __atomic_add_fetch() add to a variable and return its new value __atomic_sub_fetch() subtract from a variable and return its new value __atomic_store_n() set a variable s value (unconditionally) https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic- Builtins.html 2 CSE 422S Operating Systems Organization
Spinlocks vs Futexes Spinlocks (implemented via compare_exchange or test_and_set, for example) operate correctly, but may waste CPU cycles via spinning In cases of longer critical sections, it is desirable to be able to put processes/threads to sleep Similar to kernel-level mutexes/semaphores Fast userspace mutexes (futexes) are a feature to improve userspace lock performance 3 CSE 422S Operating Systems Organization
Futexes Three components: (1) an atomic integer variable in userspace (2) a system call to sleep/wake up contending processes (state of the atomic variable guards the sleep call) (3) a kernel-level wait queue for sleeping threads/processes Can use to synchronize multiple processes or threads In multiprocess settings, processes must explicitly share memory (e.g., lock variable location) via mmap(), etc. In multithread settings, threads implicitly share memory In their basic form, futexes offer primitive operations on which other (sleep) lock mechanisms can be built Semaphores, condition variables, readers/writer locks, etc. 4 CSE 422S Operating Systems Organization
Illustrating Futexes Optimization (using a compare and exchange operation that returns the old state) Lock acquisition (on some architectures): cmpxchg(val, UNLOCKED, LOCKED) 3 states: UNLOCKED LOCKED WAITERS WAITERS UNLOCKED LOCKED cmpxchg(val, LOCKED, WAITERS) UNLOCKED cmpxchg(val, UNLOCKED, LOCKED) LOCKED || WAITERS UNLOCKED (syscall to sleep) (success) 1. Common case code path (no lock contention) is optimized by not requiring any system calls Uncommon case - (lock contention) does not waste CPU cycles by spinning, but rather sleeps/wakes up via system calls 2. 5 CSE 422S Operating Systems Organization
Illustrating Futexes Optimization (using the atomic operations that we will use in studio today) Lock acquisition: ret = __atomic_sub_fetch(&var, 1, __ATOMIC_ACQ_REL) 3 states: UNLOCKED == 1 LOCKED == 0 WAITERS < 0 (initialize variable var to UNLOCKED) WAITERS LOCKED (success) (syscall: sleep if ret == var) Lock release: __atomic_add_fetch(&var, 1, __ATOMIC_ACQ_REL) LOCKED || WAITERS UNLOCKED __atomic_store_n(&var, UNLOCKED, __ATOMIC_RELEASE) (syscall: wake up waiters) (success) 1. 2. Common case code path again optimized: no system calls Uncommon case again (conditionally) sleeps to avoid spinning 6 CSE 422S Operating Systems Organization
Todays Studio Implement userspace spinlocks and sleep locks Carefully follow the locking instructions in the studio Futex semantics in man 7 futex introduce race condition Some Programming Considerations Volatile variables are often important Inform compiler (and someone reading the code) that a variable s value may change unexpectedly) Read type declarations right-to-left in C and C++ E.g., volatile int * is a pointer to an int that s volatile (the int value, not the pointer, may change) OpenMP is for parallel programming Pragmas built into gcc, to support concurrent execution of functions, barrier synchronization before and after parallel-for loops, etc. 7 CSE 422S Operating Systems Organization