Userspace Synchronization Techniques in Operating Systems

 
Userspace Synchronization
 
1
 
Chris Gill, Brian Kocoloski, Marion Sudvarg, James Orr
CSE 422S - Operating Systems Organization
Washington University in St. Louis
St. Louis, MO 63143
 
Atomic Operations
 
2
 
Today, we’ll explore various hardware-supported atomic operations
and their use in 
userspace
 synchronization in more detail
 
Some important new atomic functions we’ll examine
__atomic_compare_exchange()
 – set a variable to a new value if its old
value is something we expect it to be, otherwise update 
expected 
to be old
value
__atomic_add_fetch()
 – add to a variable and return its new value
__atomic_sub_fetch()
 – subtract from a variable and return its new
value
__atomic_store_n()
 – set a variable’s value (unconditionally)
 
 
CSE 422S – Operating Systems Organization
 
Spinlocks vs Futexes
 
Spinlocks (implemented via compare_exchange
or test_and_set, for example) operate correctly,
but may waste CPU cycles via spinning
 
In cases of longer critical sections, it is desirable
to be able to put processes/threads to sleep
Similar to kernel-level mutexes/semaphores
 
F
ast 
u
serspace mu
texes
 (futexes) are a feature
to improve userspace lock performance
 
 
3
 
CSE 422S – Operating Systems Organization
 
Futexes
 
4
 
Three components:
(1) an atomic integer variable in userspace
(2) a system call to sleep/wake up contending processes
(
state of the atomic variable 
guards the sleep call)
(3) a kernel-level wait queue for sleeping threads/processes
 
Can use to synchronize multiple processes or threads
In multiprocess settings, processes must explicitly share
memory (e.g., lock variable location) via mmap(), etc.
In multithread settings, threads implicitly share memory
 
In their basic form, futexes offer primitive operations
on which other (sleep) lock mechanisms can be built
Semaphores, condition variables, readers/writer locks, etc.
 
 
CSE 422S – Operating Systems Organization
 
Illustrating Futexes’ Optimization
 
5
 
 
1.
Common case code path – (no lock contention) – is optimized by
not requiring any system calls
2.
Uncommon case - (lock contention) – does not waste CPU cycles
by spinning, but rather sleeps/wakes up via system calls
 
 
3 states:
UNLOCKED
LOCKED
WAITERS
 
Lock acquisition (on some architectures):
cmpxchg(val, UNLOCKED, LOCKED)
 
UNLOCKED
 
 
 
 
 
 
(success)
 
 
LOCKED
 
WAITERS
 
 
 
 
 
 
(syscall to sleep)
 
cmpxchg(val, LOCKED, WAITERS)
 
UNLOCKED
 
UNLOCKED
 
LOCKED ||
WAITERS
 
(using a compare and exchange operation that returns the old state)
 
cmpxchg(val, UNLOCKED, LOCKED)
 
CSE 422S – Operating Systems Organization
 
Illustrating Futexes’ Optimization
 
6
 
 
1.
Common case code path – again optimized: no system calls
2.
Uncommon case – again (conditionally) sleeps to avoid spinning
 
 
3 states:
UNLOCKED == 1
LOCKED == 0
WAITERS < 0
(initialize
variable var to
UNLOCKED
)
 
Lock acquisition:
ret = __atomic_sub_fetch(&var, 1,
                         __ATOMIC_ACQ_REL)
 
(success)
 
 
LOCKED
 
WAITERS
 
       (syscall: sleep if ret == var)
 
(using the atomic operations that we will use in studio today)
 
CSE 422S – Operating Systems Organization
 
Lock release:
__atomic_add_fetch(&var, 1, __ATOMIC_ACQ_REL)
 
(success)
 
 
LOCKED || WAITERS
 
UNLOCKED
 
__atomic_store_n(&var, UNLOCKED, __ATOMIC_RELEASE)
                 (syscall: wake up waiters)
 
 
 
Today’s Studio
 
7
 
Implement userspace spinlocks and sleep locks
Carefully follow the locking instructions in the studio
Futex semantics in 
man 7 futex
 introduce race condition
 
Some Programming Considerations
Volatile variables are often important
Inform compiler (and someone reading the code) that a variable’s
value may change unexpectedly)
Read type declarations right-to-left in C and C++
E.g., 
volatile int *
 is a pointer to an int that’s volatile (the int
value, not the pointer, may change)
 
OpenMP is for parallel programming
Pragmas built into gcc, to support concurrent execution of
functions, barrier synchronization before and after parallel-for
loops, etc.
 
CSE 422S – Operating Systems Organization
Slide Note
Embed
Share

Exploring various hardware-supported atomic operations like __atomic_compare_exchange(), __atomic_add_fetch(), and __atomic_sub_fetch() for userspace synchronization. Differentiating between spinlocks and futexes, focusing on fast userspace mutexes like futexes to enhance lock performance. Detailing the components and usage of futexes in synchronizing processes or threads. Illustrating Futexes optimization using compare and exchange operations.

  • Synchronization
  • Atomic Operations
  • Userspace
  • Operating Systems
  • Futexes

Uploaded on Jul 17, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Userspace Synchronization Chris Gill, Brian Kocoloski, Marion Sudvarg, James Orr CSE 422S - Operating Systems Organization Washington University in St. Louis St. Louis, MO 63143 1

  2. Atomic Operations Today, we ll explore various hardware-supported atomic operations and their use in userspace synchronization in more detail Some important new atomic functions we ll examine __atomic_compare_exchange() set a variable to a new value if its old value is something we expect it to be, otherwise update expected to be old value __atomic_add_fetch() add to a variable and return its new value __atomic_sub_fetch() subtract from a variable and return its new value __atomic_store_n() set a variable s value (unconditionally) https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic- Builtins.html 2 CSE 422S Operating Systems Organization

  3. Spinlocks vs Futexes Spinlocks (implemented via compare_exchange or test_and_set, for example) operate correctly, but may waste CPU cycles via spinning In cases of longer critical sections, it is desirable to be able to put processes/threads to sleep Similar to kernel-level mutexes/semaphores Fast userspace mutexes (futexes) are a feature to improve userspace lock performance 3 CSE 422S Operating Systems Organization

  4. Futexes Three components: (1) an atomic integer variable in userspace (2) a system call to sleep/wake up contending processes (state of the atomic variable guards the sleep call) (3) a kernel-level wait queue for sleeping threads/processes Can use to synchronize multiple processes or threads In multiprocess settings, processes must explicitly share memory (e.g., lock variable location) via mmap(), etc. In multithread settings, threads implicitly share memory In their basic form, futexes offer primitive operations on which other (sleep) lock mechanisms can be built Semaphores, condition variables, readers/writer locks, etc. 4 CSE 422S Operating Systems Organization

  5. Illustrating Futexes Optimization (using a compare and exchange operation that returns the old state) Lock acquisition (on some architectures): cmpxchg(val, UNLOCKED, LOCKED) 3 states: UNLOCKED LOCKED WAITERS WAITERS UNLOCKED LOCKED cmpxchg(val, LOCKED, WAITERS) UNLOCKED cmpxchg(val, UNLOCKED, LOCKED) LOCKED || WAITERS UNLOCKED (syscall to sleep) (success) 1. Common case code path (no lock contention) is optimized by not requiring any system calls Uncommon case - (lock contention) does not waste CPU cycles by spinning, but rather sleeps/wakes up via system calls 2. 5 CSE 422S Operating Systems Organization

  6. Illustrating Futexes Optimization (using the atomic operations that we will use in studio today) Lock acquisition: ret = __atomic_sub_fetch(&var, 1, __ATOMIC_ACQ_REL) 3 states: UNLOCKED == 1 LOCKED == 0 WAITERS < 0 (initialize variable var to UNLOCKED) WAITERS LOCKED (success) (syscall: sleep if ret == var) Lock release: __atomic_add_fetch(&var, 1, __ATOMIC_ACQ_REL) LOCKED || WAITERS UNLOCKED __atomic_store_n(&var, UNLOCKED, __ATOMIC_RELEASE) (syscall: wake up waiters) (success) 1. 2. Common case code path again optimized: no system calls Uncommon case again (conditionally) sleeps to avoid spinning 6 CSE 422S Operating Systems Organization

  7. Todays Studio Implement userspace spinlocks and sleep locks Carefully follow the locking instructions in the studio Futex semantics in man 7 futex introduce race condition Some Programming Considerations Volatile variables are often important Inform compiler (and someone reading the code) that a variable s value may change unexpectedly) Read type declarations right-to-left in C and C++ E.g., volatile int * is a pointer to an int that s volatile (the int value, not the pointer, may change) OpenMP is for parallel programming Pragmas built into gcc, to support concurrent execution of functions, barrier synchronization before and after parallel-for loops, etc. 7 CSE 422S Operating Systems Organization

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#