C++ Parallelization and Synchronization Techniques

undefined
C
+
+
 
-
 
p
a
r
a
l
l
e
l
i
z
a
t
i
o
n
 
a
n
d
s
y
n
c
h
r
o
n
i
z
a
t
i
o
n
Jakub Yaghob
 
T
h
e
 
p
r
o
b
l
e
m
Race conditions
Separate threads with shared state
Result of computation depends on OS scheduling
 
R
a
c
e
 
c
o
n
d
i
t
i
o
n
s
 
 
s
i
m
p
l
e
d
e
m
o
Linked list
Shared state
List lst;
Thread A
lst.push_front(A);
Thread B
lst.push_front(B);
lst
X
Y
lst
X
Y
B
A
Initial state
 
Correct state
lst
X
Y
A
B
 
Another correct state
lst
X
Y
B
A
 
Incorrect state
 
R
a
c
e
 
c
o
n
d
i
t
i
o
n
s
 
 
a
d
v
a
n
c
e
d
d
e
m
o
struct
 Counter {
  Counter():value(0) { }
  
int value;
  void increment()
{
 
++value;
 }
  
void decrement()
{
 
--value;
 
}
  int get()
{
 
return value
; 
}
};
Shared state
Counter c;
Thread A
c.increment();
cout << c.get();
Thread B
c.increment();
cout << c.get();
Possible outputs
12, 21, 
11
 
C
+
+
 
f
e
a
t
u
r
e
s
C++ 11
Atomic operations
Low-level threads
High-level futures
Synchronization
primitives
Thread-local storage
C++14 features
Shared timed mutex
C++17 features
Parallel algorithms
Shared mutex
C++20 features
Stop tokens
Semaphore
Coordination types
 
T
h
r
e
a
d
s
Low-level threads
Header <thread>
thread
 class
Fork-join paradigm
Namespace 
this_thread
 
T
h
r
e
a
d
s
Class thread
Constructor
template <class F, class ...Args>
explicit thread(F&& f, Args&&... args);
Destructor
If joinable() then 
terminate()
bool joinable() const noexcept;
void join();
Blocks, until the thread *this has completed
void detach();
id get_id() const noexcept;
static unsigned hardware_concurrency();
 
T
h
r
e
a
d
s
Class jthread
Like thread, autostops+autojoins on destruction
Provides stop token
Internal member of 
std::stop_source
 type
Constructor accepts function with
std::stop_token
 as first argument
Destructor calls 
request_stop
Interface functions 
get_stop_source
,
get_stop_token
, and 
request_stop
 
T
h
r
e
a
d
s
Namespace 
this_thread
thread::id get_id() noexcept;
Unique ID of the current thread
void yield() noexcept;
Opportunity to reschedule
sleep_for
, 
sleep_until
Blocks the thread for relative/absolute timeout
 
T
h
r
e
a
d
s
Demo
#include <iostream>
#include <thread>
void thread_fn() { std::cout << “Hello from thread” <<
std::endl; }
int main(int argc, char **argv) {
  std::thread thr(&thread_fn);
  std::cout << “Hello from main” << std::endl;
  thr.join();
  return 0;
}
 
T
h
r
e
a
d
s
fork
join
“Hello from main”
“Hello from thread”
 
T
h
r
e
a
d
s
fork
blocked on join
“Hello from main”
“Hello from thread”
thread creation
overhead
 
T
h
r
e
a
d
s
fork
barrier
 
T
h
r
e
a
d
s
Demo
#include <iostream>
#include <thread>
#include <vector>
int main(int argc, char **argv) {
  std::vector<std::thread> workers;
  for(int i=0;i<10;++i)
    workers.push_back(std::thread([i]() {
      std::cout << “Hello from thread “ << i << std::endl;
    }));
  std::cout << “Hello from main” << std::endl;
  for(auto &t : workers)
    t.join();
  return 0;
}
 
T
h
r
e
a
d
s
Passing arguments to threads
By value
Safe, but you MUST make deep copy
By move (rvalue reference)
Safe, as long as strict (deep) adherence to move
semantics
By const reference
Safe, as long as object is guaranteed deep-immutable
By non-const reference
Safe, as long as the object is monitor
 
F
u
t
u
r
e
s
Futures
Header <future>
High-level asynchronous execution
Future
Promise
Async
Error handling
 
F
u
t
u
r
e
s
Shared state
Consist of
Some state information and some (possibly not yet
evaluated) result, which can be a (possibly void) value or an
exception
Asynchronous return object
Object, that reads results from an shared state
Waiting function
Potentially blocks to wait for the shared state to be made
ready
Asynchronous provider
Object that provides a result to a shared state
 
F
u
t
u
r
e
s
Future
std::future<T>
Future value of type T
Retrieve value via 
get()
Waits until the shared state is ready
wait()
, 
wait_for()
, 
wait_until()
valid()
std::shared_future<T>
Value can be read by more then one thread
 
F
u
t
u
r
e
s
Async
std::async
Higher-level convenience utility
Launches a function potentially in a new thread
Async usage
int foo(double, char, bool);
auto fut = std::async(foo, 1.5, 'x', false);
auto res = fut.get();
 
F
u
t
u
r
e
s
Packaged task
std::packaged_task
How to implement async with more control
Wraps a function and provides a future for the
function result value, but the object itself is
callable
 
F
u
t
u
r
e
s
Packaged task usage
std::packaged_task<int(double, char, bool)>
tsk(foo);
auto fut = tsk.get_future();
std::thread thr(std::move(tsk), 1.5, 'x', false);
auto res = fut.get();
 
F
u
t
u
r
e
s
Promise
std::promise<T>
Lowest-level
Steps
Calling thread makes a promise
Calling thread obtains a future from the promise
The promise, along with function arguments, are
moved into a separate thread
The new thread executes the function and fulfills the
promise
The original thread retrieves the result
 
F
u
t
u
r
e
s
Promise usage
Thread A
std::promise<int> prm;
auto fut = prm.get_future();
std::thread thr(thr_fnc, std::move(prm));
auto res = fut.get();
Thread B
void thr_fnc(std::promise<int> &&prm) {
  prm.set_value(123);
}
 
F
u
t
u
r
e
s
Constraints
A default-constructed promise is inactive
Can die without consequence
A promise becomes active, when a future is obtained via
get_future()
Only one future may be obtained
A promise must either be satisfied via 
set_value()
, or
have an exception set via 
set_exception()
A satisfied promise can die without consequence
get()
 becomes available on the future
A promise with an exception will raise the stored exception
upon call of 
get()
 on the future
A promise with neither value nor exception will raise “broken
promise” exception
 
F
u
t
u
r
e
s
Exceptions
All exceptions of type 
std::future_error
Has error code with enum type 
std::future_errc
 inactive promise
std::promise<int> pr;
// fine, no problem
 active promise, unused
std::promise<int> pr;
auto fut = pr.get_future();
// fine, no problem
// fut.get() blocks indefinitely
 too many futures
std::promise<int> pr;
auto fut1 = pr.get_future();
auto fut2 = pr.get_future();
// error “Future already
retrieved”
 
F
u
t
u
r
e
s
 satisfied promise
std::promise<int> pr;
auto fut = pr.get_future();
{
 
std::promise<int>
pr2(std::move(pr));
  
pr2.set_value(10);
} 
auto r = 
fut.get(); 
// fine, return 10
 too much satisfaction
std::promise<int> pr;
auto fut = pr.get_future();
{
 
std::promise<int>
pr2(std::move(pr));
  
pr2.set_value(10);
  
pr2.set_value(1
1
);
// error “Promise already
satisfied”
} 
auto r = 
fut.get();
 
F
u
t
u
r
e
s
 exception
std::promise<int> pr;
auto fut = pr.get_future();
{
 
std::promise<int> pr2(std::move(pr));
  p
r2.set_
exception
(
    std::make_exception_ptr(
      std::runtime_error(“bububu”))
);
} 
auto r = 
fut.get();
// throws the runtime_error
 
F
u
t
u
r
e
s
 broken promise
std::promise<int> pr;
auto fut = pr.get_future();
{
 
std::promise<int> pr2(std::move(pr));
  
// error “Broken promise”
} 
auto r = 
fut.get();
 
S
y
n
c
h
r
o
n
i
z
a
t
i
o
n
 
p
r
i
m
i
t
i
v
e
s
Synchronization primitives
Mutual exclusion
Headers <mutex> and <shared_mutex>
Condition variables
Header <condition_variable>
Semaphore
Header <semaphore>
 
M
u
t
e
x
Mutex
A synchronization primitive that can be used to protect
shared data from being simultaneously accessed by
multiple threads
mutex
 offers exclusive, non-recursive ownership semantics
A calling thread 
owns
 a 
mutex
 from the time that it successfully
calls either 
lock
 or 
try_lock
 until it calls 
unlock
When a thread owns a 
mutex
, all other threads will block (for
calls to 
lock
) or receive a false return value (for 
try_lock
) if
they attempt to claim ownership of the 
mutex
A calling thread must not own the 
mutex
 prior to calling 
lock
or 
try_lock
The behavior of a program is undefined if a 
mutex
 is
destroyed while still owned by some thread
 
M
u
t
e
x
 
e
x
a
m
p
l
e
Shared state
List lst;
std::mutex mtx;
Thread A
mtx.lock();
lst.push_front(A);
mtx.unlock();
Thread B
mtx.lock();
lst.push_front(B);
mtx.unlock();
 
M
u
t
e
x
 
v
a
r
i
a
n
t
s
Other 
mutex
 variants
timed_mutex
In addition, 
timed_mutex
 provides the ability to attempt to claim ownership
of a 
timed_mutex
 with a timeout via the 
try_lock_for
 and
try_lock_until
recursive_mutex
exclusive, recursive ownership semantics
A calling thread 
owns
 a 
recursive_mutex
 for a period of time that starts when it
successfully calls either 
lock
 or 
try_lock
. During this period, the thread may make
additional calls to 
lock
 or 
try_lock
. The period of ownership ends when the thread
makes a matching number of calls to 
unlock
When a thread owns a 
recursive_mutex
, all other threads will block (for calls to
lock
) or receive a false return value (for 
try_lock
) if they attempt to claim
ownership of the 
recursive_mutex
The maximum number of times that a 
recursive_mutex
 may be locked is
unspecified, but after that number is reached, calls to 
lock
 will throw
std::system_error 
and calls to 
try_lock
 will return false
recursive_timed_mutex
Combination
 
M
u
t
e
x
 
v
a
r
i
a
n
t
s
Other mutex variants
std::shared_mutex
Additionally multiple threads can make shared lock
using 
lock_shared()
Either exclusive lock or shared lock
std::shared_timed_mutex
 
M
u
t
e
x
 
w
r
a
p
p
e
r
s
std::unique_lock
Lock class with more features
Timed wait, deferred lock
std::lock_guard
Scope based lock (RAII)
Linked list demo, code for one thread
{
  std::lock_guard<std::mutex> lk(mtx);
  lst.push_front(X);
}
 
M
u
t
e
x
 
w
r
a
p
p
e
r
s
 
a
n
d
 
o
t
h
e
r
s
Shared lock wrapper
std::shared_lock
Calls 
lock_shared
 for the given shared mutex
Variadic wrapper
template <typename … MutexTypes> class
scoped_lock;
Multiple locks at once, RAII, deadlock avoidance
Interference size
std::size_t
hardware_destructive_interference_size;
Size of a cache line
 
L
o
c
k
i
n
g
 
a
l
g
o
r
i
t
h
m
s
std::lock
locks specified mutexes, blocks if any are
unavailable, deadlock avoidance
std::try_lock
attempts to obtain ownership of mutexes via
repeated calls to 
try_lock
// don't actually take the locks yet
std::unique_lock<std::mutex> lock1(mtx1, std::defer_lock);
std::unique_lock<std::mutex> lock2(mtx2, std::defer_lock);
// lock both unique_locks without deadlock
std::lock(lock1, lock2);
 
C
a
l
l
 
o
n
c
e
std::once_flag
Helper object for 
std::call_once
std::call_once
invokes a function only once even if called from
multiple threads
std::once_flag flag;
void do_once() {
   std::call_once(flag, [](){ do something only once }); }
std::thread t1(do_once);
std::thread t2(do_once);
 
C
o
n
d
i
t
i
o
n
 
v
a
r
i
a
b
l
e
std::condition_variable
Can be used to block a thread, or multiple threads
at the same time, until
a notification is received from another thread
a timeout expires, or
a spurious wakeup occurs
Appears to be signaled, although the condition is not valid
Verify the condition after the thread has finished waiting
Works with 
std::unique_lock
wait
 atomically manipulates mutex, 
notify
 does
nothing
 
C
o
n
d
i
t
i
o
n
 
v
a
r
i
a
b
l
e
 
e
x
a
m
p
l
e
Producer
for () {
  // produce something
  { 
std::lock_guard<std::mutex>
lock(m);
    queue
.push(i
tem
);
    
notified = true;
 }
  
cond_var.notify_one();
}
std::lock_guard<std::mutex> lock(m);
notified = true;
done = true;
cond_var.notify_one();
Consumer
std::unique_lock<std::mutex> lock(m);
while(!done) {
  
while (!notified) {
    
// loop to avoid spurious wakeups
    
cond_var.wait(lock);
  
}
  
while(!
queue
.empty()) {
    queue
.pop();
    // consume
  
}
  
notified = false;
}
std::mutex m;
std::condition_variable cond_var;
bool done = false; bool notified = false;
 
S
e
m
a
p
h
o
r
e
Counting semaphore
std::counting_semaphore
Constructor sets the count
Manipulation
acquire(), release(count=1)
Binary semaphore
std::binary_semaphore
 
C
o
o
r
d
i
n
a
t
i
o
n
 
t
y
p
e
s
Latches
Header <latch>
Thread coordination mechanism
Block threads until an expected number of threads arrive
Single use
Barriers
Header <barrier>
Sequence of phases
Each call to 
arrive()
 decrements expected count, the thread can then
wait()
When count==0, the completion function is called and all blocked threads are
unblocked
Expected count is reset to the previous value
Constructor
barrier(ptrdiff_t expected, CompletionFunction f)
 
S
t
o
p
 
t
o
k
e
n
s
Asynchronously request to stop execution of an operation
Shared state among associated 
stop_source
, 
stop_token
,
and 
stop_callback
stop_token
Interface for querying whether a stop request has been made or can
ever been made
bool stop_requested()
, 
bool stop_possible()
stop_source
Implements the semantics of making stop request
Creates 
stop_token
s
stop_token get_token()
Makes stop request
request_stop()
stop_callback
Invokes callback function when stop request has been made
 
T
h
r
e
a
d
-
l
o
c
a
l
 
s
t
o
r
a
g
e
Thread-local storage
Added a new storage-class
Use keyword 
thread_local
Must be present in all declarations of a variable
Only for namespace or block scope variables and to
the names of static data members
For block scope variables 
static
 is implied
Storage of a variable lasts for the duration of a
thread in which it is created
 
P
a
r
a
l
l
e
l
 
a
l
g
o
r
i
t
h
m
s
Parallelism
In headers <algorithm>, <numeric>
Parallel algorithms
Execution policy 
in <execution>
seq
 – execute sequentially
par
 – execute in parallel on multiple threads
par_unseq
 – execute in parallel on multiple threads, interleave individual
iterations within a single thread, no locks
unseq
 – execute in single thread+vectorized
for_each
reduce
, 
scan
, 
transform_reduce
, 
transform_scan
Inclusive scan – like 
partial_sum
, includes i-th input element in the i-th
sum
Exclusive scan – like 
partial_sum
, excludes i-th input element from the
i-th sum
No exceptions should be thrown
Terminate
 
P
a
r
a
l
l
e
l
 
a
l
g
o
r
i
t
h
m
s
Parallel algorithms
Not all algorithms have parallel version
a
d
j
a
c
e
n
t
_
d
i
f
f
e
r
e
n
c
e
,
 
a
d
j
a
c
e
n
t
_
f
i
n
d
,
 
a
l
l
_
o
f
,
 
a
n
y
_
o
f
,
 
c
o
p
y
,
 
c
o
p
y
_
i
f
,
c
o
p
y
_
n
,
 
c
o
u
n
t
,
 
c
o
u
n
t
_
i
f
,
 
e
q
u
a
l
,
 
e
x
c
l
u
s
i
v
e
_
s
c
a
n
,
 
f
i
l
l
,
 
f
i
l
l
_
n
,
 
f
i
n
d
,
f
i
n
d
_
e
n
d
,
 
f
i
n
d
_
f
i
r
s
t
_
o
f
,
 
f
i
n
d
_
i
f
,
 
f
i
n
d
_
i
f
_
n
o
t
,
 
f
o
r
_
e
a
c
h
,
f
o
r
_
e
a
c
h
_
n
,
 
g
e
n
e
r
a
t
e
,
 
g
e
n
e
r
a
t
e
_
n
,
 
i
n
c
l
u
d
e
s
,
 
i
n
c
l
u
s
i
v
e
_
s
c
a
n
,
i
n
n
e
r
_
p
r
o
d
u
c
t
,
 
i
n
p
l
a
c
e
_
m
e
r
g
e
,
 
i
s
_
h
e
a
p
,
 
i
s
_
h
e
a
p
_
u
n
t
i
l
,
i
s
_
p
a
r
t
i
t
i
o
n
e
d
,
 
i
s
_
s
o
r
t
e
d
,
 
i
s
_
s
o
r
t
e
d
_
u
n
t
i
l
,
l
e
x
i
c
o
g
r
a
p
h
i
c
a
l
_
c
o
m
p
a
r
e
,
 
m
a
x
_
e
l
e
m
e
n
t
,
 
m
e
r
g
e
,
 
m
i
n
_
e
l
e
m
e
n
t
,
m
i
n
m
a
x
_
e
l
e
m
e
n
t
,
 
m
i
s
m
a
t
c
h
,
 
m
o
v
e
,
 
n
o
n
e
_
o
f
,
 
n
t
h
_
e
l
e
m
e
n
t
,
p
a
r
t
i
a
l
_
s
o
r
t
,
 
p
a
r
t
i
a
l
_
s
o
r
t
_
c
o
p
y
,
 
p
a
r
t
i
t
i
o
n
,
 
p
a
r
t
i
t
i
o
n
_
c
o
p
y
,
r
e
d
u
c
e
,
 
r
e
m
o
v
e
,
 
r
e
m
o
v
e
_
c
o
p
y
,
 
r
e
m
o
v
e
_
c
o
p
y
_
i
f
,
 
r
e
m
o
v
e
_
i
f
,
 
r
e
p
l
a
c
e
,
r
e
p
l
a
c
e
_
c
o
p
y
,
 
r
e
p
l
a
c
e
_
c
o
p
y
_
i
f
,
 
r
e
p
l
a
c
e
_
i
f
,
 
r
e
v
e
r
s
e
,
r
e
v
e
r
s
e
_
c
o
p
y
,
 
r
o
t
a
t
e
,
 
r
o
t
a
t
e
_
c
o
p
y
,
 
s
e
a
r
c
h
,
 
s
e
a
r
c
h
_
n
,
s
e
t
_
d
i
f
f
e
r
e
n
c
e
,
 
s
e
t
_
i
n
t
e
r
s
e
c
t
i
o
n
,
 
s
e
t
_
s
y
m
m
e
t
r
i
c
_
d
i
f
f
e
r
e
n
c
e
,
s
e
t
_
u
n
i
o
n
,
 
s
o
r
t
,
 
s
t
a
b
l
e
_
p
a
r
t
i
t
i
o
n
,
 
s
t
a
b
l
e
_
s
o
r
t
,
 
s
w
a
p
_
r
a
n
g
e
s
,
t
r
a
n
s
f
o
r
m
,
 
t
r
a
n
s
f
o
r
m
_
e
x
c
l
u
s
i
v
e
_
s
c
a
n
,
 
t
r
a
n
s
f
o
r
m
_
i
n
c
l
u
s
i
v
e
_
s
c
a
n
,
t
r
a
n
s
f
o
r
m
_
r
e
d
u
c
e
,
 
u
n
i
n
i
t
i
a
l
i
z
e
d
_
c
o
p
y
,
 
u
n
i
n
i
t
i
a
l
i
z
e
d
_
c
o
p
y
_
n
,
u
n
i
n
i
t
i
a
l
i
z
e
d
_
f
i
l
l
,
 
u
n
i
n
i
t
i
a
l
i
z
e
d
_
f
i
l
l
_
n
,
 
u
n
i
q
u
e
,
 
u
n
i
q
u
e
_
c
o
p
y
 
C
+
+
 
e
x
t
e
n
s
i
o
n
 
 
e
x
e
c
u
t
o
r
s
Executors
Now separate TS, not finished in C++23 timeframe, maybe in C++26?
Executor
Controls how a task (=function) is executed
Direction
One-way execution
Does not return a result
Two-way execution
Returns future
Then
Execution agent begins execution after a given future becomes ready, returns future
Cardinality
Single
One execution agent
Bulk executions
Group of execution agents
Agents return a factory
Thread pool
Controls where the task is executed
 
C
+
+
 
e
x
t
e
n
s
i
o
n
s
 
 
c
o
n
c
u
r
r
e
n
c
y
Concurrency
TS published, depends on executors TS
Improvements to future
future<T2> then(F &&f)
Execute asynchronously a function f when the future is
ready
 
C
+
+
 
e
x
t
e
n
s
i
o
n
 
 
t
r
a
n
s
a
c
t
i
o
n
a
l
m
e
m
o
r
y
TS v1 finished, never used
TS v2 in progress
Transactional memory
Atomic blocks
Transactional behavior
Exception inside the block leads to undefined behavior
unsigned int f()
{
  static unsigned int i = 0;
  atomic do {
    ++i;
    return i;
  }
}
Slide Note
Embed
Share

Explore the challenges of race conditions in parallel programming, learn how to handle shared states in separate threads, and discover advanced synchronization methods in C++. Delve into features from C++11 to C++20, including atomic operations, synchronization primitives, and coordination types. Understand the intricacies of low-level threads and the functionalities provided by the thread class. Uncover the nuances of jthread, its autostopping capabilities, and handling stop tokens.

  • C++
  • Parallelization
  • Synchronization
  • Threads
  • Race Conditions

Uploaded on Sep 20, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. C++ - parallelization and synchronization Jakub Yaghob

  2. The problem Race conditions Separate threads with shared state Result of computation depends on OS scheduling

  3. Race conditions simple demo Initial state Linked list Shared state List lst; Thread A lst.push_front(A); Thread B lst.push_front(B); lst X Y Correct state lst B A X Y Another correct state lst A B X Y Incorrect state A lst X Y B

  4. Race conditions advanced demo Shared state Counter c; Thread A c.increment(); cout << c.get(); Thread B c.increment(); cout << c.get(); Possible outputs 12, 21, 11 struct Counter { Counter():value(0) { } int value; void increment() { ++value; } void decrement() { --value; } int get() { return value; } };

  5. C++ features C++ 11 Atomic operations Low-level threads High-level futures Synchronization primitives Thread-local storage C++14 features Shared timed mutex C++17 features Parallel algorithms Shared mutex C++20 features Stop tokens Semaphore Coordination types

  6. Threads Low-level threads Header <thread> thread class Fork-join paradigm Namespace this_thread

  7. Threads Class thread Constructor template <class F, class ...Args> explicit thread(F&& f, Args&&... args); Destructor If joinable() then terminate() bool joinable() const noexcept; void join(); Blocks, until the thread *this has completed void detach(); id get_id() const noexcept; static unsigned hardware_concurrency();

  8. Threads Class jthread Like thread, autostops+autojoins on destruction Provides stop token Internal member of std::stop_source type Constructor accepts function with std::stop_token as first argument Destructor calls request_stop Interface functions get_stop_source, get_stop_token, and request_stop

  9. Threads Namespace this_thread thread::id get_id() noexcept; Unique ID of the current thread void yield() noexcept; Opportunity to reschedule sleep_for, sleep_until Blocks the thread for relative/absolute timeout

  10. Threads Demo #include <iostream> #include <thread> void thread_fn() { std::cout << Hello from thread << std::endl; } int main(int argc, char **argv) { std::thread thr(&thread_fn); std::cout << Hello from main << std::endl; thr.join(); return 0; }

  11. Threads fork Hello from main Hello from thread join

  12. Threads fork thread creation overhead Hello from main blocked on join Hello from thread

  13. Threads fork barrier

  14. Threads Demo #include <iostream> #include <thread> #include <vector> int main(int argc, char **argv) { std::vector<std::thread> workers; for(int i=0;i<10;++i) workers.push_back(std::thread([i]() { std::cout << Hello from thread << i << std::endl; })); std::cout << Hello from main << std::endl; for(auto &t : workers) t.join(); return 0; }

  15. Threads Passing arguments to threads By value Safe, but you MUST make deep copy By move (rvalue reference) Safe, as long as strict (deep) adherence to move semantics By const reference Safe, as long as object is guaranteed deep-immutable By non-const reference Safe, as long as the object is monitor

  16. Futures Futures Header <future> High-level asynchronous execution Future Promise Async Error handling

  17. Futures Shared state Consist of Some state information and some (possibly not yet evaluated) result, which can be a (possibly void) value or an exception Asynchronous return object Object, that reads results from an shared state Waiting function Potentially blocks to wait for the shared state to be made ready Asynchronous provider Object that provides a result to a shared state

  18. Futures Future std::future<T> Future value of type T Retrieve value via get() Waits until the shared state is ready wait(), wait_for(), wait_until() valid() std::shared_future<T> Value can be read by more then one thread

  19. Futures Async std::async Higher-level convenience utility Launches a function potentially in a new thread Async usage int foo(double, char, bool); auto fut = std::async(foo, 1.5, 'x', false); auto res = fut.get();

  20. Futures Packaged task std::packaged_task How to implement async with more control Wraps a function and provides a future for the function result value, but the object itself is callable

  21. Futures Packaged task usage std::packaged_task<int(double, char, bool)> tsk(foo); auto fut = tsk.get_future(); std::thread thr(std::move(tsk), 1.5, 'x', false); auto res = fut.get();

  22. Futures Promise std::promise<T> Lowest-level Steps Calling thread makes a promise Calling thread obtains a future from the promise The promise, along with function arguments, are moved into a separate thread The new thread executes the function and fulfills the promise The original thread retrieves the result

  23. Futures Promise usage Thread A std::promise<int> prm; auto fut = prm.get_future(); std::thread thr(thr_fnc, std::move(prm)); auto res = fut.get(); Thread B void thr_fnc(std::promise<int> &&prm) { prm.set_value(123); }

  24. Futures Constraints A default-constructed promise is inactive Can die without consequence A promise becomes active, when a future is obtained via get_future() Only one future may be obtained A promise must either be satisfied via set_value(), or have an exception set via set_exception() A satisfied promise can die without consequence get() becomes available on the future A promise with an exception will raise the stored exception upon call of get() on the future A promise with neither value nor exception will raise broken promise exception

  25. Futures Exceptions All exceptions of type std::future_error Has error code with enum type std::future_errc inactive promise std::promise<int> pr; // fine, no problem too many futures std::promise<int> pr; auto fut1 = pr.get_future(); auto fut2 = pr.get_future(); // error Future already retrieved active promise, unused std::promise<int> pr; auto fut = pr.get_future(); // fine, no problem // fut.get() blocks indefinitely

  26. Futures satisfied promise std::promise<int> pr; auto fut = pr.get_future(); { std::promise<int> pr2(std::move(pr)); pr2.set_value(10); } auto r = fut.get(); // fine, return 10 too much satisfaction std::promise<int> pr; auto fut = pr.get_future(); { std::promise<int> pr2(std::move(pr)); pr2.set_value(10); pr2.set_value(11); // error Promise already satisfied } auto r = fut.get();

  27. Futures exception std::promise<int> pr; auto fut = pr.get_future(); { std::promise<int> pr2(std::move(pr)); pr2.set_exception( std::make_exception_ptr( std::runtime_error( bububu ))); } auto r = fut.get(); // throws the runtime_error

  28. Futures broken promise std::promise<int> pr; auto fut = pr.get_future(); { std::promise<int> pr2(std::move(pr)); // error Broken promise } auto r = fut.get();

  29. Synchronization primitives Synchronization primitives Mutual exclusion Headers <mutex> and <shared_mutex> Condition variables Header <condition_variable> Semaphore Header <semaphore>

  30. Mutex Mutex A synchronization primitive that can be used to protect shared data from being simultaneously accessed by multiple threads mutex offers exclusive, non-recursive ownership semantics A calling thread owns a mutex from the time that it successfully calls either lock or try_lock until it calls unlock When a thread owns a mutex, all other threads will block (for calls to lock) or receive a false return value (for try_lock) if they attempt to claim ownership of the mutex A calling thread must not own the mutex prior to calling lock or try_lock The behavior of a program is undefined if a mutex is destroyed while still owned by some thread

  31. Mutex example Shared state List lst; std::mutex mtx; Thread A mtx.lock(); lst.push_front(A); mtx.unlock(); Thread B mtx.lock(); lst.push_front(B); mtx.unlock();

  32. Mutex variants Other mutex variants timed_mutex In addition, timed_mutex provides the ability to attempt to claim ownership of a timed_mutex with a timeout via the try_lock_for and try_lock_until recursive_mutex exclusive, recursive ownership semantics A calling thread owns a recursive_mutex for a period of time that starts when it successfully calls either lock or try_lock. During this period, the thread may make additional calls to lock or try_lock. The period of ownership ends when the thread makes a matching number of calls to unlock When a thread owns a recursive_mutex, all other threads will block (for calls to lock) or receive a false return value (for try_lock) if they attempt to claim ownership of the recursive_mutex The maximum number of times that a recursive_mutex may be locked is unspecified, but after that number is reached, calls to lock will throw std::system_error and calls to try_lock will return false recursive_timed_mutex Combination

  33. Mutex variants Other mutex variants std::shared_mutex Additionally multiple threads can make shared lock using lock_shared() Either exclusive lock or shared lock std::shared_timed_mutex

  34. Mutex wrappers std::unique_lock Lock class with more features Timed wait, deferred lock std::lock_guard Scope based lock (RAII) Linked list demo, code for one thread { std::lock_guard<std::mutex> lk(mtx); lst.push_front(X); }

  35. Mutex wrappers and others Shared lock wrapper std::shared_lock Calls lock_shared for the given shared mutex Variadic wrapper template <typename MutexTypes> class scoped_lock; Multiple locks at once, RAII, deadlock avoidance Interference size std::size_t hardware_destructive_interference_size; Size of a cache line

  36. Locking algorithms std::lock locks specified mutexes, blocks if any are unavailable, deadlock avoidance std::try_lock attempts to obtain ownership of mutexes via repeated calls to try_lock // don't actually take the locks yet std::unique_lock<std::mutex> lock1(mtx1, std::defer_lock); std::unique_lock<std::mutex> lock2(mtx2, std::defer_lock); // lock both unique_locks without deadlock std::lock(lock1, lock2);

  37. Call once std::once_flag Helper object for std::call_once std::call_once invokes a function only once even if called from multiple threads std::once_flag flag; void do_once() { std::call_once(flag, [](){ do something only once }); } std::thread t1(do_once); std::thread t2(do_once);

  38. Condition variable std::condition_variable Can be used to block a thread, or multiple threads at the same time, until a notification is received from another thread a timeout expires, or a spurious wakeup occurs Appears to be signaled, although the condition is not valid Verify the condition after the thread has finished waiting Works with std::unique_lock wait atomically manipulates mutex, notify does nothing

  39. Condition variable example std::mutex m; std::condition_variable cond_var; bool done = false; bool notified = false; Producer for () { // produce something { std::lock_guard<std::mutex> lock(m); queue.push(item); notified = true; } cond_var.notify_one(); } std::lock_guard<std::mutex> lock(m); notified = true; done = true; cond_var.notify_one(); Consumer std::unique_lock<std::mutex> lock(m); while(!done) { while (!notified) { // loop to avoid spurious wakeups cond_var.wait(lock); } while(!queue.empty()) { queue.pop(); // consume } notified = false; }

  40. Semaphore Counting semaphore std::counting_semaphore Constructor sets the count Manipulation acquire(), release(count=1) Binary semaphore std::binary_semaphore

  41. Coordination types Latches Header <latch> Thread coordination mechanism Block threads until an expected number of threads arrive Single use Barriers Header <barrier> Sequence of phases Each call to arrive() decrements expected count, the thread can then wait() When count==0, the completion function is called and all blocked threads are unblocked Expected count is reset to the previous value Constructor barrier(ptrdiff_t expected, CompletionFunction f)

  42. Stop tokens Asynchronously request to stop execution of an operation Shared state among associated stop_source, stop_token, and stop_callback stop_token Interface for querying whether a stop request has been made or can ever been made bool stop_requested(), bool stop_possible() stop_source Implements the semantics of making stop request Creates stop_tokens stop_token get_token() Makes stop request request_stop() stop_callback Invokes callback function when stop request has been made

  43. Thread-local storage Thread-local storage Added a new storage-class Use keyword thread_local Must be present in all declarations of a variable Only for namespace or block scope variables and to the names of static data members For block scope variables static is implied Storage of a variable lasts for the duration of a thread in which it is created

  44. Parallel algorithms Parallelism In headers <algorithm>, <numeric> Parallel algorithms Execution policy in <execution> seq execute sequentially par execute in parallel on multiple threads par_unseq execute in parallel on multiple threads, interleave individual iterations within a single thread, no locks unseq execute in single thread+vectorized for_each reduce, scan, transform_reduce, transform_scan Inclusive scan like partial_sum, includes i-th input element in the i-th sum Exclusive scan like partial_sum, excludes i-th input element from the i-th sum No exceptions should be thrown Terminate

  45. Parallel algorithms Parallel algorithms Not all algorithms have parallel version adjacent_difference, adjacent_find, all_of, any_of, copy, copy_if, copy_n, count, count_if, equal, exclusive_scan, fill, fill_n, find, find_end, find_first_of, find_if, find_if_not, for_each, for_each_n, generate, generate_n, includes, inclusive_scan, inner_product, inplace_merge, is_heap, is_heap_until, is_partitioned, is_sorted, is_sorted_until, lexicographical_compare, max_element, merge, min_element, minmax_element, mismatch, move, none_of, nth_element, partial_sort, partial_sort_copy, partition, partition_copy, reduce, remove, remove_copy, remove_copy_if, remove_if, replace, replace_copy, replace_copy_if, replace_if, reverse, reverse_copy, rotate, rotate_copy, search, search_n, set_difference, set_intersection, set_symmetric_difference, set_union, sort, stable_partition, stable_sort, swap_ranges, transform, transform_exclusive_scan, transform_inclusive_scan, transform_reduce, uninitialized_copy, uninitialized_copy_n, uninitialized_fill, uninitialized_fill_n, unique, unique_copy

  46. C++ extension executors Executors Now separate TS, not finished in C++23 timeframe, maybe in C++26? Executor Controls how a task (=function) is executed Direction One-way execution Does not return a result Two-way execution Returns future Then Execution agent begins execution after a given future becomes ready, returns future Cardinality Single One execution agent Bulk executions Group of execution agents Agents return a factory Thread pool Controls where the task is executed

  47. C++ extensions concurrency Concurrency TS published, depends on executors TS Improvements to future future<T2> then(F &&f) Execute asynchronously a function f when the future is ready

  48. C++ extension transactional memory TS v1 finished, never used TS v2 in progress Transactional memory Atomic blocks Transactional behavior Exception inside the block leads to undefined behavior unsigned int f() { static unsigned int i = 0; atomic do { ++i; return i; } }

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#