Reactor: A Case for Predictable, Virtualized Actor Database Systems

 
Reactors: A Case for
Predictable, Virtualized
Actor Database Systems
 
Vivek Shah ,  Marcos Antonio Vaz Salles
University of Copenhagen (DMS@DIKU)
 
Increasing NEW oltp
application diversity
 
2
 
NEW OLTP APPLICATION TRENDS
 
Increasing complexity of
application logic
 
3
 
Latency is critical
 
Scalability matters
optimizing for latency in new oltp applications
4
Client
Application
Server
Database
server
NEW OLTP DATABASE TO THE RESCUE
5
Modern hardware is fast evolving
New OLTP databases are being re-designed
to harness their performance
Is the programming
interface of  stored
procedures good enough ?
When are stored procedures not good enough?
Software
Engineering
Challenges
6
Performance
Challenges
Data partitioning
not enough
Leverage locality &
intra-procedure
parallelism
Need low-level
control over
deployment
Require
modularity and
isolation
Need
abstractions to
reason about
performance
 
Actor programming models are desirable
 
7
 
Modularity
Asynchronicity
Concurrency
 
 
Actor relational databases
    Relational
     Databases
Actor
 Runtimes
 
8
 
??
How can we integrate
actor programming
models in modern
relational databases
with high performance?
 
9
 
Talk outline
 
10
 
1.
Motivation / Problem Statement
2.   Relational Actor Programming Model (
Reactor
)
3.   Relational Actor Database System (
ReactDB
)
4.   Evaluation
 
 
Relational actor
programming model
(
REACTOR
)
 
11
 
2.
S
implified bitcoin exchange application
12
settlement_risk
providers
orders
Providers
Wallets
What is a Reactor?
13
 
A reactor is an application-defined actor encapsulating
state as relations
Concurrent (Single-threaded)
Isolated
Reactor Type
Schema + Methods
Reactor Instance
Type + Unique name assigned by the programmer
Examples of reactor types and instances
14
reactor
 Provider {
  
Relation
 orders {
    wallet 
int
,
    value 
float
,
    settled 
char
(1)
  }
  
Relation
 provider_info {
    risk 
float
,
    time 
timestamp
,
    window 
interval
  };
}
  
float
 sim_risk(exposure) { … }
  
float
 calc_risk(p_exposure){ … }
  
void
 add_entry(wallet, value){ … }
Name
:
OneEx
Name:
XBIT_US
Name:
CBIT_DK
Declarative queries within a reactor
Declarative queries to
access 
isolated
relational state within a
reactor
Relational
Key/Value
Hybrid
Require
modularity and
isolation
Declarative queries within a reactor
16
reactor
 Provider {
  
Relation
 orders { wallet 
int
, value 
float
, settled 
char
(1) };
  
Relation
 provider_info { risk 
float
, time 
timestamp
, window 
interval
 };
 
   
float
 calc_risk(p_exposure){
  }
  
void
 add_entry(wallet, value){
  }
}
    
INSERT INTO orders VALUES (wallet, value, ‘N’);
    SELECT SUM(value) INTO exposure FROM orders WHERE settled = 'N';
    
if
 exposure > p_exposure 
then abort
;
    
SELECT risk, time, window INTO p_risk, p_time, p_window FROM provider_info;
    
if
 p_time < now() - p_window 
then
      
p_risk := sim_risk(exposure);
    
UPDATE provider_info SET risk = p_risk, time = now();
    
end if
;
    
return
 p_risk;
Exposure of a
provider within
threshold, risk
simulated
Communication with reactors
17
Communication across
reactors using
asynchronous function calls
by specifying reactor names
Async
method
invocation
Future 
result
Invocation
:-
fut_res := fn(params) on reactor_name;
Synchronization:-
fut_res.get();
Need
abstractions to
reason about
performance
Leverage locality &
intra-procedure
parallelism
Require
modularity and
isolation
Communication with reactors
18
Total simulated
risk across
providers within
threshold
Automatic synchronization with reactors
19
A
B
C
F
E
D
G
Client
 
Method invocation on the same reactor are
synchronous despite the asynchronous
communication model.
 
Parent method completes only when its
children methods finish
Reactors provide transactional guarantees
20
 
All-or-nothing atomicity
Durability
Serializability guarantees
Unsafe program executions  are
aborted
Formalized equivalence to classical
transactional model
A
B
B
A
C
A
B
D
 
 
In-memory actor
relational database
system (ReactDB)
 
21
 
3.
In-memory database architecture design problem
22
…..
Reactors
Multi-core machine with large
main memory
How do we map a set of reactors to the compute and
memory resources of a large multi-core machine?
Delegate to operating system abstractions
23
Each reactor is a process vs 
each
reactor is a thread
Overheads
Lack of scheduling flexibility
and control
ReactDB architecture building blocks
24
 
Transaction Executors
Abstracts a physical core
Containers
Abstracts a set of physical cores
and the memory shared by them
25
Map transaction
executors to containers
(many to one)
Map reactors to
transaction executors
(many to many) such
that 
a reactor belongs
to one container only
How do we map a set of reactors to the compute and
memory resources of a large multi-core machine?
Specified manually
at deployment time
Need low-level
control over
deployment
26
Transaction
Executor
Request
Queue
Transaction
Executor
Request
Queue
Transaction Coordinator
Transport Driver
Transaction Router
Container
Container
Container
Container
Multi-core machine
ReactDB
Architecture
 
ReactDB deployments
 
27
Transaction
Executor
(A,B)
 
Container
Simple Router
Transaction
Executor
(C,D)
Simple Router
Container
 
shared-nothing
ReactDB deployments
28
 
Transaction
Executor
(A,B,C,D)
Load Balancing Router
Transaction
Executor
(A,B,C,D)
Transaction
Executor
(A,B)
Affinity based Router
Transaction
Executor
(C,D)
Container
Container
shared-everything
shared-everything
-with-affinity
-without-affinity
Hybrid deployments possible, not explored
in this work
 
ReactDB implementation overview
 
29
 
Thread Management in Transaction
Executors
Threadpool uses cooperative multitasking
Minimize context switches
Configurable multi-programming level
Maximize resource utilization
 
ReactDB implementation overview
30
 
Concurrency Control
Single Container
OCC protocol (Silo, [Tu et al. 2013])
Synchronous execution to leverage shared memory
Multi-Container
OCC + 2PC protocol
Asynchronous execution across containers
Storage Layer
Relations of all reactors in a container stored as primary indices
(Masstree [Mao et al. 2012]) in memory
Durability not implemented yet
 
 
 
Evaluation of Reactors
and ReactDB
 
31
 
4.
 
Can we really leverage
asynchronicity for
performance gains in
new oltp using reactors
and ReactDB ?
 
32
 
33
 
Workload : 
Bitcoin exchange
  
auth
_pay
 program
Deployments
shared-everything-with-affinity 
container 
-> sequential
shared-nothing containers
Parallel exposure calculation followed by sequential risk
calculation 
-> query-parallelism
Fully parallel exposure and risk calculation (as shown in
example code before) 
-> procedure-parallelism
 
Experimental setup
 
34
 
Experimental setup
 
Machine with 2X AMD Opteron 6274 with 8 cores@2.1 GHz, L1i 64 KB,
L2 2MB, L3 shared 6 MB, 125 GB RAM, 64 bit Linux 4.1.15
 
1x Exchange reactor, 15X Provider reactors (30,000 orders per of
which 800 latest orders used for exposure calculation)
 
Varying complexity of sim_risk by random number generation
 
Single worker to generate the workload
 
35
 
Asynchronicity gains manifest with
increasing parallelizable application logic
 
What about the effect
of load on 
gains from
asynchronicity ?
 
Can db architecture
virtualization help ?
 
36
37
 
Machine Setup
Machine with 2X AMD Opteron 6274 with 8 cores@2.1 GHz, L1i 64 KB, L2 2 MB,
L3 shared 6MB, 125 GB RAM, 64 bit Linux 4.1.15
Workload
100% TPC-C new-order 
augmented with artificial delay of 300 - 400
𝜇sec to simulate stock replenishment analytics per warehouse
Implemented by modeling a warehouse as a reactor
Deployment
shared-nothing
 vs 
shared-everything-with-affinity
Scale factor of 8, varying workers to simulate load on the database
Experimental setup
 
38
 
Asynchronicity gains diminish
with increasing load
 
asynchronicity gains
offset by queuing
 
shared-nothing
deployment best at
low load
 
cross over happens
much earlier without
delay @2 workers
 
More in the paper
 
39
Conclusion
40
 
NEW OLTP + modern databases -> Needs rethink of the
existing stored procedure programming model
Reactor
 programming model
 marries actor programming
construct with relational data model and querying
ReactDB
 
leverages the Reactor programming model and a
containerization mechanism to allow control of database
architecture at deployment time
 
Poster session tomorrow at 16:00
Poster session tomorrow at 16:00
 
thanks!
 
thanks!
 
Any questions?
 
You can find me at
http://www.diku.dk/~bonii
@bonivivek
bonii@di.ku.dk
 
41
Slide Note
Embed
Share

Exploring the integration of actor programming models in modern relational databases to achieve high performance. The focus is on addressing challenges related to stored procedures, data partitioning, modularity, isolation, software engineering, and performance. The talk outlines motivation, the relational actor programming model (Reactor), relational actor database system (ReactDB), and the evaluation process.

  • Actor programming models
  • Relational databases
  • High performance
  • Database systems

Uploaded on Sep 06, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Reactors: A Case for Predictable, Virtualized Actor Database Systems Vivek Shah , Marcos Antonio Vaz Salles University of Copenhagen (DMS@DIKU)

  2. Increasing NEW oltp application diversity 2

  3. NEW OLTP APPLICATION TRENDS Increasing complexity of application logic Latency is critical Scalability matters 3

  4. optimizing for latency in new oltp applications Application Application Server Server Database Database server server Client Client 4

  5. NEW OLTP DATABASE TO THE RESCUE Modern hardware is fast evolving New OLTP databases are being re-designed to harness their performance Is the programming interface of stored procedures good enough ? 5

  6. When are stored procedures not good enough? Data partitioning not enough Require modularity and isolation Software Engineering Challenges Performance Challenges Leverage locality & intra-procedure parallelism Need abstractions to reason about performance Need low-level control over deployment 6

  7. Actor programming models are desirable Modularity Asynchronicity Concurrency 7

  8. Actor relational databases ?? Actor Runtimes Relational Databases 8

  9. How can we integrate actor programming models in modern relational databases with high performance? 9

  10. Talk outline 1. Motivation / Problem Statement 2. Relational Actor Programming Model (Reactor Reactor) 3. Relational Actor Database System (ReactDB ReactDB) 4. Evaluation 10

  11. 2. Relational actor programming model (REACTOR REACTOR) 11

  12. Simplified bitcoin exchange application settlement_risk P_EXPOSURE G_RISK 50000000 23400000 providers NAME RISK TIME WINDOW CBIT_DK 2341569 18-11-17 11:45:67 10 Providers XBIT_US 5909863 18-11-17 11:43:34 30 orders PROVIDER WALLET VALUE SETTLED Wallets CBIT_DK 43 450 N XBIT_US 42 1000 Y XBIT_US 85 356.23 N 12

  13. What is a Reactor? A reactor is an application-defined actor encapsulating state as relations Concurrent (Single-threaded) Isolated Reactor Type Schema + Methods Reactor Instance Type + Unique name assigned by the programmer 13

  14. Examples of reactor types and instances reactor Provider { Relation orders { wallet int, value float, settled char(1) } reactor Exchange { Relation settlement_risk { p_exposure float, g_risk float }; Relation provider_names { value varchar(32) }; Relation provider_info { risk float, time timestamp, window interval }; void auth_pay(pprovider, pwallet, pvalue) { ... } float sim_risk(exposure) { } } float calc_risk(p_exposure){ } void add_entry(wallet, value){ } Name: XBIT_US Name: CBIT_DK Name: OneEx } 14

  15. Declarative queries within a reactor Declarative queries to access isolated relational state within a reactor Relational Key/Value Require modularity and isolation Hybrid

  16. Declarative queries within a reactor reactor Provider { Relation orders { wallet int, value float, settled char(1) }; Relation provider_info { risk float, time timestamp, window interval }; float calc_risk(p_exposure){ SELECT SUM(value) INTO exposure FROM orders WHERE settled = 'N'; if exposure > p_exposure then abort; SELECT risk, time, window INTO p_risk, p_time, p_window FROM provider_info; if p_time < now() - p_window then p_risk := sim_risk(exposure); UPDATE provider_info SET risk = p_risk, time = now(); end if; return p_risk; Exposure of a provider within threshold, risk simulated } void add_entry(wallet, value){ INSERT INTO orders VALUES (wallet, value, N ); } 16 }

  17. Communication with reactors Leverage locality & intra-procedure parallelism Communication across reactors using asynchronous function calls Need abstractions to reason about performance by specifying reactor names Async method invocation Future result Invocation:- fut_res := fn(params) on reactor_name; Require modularity and isolation Synchronization:- fut_res.get(); 17

  18. Communication with reactors reactor Exchange { Relation settlement_risk { p_exposure float, g_risk float }; Relation provider_names { value varchar(32) }; void auth_pay(pprovider, pwallet, pvalue) { SELECT g_risk, p_exposure INTO risk,exposure FROM settlement_risk; results := []; foreach p_provider in (SELECT value FROM provider_names) { res := calc_risk(exposure) on reactor p_provider; results.add(res); } total_risk := 0; foreach res in results total_risk := total_risk + res.get(); Total simulated risk across providers within threshold if total_risk + pvalue < risk then add_entry(pwallet, pvalue) on reactor pprovider; else abort; end if; } } 18

  19. Automatic synchronization with reactors Method invocation on the same reactor are Client synchronous despite the asynchronous A communication model. B C D Parent method completes only when its E F G children methods finish 19

  20. Reactors provide transactional guarantees All-or-nothing atomicity Durability A B A B Serializability guarantees Unsafe program executions are aborted Formalized equivalence to classical transactional model A B C D 20

  21. 3. In-memory actor relational database system (ReactDB) 21

  22. In-memory database architecture design problem .. Reactors Multi-core machine with large main memory How do we map a set of reactors to the compute and memory resources of a large multi-core machine? 22

  23. Delegate to operating system abstractions Each reactor is a process vs each reactor is a thread Lack of scheduling flexibility and control Overheads 23

  24. ReactDB architecture building blocks Transaction Executors Abstracts a physical core Containers Abstracts a set of physical cores and the memory shared by them 24

  25. How do we map a set of reactors to the compute and memory resources of a large multi-core machine? Map reactors to transaction executors (many to many) such that a reactor belongs to one container only Map transaction executors to containers (many to one) Need low-level control over deployment Specified manually at deployment time 25

  26. Multi-core machine Container Container Transaction Executor Transaction Executor ReactDB Architecture Container Request Queue Request Queue Transaction Coordinator Transport Driver Container Transaction Router 26

  27. ReactDB deployments Container Container Transaction Executor (A,B) Transaction Executor (C,D) Simple Router Simple Router shared-nothing 27

  28. ReactDB deployments Container Container Transaction Executor (A,B) Transaction Executor (C,D) Transaction Executor (A,B,C,D) Transaction Executor (A,B,C,D) Affinity based Router Load Balancing Router shared-everything-with-affinity shared-everything -without-affinity Hybrid deployments possible, not explored in this work 28

  29. ReactDB implementation overview Thread Management in Transaction Executors Threadpool uses cooperative multitasking Minimize context switches Configurable multi-programming level Maximize resource utilization 29

  30. ReactDB implementation overview Concurrency Control Single Container OCC protocol (Silo, [Tu et al. 2013]) Synchronous execution to leverage shared memory Multi-Container OCC + 2PC protocol Asynchronous execution across containers Storage Layer Relations of all reactors in a container stored as primary indices (Masstree [Mao et al. 2012]) in memory Durability not implemented yet 30

  31. 4. Evaluation of Reactors and ReactDB 31

  32. Can we really leverage asynchronicity for performance gains in new oltp using reactors and ReactDB ? 32

  33. Experimental setup Workload : Bitcoin exchange auth_pay program Deployments shared-everything-with-affinity container -> sequential shared-nothing containers Parallel exposure calculation followed by sequential risk calculation -> query-parallelism Fully parallel exposure and risk calculation (as shown in example code before) -> procedure-parallelism 33

  34. Experimental setup Machine with 2X AMD Opteron 6274 with 8 cores@2.1 GHz, L1i 64 KB, L2 2MB, L3 shared 6 MB, 125 GB RAM, 64 bit Linux 4.1.15 1x Exchange reactor, 15X Provider reactors (30,000 orders per of which 800 latest orders used for exposure calculation) Varying complexity of sim_riskby random number generation Single worker to generate the workload 34

  35. Asynchronicity gains manifest with increasing parallelizable application logic 35

  36. What about the effect of load on gains from asynchronicity ? Can db architecture virtualization help ? 36

  37. Experimental setup Machine Setup Machine with 2X AMD Opteron 6274 with 8 cores@2.1 GHz, L1i 64 KB, L2 2 MB, L3 shared 6MB, 125 GB RAM, 64 bit Linux 4.1.15 Workload 100% TPC-C new-order augmented with artificial delay of 300 - 400 ?sec to simulate stock replenishment analytics per warehouse Implemented by modeling a warehouse as a reactor Deployment shared-nothingvs shared-everything-with-affinity Scale factor of 8, varying workers to simulate load on the database 37

  38. asynchronicity gains offset by queuing shared-nothing deployment best at low load cross over happens much earlier without delay @2 workers Asynchronicity gains diminish with increasing load 38

  39. More in the paper 39

  40. Conclusion NEW OLTP + modern databases -> Needs rethink of the existing stored procedure programming model Reactor programming model marries actor programming construct with relational data model and querying ReactDB leverages the Reactor programming model and a containerization mechanism to allow control of database architecture at deployment time thanks! Poster session tomorrow at 16:00 40

  41. thanks! Any questions? You can find me at http://www.diku.dk/~bonii @bonivivek bonii@di.ku.dk 41

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#