Janus: Consolidating Concurrency Control and Consensus for Commits
State-of-the-art research on Janus protocol that aims to enhance distributed transactions by consolidating concurrency control and consensus mechanisms, minimizing wide-area round trips, and improving fault tolerance for commit operations. The protocol addresses latency and throughput limitations caused by conflicts, aiming for fewer round trips and successful commits under various scenarios. It focuses on establishing order before execution to avoid aborts, ensuring consistent ordering for transactions and replications, and optimizing behavior under conflicts.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Janus Consolidating Concurrency Control and Consensus for Commits under Conflicts Shuai Mu, Lamont Nelson, Wyatt Lloyd, Jinyang Li New York University, University of Southern California
State of the Art for Distributed Transactions Layer Concurrency Control on top of Consensus Texas California New York Paxos Transaction Protocol (e.g., 2PC) Shard for scalability Geo-replicate for fault tolerance Paxos
Latency Limitation: Multiple Wide-Area Round Trips from Layering Texas California New York a++ b++
Throughput Limitation: Conflicts Cause Aborts Texas California New York a++ b++ a*=2 b*=2
Goals: Fewer Wide-Area Round Trips and Commits Under Conflicts Best case wide-area RTTs Janus Tapir [SOSP 15] ... 1 Calvin [SIGMOD 12] ... Spanner [OSDI 12] ... 2 Behavior under conflicts Aborts Commits
Establish Order Before Execution to Avoid Aborts Designed for transactions with static read & write-sets Structure a transaction as a set of stored procedure pieces Servers establishes consistent ordering for pieces before execution a++ a++ a++ a b++ b++ b*=2 Challenge: Distributed ordering to avoid bottleneck a*=2 a*=2 a*=2 a*=2 b b++ b++ b*=2 b*=2
Establish Order for Transactions and Replication Together to Commit in 1 Wide-area Roundtrip Consistent ordering for transaction and replication is the same! Layering establishes the same order twice while Janus orders once Transaction Replication a++ a++ a++ a++ a++ a*=2 a*=2 a*=2 b++ b++ a*=2 a*=2 a*=2 a++ a++ a a' Replica of a Challenge: Fault tolerance for ordering b*=2 b*=2 a*=2 a*=2 b b++ b++ b*=2 b*=2
Overview of the Janus Protocol No Conflicts? Commit Pre-accept Yes Send pieces to servers Establish final ordering Establish initial order using dependencies Execute pieces in order Accept Replicate dependencies Detect conflicts
No Conflicts: Commit in 1 Wide-Area Round Trip Commit Pre-accept 1 Local RTT Execute A California Execute B New York Execute A Execute B 1 Wide-area RTT
Conflicts: Commit in 2 Wide-Area RTT Accept Commit Pre-accept A California B New York A B
Conflicts: Commit in 2 Wide-Area Round Trips Commit Accept Pre-accept A California B New York A B
Conflicts: Commit in 2 Wide-Area Round Trip Merge Dependencies Accept Deterministically Order Cycles Commit Execute Execute A California Execute Execute B New York Execute Execute A Execute Execute B
Janus Achieves Fewer Wide-Area Round Trips and Commits Under Conflicts No conflicts: commit in 1 wide-area round trip Pre-accept sufficient to ensure same order under failures Conflicts: commit in 2 wide-area round trips Accept phase replicates dependencies to ensure same order under failures
Janus Paper Includes Many More Details Full details of execution Quorum sizes Behavior under server failure Behavior under coordinator (client) failure Design extensions to handle dynamic read & write sets
Evaluation https://github.com/NYU-NEWS/janus Throughput under conflicts Latency under conflicts Overhead when there are no conflicts? Baselines 2PL (2PC) layered on top of MultiPaxos TAPIR [SOSP 15] Testbed: EC2 (Oregon, Ireland, Seoul)
Janus Commits under Conflicts for High Throughput 10000 Throughput-(new-order/s) Janus No aborts 1000 2PL 100 Aborts due to conflicts at shards 10 Tapir 1 Aborts due to conflicts at shards & replicas 1 10 100 1000 # Clients TPC-C with 6 shards, 3-way geo-replicated (9 total servers), 1 warehouse per shard.
Janus Commits under Conflicts for Low Latency 1000 High latency due to retries after aborts 90-percentile latency (ms) 800 2PL 2 wide-area roundtrips 600 Tapir 400 Janus 200 2 wide-area roundtrips plus execution time 0 1 wide-area roundtrip 1 10 100 1000 # Clients TPC-C with 6 shards, 3-way geo-replicated (9 total servers), 1 warehouse per shard.
Small Throughput Overhead under Few Conflicts 13% overhead from tracking dependencies 120000 Overhead from accept phase + increased dependency tracking 100000 Janus Throughput-(txn/s) 80000 Tapir 60000 40000 20000 Overhead from retries after aborts 0 0.4 0.5 0.6 0.7 0.8 0.9 1 Zipf coefficient Microbenchmark with 3 shards, 3-way replicated in a single data center (9 total servers).
Related Work Isolation Level 1 RTT Commit under Conflicts Janus [OSDI 16] Strict-Serial Tapir [SOSP 15] Strict-Serial Rep.Commit [VLDB 13] Strict-Serial Calvin [SIGMOD 12] Strict-Serial EPaxos [SOSP 13] Rococo [OSDI 14] Spanner [OSDI 12] Strict-Serial MDCC [EuroSys 13] ReadCommit* COPS [SOSP 11] Causal+ Eiger [NSDI 13] Causal+
Conclusion Two limitations for layered transaction protocols Multiple wide-area round trips in the best case Conflicts cause aborts Janus consolidates concurrency control and consensus Ordering requirements are similar and can be combined! Establishing a single ordering with dependency tracking enables: Committing in 1 wide-area round trip in the best case Committing in 2 wide-area round trips under conflicts Evaluation Small throughput overhead when there are no conflicts Low latency and good throughput even with many conflicts