Concurrency Control in Distributed Systems: Challenges and Solutions
Delve into the realm of concurrency control in distributed systems, exploring the dual challenges of handling failures and ensuring proper synchronization between multiple transactions. Learn about the complexities of managing shared data, addressing consistency issues, and implementing strategies for conflict resolution. Discover the significance of serializability in orchestrating efficient transaction processing amidst concurrent operations.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Distributed Systems Lecture 6 Concurrency control Cheng Li
Two goals: handle failure (A, D): the focus of previous lecture - A machine crashes and later re-starts. handle concurrency (C, I): today s focus - Concurrency control protocols 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 2
Concurrency Multiple transactions run simultaneously to make better use of resources like many cores for better performance. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 3
Problems brought by concurrency concurrency in the face of data sharing creates consistency problems - need to use some form of synchronization or conflict detection to avoid races and resulting inconsistency (the concurrency control problem) failures happen! - to software, hardware, and storage media (corruption or crash)! - and, as a result, we have to worry about partial computations and the correctness of computations after restart (the recovery problem) 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 4
Distributed DBMS source: Concurrency control in Distributed Database Systems. Computer Surveys, June 1981. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 5
Lost update anomaly 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 6
Inconsistent retrieval anomaly 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 7
Serializability We need to interleave the operations of multiple transactions to get the efficiencies of concurrency. Let s call a specific interleaving a schedule (actually, it s a partial ordering more soon). We have to decide which schedules are correct, and which are incorrect. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 8
Serializability a schedule (an ordering) is partial, since it only needs to specify two kinds of dependencies in the schedule: - intra-transaction: all operations of a given transaction, for which an order is specified by the transaction, must appear in that order in the schedule. - inter-transaction: the ordering of conflicting operations from different transactions must be specified. two operations conflict if they both operate on the same data and at least one is a write() two schedules are equivalent if (a) they contain the same transactions and operations, and (b) they order all conflicting operations of non-aborting transactions in the same way 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 9
Distributed Transaction Serializability Two histories have to be considered: - local histories - global history For global transactions (i.e., global history) to be serializable, two conditions are necessary: - Each local history should be serializable. - Two conflicting operations should be in the same relative order in all of the local histories where they appear together. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 10
Concurrency control A number of concurrency control techniques are used to ensure noninterference or isolation of concurrently executing transactions. - Pessimistic concurrency control Lock-based concurrency control protocol Timestamp-based concurrency control protocol - Optimistic concurrency control - Multi-version concurrency control protocol can be either pessimistic or optimistic 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 11
Locking-Based Protocols A lock is a mechanism to control concurrent access to a data item Data items can be locked in two modes : 1. exclusive (X) mode. Data item can be both read as well as written. X-lock is requested using lock-X instruction. 2. shared (S) mode. Data item can only be read. S-lock is requested using lock-S instruction. Lock requests are made to the concurrency-control manager by the programmer. Transaction can proceed only after request is granted. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 12
Lock-Based Protocols (Cont.) Lock-compatibility matrix A transaction may be granted a lock on an item if the requested lock is compatible with locks already held on the item by other transactions Any number of transactions can hold shared locks on an item, - But if any transaction holds an exclusive on the item no other transaction may hold any lock on the item. If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks held by other transactions have been released. The lock is then granted. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 13
Two-Phase Locking (2PL) source: Granularity of locks and degrees of consistency in a shared data base. Technical report, IBM, 1976. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 14
Deadlocks Consider the partial schedule Neither T3nor T4can make progress executing lock-S(B) causes T4to wait for T3to release its lock on B, while executing lock-X(A) causes T3to wait for T4to release its lock on A. Such a situation is called a deadlock. - To handle a deadlock one of T3or T4must be rolled back and its locks released. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 15
Deadlock Management Prevention - Guaranteeing that deadlocks can never occur in the first place. Check transaction when it is initiated. Requires no run time support. - All resources which may be needed by a transaction must be predeclared Avoidance - Detecting potential deadlocks in advance and taking action to insure that deadlock will not occur. Requires run time support. - Order either the objects or the sites and always request locks in that order. Detection and Recovery - Allowing deadlocks to form and then finding and breaking them. As in the avoidance scheme, this requires run time support. - Centralized - Hierarchical - Distributed 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 16
Cascading roll-backs When a deadlock occurs there is a possibility of cascading roll-backs. Cascading roll-back is possible under two-phase locking. To avoid this, follow a modified protocol called strict two- phase locking -- a transaction must hold all its exclusive locks till it commits/aborts. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 17
Strict 2PL 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 18
Implementation of Locking A lock manager can be implemented as a separate process to which transactions send lock and unlock requests The lock manager replies to a lock request by sending a lock grant messages (or a message asking the transaction to roll back, in case of a deadlock) The requesting transaction waits until its request is answered The lock manager maintains a data-structure called a lock table to record granted locks and pending requests The lock table is usually implemented as an in-memory hash table indexed on the name of the data item being locked 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 19
Lock Table Dark blue rectangles indicate granted locks; light blue indicate waiting requests Lock table also records the type of lock granted or requested New request is added to the end of the queue of requests for the data item, and granted if it is compatible with all earlier locks Unlock requests result in the request being deleted, and later requests are checked to see if they can now be granted If transaction aborts, all waiting or granted requests of the transaction are deleted - lock manager may keep a list of locks held by each transaction, to implement this efficiently 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 20
Example of Granularity Hierarchy The levels, starting from the coarsest (top) level are - database - area - file - record 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 21
Distributed lock managers Google Research Publication: Chubby Distributed Lock Service. Zookeeper.apache.org. [Please find the paper as well] etcd: Distributed reliable key-value store for the most critical data of a distributed system, CoreOS, 2018-01-16 https://redis.io/topics/distlock https://www.consul.io/ 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 22
Concurrency control A number of concurrency control techniques are used to ensure noninterference or isolation of concurrently executing transactions. - Pessimistic concurrency control Lock-based concurrency control protocol Timestamp-based concurrency control protocol - Optimistic concurrency control - Multi-version concurrency control protocol can be either pessimistic or optimistic 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 23
Timestamp-based protocols Monotonic timestamp assignment - Each transaction is issued a timestamp when it enters the system. If an old transaction Tihas time-stamp TS(Ti), a new transaction Tjis assigned time-stamp TS(Tj) such that TS(Ti) <TS(Tj). In order to assure such behavior, the protocol maintains for each data Q two timestamp values: - W-timestamp(Q) is the largest time-stamp of any transaction that executed write(Q) successfully. - R-timestamp(Q) is the largest time-stamp of any transaction that executed read(Q) successfully. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 24
Timestamp-Based Protocols (Cont.) The timestamp ordering protocol ensures that any conflicting read and write operations are executed in timestamp order. Suppose a transaction Tiissues a read(Q) 1. If TS(Ti) < W-timestamp(Q), then Tineeds to read a value of Q that was already overwritten. Hence, the read operation is rejected, and Tiis rolled back. 2. If TS(Ti) W-timestamp(Q), then the read operation is executed, and R-timestamp(Q) is set to max(R-timestamp(Q), TS(Ti)). 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 25
Timestamp-Based Protocols (Cont.) Suppose that transaction Tiissues write(Q). 1. If TS(Ti) < R-timestamp(Q), then the value of Q that Tiis producing was needed previously, and the system assumed that that value would never be produced. Hence, the write operation is rejected, and Tiis rolled back. 2. If TS(Ti) < W-timestamp(Q), then Tiis attempting to write an obsolete value of Q. Hence, this write operation is rejected, and Tiis rolled back. 3. Otherwise, the write operation is executed, and W- timestamp(Q) is set to TS(Ti). 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 26
Example Use of the Protocol A partial schedule for several data items for transactions with timestamps 1, 2, 3, 4, 5 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 27
Correctness of TO Protocol The timestamp-ordering protocol guarantees serializability since all the arcs in the precedence graph are of the form: Thus, there will be no cycles in the precedence graph Timestamp protocol ensures freedom from deadlock as no transaction ever waits. But the schedule may not be cascade-free, and may not even be recoverable. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 28
Concurrency control A number of concurrency control techniques are used to ensure noninterference or isolation of concurrently executing transactions. - Pessimistic concurrency control Lock-based concurrency control protocol Timestamp-based concurrency control protocol - Optimistic concurrency control - Multi-version concurrency control protocol can be either pessimistic or optimistic 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 29
Optimistic concurrency control Locking is a conservative approach in which conflicts are prevented. Disadvantages: - Lock management overhead. - Deadlock detection/resolution. - Lock contention for heavily used objects. Locking is pessimistic because it assumes that conflicts will happen. If conflicts are rare, we might get better performance by not locking, and instead checking for conflicts at commit. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 30
Optimistic concurrency control Execution of transaction Tiis done in three phases. 1. Read and execution phase: Transaction Tiwrites only to temporary local variables 2. Validation phase: Transaction Tiperforms a ''validation test'' to determine if local variables can be written without violating serializability. 3. Write phase: If Tiis validated, the updates are applied to the database; otherwise, Tiis rolled back. The three phases of concurrently executing transactions can be interleaved, but each transaction must go through the three phases in that order. - Assume for simplicity that the validation and write phase occur together, atomically and serially I.e., only one transaction executes validation/write at a time. Source: H. T. Kung and J. T. Robinson. On Optimistic Methods for Concurrency Control. ACM Trans. Database Syst., 6(2):213 226, June 1981. USTC-ADSL-Dist-Sys-Lecture-Note 10/5/2024 31
Optimistic concurrency control Each transaction Tihas 3 timestamps - Start(Ti) : the time when Tistarted its execution - Validation(Ti): the time when Tientered its validation phase - Finish(Ti) : the time when Tifinished its write phase Serializability order is determined by timestamp given at validation time; this is done to increase concurrency. - Thus, TS(Ti) is given the value of Validation(Ti). This protocol is useful and gives greater degree of concurrency if probability of conflicts is low. - because the serializability order is not pre-decided, and - relatively few transactions will have to be rolled back. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 32
Validation Test for Transaction Tj If for all Tiwith TS (Ti) < TS (Tj) either one of the following condition holds: - finish(Ti) < start(Tj) - start(Tj) < finish(Ti) < validation(Tj) and the set of data items written by Tidoes not intersect with the set of data items read by Tj. then validation succeeds and Tjcan be committed. Otherwise, validation fails and Tjis aborted. Justification: Either the first condition is satisfied, and there is no overlapped execution, or the second condition is satisfied and the writes of Tjdo not affect reads of Tisince they occur after Tihas finished its reads. the writes of Tido not affect reads of Tjsince Tjdoes not read any item written by Ti. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 33
Overheads in Optimistic CC Must record read/write activity in ReadSet and WriteSet per transaction. - Must create and destroy these sets as needed. Must check for conflicts during validation, and must make validated writes ``global . - Critical section can reduce concurrency. - Scheme for making writes global can reduce clustering of objects. Optimistic CC restarts transactions that fail validation. - Work done so far is wasted; requires clean-up. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 34
Concurrency control A number of concurrency control techniques are used to ensure noninterference or isolation of concurrently executing transactions. - Pessimistic concurrency control Lock-based concurrency control protocol Timestamp-based concurrency control protocol - Optimistic concurrency control - Multi-version concurrency control protocol can be either pessimistic or optimistic 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 35
MULTI-VERSION CONCURRENCY CONTROL The DBMS maintains multiple physical versions of a single logical object in the database: - When a txn writes to an object, the DBMS creates a new version of that object. - When a txn reads an object, it reads the newest version that existed when the txn started. First proposed in 1978 MIT PhD dissertation. Used in almost every new DBMS in last 10 years. Source: D. P. Reed. Naming and Synchronization in a Decentralized Computer System. Ph.D. dissertation, 1978. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 36
MULTI-VERSION CONCURRENCY CONTROL Main benefits: Writers don t block readers. Read-only txns can read a consistent snapshot without acquiring locks. Easily support time-travel queries. MVCC is more than just a concurrency control protocol . It completely affects how the DBMS manages transactions and the database. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 37
MVCC DESIGN DECISIONS Concurrency Control Protocol Version Storage Garbage Collection Index Management Source: An Empirical Evaluation of In-Memory Multi-Version Concurrency Control 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 38
CONCURRENCY CONTROL PROTOCOL Approach #1: Timestamp Ordering - Assign txns timestamps that determine serial order. - Considered to be original MVCC protocol. Approach #2: Optimistic Concurrency Control - Three-phase protocol from last class. - Use private workspace for new versions. Approach #3: Two-Phase Locking - Txns acquire appropriate lock on physical version before they can read/write a logical tuple. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 39
MVCC implementations 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 40
MVCC details Multiversion schemes keep old versions of data item to increase concurrency. Each successful write results in the creation of a new version of the data item written. Use timestamps to label versions. When a read(Q) operation is issued, select an appropriate version of Q based on the timestamp of the transaction, and return the value of the selected version. reads never have to wait as an appropriate version is returned immediately. 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 41
Multiversion + Timestamp Ordering Each data item Q has a sequence of versions <Q1, Q2,...., Qm>. Each version Qkcontains three data fields: - Content -- the value of version Qk. - W-timestamp(Qk) -- timestamp of the transaction that created (wrote) version Qk - R-timestamp(Qk) -- largest timestamp of a transaction that successfully read version Qk When a transaction Ticreates a new version Qkof Q, Qk's W- timestamp and R-timestamp are initialized to TS(Ti). R-timestamp of Qkis updated whenever a transaction Tj reads Qk, and TS(Tj) > R-timestamp(Qk). 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 42
MultiversionTimestamp Ordering (Cont) Suppose that transaction Tiissues a read(Q) or write(Q) operation. Let Qkdenote the version of Q whose write timestamp is the largest write timestamp less than or equal to TS(Ti). 1. If transaction Tiissues a read(Q), then the value returned is the content of version Qk. 2. If transaction Tiissues a write(Q) 1. if TS(Ti) < R-timestamp(Qk), then transaction Tiis rolled back. 2. if TS(Ti) =W-timestamp(Qk), the contents of Qkare overwritten 3. else a new version of Q is created. Observe that - Reads always succeed - A write by Tiis rejected if some other transaction Tjthat (in the serialization order defined by the timestamp values) should read Ti's write, has already read a version created by a transaction older than Ti. Protocol guarantees serializability 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 43
MVCC: Implementation Issues Creation of multiple versions increases storage overhead - Extra tuples - Extra space in each tuple for storing version information Versions can, however, be garbage collected - E.g. if Q has two versions Q5 and Q9, and the oldest active transaction has timestamp > 9, than Q5 will never be required again 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 44
More references MaaT: Effective and scalable coordination of distributed transactions in the cloud Calvin: fast distributed transactions for partitioned database systems An Evaluation of Distributed Concurrency Control Spanner: Google s Globally-Distributed Database 10/5/2024 USTC-ADSL-Dist-Sys-Lecture-Note 45
Distributed Systems Lecture 6 Concurrency control Q & A!