【分享】浅谈数据库并发控制 - 锁和 MVCC - Jake blog

Table of Contents

After studying programming for several years, you will find that there are no simple and quick solutions to all problems. Many problems require trade-offs and compromises. This article introduces the trade-offs and compromises made by the database between concurrency performance and serializability - the concurrency control mechanism.

If all transactions in the database are executed serially, it can easily become a performance bottleneck for the entire application. Although nodes that cannot be expanded horizontally will become a bottleneck in the end, a database that executes transactions serially will speed up the process; and concurrency (Concurrency) makes it possible for everything to happen. It can solve certain performance problems, but it will bring more weird errors.

After the introduction of concurrent transactions, if you do not control the execution of transactions, various problems will occur. You may be tortured by various strange problems without enjoying the performance improvements brought by concurrency.

Overview

How to control concurrency is one of the very important issues in the database field. However, as of today, there are many mature solutions for transaction concurrency control. The principles of these solutions are what this article wants to introduce. This article will introduce the three most common concurrency control mechanisms:

They are pessimistic concurrency control, optimistic concurrency control and multi-version concurrency control. Pessimistic concurrency control is actually the most common concurrency control mechanism, which is a lock; and optimistic concurrency control actually has another name: optimistic lock. Optimistic lock is not actually a real lock. We will introduce it in detail later in the article. Finally, there is multi-version concurrency control (MVCC). Different from the opposing names of the first two, MVCC Can be used in conjunction with either of the first two mechanisms to improve database read performance.

Since this article introduces different concurrency control mechanisms, it will definitely involve the concurrency of different transactions. We will analyze how various mechanisms work through schematic diagrams.

pessimistic concurrency control

Controlling the acquisition of the same data by different transactions is the most fundamental way to ensure the consistency of the database. If we can allow transactions to have exclusive access to the same resource at the same time, then we can ensure that different transactions operating the same resource will not affect each other.

The simplest and most widely used method is to use locks to solve the problem. When a transaction needs to operate on a resource, it must first obtain the lock corresponding to the resource to ensure that other transactions will not access the resource before performing various operations on the resource. In pessimistic concurrency control, the database program takes a pessimistic attitude towards data modification and will be locked during the data processing process to solve the problem of competition.

read-write lock

In order to maximize the concurrency of database transactions, the locks in the database are designed in two modes, namely shared locks and mutex locks. When a transaction obtains a shared lock, it can only perform read operations, so the shared lock is also called a read lock; and when a transaction obtains a mutex lock for a row of data, it can perform read and write operations on the row of data, so the mutex lock is also called a write lock.

In addition to limiting the read and write operations that transactions can perform, shared locks and mutex locks also have a "sharing" and "mutual exclusion" relationship between them. That is, multiple transactions can obtain a shared lock for a certain row of data at the same time. However, mutex locks are not compatible with shared locks and other mutex locks. We can naturally understand the reason for this design: multiple transactions writing the same data at the same time will inevitably cause various weird problems.

If the current transaction cannot obtain the lock corresponding to the row data, it will fall into a waiting state. It cannot obtain the lock and perform corresponding operations until other transactions release the lock corresponding to the current data.

two-phase lock protocol

The two-phase lock protocol (2PL) is a protocol that can ensure the serializability of transactions. It divides the acquisition and release of locks of a transaction into two different stages: growing (Growing) and shrinking (Shrinking).

In the growth phase, a transaction can obtain locks but cannot release locks; in the reduction phase, transactions can only release locks and cannot acquire new locks. If you only look at the definition of 2PL, then it has been introduced here, but it has two variants:

Strict 2PL: Transaction heldmutually exclusiveThe lock must be released after submission;
Rigorous 2PL: Transaction heldallThe lock must be released after submission;

Although the use of locks can help us solve problems caused by concurrent execution between different transactions, the use of two-phase locks has introduced another serious problem, deadlock; different transactions waiting for each other's locked resources will cause deadlocks. Let's give a simple example here:

When the two transactions first acquire the locks on the draven and beacon resources respectively, and then request the locks that the other party has already acquired, a deadlock will occur. Both parties have no way to wait until the locks are released. If there is no deadlock processing mechanism, they will wait indefinitely, and neither transaction can be completed.

Deadlock handling

Deadlock is a common occurrence in multi-threaded programming. Once multiple threads are involved in competing for resources, you need to consider whether the current threads or transactions will cause a deadlock. Generally speaking, there are two ways to solve the deadlock. One is to eliminate the occurrence and occurrence of deadlock from the source, and the other is to allow the system to enter a deadlock state, but be able to detect and recover in time when a deadlock occurs in the system.

Prevent deadlock

There are two ways to help us prevent deadlocks. One is to ensure that there will be no loops in the waiting between transactions. That is, the waiting graph between transactions should be adirected acyclic graph, there is no loop waiting situation or guarantee that all resources wanted to obtain in a transaction are locked atomically at the beginning of the transaction, all resources are either locked or not locked.

However, there are two problems with this approach. It is difficult to determine which resources need to be locked at the beginning of a transaction. At the same time, because some data that will be used very late is locked in advance, the data utilization rate and transaction concurrency rate are also very low. One solution is to lock all data rows in a certain order and combine it with the 2PL protocol to ensure that all data rows are locked in ascending order during the locking phase. However, this method still requires the transaction to know in advance the data set to be locked.

Another way to prevent deadlock is to use preemption plus transaction rollback to prevent deadlock. When a transaction starts to execute, a timestamp will be obtained first. The database program will decide whether the transaction should wait or rollback based on the timestamp of the transaction. At this time, there are two mechanisms for us to choose from. One is the wait-die mechanism:

When the timestamp of the executed transaction is smaller than that of another transaction, that is, transaction A starts before B, then it will wait for another transaction to release the lock on the corresponding resource, otherwise it will keep the current timestamp and roll back.

Another mechanism is called wound-wait, which is a preemption solution. It is completely opposite to the result of the wait-die mechanism. If the current transaction is executed before another transaction and requests the resources of another transaction, then the other transaction will immediately roll back and give the resources to the transaction that was executed first. Otherwise, it will wait for other transactions to release the resources:

Both methods will cause unnecessary transaction rollback, which will cause certain performance losses. The simpler way to solve the deadlock is to use a timeout, but the setting of the timeout needs to be carefully considered. Otherwise, long-term transactions will not be executed normally, or deadlocks that need to be resolved will not be discovered in time, so its use still has certain limitations.

Deadlock detection and recovery

If the database program cannot guarantee that deadlock will not occur in principle through the protocol, then it needs to detect the deadlock in time and restore it from the deadlock state to the normal state to ensure that the database program can work normally. When using detection and recovery methods to solve deadlocks, the database program needs to maintain reference information between data and transactions. It also needs to provide an algorithm for determining whether the current database has entered a deadlock state. Finally, it needs to provide appropriate strategies for timely recovery when a deadlock occurs.

In the previous section, we actually mentioned that deadlock detection can be judged through a directed waiting graph. If a transaction depends on the data being processed by another transaction, then the current transaction will wait for the end of the other transaction, which is an edge in the entire waiting graph:

As shown in the figure above, if a cycle appears in this directed graph, it means that the current database has entered a deadlock state. TransB -> TransE -> TransF -> TransD -> TransB, at this time, the deadlock recovery mechanism needs to be connected.

How to recover from a deadlock is actually very simple. The most common solution is to select a transaction in the entire ring to roll back to break the cycle in the entire waiting graph. There are three things to consider during the entire recovery process:

Every time a deadlock occurs, multiple transactions will actually be affected, and it is necessary to choose which task to roll back. The golden principle when choosing a victim (Victim) isminimize cost, so we need to comprehensively consider factors such as the time the transaction has been calculated, the data rows used, and the transactions involved; when we select the victim, we can start the rollback. There are actually two options for rollback, one is full rollback, and the other is partial rollback. Partial rollback will roll back to a checkpoint before the transaction. If there is no checkpoint, there is naturally no way to perform partial rollback.

During the process of deadlock recovery, some tasks may actually be selected as victims during multiple deadlocks and will never be successfully executed, causing starvation. We need to ensure that the transaction will be executed within a limited time, so the timestamp must be taken into consideration when selecting victims.

Lock granularity

So far we have not discussed locks of different granularities. We have always discussed data row locks, but sometimes we want to treat multiple nodes as a data unit and use locks to directly lock this data unit, table or even database. The realization of this goal requires us to define locks of different granularities in the database:

When we have locks with different granularities, if a transaction wants to lock the entire database or the entire table, it only needs to simply lock the corresponding node and add explicit locks to the current node and implicit locks to all child nodes. Although this kind of locks with different granularities can solve the problem that the child nodes cannot be locked when the parent node is locked, we have no way to immediately determine that the parent node cannot be locked when the child node is locked.

At this point we need to introduceintention lockTo solve this problem, when you need to lock a child node, first add corresponding intention locks to all parent nodes. Intention locks are not mutually exclusive at all, and are only used to help the parent node quickly determine whether the node can be locked:

Here is a picture that introduces two intention locks,Intention shared lock和intent mutex lockAfter that, all locks are compatible; here, we have accelerated the throughput of the database through locks and intention locks of different granularities.

optimistic concurrency control

In addition to the pessimistic concurrency control mechanism - lock, we actually have other concurrency control mechanisms,optimistic concurrency control(Optimistic Concurrency Control). Optimistic concurrency control is also called optimistic locking, but it is not a real lock. Many people mistakenly think that optimistic locking is a real lock, but it is just a concurrency control idea.

In this section, we will first introduceTimestamp-based concurrency control mechanism, and then extend it on the basis of this protocol to implement an optimistic concurrency control mechanism.

timestamp based protocol

The lock protocol is executed sequentially according to the time when different transactions request the same data item. Because the data that a later transaction wants to obtain has been locked by the previous transaction and can only wait for the lock to be released, the order in which the lock-based protocol executes transactions is related to the order in which the lock is obtained. The timestamp-based protocol we want to introduce here can determine the execution order of transactions before they are executed.

Each transaction will have a globally unique timestamp, which can use the system clock time or a counter, as long as it can ensure that all timestamps are unique and increase with time.

The timestamp-based protocol can ensure that the order of parallel execution of transactions is exactly the same as the effect of serial execution of transactions according to timestamps; each data item has two timestamps, a read timestamp and a write timestamp, which respectively represent the timestamp of the current transaction that successfully executed the corresponding operation.

This protocol can ensure that all conflicting read and write operations can be executed serially according to the size of the timestamp. When performing the corresponding operation, you do not need to pay attention to other transactions and only need to care about the value of the timestamp corresponding to the data item:

Whether it is a read operation or a write operation, the read and write timestamp values will be compared from left to right. If it is less than the current value, it will be directly rejected and rolled back. The database system will add a new timestamp to the rolled back transaction and re-execute the transaction.

Authentication based protocol

optimistic concurrency controlIn fact, it is essentially a protocol based on verification, because in most applications, read-only transactions account for the vast majority, and the possibility of conflicts between transactions due to write operations is very small. This means that most transactions can run very well without the need for a concurrency control mechanism, and can also ensure the consistency of the database. The concurrency control mechanism actually adds a lot of overhead to the entire database system, and we can actually reduce this overhead through other strategies.

The verification protocol is the solution we found. It divides the execution of all transactions into two or three phases according to whether the transaction is read-only or updated:

During the read phase, the database will execute theAll read and write operations, and store all written values into temporary variables, and will not actually update the content in the database; at this time, the database program will enter the next stage, where the database program will check whether the current changes are legal, that is, whether other transactions have updated the data during the RAED PHASE period. If the test passes, then directly enter the WRITE PHASE to write all changes in the temporary variables to the database. Transactions that do not pass the test will be terminated directly.

In order to ensure that optimistic concurrency control can run normally, we need to know the occurrence time of different phases of a transaction, including the transaction start time, the start time of the verification phase, and the end time of the write phase; through these three timestamps, we can ensure that any conflicting transactions will not be written to the database at the same time. Once a transaction completes the verification phase, it will be written immediately, and other transactions that read the same data will be rolled back and re-executed.

As an optimistic concurrency control mechanism, it assumes that all transactions will eventually pass the verification phase and execute successfully, while the lock mechanism and protocols based on timestamp ordering are pessimistic, because they will force transactions to wait or roll back when conflicts occur, even if there is no need for locks to ensure that there is no possibility of conflicts between transactions.

Multi-version concurrency control

The concurrency control mechanisms we have introduced so far actually solve race conditions between transactions by delaying or terminating corresponding transactions (Race condition) to ensure the serializability of transactions; although the previous two concurrency control mechanisms can indeed fundamentally solve the problem of serializability of concurrent transactions, in actual environments most database transactions are read-only, and read requests are many times more than write requests. If there is no concurrency control mechanism before write requests and read requests, then the worst case scenario is that the read request reads the data that has already been written, which is completely acceptable for many applications.

Under this premise, the database system introduces another concurrency control mechanism – Multi-version concurrency control(Multiversion Concurrency Control), each write operation will create a new version of data, and the read operation will select the most appropriate result from a limited number of versions of data and return it directly; at this time, the conflict between read and write operations no longer needs to be paid attention to, and managing and quickly selecting data versions has become the main problem that MVCC needs to solve.

MVCC is not something that is opposed to optimistic and pessimistic concurrency control. It can be well combined with the two to increase the amount of transaction concurrency. MVCC is implemented in the most popular SQL databases MySQL and PostgreSQL; however, because they implement pessimistic locking and optimistic locking respectively, the way MVCC is implemented is also different.

MySQL and MVCC

The multiversion two-phase lock protocol (Multiversion 2PL) implemented in MySQL combines the advantages of MVCC and 2PL. Each version of the data row has a unique timestamp. When there is a read transaction request, the database program will directly return the data item with the largest timestamp from multiple versions.

The update operation is a little more complicated. The transaction will first read the latest version of the data to calculate the result of the data update, and then create a new version of the data. The timestamp of the new data is the largest version of the current data row. +1：

The deletion of data versions is also selected based on the timestamp. MySQL will regularly clear the data with the lowest version from the database to ensure that there will not be a large amount of legacy content.

PostgreSQL and MVCC

Different from the pessimistic concurrency control used in MySQL, optimistic concurrency control is used in PostgreSQL, which leads to some differences in the implementation of MVCC when combining optimistic locks. The final implementation is called Multiversion Timestamp Ordering. In this protocol, all transactions will be assigned a unique timestamp before execution, and each data item has two timestamps for reading and writing:

When a PostgreSQL transaction issues a read request, the database directly returns the latest version of the data without being blocked by any operation. When the write operation is executed, the timestamp of the transaction must be greater than or equal to the read timestamp of the data row, otherwise it will be rolled back.

This implementation of MVCC ensures that read transactions will never fail and do not need to wait for the lock to be released. For applications with far more read requests than write requests, optimistic locking plus MVCC It has greatly improved the performance of the database; although this protocol can make some obvious performance improvements for some actual situations, it will also cause two problems. One is that each read operation will update the read timestamp and cause two disk writes. The second is that conflicts between transactions are resolved through rollback, so if the possibility of conflict is very high or the cost of rollback is huge, the read and write performance of the database is not as good as using the traditional lock waiting method.

Summarize

The concurrency control mechanism of the database has a very mature and complete solution today. We do not need to design a new protocol ourselves to handle conflicts between different transactions. The relevant knowledge learned from the concurrency control mechanism of the database, whether locks or optimistic concurrency control, are widely used in other fields or applications, so it is necessary to understand and be familiar with the principles of different concurrency control mechanisms.

Reference

Article reproduced from – https://draveness.me/database-concurrency-control

This siteOriginal articleAll follow "Attribution-NonCommercial-ShareAlike 4.0 License (CC BY-NC-SA 4.0)". Please keep the following tags for sharing and interpretation:

Original author:Jake Tao,source:"[Share] A brief discussion on database concurrency control – locks and MVCC"

[Share] A brief discussion on database concurrency control – locks and MVCC