Multiversion Concurrency Control
From CS Wiki
Multiversion Concurrency Control (MVCC) is a concurrency control method used in database systems that allows multiple versions of a data item to exist simultaneously. It ensures consistent reads without locking and provides high concurrency by maintaining transaction isolation.
Key Concepts[edit | edit source]
- Data Versioning: MVCC creates a new version of a data item for every write operation, enabling transactions to access consistent snapshots.
- Timestamps: Each transaction and data version is associated with timestamps, which determine visibility and consistency.
- Read and Write Operations:
- Reads access the latest version of data visible to the transaction, avoiding conflicts.
- Writes create a new version of the data item without blocking other transactions.
How MVCC Works[edit | edit source]
MVCC operates based on the following principles:
- Each transaction is assigned a unique timestamp at the start.
- Data items maintain multiple versions, each with associated timestamps for creation and expiration.
- Read and write operations follow these rules:
- A transaction reads the most recent version of a data item whose timestamp is less than or equal to the transaction's start timestamp.
- A transaction creates a new version of a data item during writes, ensuring isolation.
Advantages[edit | edit source]
- Non-Blocking Reads: Readers do not block writers, and vice versa, improving concurrency.
- Consistent Snapshots: Transactions operate on consistent snapshots of data without interfering with other transactions.
- Deadlock-Free: MVCC avoids deadlocks by not requiring locks for reads.
Limitations[edit | edit source]
- Storage Overhead: Maintaining multiple versions of data increases storage requirements.
- Garbage Collection: Older versions of data must be periodically removed to free space.
- Write Amplification: Frequent writes can generate a large number of versions, impacting performance.
Example of MVCC[edit | edit source]
Consider a scenario with two transactions accessing a shared data item A:
Step | Transaction T1 | Transaction T2 | Explanation |
---|---|---|---|
1 | BEGIN TRANSACTION (TS=1) | BEGIN TRANSACTION (TS=2) | T1 and T2 start with timestamps 1 and 2, respectively. |
2 | READ(A) | - | T1 reads the current version of A (value=100, TS=0). |
3 | - | WRITE(A, value=200) | T2 writes a new version of A with TS=2. |
4 | READ(A) | - | T1 still sees the version of A with TS=0 (value=100), ensuring a consistent snapshot. |
5 | COMMIT | COMMIT | Both transactions commit successfully. |
In this example, T1 reads the old version of A while T2 writes a new version, demonstrating how MVCC prevents conflicts.
MVCC in Databases[edit | edit source]
MVCC is widely implemented in modern database systems:
- PostgreSQL: Provides MVCC for high-performance and consistent transactions.
- MySQL (InnoDB): Uses MVCC to support isolation levels like Repeatable Read and Read Committed.
- Oracle Database: Implements MVCC to manage consistent reads and undo data.
Variants of MVCC[edit | edit source]
- Snapshot Isolation:
- Transactions work with a snapshot of the database at their start time.
- Prevents dirty reads and non-repeatable reads but allows write skew.
- Serializable Snapshot Isolation (SSI):
- Extends snapshot isolation to detect and prevent write conflicts, ensuring serializability.