Multiversion Concurrency Control

From CS Wiki

Multiversion Concurrency Control (MVCC) is a concurrency control method used in database systems that allows multiple versions of a data item to exist simultaneously. It ensures consistent reads without locking and provides high concurrency by maintaining transaction isolation.

Key Concepts[edit | edit source]

  • Data Versioning: MVCC creates a new version of a data item for every write operation, enabling transactions to access consistent snapshots.
  • Timestamps: Each transaction and data version is associated with timestamps, which determine visibility and consistency.
  • Read and Write Operations:
    • Reads access the latest version of data visible to the transaction, avoiding conflicts.
    • Writes create a new version of the data item without blocking other transactions.

How MVCC Works[edit | edit source]

MVCC operates based on the following principles:

  1. Each transaction is assigned a unique timestamp at the start.
  2. Data items maintain multiple versions, each with associated timestamps for creation and expiration.
  3. Read and write operations follow these rules:
    • A transaction reads the most recent version of a data item whose timestamp is less than or equal to the transaction's start timestamp.
    • A transaction creates a new version of a data item during writes, ensuring isolation.

Advantages[edit | edit source]

  • Non-Blocking Reads: Readers do not block writers, and vice versa, improving concurrency.
  • Consistent Snapshots: Transactions operate on consistent snapshots of data without interfering with other transactions.
  • Deadlock-Free: MVCC avoids deadlocks by not requiring locks for reads.

Limitations[edit | edit source]

  • Storage Overhead: Maintaining multiple versions of data increases storage requirements.
  • Garbage Collection: Older versions of data must be periodically removed to free space.
  • Write Amplification: Frequent writes can generate a large number of versions, impacting performance.

Example of MVCC[edit | edit source]

Consider a scenario with two transactions accessing a shared data item A:

Step Transaction T1 Transaction T2 Explanation
1 BEGIN TRANSACTION (TS=1) BEGIN TRANSACTION (TS=2) T1 and T2 start with timestamps 1 and 2, respectively.
2 READ(A) - T1 reads the current version of A (value=100, TS=0).
3 - WRITE(A, value=200) T2 writes a new version of A with TS=2.
4 READ(A) - T1 still sees the version of A with TS=0 (value=100), ensuring a consistent snapshot.
5 COMMIT COMMIT Both transactions commit successfully.

In this example, T1 reads the old version of A while T2 writes a new version, demonstrating how MVCC prevents conflicts.

MVCC in Databases[edit | edit source]

MVCC is widely implemented in modern database systems:

  • PostgreSQL: Provides MVCC for high-performance and consistent transactions.
  • MySQL (InnoDB): Uses MVCC to support isolation levels like Repeatable Read and Read Committed.
  • Oracle Database: Implements MVCC to manage consistent reads and undo data.

Variants of MVCC[edit | edit source]

  • Snapshot Isolation:
    • Transactions work with a snapshot of the database at their start time.
    • Prevents dirty reads and non-repeatable reads but allows write skew.
  • Serializable Snapshot Isolation (SSI):
    • Extends snapshot isolation to detect and prevent write conflicts, ensuring serializability.

See Also[edit | edit source]