Transactions & Isolation Levels

Most database operations do not happen in isolation — multiple clients read and write concurrently. Transactions are the mechanism that keeps concurrent operations from corrupting each other, and isolation levels are the knob that controls exactly how much protection you get versus how much performance you sacrifice.

What is a Transaction

A transaction is a sequence of one or more SQL statements that the database treats as a single unit of work. It either commits in full or rolls back entirely — there is no in-between state visible to other connections.

BEGIN;

UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

COMMIT;   -- both updates land, or neither does

If the database crashes after the first UPDATE but before COMMIT, the transaction is rolled back automatically on recovery. The balance is never debited without being credited.

ACID

ACID is the set of four guarantees that define what it means for a transaction to be safe.

  1. Atomicity — a transaction either fully completes or fully rolls back. No partial writes. If the bank transfer debits one account but crashes before crediting the other, the debit is automatically undone.
  2. Consistency — a transaction takes the database from one valid state to another, never violating defined constraints (foreign keys, unique constraints, check constraints). An inconsistent write is rejected and the transaction is aborted.
  3. Isolation — concurrent transactions execute as if they were sequential. One transaction cannot see another's uncommitted changes. The exact behaviour is configurable via isolation levels — covered below.
  4. Durability — once committed, a transaction survives crashes. Data is written to persistent storage via a Write-Ahead Log (WAL) before the commit is acknowledged to the client.

Write-Ahead Log (WAL)

Durability is implemented by writing every change to an append-only log on disk beforeapplying it to the data pages. On crash recovery, the database replays the log to restore committed transactions and discard uncommitted ones. WAL also powers replication — replicas tail the primary's log and apply the same changes.

BASE

Most NoSQL databases trade ACID for BASE semantics in exchange for horizontal scalability and lower write latency:

  1. Basically Available — the system responds to every request, even if some nodes are down or returning stale data.
  2. Soft state — the state of the system may change over time without new input, as replicas converge toward consistency.
  3. Eventually consistent — given no new updates, all replicas will eventually reach the same state.

BASE is a deliberate trade-off, not a failure mode. Accept stale reads in exchange for lower write latency and higher availability under network partition. The right choice depends on whether your use case can tolerate temporary inconsistency.

Concurrency Anomalies

When multiple transactions run simultaneously, they can interfere with each other in well-defined ways. These are the anomalies that isolation levels are designed to prevent:

Dirty read

Transaction A reads data that Transaction B has written but not yet committed. If B rolls back, A has read a value that never existed.

T1: UPDATE accounts SET balance = 0 WHERE id = 1; T2: SELECT balance FROM accounts WHERE id = 1; -- reads 0 (uncommitted) T1: ROLLBACK; -- balance was never actually 0

Non-repeatable read

Transaction A reads the same row twice and gets different values because Transaction B committed a change between the two reads.

T1: SELECT price FROM products WHERE id = 5; -- returns 100 T2: UPDATE products SET price = 120 WHERE id = 5; COMMIT; T1: SELECT price FROM products WHERE id = 5; -- returns 120 ← changed

Phantom read

Transaction A runs a range query twice and gets a different set of rows because Transaction B inserted or deleted rows that fall within that range between the two reads.

T1: SELECT * FROM orders WHERE total > 100; -- returns 5 rows T2: INSERT INTO orders (total) VALUES (150); COMMIT; T1: SELECT * FROM orders WHERE total > 100; -- returns 6 rows ← phantom

Lost update

Two transactions read a value, compute a new value based on it, and both write back — the second write silently overwrites the first.

T1: SELECT stock FROM products WHERE id = 1; -- reads 10 T2: SELECT stock FROM products WHERE id = 1; -- reads 10 T1: UPDATE products SET stock = 9 WHERE id = 1; COMMIT; T2: UPDATE products SET stock = 9 WHERE id = 1; COMMIT; -- T1's update lost

Isolation Levels

SQL defines four standard isolation levels. Higher isolation prevents more anomalies but increases lock contention and reduces concurrency. Most applications run at Read Committed (PostgreSQL default) or Repeatable Read (MySQL InnoDB default).

LevelDirty ReadNon-repeatable ReadPhantom Read
READ UNCOMMITTEDPossiblePossiblePossible
READ COMMITTEDPreventedPossiblePossible
REPEATABLE READPreventedPreventedPossible*
SERIALIZABLEPreventedPreventedPrevented

* MySQL InnoDB prevents phantom reads at REPEATABLE READ using gap locks. PostgreSQL does not at this level.

READ UNCOMMITTED

No isolation — transactions can read each other's uncommitted changes. Dirty reads are possible. Almost never used in practice. Only useful for approximate analytics on large tables where a slight inconsistency is acceptable and locking overhead must be avoided.

READ COMMITTED

The PostgreSQL default. A query only sees data committed before the query began. Dirty reads are impossible. Non-repeatable reads are possible — the same row read twice in the same transaction may return different values if another transaction committed between the two reads. Suitable for most OLTP workloads.

REPEATABLE READ

The MySQL InnoDB default. A transaction operates on a consistent snapshot taken at the start of the transaction. Reading the same row twice always returns the same value — even if another transaction commits a change in between. Phantom reads may still occur (new rows appearing in a range scan). Use when you need to read a row multiple times and act on a consistent value.

SERIALIZABLE

The strongest level. Transactions behave as if they executed one at a time, in serial order. All three anomalies are prevented. Implemented via range locks or optimistic concurrency (SSI in PostgreSQL). Significantly reduces throughput under high concurrency. Reserve for transactions where correctness is critical and contention is acceptable — financial transfers, inventory deductions, ticket booking.

Deadlocks

A deadlock occurs when two or more transactions each hold a lock that the other needs, creating a circular wait that neither can break on its own.

T1: locks row A, waits for row B T2: locks row B, waits for row A → neither can proceed

Detection and resolution

Databases detect deadlocks automatically by checking for cycles in the wait-for graph. When a deadlock is detected, the database picks one transaction as the victim, rolls it back, and returns an error to the application. The surviving transaction can then acquire the released lock and proceed. The application should catch the deadlock error and retry the transaction.

Prevention strategies

  1. Consistent lock ordering — always acquire locks in the same order across all transactions. If every transaction locks row A before row B, the circular wait cannot form.
  2. Keep transactions short — the longer a transaction holds locks, the higher the chance of collision. Do the minimum inside a transaction; move logic that does not need the lock outside it.
  3. SELECT FOR UPDATE — acquire a write lock explicitly at read time when you know you will update the row. This prevents the read-modify-write race that often causes lost updates and deadlocks.
  4. Optimistic concurrency — instead of locking, include a version column in the WHERE clause of the UPDATE. If another transaction modified the row first, the UPDATE affects 0 rows and the application retries.