A few days ago I read Joe Nelson’s post about DB’s transactions anomalies and Postgres’s isolation levels to prevent them.
I found it very useful and descriptive, so I decided to make a short summary of the information he showed there for those who tl;dr.
|Anomaly||Description||Serializable Level||Snapshot Isolation Level||Read Committed Level|
|Dirty Writes||Transaction T1 modifies an item, T2 further modifies it before T1 commits or rolls back.||Not possible||Not possible||Not possible|
|Dirty Reads||A transaction reads data written by a concurrent uncommitted transaction.||Not possible||Not possible||Not possible|
|Read Skew||A transaction re-reads data it has previously read and finds that data has been modified by another transaction. Requires two reads.||Not possible||Not possible||Possible|
|Phantom Reads||The same as Read Skew but reads a collection instead of just one row.||Not possible||Not possible||Possible|
|Lost Update||T1 reads an item. T2 updates it. T1 updates it based on what it read than commits. T2’s update has been lost.||Not possible||Not possible||Possible|
|Write Skew||Two concurrent transactions each determine what they are writing based on reading a data set which overlaps what the other is writing.||Not possible||Possible||Possible|
|Read-Only Serialization||T2 reads a collection. T1 reads it and then updates it. T2 reads it based on previously read version. T3 reads it after T1 committed.||Almost impossible (less than 0.01%)||Possible||Possible|
So you can use these three isolation levels in Postgres: Serializable, Snapshot isolation, Read Commited. Here are reasons why you can choose one of those.
- Serializable. You can be sure that no anomaly will affect your transactions because it’s the most restrictive level. Basically no anomaly except Read-Only Serialization one, but even this one can occur very rarely. Keep in mind that it slows down your transactions because you need to put and release collection locks. It’s the default one and definitely should be used when you work with the money.
- Snapshot isolation level. Aka Repeatable Read level. It will work faster because it will not check changes of concurrent transactions before it commits. It’s still proper isolation level, but your app should retry write transaction in case if write skew appeared. But it’s still safe for read transactions.
- Read committed level. Default isolation level for PostgreSQL. Suitable for the most usual use cases, but you should consider another isolation level if you work with complex transactions. Here is the example from the PostgreSQL wiki:
BEGIN; UPDATE website SET hits = hits + 1; -- run from another session: DELETE FROM website WHERE hits = 10; COMMIT;
In this example delete transaction will not see the websites which had 9 before the update transaction started to perform.
So if you see that in your app you have some concurrency phenomena, consider looking deeper in DB’s isolation level docs and choosing another level.
Here are posts I used and suggest you to read to understand this topic deeper:
- Original post about concurrency phenomena and DB isolation levels.
- Official docs of PostgreSQL.
- Analyze of Read-Only Serialization anomaly.
- Repeatable-Read isolation level in MySQL.