Recently while searching for persistence solutions - I came across the issue of
concurrency control. To quote wiki:
In computer science, especially in the fields of computer programming .... and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.
The notion of "correct results" may be defined in a lot of different ways - one of which is
Serializability - the strictest. To quote wiki:
In concurrency control of databases, ..., a transaction schedule (history) is serializable, has the Serializability[1][2] property, if its outcome (the resulting database state, the values of the database's data) is equal to the outcome of its transactions executed serially, i.e., sequentially without overlapping in time.
Well I had used RDBMS transactions before and vaguely remembered the concept of
transaction isolation levels. SQL standard specifies 4 different isolation levels READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ and SERIALIZABLE.
Ah, I connected the dots, the SERIALIZABLE isolation level must be guaranteeing Serializability as above! Then - to my utter surprise - I came across the following:
However, many databases (e.g. Oracle [1], PostgreSQL [2]) do not guarantee that there is some serial ordering of the transactions that will result in the same outcome, instead implementing snapshot isolation. This explicitly contradicts the ANSI/ISO SQL 99 standard (see ISO/IEC9075-2:1999(E), page 83). Other databases (MS SQL Server [3]) support both serializable and snapshot isolation modes.
WTF?! I had to dig further! So I started to read about
Snapshot isolation. I was curious as to what elements of Serializability were missing from Snapshot isolation. Wikipedia link above has a very nice example which lets us appreciate the difference between the two -
A transaction executing under snapshot isolation appears to operate on a personal snapshot of the database, taken at the start of the transaction. When the transaction concludes, it will successfully commit only if the values updated by the transaction have not been changed externally since the snapshot was taken. Such a write-write conflict will cause the transaction to abort.
In a write skew anomaly, two transactions (T1 and T2) concurrently read an overlapping data set (e.g. values V1 and V2), concurrently make disjoint updates (e.g. T1 updates V1, T2 updates V2), and finally concurrently commit, neither having seen the update performed by the other. Were the system serializable, such an anomaly would be impossible, as either T1 or T2 would have to occur "first", and be visible to the other. In contrast, snapshot isolation permits write skew anomalies.
As a concrete example, imagine V1 and V2 are two balances held by a single person, Phil. The bank will allow either V1 or V2 to run a deficit, provided the total held in both is never negative (i.e. V1 + V2 ≥ 0). Both balances are currently $100. Phil initiates two transactions concurrently, T1 withdrawing $200 from V1, and T2 withdrawing $200 from V2.
If the database guaranteed serializable transactions, the simplest way of coding T1 is to deduct $200 from V1, and then verify that V1 + V2 ≥ 0 still holds, aborting if not. T2 similarly deducts $200 from V2 and then verifies V1 + V2 ≥ 0. Since the transactions must serialize, either T1 happens first, leaving V1 = -$100, V2 = $100, and preventing T2 from succeeding (since V1 + (V2 - $200) is now -$200), or T2 happens first and similarly prevents T1 from committing.
Under snapshot isolation, however, T1 and T2 operate on private snapshots of the database: each deducts $200 from an account, and then verifies that the new total is zero, using the other account value that held when the snapshot was taken. Since neither update conflicts, both commit successfully, leaving V1 = V2 = -$100, and V1 + V2 = -$200.
If built on multiversion concurrency control, snapshot isolation allows transactions to proceed without worrying about concurrent operations, and more importantly without needing to re-verify all read operations when the transaction finally commits. The only information that must be stored during the transaction is a list of updates made, which can be scanned for conflicts fairly easily before being committed.
Well, I had to reproduce that, didn't I ? Of course you say? I hastily downloaded and installed PostgreSQL. I have used MSSQL and Sybase but not PGSQL before - but they are mostly the same right? I created a table accounts with columns id, amount. Inserted rows [1,100], [2,100].
To my surprise and annoyance PGSQL does not allow IF in SQL statements - can someone explain me why? I had to create a separate function for it. It took me some time to come up with the following SQL:
CREATE OR REPLACE FUNCTION test (id1 integer, id2 integer)
RETURNS boolean AS $$
BEGIN
IF (SELECT SUM(amount) FROM accounts WHERE id IN (id1,id2)) >= 0 THEN
UPDATE accounts SET amount=amount-200 WHERE id=id1;
RETURN true;
END IF;
RETURN false;
END;
$$ LANGUAGE plpgsql;
Great, first step done. Next step was a trivial matter of executing these two queries -
QUERY 1:
START TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM test(1,2);
END TRANSACTION;
QUERY 2:
START TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM test(2,1);
END TRANSACTION;
concurrently so that the accounts ended up in an inconsistent state of [-100,-100]. Since I'm not Chuck Norris and couldn't come up with a direct way to accomplishing this - I had to write some code. I hadn't used raw JDBC before - just an ORM - Ibatis. But JDBC interfaces are simple enough and I came up with the following
http://gist.github.com/556599.
Executing this I got the results - Amounts=[-100,-100] - successfully corrupted my data in isolation level serialization ! I wasn't too happy with PostgreSQL for "eating away" my $200 - but to be fair to them - they mention this in
their documentation. From my little reading I found that the reason why most RDBMSes implement Snapshot isolation instead of true Serialization is three fold:
- The ANSI SQL-92 standard did not strictly require the TRANSACTION ISOLATION LEVEL SERIALIZABLE to have true Serializability (I don't know if that was on purpose or just ignorance).
- Performance reasons - snapshot isolation is easier & more performant.
- Apps requiring true Serial schedules are less.
Even so one has to ask the question - why can't they (PostgreSQL, Oracle) use the term
TRANSACTION ISOLATION LEVEL SNAPSHOT ISOLATION or something similar instead of confusing people.
So ok, how do we actually solve this problem (without using manual locks and such)? One solution is to give the database the impression that we are modifying both rows instead of only one. So I added the following to the function test:
UPDATE accounts SET amount=amount WHERE id=id2;
Running with this - gave me the fatal (but expected):
Deducting from account=1 gave ERROR: deadlock detected
Detail: Process 4292 waits for ShareLock on transaction 2882; blocked by process 7992.
Process 7992 waits for ShareLock on transaction 2883; blocked by process 4292.
Hint: See server log for query details.
Where: SQL statement "UPDATE accounts SET amount=amount-200 WHERE id=id1"
PL/pgSQL function "test" line 4 at SQL statement
Amounts=[100,-100]
Ok, lets try to:
UPDATE accounts SET amount=amount WHERE id IN (id1,id2)
And this time I got the (again expected) error:
Deducting from account=2 gave ERROR: could not serialize access due to concurrent update
Where: SQL statement "UPDATE accounts SET amount=amount WHERE id IN (id1,id2)"
PL/pgSQL function "test" line 2 at SQL statement
Amounts=[-100,100]
After this, PostgreSQL recommends the application to retry the transaction, so that'll take care of it. Not the best solution, but it works and at least doesn't corrupt the data!
Let me know in the comments if you have a better or easier solution !
Since MVCC and Snapshot isolation are closely related; and that Clojure exposes an MVCC based STM - I thought if Clojure transactions too had similar semantics. Again I simply had to quench my thirst, didn't I ?
So here
http://gist.github.com/556679 is prototype of the above problem in Clojure.
Running as below produced the same results - data corruption.
$ java -cp lib/clojure-1.2.0.jar\;src clojure.main -e "(require 'tt) (tt/transaction-simul {:ensure2 false :ensure1 false}) (System/exit 0)"
Amounts=[-100,-100]
"Elapsed time: 0.248209 msecs"
But Clojure has a wonderful function
ensure which does exactly what we require in this situation, i.e. protects a ref from getting modified from behind our back in a transaction. We ensure the 2nd account ref from modification while we are in a transaction which modifies the 1st account and voila! we get correct behavior which we want - serialization!
$ java -cp lib/clojure-1.2.0.jar\;src clojure.main -e "(require 'tt) (tt/transaction-simul {:ensure2 true :ensure1 false}) (System/exit 0)"
Amounts=[100,-100]
"Elapsed time: 7545.406736 msecs"
$ java -cp lib/clojure-1.2.0.jar\;src clojure.main -e "(require 'tt) (tt/transaction-simul {:ensure2 true :ensure1 true}) (System/exit 0)"
Amounts=[-100,100]
"Elapsed time: 10705.949317 msecs"
But not without a cost - see the runtimes. I'm not yet sure why ensure bloats the runtimes so much - but atleast it produces correct results (serialized) without worrying about deadlocks and without worrying about re-running the transaction again ourselves.
Comments welcome!