Deadlock bug

Поиск
Список
Период
Сортировка
От Joel Jacobson
Тема Deadlock bug
Дата
Msg-id AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
обсуждение исходный текст
Ответы Re: Deadlock bug  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
(Magnus and pghackers, I've included you in this email, since it appears to be PostgreSQL bug. The example below is general, and not specific to Glue Finance database model. Feel free to share it with anyone.)

I've just tried to replicate the deadlock in 8.4.4 and 9.0b4.
Same problem in both versions.

---- start of comments, specific to Glue Finance database ----

(1) Orders.SessionID is not really necessary, we only store it to log what session created which order. We never use this information, it is merely saved for logging purposes.
Dropping the foreign key...
    "orders_sessionid_fkey" FOREIGN KEY (sessionid) REFERENCES sessions(sessionid)
...would mean we risk data integrity problems if the session would be deleted (which it never is), even if it would be deleted, we wouldn't really care since it just for logging purposes.

(2) Setting Orders.Heartbeat to now() on each intended-to-be-most-of-the-times-read-only-until-something-happends-request (aka Get_Server_Request) is of course a huge performance hit, as it require a row exclusive lock, meaning such requests cannot be performed in parallell.
We will therefore remove the Orders.Heartbeat column entirely.

(3) Making sure Orders is always locked first, before obtaining the Sessions lock, would like you suggest also solve the problem, but requires a larger rewrite of probably a lot of functions.
Removing the foreign key means we don't have to rewrite the functions.

(4) Fix the PostgreSQL bug.

(1) would effectively solve the deadlock issue, but not the performance issue, we should therefore do (2) as well.
---- end of comments, specific to Glue Finance database ----

I think this clearly looks like a bug in PostgreSQL because of the following observations:

Below are comments to the screencast at http://screencast.com/t/NTk2Y2VhMW

The following example is not specific for Glue Finance database.
Attached, please find the text file with the queries and simple example schema.

1. Process 1 executes "UPDATE A SET Col1 = 1 WHERE AID = 1;".
We can see it obtains two RowExclusiveLocks on relations "a_pkey" and "a".
This is the expected result.

2. Process 2 then executes "UPDATE B SET Col2 = 1 WHERE BID = 2;".
We can see it obtains two RowExclusiveLocks on relations "b_pkey" and "b".
I don't know if this is expected, since the row in B references the row in A being updated by process 1.
Because of the foreign key, shouldn't some kind of share lock on A be obtained by process 2, or some other kind of lock?

3. Process 1 tries to execute "UPDATE B SET Col2 = 1 WHERE BID = 2;" and will of course have to wait, because process 2 already has a RowExclusiveLock on the same row in table B.

Process 1 is now waiting...

4. Now, in the other SQL prompt (process 2), we take a look at the vLocks view.
Unexpected observations:
a) both processes have been granted a RowExclusiveLock on table B. How can both be granted a RowExclusiveLock on the same table? Since the table only contains one row, it must be a lock on the same row, which should be impossible, right?
b) process 1 (which is currently waiting) has been granted a lock of type "tuple", page 0, tuple 1, mode "ExclusiveLock" on table B. I don't know what a "tuple" lock is, but what surprises me is process 1 being granted the lock, and not process 2 (since process 2 updated B before 1).

Now, while process 1 is waiting, let's execute the same query in process 2:

5. Process 2 tries to execute "UPDATE B SET Col2 = 1 WHERE BID = 2;" which is exactly the same query as in step 2 above.
Since process 2 already hold a granted RowExclusiveLock on the row in table B it tries to update, I think this query should be executed instantly without any problem. Instead, it causes a deadlock in process 2, allowing process 1 to commit. Very strange.

Could this have any other explanation than a bug (or perhaps feature) in postgres?

-- 
Best regards,

Joel Jacobson
Glue Finance

E: jj@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: git: uh-oh
Следующее
От: Dave Page
Дата:
Сообщение: Re: git: uh-oh