Re: LISTEN/NOTIFY race condition?
От | Laurent Birtz |
---|---|
Тема | Re: LISTEN/NOTIFY race condition? |
Дата | |
Msg-id | 47D5ACE7.5030601@kryptiva.com обсуждение исходный текст |
Ответ на | Re: LISTEN/NOTIFY race condition? (Laurent Birtz <laurent.birtz@kryptiva.com>) |
Ответы |
Re: LISTEN/NOTIFY race condition?
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-bugs |
Hello, I've figured out the LISTEN / NOTIFY race condition I had previously encountered. It is not trivial, so I'll try to give as much details as possible to describe what is going on. Here is a description of the two Postgres functions involved in the race condition: void CommitTransaction() { ... /* NOTIFY commit must come before lower-level cleanup */ AtCommit_Notify(); ... /* * Here is where we really truly commit. */ latestXid = RecordTransactionCommit(); ... /* * This is all post-commit cleanup. Note that if an error is raised here, * it's too late to abort the transaction. This should be just * noncritical resource releasing. * * The ordering of operations is not entirely random. The idea is: * release resources visible to other backends (eg, files, buffer pins); * then release locks; then release backend-local resources. We want to * release locks at the point where any backend waiting for us will see * our transaction as being fully cleaned up. * * Resources that can be associated with individual queries are handled by * the ResourceOwner mechanism. The other calls here are for backend-wide * state. */ ResourceOwnerRelease(TopTransactionResourceOwner, RESOURCE_RELEASE_LOCKS, true, true); ... } void Async_Listen(const char *relname) { ... lRel = heap_open(ListenerRelationId, ExclusiveLock); ... /* * OK to insert a new tuple */ simple_heap_insert(lRel, tuple); ... heap_close(lRel, ExclusiveLock); ... } In summary, CommitTransaction() notifies FIRST and commits AFTER. Async_Listen() releases its lock BEFORE the end of its transaction. I've written a program that triggers the bug quickly. Here is what this program do: 1) The program deletes every row of the table 'the_log'. This table contains an AFTER-INSERT trigger that notifies processes listening on it. 2) The program starts a command thread and an event thread concurrently. 3) The command thread executes the following statements: a) BEGIN TRANSACTION b) INSERT INTO the_log ... c) COMMIT 4) The event thread executes 'LISTEN the_log'. 5) The event thread executes 'SELECT * FROM the_log'. 6) If the event thread found the event, we clean up and go back to 1). 7) The event thread waits for a notification. If the notification is received, we clean up and go back to 1). 8) We reproduced the bug. I've instrumented the Postgres source to show what is going on in the functions of interest. I print both the executing process PID and the current time in microseconds. My machine has only one CPU. Hence, I'm pretty sure the following trace accurately reflects what is happening when my program is executing. Note: PID 29295 is the backend for the command thread. PID 29296 is the backend for the event thread. Blank lines correspond to context switches. Command thread: executing |start transaction|. Event thread: executing |LISTEN the_log|. // The command backend is handling 'start transaction'. DEBUG: [PID 29295] [Time 497310]: BeginTransactionBlock() called for BEGIN Command thread: executing |insert into the_log...|. // The event backend is doing the listen. DEBUG: [PID 29296] [Time 499744]: StartTransaction() called DEBUG: Async_Listen(the_log,29296) DEBUG: [PID 29296] [Time 501046]: CommitTransaction() called DEBUG: [PID 29296] [Time 501163]: RecordTransactionCommit() called // The command backend is handling 'insert'. DEBUG: Async_Notify(the_log) DEBUG: [PID 29295] [Time 502624]: CommitTransactionCommand: incrementing command counter Command thread: executing |commit|. // The command backend is handling 'commit'. DEBUG: [PID 29295] [Time 503192]: CommitTransaction() called DEBUG: [PID 29295] AtCommit_Notify DEBUG: [PID 29295] AtCommit_Notify: done DEBUG: [PID 29295] [Time 507981]: RecordTransactionCommit() called // The event backend is still handling 'listen'. DEBUG: [PID 29296] [Time 508319]: CommitTransaction: about to release with RESOURCE_RELEASE_LOCKS. Event thread: executing |select * from the_log|. // The event backend is handling 'select'. DEBUG: [PID 29296] [Time 508754]: StartTransaction() called DEBUG: [PID 29296] [Time 510082]: CommitTransaction() called DEBUG: [PID 29296] [Time 510207]: RecordTransactionCommit() called DEBUG: [PID 29296] [Time 510326]: CommitTransaction: about to release with RESOURCE_RELEASE_LOCKS. // The command thread is still handling 'commit'. DEBUG: [PID 29295] [Time 510690]: CommitTransaction: about to release with RESOURCE_RELEASE_LOCKS. Analysis: The command backend begins a transaction, then the event backend begins a transaction for LISTEN. The event backend updates the pg_listener table in mutual exclusion, but it releases the mutex before the transaction is commited. The command thread does its INSERT then starts its COMMIT. The command backend scans the pg_listener table and doesn't find anyone to notify. Before the command backend has the time to actually commit, the event backend commits and the listen thread has the time to execute the SELECT statement. Since the command thread didn't commit yet, the SELECT returns nothing. Finally, the command thread commits, without notifying anyone. Essentially, the event thread won't receive a notification for the events commited when it executes the LISTEN statement, and it won't necessarily see them either if it executes a SELECT statement subsequently. I believe this greatly limits the usefulness of the LISTEN / NOTIFY mechanism. Solutions: There are two solutions I developed: 1) Don't use an AFTER-INSERT trigger. Execute NOTIFY in its own top-level transaction. Cons: - Bad for performance. - Annoying kludge. - It's flawed. If the command process crashes before it commits the NOTIFY, the event won't be noticed. 2) Fix the race condition in Postgres. In Async_Listen(): change 'heap_close(lRel, ExclusiveLock);' for 'heap_close(lRel, NoLock);'. I've tested this solution. It seems to work and I believe this is the best solution. I would appreciate receiving feedback on this issue. Thanks for your time, Laurent Birtz
В списке pgsql-bugs по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: Re: BUG #4023: The transaction control does not work for sequences