Re: LISTEN/NOTIFY race condition?

Поиск
Список
Период
Сортировка
От Laurent Birtz
Тема Re: LISTEN/NOTIFY race condition?
Дата
Msg-id 47D5ACE7.5030601@kryptiva.com
обсуждение исходный текст
Ответ на Re: LISTEN/NOTIFY race condition?  (Laurent Birtz <laurent.birtz@kryptiva.com>)
Ответы Re: LISTEN/NOTIFY race condition?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
Hello,

I've figured out the LISTEN / NOTIFY race condition I had previously
encountered. It is not trivial, so I'll try to give as much details as
possible to describe what is going on.


Here is a description of the two Postgres functions involved in the race
condition:

void CommitTransaction()
{
  ...

  /* NOTIFY commit must come before lower-level cleanup */
  AtCommit_Notify();

  ...

  /*
   * Here is where we really truly commit.
   */
  latestXid = RecordTransactionCommit();

  ...

  /*
   * This is all post-commit cleanup.  Note that if an error is raised here,
   * it's too late to abort the transaction.  This should be just
   * noncritical resource releasing.
   *
   * The ordering of operations is not entirely random.  The idea is:
   * release resources visible to other backends (eg, files, buffer pins);
   * then release locks; then release backend-local resources. We want to
   * release locks at the point where any backend waiting for us will see
   * our transaction as being fully cleaned up.
   *
   * Resources that can be associated with individual queries are handled by
   * the ResourceOwner mechanism.  The other calls here are for backend-wide
   * state.
   */
  ResourceOwnerRelease(TopTransactionResourceOwner,
                       RESOURCE_RELEASE_LOCKS,
                       true, true);
  ...
}

void Async_Listen(const char *relname)
{
  ...

  lRel = heap_open(ListenerRelationId, ExclusiveLock);

  ...

  /*
   * OK to insert a new tuple
   */

  simple_heap_insert(lRel, tuple);

  ...

  heap_close(lRel, ExclusiveLock);

  ...
}

In summary, CommitTransaction() notifies FIRST and commits AFTER.
Async_Listen() releases its lock BEFORE the end of its transaction.


I've written a program that triggers the bug quickly. Here is what this
program do:

1) The program deletes every row of the table 'the_log'. This table
   contains an AFTER-INSERT trigger that notifies processes listening on
   it.

2) The program starts a command thread and an event thread concurrently.

3) The command thread executes the following statements:
   a) BEGIN TRANSACTION
   b) INSERT INTO the_log ...
   c) COMMIT

4) The event thread executes 'LISTEN the_log'.

5) The event thread executes 'SELECT * FROM the_log'.

6) If the event thread found the event, we clean up and go back to 1).

7) The event thread waits for a notification. If the notification is
   received, we clean up and go back to 1).

8) We reproduced the bug.


I've instrumented the Postgres source to show what is going on in the
functions of interest. I print both the executing process PID and the
current time in microseconds. My machine has only one CPU. Hence, I'm
pretty sure the following trace accurately reflects what is happening
when my program is executing.

Note:
      PID 29295 is the backend for the command thread.
      PID 29296 is the backend for the event thread.
      Blank lines correspond to context switches.

Command thread: executing |start transaction|.

Event thread: executing |LISTEN the_log|.

// The command backend is handling 'start transaction'.
DEBUG:  [PID 29295] [Time 497310]:
    BeginTransactionBlock() called for BEGIN

Command thread: executing |insert into the_log...|.

// The event backend is doing the listen.
DEBUG:  [PID 29296] [Time 499744]: StartTransaction() called
DEBUG:  Async_Listen(the_log,29296)
DEBUG:  [PID 29296] [Time 501046]: CommitTransaction() called
DEBUG:  [PID 29296] [Time 501163]: RecordTransactionCommit() called

// The command backend is handling 'insert'.
DEBUG:  Async_Notify(the_log)
DEBUG:  [PID 29295] [Time 502624]:
    CommitTransactionCommand: incrementing command counter

Command thread: executing |commit|.

// The command backend is handling 'commit'.
DEBUG:  [PID 29295] [Time 503192]: CommitTransaction() called
DEBUG:  [PID 29295] AtCommit_Notify
DEBUG:  [PID 29295] AtCommit_Notify: done
DEBUG:  [PID 29295] [Time 507981]: RecordTransactionCommit() called

// The event backend is still handling 'listen'.
DEBUG:  [PID 29296] [Time 508319]:
    CommitTransaction: about to release with RESOURCE_RELEASE_LOCKS.

Event thread: executing |select * from the_log|.

// The event backend is handling 'select'.
DEBUG:  [PID 29296] [Time 508754]: StartTransaction() called
DEBUG:  [PID 29296] [Time 510082]: CommitTransaction() called
DEBUG:  [PID 29296] [Time 510207]: RecordTransactionCommit() called
DEBUG:  [PID 29296] [Time 510326]:
    CommitTransaction: about to release with RESOURCE_RELEASE_LOCKS.

// The command thread is still handling 'commit'.
DEBUG:  [PID 29295] [Time 510690]:
    CommitTransaction: about to release with RESOURCE_RELEASE_LOCKS.


Analysis:

The command backend begins a transaction, then the event backend begins a
transaction for LISTEN. The event backend updates the pg_listener table
in mutual exclusion, but it releases the mutex before the transaction is
commited. The command thread does its INSERT then starts its COMMIT. The
command backend scans the pg_listener table and doesn't find anyone to
notify. Before the command backend has the time to actually commit, the
event backend commits and the listen thread has the time to execute the
SELECT statement. Since the command thread didn't commit yet, the SELECT
returns nothing. Finally, the command thread commits, without notifying
anyone.

Essentially, the event thread won't receive a notification for the events
commited when it executes the LISTEN statement, and it won't necessarily
see them either if it executes a SELECT statement subsequently. I believe
this greatly limits the usefulness of the LISTEN / NOTIFY mechanism.


Solutions:

There are two solutions I developed:

1) Don't use an AFTER-INSERT trigger. Execute NOTIFY in its own
   top-level transaction.

  Cons:
  - Bad for performance.
  - Annoying kludge.
  - It's flawed. If the command process crashes before it
    commits the NOTIFY, the event won't be noticed.

2) Fix the race condition in Postgres.

   In Async_Listen(): change
   'heap_close(lRel, ExclusiveLock);' for 'heap_close(lRel, NoLock);'.

   I've tested this solution. It seems to work and I believe this is the
   best solution.


I would appreciate receiving feedback on this issue.

Thanks for your time,
Laurent Birtz

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #4023: The transaction control does not work for sequences
Следующее
От: Tom Lane
Дата:
Сообщение: Re: LISTEN/NOTIFY race condition?