Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)
Дата
Msg-id 534B96F9.6040602@vmware.com
обсуждение исходный текст
Ответ на Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: Problem with txid_snapshot_in/out() functionality)  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 04/13/2014 11:39 PM, Heikki Linnakangas wrote:
> However, I just noticed that there's a race condition between PREPARE
> TRANSACTION and COMMIT/ROLLBACK PREPARED. PostPrepare_Locks runs after
> the prepared transaction is already marked as fully prepared. That means
> that by the time we get to PostPrepare_Locks, another backend might
> already have finished and removed the prepared transaction. That leads
> to a PANIC (put a breakpoint just before PostPrepare_Locks):
>
> postgres=# commit prepared 'foo';
> PANIC:  failed to re-find shared proclock object
> PANIC:  failed to re-find shared proclock object
> The connection to the server was lost. Attempting reset: Failed.
>
> FinishPrepareTransaction reads the list of locks from the two-phase
> state file, but PANICs when it doesn't find the corresponding locks in
> the lock manager (because PostPrepare_Locks hasn't transfered them to
> the dummy PGPROC yet).
>
> I think we'll need to transfer of the locks earlier, before the
> transaction is marked as fully prepared. I'll take a closer look at this
> tomorrow.

Here's a patch to do that. It's very straightforward, I just moved the
calls to transfer locks earlier, before ProcArrayClearTransaction.
PostPrepare_MultiXact had a similar race -  it also transfer state from
the old PGPROC entry to the new, and needs to be done before allowing
another backend to remove the new PGPROC entry. I changed the names of
the functions to distinguish them from the other PostPrepare_* functions
that now happen at a different time.

The patch is simple, but it's a bit scary to change the order of things
like this. Looking at all the calls that now happen after transferring
the locks, I believe this is OK. The change also applies to the
callbacks called by the RegisterXactCallback mechanism, which means that
in theory there might be a 3rd party extension out there that's
affected. All the callbacks in contrib and plpgsql are OK, and it's
questionable to do anything complicated that would depend on
heavy-weight locks to be held in those callbacks, so I think this is OK.
Warrants a note in the release notes, though.

- Heikki


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Christian Ullrich
Дата:
Сообщение: Re: PostgreSQL in Windows console and Ctrl-C
Следующее
От: Marko Kreen
Дата:
Сообщение: Re: Problem with txid_snapshot_in/out() functionality