Обсуждение: ERROR after writing PREPARE WAL record

Поиск
Список
Период
Сортировка

ERROR after writing PREPARE WAL record

От
Asim R P
Дата:
Hello

Cancel/terminate requests are held off during "PREPARE TRANSACTION" processing in function PrepareTransaction().  However, a subroutine invoked by PrepareTransaction() may perform elog(ERROR) or elog(FATAL).

And if that happens after PREPARE WAL record is written and before transaction state is cleaned up, normal abort processing is triggered, i.e. AbortTransaction().  It is not correct to perform abort transaction workflow against a transaction that is already marked as prepared.  A prepared transaction should only be finished using "COMMIT/ROLLBACK PREPARED" operation.

I tried injecting an elog(ERROR) at the end of EndPrepare() and that resulted in a PANIC at some point.

Before delving into more details, I want to ascertain that this is a valid problem to solve.  Is the above problem worth worrying about?

Asim

Re: ERROR after writing PREPARE WAL record

От
Tom Lane
Дата:
Asim R P <apraveen@pivotal.io> writes:
> Cancel/terminate requests are held off during "PREPARE TRANSACTION"
> processing in function PrepareTransaction().  However, a subroutine invoked
> by PrepareTransaction() may perform elog(ERROR) or elog(FATAL).

Doing anything that's likely to fail in the post-commit code path is
a Bad Idea (TM).  There's no good recovery avenue, so the fact that
you generally end up at a PANIC is expected/intentional.

The correct response, if you notice code doing that, is to fix it so
it doesn't do that.  Typically the right answer is to move the
failure-prone operation to pre-commit processing.

            regards, tom lane



Re: ERROR after writing PREPARE WAL record

От
Asim R P
Дата:
On Wed, Jul 17, 2019 at 7:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Asim R P <apraveen@pivotal.io> writes:
> > Cancel/terminate requests are held off during "PREPARE TRANSACTION"
> > processing in function PrepareTransaction().  However, a subroutine invoked
> > by PrepareTransaction() may perform elog(ERROR) or elog(FATAL).
>
> The correct response, if you notice code doing that, is to fix it so
> it doesn't do that.  Typically the right answer is to move the
> failure-prone operation to pre-commit processing.

Thank you for the response.  There is nothing particularly alarming.  There is one case in LWLockAcquire that may error out if (num_held_lwlocks >= MAX_SIMUL_LWLOCKS).  This problem also exists in CommitTransaction() and AbortTransaction() code paths. Then there is arbitrary add-on code registered as Xact_callbacks.

SyncRepWaitForLSN() directly checks ProcDiePending and QueryCancelPending without going through CHECK_FOR_INTERRUPTS and that is for good reason.  Moreover, it only emits a WARNING, so no problem there.

Asim