Re: Design of pg_stat_subscription_workers vs pgstats

Поиск
Список
Период
Сортировка
От David G. Johnston
Тема Re: Design of pg_stat_subscription_workers vs pgstats
Дата
Msg-id CAKFQuwYS_EUe+sR6MS3aiR9UXtUJfDcmHoDjrXAeDnY5w_9bnw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Design of pg_stat_subscription_workers vs pgstats  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Thu, Jan 27, 2022 at 2:15 PM Andres Freund <andres@anarazel.de> wrote:
Another related thing is that using a 32bit xid for allowing skipping is a bad
idea anyway - we shouldn't adding new interfaces with xid wraparound dangers -
it's getting more and more common to have multiple wraparounds a day.  An
easily better alternative would be the LSN at which a transaction starts.


Interesting idea.  I do not think a well-designed skipping feature need worry about wrap-around though.  The XID to be skipped was just seen be a worker and because it failed it will continue to be the same XID encountered by that worker until it is resolved.  There is no effective progression in time while the subscriber is stuck for wrap-around to happen.  Since we want to skip the transaction as a whole adding a layer of hidden indirection to the process seems undesirable.  I'm not against the idea though - to the user it is basically "copy this value from the error message in order to skip the transaction that caused the error".  Then the system verifies the value and then ensures it skips one, and only one, transaction.


It's pretty easy from the POV of getting into a new transaction.

PG_CATCH():

    /* get us out of the failed transaction */
    AbortOutOfAnyTransaction();

    StartTransactionCommand();
    /* do something to remember the error we just got */
    CommitTransactionCommand();

Thank you.
It may be a bit harder to afterwards to to not just error out the whole
worker, because we'd need to know what to do instead.


I imagine the launcher and worker startup code can be made to deal with the restart adequately.  Just wait if the last thing seen was an error.  Require the user to manually resume the worker - unless we really think a try-until-you-succeed with a backoff protocol is superior.  Upon system restart all error information is cleared and we start from scratch and let the errors happen (or not depending) as they will.

David J.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: A test for replay of regression tests
Следующее
От: Andres Freund
Дата:
Сообщение: Re: A test for replay of regression tests