Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed

Поиск
Список
Период
Сортировка
От Alexander Lakhin
Тема Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed
Дата
Msg-id cc7925b8-30cc-c76d-b1b6-c9ec6bd36a03@gmail.com
обсуждение исходный текст
Ответ на Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Ответы Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Список pgsql-hackers
Hi,

12.01.2024 10:15, Bertrand Drouvot wrote:
>
> For this one, the "good" news is that it looks like that we don’t see the
> "terminating" message not followed by an "obsolete" message (so the engine
> behaves correctly) anymore.
>
> There is simply nothing related to the row_removal_activeslot at all (the catalog_xmin
> advanced and there is no conflict).

Yes, judging from all the failures that we see now, it looks like the
0001-Fix-race-condition...patch works as expected.

> And I agree that this is due to the Standby/RUNNING_XACTS that is "advancing" the
> catalog_xmin of the active slot.
>
>> Standby/RUNNING_XACTS is exactly why 039_end_of_wal.pl uses wal_level
>> = minimal, because these lead to unpredictible records inserted,
>> impacting the reliability of the tests.  We cannot do that here,
>> obviously.  That may be a long shot, but could it be possible to tweak
>> the test with a retry logic, retrying things if such a standby
>> snapshot is found because we know that the invalidation is not going
>> to work anyway?
> I think it all depends what the xl_running_xacts does contain (means does it
> "advance" or not the catalog_xmin in our case).
>
> In our case it does advance it (should it occurs) due to the "select txid_current()"
> that is done in wait_until_vacuum_can_remove() in 035_standby_logical_decoding.pl.
>
> I suggest to make use of txid_current_snapshot() instead (that does not produce
> a Transaction/COMMIT wal record, as opposed to txid_current()).
>
> I think that it could be "enough" for our case here, and it's what v5 attached is
> now doing.
>
> Let's give v5 a try? (please apply v1-0001-Fix-race-condition-in-InvalidatePossiblyObsoleteS.patch
> too).

Unfortunately, I've got the failure again (please see logs attached).
(_primary.log can confirm that I have used exactly v5 — I see no
txid_current() calls there...)

Best regards,
Alexander
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Banck
Дата:
Сообщение: Re: plpgsql memory leaks
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Make attstattarget nullable