Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin
Дата
Msg-id 881304a6-6b8b-42a7-ad7c-261c3eace4a4@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin  (Melanie Plageman <melanieplageman@gmail.com>)
Список pgsql-hackers


On 6/24/24 16:53, Melanie Plageman wrote:
> On Mon, Jun 24, 2024 at 4:27 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>
>> On 21/06/2024 03:02, Peter Geoghegan wrote:
>>> On Thu, Jun 20, 2024 at 7:42 PM Melanie Plageman
>>> <melanieplageman@gmail.com> wrote:
>>>
>>>> The repro forces a round of index vacuuming after the standby
>>>> reconnects and before pruning a dead tuple whose xmax is older than
>>>> OldestXmin.
>>>>
>>>> At the end of the round of index vacuuming, _bt_pendingfsm_finalize()
>>>> calls GetOldestNonRemovableTransactionId(), thereby updating the
>>>> backend's GlobalVisState and moving maybe_needed backwards.
>>>
>>> Right. I saw details exactly consistent with this when I used GDB
>>> against a production instance.
>>>
>>> I'm glad that you were able to come up with a repro that involves
>>> exactly the same basic elements, including index page deletion.
>>
>> Would it be possible to make it robust so that we could always run it
>> with "make check"? This seems like an important corner case to
>> regression test.
> 
> I'd have to look into how to ensure I can stabilize some of the parts
> that seem prone to flaking. I can probably stabilize the vacuum bit
> with a query of pg_stat_activity making sure it is waiting to acquire
> the cleanup lock.
> 
> I don't, however, see a good way around the large amount of data
> required to trigger more than one round of index vacuuming. I could
> generate the data more efficiently than I am doing here
> (generate_series() in the from clause). Perhaps with a copy? I know it
> is too slow now to go in an ongoing test, but I don't have an
> intuition around how fast it would have to be to be acceptable. Is
> there a set of additional tests that are slower that we don't always
> run? I didn't follow how the wraparound test ended up, but that seems
> like one that would have been slow.
> 

I think it depends on what is the impact on the 'make check' duration.
If it could be added to one of the existing test groups, then it depends
on how long the slowest test in that group is. If the new test needs to
be in a separate group, it probably needs to be very fast.

But I was wondering how much time are we talking about, so I tried

creating a table, filling it with 300k rows => 250ms
creating an index => 180ms
delete 90% data => 200ms
vacuum t => 130ms

which with m_w_m=1MB does two rounds of index cleanup. That's ~760ms.
I'm not sure how much more stuff does the test need to do, but this
would be pretty reasonable, if we could add it to an existing group.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nathan Bossart
Дата:
Сообщение: Re: basebackups seem to have serious issues with FILE_COPY in CREATE DATABASE
Следующее
От: Robert Haas
Дата:
Сообщение: Re: POC, WIP: OR-clause support for indexes