Re: GiST VACUUM

Поиск
Список
Период
Сортировка
От Andrey Borodin
Тема Re: GiST VACUUM
Дата
Msg-id 5F278E38-FD10-45EB-88D0-5B1D7108F984@yandex-team.ru
обсуждение исходный текст
Ответ на Re: GiST VACUUM  (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы Re: GiST VACUUM  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-hackers

> 13 июля 2018 г., в 18:10, Heikki Linnakangas <hlinnaka@iki.fi> написал(а):
>
>>>> But the situation in gistdoinsert(), where you encounter a deleted leaf page, could happen during normal
operation,if vacuum runs concurrently with an insert. Insertion locks only one page at a time, as it descends the tree,
soafter it has released the lock on the parent, but before it has locked the child, vacuum might have deleted the page.
Inthe latest patch, you're checking for that just before swapping the shared lock for an exclusive one, but I think
that'swrong; you need to check for that after swapping the lock, because otherwise vacuum might delete the page while
you'renot holding the lock. 
>>> Looks like a valid concern, I'll move that code again.
>> Done.
>
> Ok, the comment now says:
>
>> +            /*
>> +             * Leaf pages can be left deleted but still referenced in case of
>> +             * crash during VACUUM's gistbulkdelete()
>> +             */
>
> But that's not accurate, right? You should never see deleted pages after a crash, because the parent is updated in
thesame WAL record as the child page, right? 
Fixed the comment.
>
> I'm still a bit scared about using pd_prune_xid to store the XID that prevents recycling the page too early. Can we
usesome field in GISTPageOpaqueData for that, similar to how the B-tree stores it in BTPageOpaqueData? 
There is no room in opaque data, but, technically all page is just a tombstone until reused, so we can pick arbitrary
place.PFA v7 where xid after delete is stored in opaque data, but we can use other places, like line pointer array or
opaque-1

> 13 июля 2018 г., в 18:25, Heikki Linnakangas <hlinnaka@iki.fi> написал(а):
>
> Looking at the second patch, to scan the GiST index in physical order, that seems totally unsafe, if there are any
concurrentpage splits. In the logical scan, pushStackIfSplited() deals with that, by comparing the page's NSN with the
parent'sLSN. But I don't see anything like that in the physical scan code. 

Leaf page can be pointed by internal page and rightlink simultaneously. The purpose of NSN is to visit this page
exactlyonce by following only on of two links in a scan. This is achieved naturally if we read everything from the
beginningto the end. (That is how I understand, I can be wrong) 

> I think we can do something similar in the physical scan: remember the current LSN position at the beginning of the
vacuum,and compare with that. The B-tree code uses the "cycle ID" for similar purposes. 
>
> Do we still need the separate gistvacuumcleanup() pass, if we scan the index in physical order in the bulkdelete pass
already?

We do not need to gather stats there, but we are doing RecordFreeIndexPage() and IndexFreeSpaceMapVacuum(). Is it
correctto move this to first scan? 

Best regards, Andrey Borodin.


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Don Seiler
Дата:
Сообщение: Re: [PATCH] Include application_name in "connection authorized" log message
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: pgsql: Fix parallel index and index-only scans to fall back toserial.