Re: decoupling table and index vacuum

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: decoupling table and index vacuum
Дата
Msg-id CA+Tgmob6CnSyyB+tU52727nMYnmecq00g-c0jV-y5GhqVh83NQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: decoupling table and index vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: decoupling table and index vacuum
Re: decoupling table and index vacuum
Список pgsql-hackers
On Thu, Apr 22, 2021 at 10:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> The dead TID fork needs to also be efficiently searched. If the heap
> scan runs twice, the collected dead TIDs on each heap pass could be
> overlapped. But we would not be able to merge them if we did index
> vacuuming on one of indexes at between those two heap scans. The
> second time heap scan would need to record only TIDs that are not
> collected by the first time heap scan.

I agree that there's a problem here. It seems to me that it's probably
possible to have a dead TID fork that implements "throw away the
oldest stuff" efficiently, and it's probably also possible to have a
TID fork that can be searched efficiently. However, I am not sure that
it's possible to have a dead TID fork that does both of those things
efficiently. Maybe you have an idea. My intuition is that if we have
to pick one, it's MUCH more important to be able to throw away the
oldest stuff efficiently. I think we can work around the lack of
efficient lookup, but I don't see a way to work around the lack of an
efficient operation to discard the oldest stuff.

> Right. Given decoupling index vacuuming, I think the index’s garbage
> statistics are important which preferably need to be fetchable without
> accessing indexes. It would be not hard to estimate how many index
> tuples might be able to be deleted by looking at the dead TID fork but
> it doesn’t necessarily match the actual number.

Right, and to appeal (I think) to Peter's quantitative vs. qualitative
principle, it could be way off. Like, we could have a billion dead
TIDs and in one index the number of index entries that need to be
cleaned out could be 1 billion and in another index it could be zero
(0). We know how much data we will need to scan because we can fstat()
the index, but there seems to be no easy way to estimate how many of
those pages we'll need to dirty, because we don't know how successful
previous opportunistic cleanup has been. It is not impossible, as
Peter has pointed out a few times now, that it has worked perfectly
and there will be no modifications required, but it is also possible
that it has done nothing.

--
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: decoupling table and index vacuum
Следующее
От: Andres Freund
Дата:
Сообщение: Re: decoupling table and index vacuum