Re: VACUUM/ANALYZE counting of in-doubt tuples

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: VACUUM/ANALYZE counting of in-doubt tuples
Дата
Msg-id 127.1195486723@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: VACUUM/ANALYZE counting of in-doubt tuples  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Ответы Re: VACUUM/ANALYZE counting of in-doubt tuples  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Tom Lane wrote:
>> On further thought though, that's not the whole story, and in fact
>> VACUUM itself isn't doing very well at accounting for in-doubt tuples.

> How about this: let's have VACUUM send a message at the start of
> processing the table.  pgstats saves the current counters for the table
> somewhere and resets them to zero; and any transaction that sends
> messages after that is counted to the new counter.

> When vacuum finishes and commits, it sends another message and pgstats
> forgets the counters it saved.  At this point, the count of dead tuples
> will be correct.  (If during vacuum anyone retrieves the number of dead
> tuples, the logical thing would be to report the saved counter).

No, that doesn't work (not to mention that adding two more counters
per-table will be a lot of bloat for the stats tables).

The race conditions are a lot more subtle than that.  The stats
collector cannot know when it receives a tabstat message after VACUUM
starts whether VACUUM has/will see the tuples involved, or whether it
will see them as committed or not.  That would depend on whether VACUUM
has yet reached the page(s) the tuples are in.  (Conversely tabstats
arriving shortly after the VACUUM completion report might or might not
correspond to tuples seen by VACUUM, though neither your proposal nor
mine tries to address that.)

AFAICS the only way to be 100% correct would be to track live/dead
counts on a page-by-page basis, which is obviously impractical.
(And I'm not sure even that works, given the possibility of stats
messages arriving at the collector out-of-order compared to the
actual page-changing operations.)

So we have to settle for an approximation, and that being the case
I'd rather not have an expensive approximation.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: VACUUM/ANALYZE counting of in-doubt tuples