Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?
Дата
Msg-id CA+TgmobFv_X485LjFxz4OtkvKHXBAngGVvkGDtPnFt1tbnF2tg@mail.gmail.com
обсуждение исходный текст
Ответ на Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Sun, Dec 5, 2021 at 12:28 AM Peter Geoghegan <pg@bowt.ie> wrote:
> I wonder why we're counting the number of dead tuples (or LP_DEAD stub
> items) in the relation as a whole in ANALYZE's acquire_sample_rows()
> function. Wouldn't it make more sense to focus on the "live vs dead
> tuple properties" of heap pages that are not known to be all-visible
> when we generate statistics for our pgstat_report_analyze() report?
> These statistic collector stats are only for the benefit of autovacuum
> scheduling -- and so they're *consumed* in a way that is totally
> different to the nearby pg_statistic stats.

I think this could be the right idea. I'm not certain that it is, but
it does sound believable.

> This new approach also buys us the ability to extrapolate a new
> estimated number of dead tuples using old, stale stats. The stats can
> be combined with the authoritative/known number of not-all-visible
> pages right this second, since it's cheap enough to *accurately*
> determine the total number of not-all-visible pages for a heap
> relation by calling visibilitymap_count(). My guess is that this would
> be much more accurate in practice: provided the original average
> number of dead/live tuples (tuples per not-all-visible block) was
> still reasonably accurate, the extrapolated "total dead tuples right
> now" values would also be accurate.

So does this. If some of the table is now all-visible when it wasn't
before, it's certainly a good guess that the portions that still
aren't have about the same distribution of dead tuples that they did
before ... although the other direction is less clear: it seems
possible that newly not-all-visible pages have fewer dead tuples than
ones which have been not-all-visible for a while. But you have to make
some guess.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Bossart, Nathan"
Дата:
Сообщение: Re: Do we need pre-allocate WAL files during end-of-recovery checkpoint?
Следующее
От: Robert Haas
Дата:
Сообщение: Re: ExecTypeSetColNames is fundamentally broken