Re: [Proposal] More Vacuum Statistics

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: [Proposal] More Vacuum Statistics
Дата
Msg-id 556920E1.2050508@BlueTreble.com
обсуждение исходный текст
Ответ на Re: [Proposal] More Vacuum Statistics  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [Proposal] More Vacuum Statistics  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On 5/28/15 9:14 AM, Tom Lane wrote:
> Naoya Anzai <nao-anzai@xc.jp.nec.com> writes:
>> In my much experience up until now,I have an idea that we can add
>> 2 new vacuum statistics into pg_stat_xxx_tables.
>
> Adding new stats in that way requires adding per-table counters, which
> bloat the statistics files (especially in database with very many tables).
> I do not think you've made a case for these stats being valuable enough
> to justify such overhead for everybody.

It occurs to me that there's no good reason for vacuum-derived stats to 
be in the stats file; it's not like users run vacuum anywhere near as 
often as other commands. It's stats could be kept in pg_class; we're 
already keeping things like relallvisible there.

> As far as the first one goes, I don't even think it's especially useful.
> There might be value in tracking the times of the last few vacuums on a
> table, but knowing the time for only the latest one doesn't sound like it
> would prove much.  So I'd be inclined to think more along the lines of
> scanning the postmaster log for autovacuum runtimes, instead of squeezing
> it into the pg_stats views.

You'd also want to know how many pages were scanned, since any decent 
estimation would need to take table size into account.

As for history, that's a problem that exists for *all* our statistics, 
so anyone that cares about that is going to setup some system to 
periodically capture the contents of pg_stat_*.

> A possible alternative so far as the second one goes is to add a function
> (perhaps in contrib/pg_freespacemap) that simply runs through a table's
> VM and counts the number of set bits.  This would be more accurate (no
> risk of lost counter updates) and very possibly cheaper overall: it would
> take longer to find out the number when you wanted it, but you wouldn't be
> paying the distributed overhead of tracking it when you didn't want it.

Seems like a reasonable addition to that contrib module regardless. As 
Jeff Janes mentioned this info is available in pg_class, but it requires 
an ANALYZE to update it.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: nested loop semijoin estimates
Следующее
От: Andres Freund
Дата:
Сообщение: Re: [Proposal] More Vacuum Statistics