Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Дата
Msg-id 22497.1306341699@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Список pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, May 25, 2011 at 11:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> 2. Revise the vacuum code so that it doesn't skip updating the pg_class
>> entries. �We could have it count the number of pages it skipped, rather
>> than just keeping a bool, and then scale up the rel_tuples count to be
>> approximately right by assuming the skipped pages have tuple density
>> similar to the scanned ones.

> This approach doesn't seem like a good idea to me.  The skipped
> portions of the table are the ones that haven't been modified in a
> while, so this is or embeds an assumption that the tuples in the hot
> and cold portions of the table are the same size.  That might be true,
> but it isn't hard to think of cases where it might not be.  There
> could also very easily be sampling error, because it's easy to imagine
> a situation where 99% of the table is getting skipped.

Yeah, I had been thinking about the latter point.  We could be
conservative and just keep the reported tuple density the same (ie,
update relpages to the new correct value, while setting reltuples so
that the density ratio doesn't change).  But that has its own problems
when the table contents *do* change.  What I'm currently imagining is
to do a smoothed moving average, where we factor in the new density
estimate with a weight dependent on the percentage of the table we did
scan.  That is, the calculation goes something like

old_density = old_reltuples / old_relpages
new_density = counted_tuples / scanned_pages
reliability = scanned_pages / new_relpages
updated_density = old_density + (new_density - old_density) * reliability
new_reltuples = updated_density * new_relpages

We could slow the moving-average convergence even further when
reliability is small; for instance if you were really paranoid you might
want to use the square of reliability in line 4.  That might be
overdoing it, though.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: [BUGS] BUG #6034: pg_upgrade fails when it should not.
Следующее
От: Vaibhav Kaushal
Дата:
Сообщение: Re: Expression Evaluator used for creating the plan tree / stmt ?