Обсуждение: ANALYZE counts LP_DEAD line pointers as n_dead_tup

Поиск
Список
Период
Сортировка

ANALYZE counts LP_DEAD line pointers as n_dead_tup

От
Masahiko Sawada
Дата:
Hi all,

If we create a table with vacuum_index_cleanup = off or execute VACUUM
with INDEX_CLEANUP = off, vacuum updates pg_stat_all_tables.n_dead_tup
to the number of HEAPTUPLE_RECENTLY_DEAD tuples. Whereas analyze
updates it to the sum of the number of HEAPTUPLE_DEAD/RECENTLY_DEAD
tuples and LP_DEAD line pointers. So if the table has many LP_DEAD
line pointers due to skipping index cleanup, autovacuum is triggered
every time after analyze/autoanalyze. This issue seems to happen also
on back branches, probably from 12 where INDEX_CLEANUP option was
introduced.

I think we can have heapam_scan_analyze_next_tuple() not count LP_DEAD
line pointer as lazy_scan_prune() does. Attached the patch for that.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

Re: ANALYZE counts LP_DEAD line pointers as n_dead_tup

От
Peter Geoghegan
Дата:
On Wed, Apr 14, 2021 at 7:11 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> If we create a table with vacuum_index_cleanup = off or execute VACUUM
> with INDEX_CLEANUP = off, vacuum updates pg_stat_all_tables.n_dead_tup
> to the number of HEAPTUPLE_RECENTLY_DEAD tuples. Whereas analyze
> updates it to the sum of the number of HEAPTUPLE_DEAD/RECENTLY_DEAD
> tuples and LP_DEAD line pointers. So if the table has many LP_DEAD
> line pointers due to skipping index cleanup, autovacuum is triggered
> every time after analyze/autoanalyze. This issue seems to happen also
> on back branches, probably from 12 where INDEX_CLEANUP option was
> introduced.

Hmm.

> I think we can have heapam_scan_analyze_next_tuple() not count LP_DEAD
> line pointer as lazy_scan_prune() does. Attached the patch for that.

lazy_scan_prune() is concerned about what the state of the table *will
be* when VACUUM finishes, based on its working assumption that index
vacuuming and heap vacuuming always go ahead. This is exactly the same
reason why lazy_scan_prune() will set LVPagePruneState.hastup to
'false' in the presence of an LP_DEAD item -- this is not how
count_nondeletable_pages() considers if the same page 'hastup' much
later on, right at the end of the VACUUM (it will only consider the
page safe to truncate away if it now only contains LP_UNUSED items --
LP_DEAD items make heap/table truncation unsafe).

In general accounting rules like this may need to work slightly
differently across near-identical functions because of "being versus
becoming" issues. It is necessary to distinguish between "being" code
(e.g., this ANALYZE code, count_nondeletable_pages() and its hastup
issue) and "becoming" code (e.g., lazy_scan_prune() ands its approach
to counting "remaining" dead tuples as well as hastup-ness). I tend to
doubt that your patch is the right approach because the two code paths
already "agree" once you assume that the LP_DEAD items that
lazy_scan_prune() sees will be gone at the end of the VACUUM. I do
agree that this is a problem, though.

Generally speaking, the "becoming" code from lazy_scan_prune() is not
100% sure that it will be correct in each case, for a large variety of
reasons. But I think that we should expect it to be mostly correct. We
definitely cannot allow it to be quite wrong all the time with some
workloads. And so I agree that this is a problem for the INDEX_CLEANUP
= off case, though it's equally an issue for the recently added
failsafe mechanism. I do not believe that it is a problem for the
bypass-indexes optimization, though, because that is designed to only
be applied when there are practically zero LP_DEAD items. The
optimization can make VACUUM record that there are zero dead tuples
after the VACUUM finishes, even though there were in fact a very small
non-zero number of dead tuples -- but that's not appreciably different
from any of the other ways that the count of dead tuples could be
inaccurate (e.g. concurrent opportunistic pruning). The specific tests
that we apply inside lazy_vacuum() should make sure that autovacuum
scheduling is never affected. The autovacuum scheduling code can
safely "believe" that the indexes were vacuumed, because it really is
the same as if there were precisely zero LP_DEAD items (or the same
for all practical purposes).

I'm not sure what to do, though. Both the INDEX_CLEANUP = off case and
the failsafe case are only intended for emergencies. And it's hard to
know what to do in a code path that is designed to rarely or never be
used.

-- 
Peter Geoghegan



Re: ANALYZE counts LP_DEAD line pointers as n_dead_tup

От
Peter Geoghegan
Дата:
On Fri, Apr 16, 2021 at 1:16 PM Peter Geoghegan <pg@bowt.ie> wrote:
> I'm not sure what to do, though. Both the INDEX_CLEANUP = off case and
> the failsafe case are only intended for emergencies. And it's hard to
> know what to do in a code path that is designed to rarely or never be
> used.

How about just documenting it in comments, as in the attached patch? I
tried to address all of the issues with LP_DEAD accounting together.
Both the issue raised by Masahiko, and one or two others that were
also discussed recently on other threads. They all seem kind of
related to me.

I didn't address the INDEX_CLEANUP = off case in the comments directly
(I just addressed the failsafe case). There is no good reason to think
that the situation will resolve with INDEX_CLEANUP = off, so it didn't
seem wise to mention it too. But that should still be okay --
INDEX_CLEANUP = off has practically been superseded by the failsafe,
since it is much more flexible. And, anybody that uses INDEX_CLEANUP =
off cannot expect to never do index cleanup without seriously bad
consequences all over the place.


--
Peter Geoghegan

Вложения

Re: ANALYZE counts LP_DEAD line pointers as n_dead_tup

От
Peter Geoghegan
Дата:
On Fri, Apr 16, 2021 at 6:54 PM Peter Geoghegan <pg@bowt.ie> wrote:
> How about just documenting it in comments, as in the attached patch? I
> tried to address all of the issues with LP_DEAD accounting together.
> Both the issue raised by Masahiko, and one or two others that were
> also discussed recently on other threads. They all seem kind of
> related to me.

I pushed a version of this just now.

Thanks
-- 
Peter Geoghegan