Обсуждение: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

Поиск
Список
Период
Сортировка

[HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tomas Vondra
Дата:
Hi,

It seems to me that VACUUM and ANALYZE somewhat disagree on what exactly 
reltuples means. VACUUM seems to be thinking that
    reltuples = live + dead

while ANALYZE apparently believes that
    reltuples = live

This causes somewhat bizarre changes in the value, depending on which of 
those commands was executed last.

To demonstrate the issue, let's create a simple table with 1M rows, 
delete 10% rows and then we'll do a bunch of VACUUM / ANALYZE and check 
reltuples, n_live_tup and n_dead_tup in the catalogs.

I've disabled autovacuum so that it won't interfere with this, and 
there's another transaction blocking VACUUM from actually cleaning any 
dead tuples.

    test=# create table t as           select i from generate_series(1,1000000) s(i);
    test=# select reltuples, n_live_tup, n_dead_tup             from pg_stat_user_tables join pg_class using (relname)
         where relname = 't';
 
     reltuples | n_live_tup | n_dead_tup    -----------+------------+------------         1e+06 |    1000000 |
0

So, that's nice. Now let's delete 10% of rows, and run VACUUM and 
ANALYZE a few times.
    test=# delete from t where random() < 0.1;
    test=# vacuum t;
    test=# select reltuples, n_live_tup, n_dead_tup             from pg_stat_user_tables join pg_class using (relname)
         where relname = 't';
 
     reltuples | n_live_tup | n_dead_tup    -----------+------------+------------         1e+06 |     900413 |
99587

    test=# analyze t;
     reltuples | n_live_tup | n_dead_tup    -----------+------------+------------        900413 |     900413 |
99587
    test=# vacuum t;
     reltuples | n_live_tup | n_dead_tup    -----------+------------+------------         1e+06 |     900413 |
99587


So, analyze and vacuum disagree.

To further confuse the poor DBA, VACUUM always simply ignores the old 
values while ANALYZE combines the old and new values on large tables 
(and converges to the "correct" value after a few steps). This table is 
small (less than 30k pages), so ANALYZE does not do that.

This is quite annoying, because people tend to look at reltuples while 
investigating bloat (e.g. because the check_postgres query mentioned on 
our wiki [1] uses reltuples in the formula).

[1] https://wiki.postgresql.org/wiki/Show_database_bloat

And when the cleanup is blocked for some reason (as in the example 
above), VACUUM tends to be running much more often (because it can't 
cleanup anything). So reltuples tend to be set to the higher value, 
which I'd argue is the wrong value for estimating bloat.

I haven't looked at the code yet, but I've confirmed this happens both 
on 9.6 and 10. I haven't checked older versions, but I guess those are 
affected too.

The question is - which of the reltuples definitions is the right one? 
I've always assumed that "reltuples = live + dead" but perhaps not?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> It seems to me that VACUUM and ANALYZE somewhat disagree on what exactly 
> reltuples means. VACUUM seems to be thinking that
>      reltuples = live + dead
> while ANALYZE apparently believes that
>      reltuples = live

> The question is - which of the reltuples definitions is the right one? 
> I've always assumed that "reltuples = live + dead" but perhaps not?

I think the planner basically assumes that reltuples is the live tuple
count, so maybe we'd better change VACUUM to get in step.
        regards, tom lane



Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tomas Vondra
Дата:
On 7/25/17 12:55 AM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> It seems to me that VACUUM and ANALYZE somewhat disagree on what
>> exactly reltuples means. VACUUM seems to be thinking that reltuples
>> = live + dead while ANALYZE apparently believes that reltuples =
>> live
> 
>> The question is - which of the reltuples definitions is the right
>> one? I've always assumed that "reltuples = live + dead" but perhaps
>> not?
> 
> I think the planner basically assumes that reltuples is the live
> tuple count, so maybe we'd better change VACUUM to get in step.
> 

Attached is a patch that (I think) does just that. The disagreement was 
caused by VACUUM treating recently dead tuples as live, while ANALYZE 
treats both of those as dead.

At first I was worried that this will negatively affect plans in the 
long-running transaction, as it will get underestimates (due to 
reltuples not including rows it can see). But that's a problem we 
already have anyway, you just need to run ANALYZE in the other session.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 7/25/17 12:55 AM, Tom Lane wrote:
>> I think the planner basically assumes that reltuples is the live
>> tuple count, so maybe we'd better change VACUUM to get in step.

> Attached is a patch that (I think) does just that. The disagreement was 
> caused by VACUUM treating recently dead tuples as live, while ANALYZE 
> treats both of those as dead.

> At first I was worried that this will negatively affect plans in the 
> long-running transaction, as it will get underestimates (due to 
> reltuples not including rows it can see). But that's a problem we 
> already have anyway, you just need to run ANALYZE in the other session.

This definitely will have some impact on plans, at least in cases where
there's a significant number of unvacuumable dead tuples.  So I think
it's a bit late for v10, and I wouldn't want to back-patch at all.
Please add to the next commitfest.
        regards, tom lane



Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tomas Vondra
Дата:


On 7/25/17 5:04 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On 7/25/17 12:55 AM, Tom Lane wrote:
>>> I think the planner basically assumes that reltuples is the live 
>>> tuple count, so maybe we'd better change VACUUM to get in step.
> 
>> Attached is a patch that (I think) does just that. The disagreement
>> was caused by VACUUM treating recently dead tuples as live, while
>> ANALYZE treats both of those as dead.
> 
>> At first I was worried that this will negatively affect plans in
>> the long-running transaction, as it will get underestimates (due
>> to reltuples not including rows it can see). But that's a problem
>> we already have anyway, you just need to run ANALYZE in the other
>> session.
> 
> This definitely will have some impact on plans, at least in cases
> where there's a significant number of unvacuumable dead tuples. So I
> think it's a bit late for v10, and I wouldn't want to back-patch at
> all. Please add to the next commitfest.
> 

I dare to disagree here, for two reasons.

Firstly, the impact *is* already there, it only takes running ANALYZE. 
Or VACUUM ANALYZE. In both those cases we already end up with 
reltuples=n_live_tup.

Secondly, I personally strongly prefer stable predictable behavior over 
intermittent oscillations between two values. That's a major PITA on 
production, both to investigate and fix.

So people already have this issue, although it only strikes randomly. 
And no way to fix it (well, except for fixing the cleanup, but that may 
not be possible).

It is true we tend to run VACUUM more often than ANALYZE, particularly 
in situations where the cleanup can't proceed - ANALYZE will do it's 
work and VACUUM will be triggered over and over again, so it "wins" this 
way. But I'm not sure that's something we should rely on.


FWIW I personally see this as a fairly annoying bug, and would vote to 
backpatch it, although I understand people might object. But I don't 
quite see a reason not to fix this in v10.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Noah Misch
Дата:
On Tue, Jul 25, 2017 at 07:02:28PM +0200, Tomas Vondra wrote:
> On 7/25/17 5:04 PM, Tom Lane wrote:
> >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> >>Attached is a patch that (I think) does just that. The disagreement
> >>was caused by VACUUM treating recently dead tuples as live, while
> >>ANALYZE treats both of those as dead.
> >
> >>At first I was worried that this will negatively affect plans in
> >>the long-running transaction, as it will get underestimates (due
> >>to reltuples not including rows it can see). But that's a problem
> >>we already have anyway, you just need to run ANALYZE in the other
> >>session.
> >
> >This definitely will have some impact on plans, at least in cases
> >where there's a significant number of unvacuumable dead tuples. So I
> >think it's a bit late for v10, and I wouldn't want to back-patch at
> >all. Please add to the next commitfest.
> >
> 
> I dare to disagree here, for two reasons.
> 
> Firstly, the impact *is* already there, it only takes running ANALYZE. Or
> VACUUM ANALYZE. In both those cases we already end up with
> reltuples=n_live_tup.
> 
> Secondly, I personally strongly prefer stable predictable behavior over
> intermittent oscillations between two values. That's a major PITA on
> production, both to investigate and fix.

> FWIW I personally see this as a fairly annoying bug, and would vote to
> backpatch it, although I understand people might object.

I tend to agree.  If you have a setup that somehow never uses ANALYZE or never
uses VACUUM on a particular table, you might prefer today's (accidental)
behavior.  However, the discrepancy arises only on a table that gets dead
tuples, and a table that gets dead tuples will see both VACUUM and ANALYZE
soon enough.  It does seem like quite a stretch to imagine someone wanting
plans to depend on which of VACUUM or ANALYZE ran most recently.



Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Haribabu Kommi
Дата:


On Tue, Jul 25, 2017 at 9:33 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On 7/25/17 12:55 AM, Tom Lane wrote:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
It seems to me that VACUUM and ANALYZE somewhat disagree on what
exactly reltuples means. VACUUM seems to be thinking that reltuples
= live + dead while ANALYZE apparently believes that reltuples =
live

The question is - which of the reltuples definitions is the right
one? I've always assumed that "reltuples = live + dead" but perhaps
not?

I think the planner basically assumes that reltuples is the live
tuple count, so maybe we'd better change VACUUM to get in step.


Attached is a patch that (I think) does just that. The disagreement was caused by VACUUM treating recently dead tuples as live, while ANALYZE treats both of those as dead.

At first I was worried that this will negatively affect plans in the long-running transaction, as it will get underestimates (due to reltuples not including rows it can see). But that's a problem we already have anyway, you just need to run ANALYZE in the other session.

Thanks for the patch.
From the mail, I understand that this patch tries to improve the
reltuples value update in the catalog table by the vacuum command
to consider the proper visible tuples similar like analyze command.

- num_tuples);
+ num_tuples - nkeep);

With the above correction, there is a problem in reporting the number
of live tuples to the stats.

postgres=# select reltuples, n_live_tup, n_dead_tup
              from pg_stat_user_tables join pg_class using (relname)
             where relname = 't';
 reltuples | n_live_tup | n_dead_tup 
-----------+------------+------------
    899818 |     799636 |     100182
(1 row)


The live tuples data value is again decremented with dead tuples
value before sending them to stats in function lazy_vacuum_rel(),

/* report results to the stats collector, too */
new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples;

The fix needs a correction here also. Or change the correction in 
lazy_vacuum_rel() function itself before updating catalog table similar
like stats.


While testing this patch, I found another problem that is not related to
this patch. When the vacuum command is executed mutiple times on
a table with no dead rows, the number of reltuples value is slowly
reducing.

postgres=# select reltuples, n_live_tup, n_dead_tup
              from pg_stat_user_tables join pg_class using (relname)
             where relname = 't';
 reltuples | n_live_tup | n_dead_tup 
-----------+------------+------------
    899674 |     899674 |          0
(1 row)

postgres=# vacuum t;
VACUUM
postgres=# select reltuples, n_live_tup, n_dead_tup
              from pg_stat_user_tables join pg_class using (relname)
             where relname = 't';
 reltuples | n_live_tup | n_dead_tup 
-----------+------------+------------
    899622 |     899622 |          0
(1 row)

postgres=# vacuum t;
VACUUM
postgres=# select reltuples, n_live_tup, n_dead_tup
              from pg_stat_user_tables join pg_class using (relname)
             where relname = 't';
 reltuples | n_live_tup | n_dead_tup 
-----------+------------+------------
    899570 |     899570 |          0
(1 row)


In lazy_scan_heap() function, we force to scan the last page of the
relation to avoid the access exclusive lock in lazy_truncate_heap
if there are tuples in the last page. Because of this reason, the
scanned_pages value will never be 0, so the vac_estimate_reltuples
function will estimate the tuples based on the number of tuples
from the last page of the relation. This estimation is leading to
reduce the number of retuples.

I am thinking whether this problem really happen in real world scenarios
to produce a fix?
 
Regards,
Hari Babu
Fujitsu Australia

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Daniel Gustafsson
Дата:
> On 06 Sep 2017, at 09:45, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Tue, Jul 25, 2017 at 9:33 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>>
wrote:
> On 7/25/17 12:55 AM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> writes:
> It seems to me that VACUUM and ANALYZE somewhat disagree on what
> exactly reltuples means. VACUUM seems to be thinking that reltuples
> = live + dead while ANALYZE apparently believes that reltuples =
> live
>
> The question is - which of the reltuples definitions is the right
> one? I've always assumed that "reltuples = live + dead" but perhaps
> not?
>
> I think the planner basically assumes that reltuples is the live
> tuple count, so maybe we'd better change VACUUM to get in step.
>
> Attached is a patch that (I think) does just that. The disagreement was caused by VACUUM treating recently dead
tuplesas live, while ANALYZE treats both of those as dead. 
>
> At first I was worried that this will negatively affect plans in the long-running transaction, as it will get
underestimates(due to reltuples not including rows it can see). But that's a problem we already have anyway, you just
needto run ANALYZE in the other session. 
>
> Thanks for the patch.
> From the mail, I understand that this patch tries to improve the
> reltuples value update in the catalog table by the vacuum command
> to consider the proper visible tuples similar like analyze command.
>
> -                                                         num_tuples);
> +                                                         num_tuples - nkeep);
>
> With the above correction, there is a problem in reporting the number
> of live tuples to the stats.
>
> postgres=# select reltuples, n_live_tup, n_dead_tup
>               from pg_stat_user_tables join pg_class using (relname)
>              where relname = 't';
>  reltuples | n_live_tup | n_dead_tup
> -----------+------------+------------
>     899818 |     799636 |     100182
> (1 row)
>
>
> The live tuples data value is again decremented with dead tuples
> value before sending them to stats in function lazy_vacuum_rel(),
>
>     /* report results to the stats collector, too */
>     new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples;
>
> The fix needs a correction here also. Or change the correction in
> lazy_vacuum_rel() function itself before updating catalog table similar
> like stats.

This patch is marked Waiting for Author, have you had a chance to look at this
to address the comments in the above review?

cheers ./daniel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tomas Vondra
Дата:

On 09/06/2017 09:45 AM, Haribabu Kommi wrote:
> 
> 
> On Tue, Jul 25, 2017 at 9:33 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
> 
>     On 7/25/17 12:55 AM, Tom Lane wrote:
> 
>         Tomas Vondra <tomas.vondra@2ndquadrant.com
>         <mailto:tomas.vondra@2ndquadrant.com>> writes:
> 
>             It seems to me that VACUUM and ANALYZE somewhat disagree on what
>             exactly reltuples means. VACUUM seems to be thinking that
>             reltuples
>             = live + dead while ANALYZE apparently believes that reltuples =
>             live
> 
> 
>             The question is - which of the reltuples definitions is the
>             right
>             one? I've always assumed that "reltuples = live + dead" but
>             perhaps
>             not?
> 
> 
>         I think the planner basically assumes that reltuples is the live
>         tuple count, so maybe we'd better change VACUUM to get in step.
> 
> 
>     Attached is a patch that (I think) does just that. The disagreement
>     was caused by VACUUM treating recently dead tuples as live, while
>     ANALYZE treats both of those as dead.
> 
>     At first I was worried that this will negatively affect plans in the
>     long-running transaction, as it will get underestimates (due to
>     reltuples not including rows it can see). But that's a problem we
>     already have anyway, you just need to run ANALYZE in the other session.
> 
> 
> Thanks for the patch.
> From the mail, I understand that this patch tries to improve the
> reltuples value update in the catalog table by the vacuum command
> to consider the proper visible tuples similar like analyze command.
> 
> -num_tuples);
> +num_tuples - nkeep);
> 
> With the above correction, there is a problem in reporting the number
> of live tuples to the stats.
> 
> postgres=# select reltuples, n_live_tup, n_dead_tup
>               from pg_stat_user_tables join pg_class using (relname)
>              where relname = 't';
>  reltuples | n_live_tup | n_dead_tup 
> -----------+------------+------------
>     899818 |     799636 |     100182
> (1 row)
> 
> 
> The live tuples data value is again decremented with dead tuples
> value before sending them to stats in function lazy_vacuum_rel(),
> 
> /* report results to the stats collector, too */
> new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples;
> 
> The fix needs a correction here also. Or change the correction in 
> lazy_vacuum_rel() function itself before updating catalog table similar
> like stats.
> 

Ah, haven't noticed that for some reason - you're right, we estimate the
reltuples based on (num_tuples - nkeep), so it doesn't make much sense
to subtract nkeep again. Attached is v2 of the fix.

I've removed the subtraction from lazy_vacuum_rel(), leaving just

    new_live_tuples = new_rel_tuples;

and now it behaves as expected (no second subtraction). That means we
can get rid of new_live_tuples altogether (and the protection against
negative values), and use new_rel_tuples directly.

Which pretty much means that in this case

    (pg_class.reltuples == pg_stat_user_tables.n_live_tup)

but I guess that's fine, based on the initial discussion in this thread.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tomas Vondra
Дата:
Hi,

Apologies, I forgot to respond to the second part of your message.

On 09/06/2017 09:45 AM, Haribabu Kommi wrote:
>
> While testing this patch, I found another problem that is not related to
> this patch. When the vacuum command is executed mutiple times on
> a table with no dead rows, the number of reltuples value is slowly
> reducing.
> 
> postgres=# select reltuples, n_live_tup, n_dead_tup
>               from pg_stat_user_tables join pg_class using (relname)
>              where relname = 't';
>  reltuples | n_live_tup | n_dead_tup 
> -----------+------------+------------
>     899674 |     899674 |          0
> (1 row)
> 
> postgres=# vacuum t;
> VACUUM
> postgres=# select reltuples, n_live_tup, n_dead_tup
>               from pg_stat_user_tables join pg_class using (relname)
>              where relname = 't';
>  reltuples | n_live_tup | n_dead_tup 
> -----------+------------+------------
>     899622 |     899622 |          0
> (1 row)
> 
> postgres=# vacuum t;
> VACUUM
> postgres=# select reltuples, n_live_tup, n_dead_tup
>               from pg_stat_user_tables join pg_class using (relname)
>              where relname = 't';
>  reltuples | n_live_tup | n_dead_tup 
> -----------+------------+------------
>     899570 |     899570 |          0
> (1 row)
> 
> 
> In lazy_scan_heap() function, we force to scan the last page of the 
> relation to avoid the access exclusive lock in lazy_truncate_heap if
> there are tuples in the last page. Because of this reason, the 
> scanned_pages value will never be 0, so the vac_estimate_reltuples 
> function will estimate the tuples based on the number of tuples from
> the last page of the relation. This estimation is leading to reduce
> the number of retuples.
> 

Hmmm, that's annoying. Perhaps if we should not update the values in
this case, then? I mean, if we only scan the last page, how reliable the
derived values are?

For the record - AFAICS this issue is unrelated to do with the patch
(i.e. it's not introduced by it).

> I am thinking whether this problem really happen in real world 
> scenarios to produce a fix?
> 

Not sure.

As vacuum run decrements the query only a little bit, so you'd have to
run the vacuum many times to be actually bitten by it. For people
relying on autovacuum that won't happen, as it only runs on tables with
certain number of dead tuples.

So you'd have to be running VACUUM in a loop or something (but not
VACUUM ANALYZE, because that works fine) from a script, or something
like that.

That being said, fixing a bug is always a good thing I guess.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Haribabu Kommi
Дата:


On Mon, Sep 25, 2017 at 4:39 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:


On 09/06/2017 09:45 AM, Haribabu Kommi wrote:
>
>
> On Tue, Jul 25, 2017 at 9:33 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>
>     On 7/25/17 12:55 AM, Tom Lane wrote:
>
>         Tomas Vondra <tomas.vondra@2ndquadrant.com
>         <mailto:tomas.vondra@2ndquadrant.com>> writes:
>
>             It seems to me that VACUUM and ANALYZE somewhat disagree on what
>             exactly reltuples means. VACUUM seems to be thinking that
>             reltuples
>             = live + dead while ANALYZE apparently believes that reltuples =
>             live
>
>
>             The question is - which of the reltuples definitions is the
>             right
>             one? I've always assumed that "reltuples = live + dead" but
>             perhaps
>             not?
>
>
>         I think the planner basically assumes that reltuples is the live
>         tuple count, so maybe we'd better change VACUUM to get in step.
>
>
>     Attached is a patch that (I think) does just that. The disagreement
>     was caused by VACUUM treating recently dead tuples as live, while
>     ANALYZE treats both of those as dead.
>
>     At first I was worried that this will negatively affect plans in the
>     long-running transaction, as it will get underestimates (due to
>     reltuples not including rows it can see). But that's a problem we
>     already have anyway, you just need to run ANALYZE in the other session.
>
>
> Thanks for the patch.
> From the mail, I understand that this patch tries to improve the
> reltuples value update in the catalog table by the vacuum command
> to consider the proper visible tuples similar like analyze command.
>
> -num_tuples);
> +num_tuples - nkeep);
>
> With the above correction, there is a problem in reporting the number
> of live tuples to the stats.
>
> postgres=# select reltuples, n_live_tup, n_dead_tup
>               from pg_stat_user_tables join pg_class using (relname)
>              where relname = 't';
>  reltuples | n_live_tup | n_dead_tup 
> -----------+------------+------------
>     899818 |     799636 |     100182
> (1 row)
>
>
> The live tuples data value is again decremented with dead tuples
> value before sending them to stats in function lazy_vacuum_rel(),
>
> /* report results to the stats collector, too */
> new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples;
>
> The fix needs a correction here also. Or change the correction in 
> lazy_vacuum_rel() function itself before updating catalog table similar
> like stats.
>

Ah, haven't noticed that for some reason - you're right, we estimate the
reltuples based on (num_tuples - nkeep), so it doesn't make much sense
to subtract nkeep again. Attached is v2 of the fix.

I've removed the subtraction from lazy_vacuum_rel(), leaving just

    new_live_tuples = new_rel_tuples;

and now it behaves as expected (no second subtraction). That means we
can get rid of new_live_tuples altogether (and the protection against
negative values), and use new_rel_tuples directly.

Which pretty much means that in this case

    (pg_class.reltuples == pg_stat_user_tables.n_live_tup)

but I guess that's fine, based on the initial discussion in this thread.

The changes are fine and now it reports proper live tuples in both
pg_class and stats. The other issue of continuous vacuum operation
leading to decrease of number of live tuples is not related to this
patch and that can be handled separately. 

I changed the patch status as ready for committer.

Regards,
Hari Babu
Fujitsu Australia

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tom Lane
Дата:
Haribabu Kommi <kommi.haribabu@gmail.com> writes:
> The changes are fine and now it reports proper live tuples in both
> pg_class and stats. The other issue of continuous vacuum operation
> leading to decrease of number of live tuples is not related to this
> patch and that can be handled separately.

I did not like this patch too much, because it did nothing to fix
the underlying problem of confusion between "live tuples" and "total
tuples"; in fact, it made that worse, because now the comment on
LVRelStats.new_rel_tuples is a lie.  And there's at least one usage
of that field value where I think we do want total tuples not live
tuples: where we pass it to index AM cleanup functions.  Indexes
don't really care whether heap entries are live or not, only that
they're there.  Plus the VACUUM VERBOSE log message that uses the
field is supposed to be reporting total tuples not live tuples.

I hacked up the patch trying to make that better, finding along the
way that contrib/pgstattuple shared in the confusion about what
it was trying to count.  Results attached.

However, I'm not sure we're there yet, because there remains a fairly
nasty discrepancy even once we've gotten everyone onto the same page
about reltuples counting just live tuples: VACUUM and ANALYZE have
different definitions of what's "live".  In particular they do not treat
INSERT_IN_PROGRESS and DELETE_IN_PROGRESS tuples the same.  Should we
try to do something about that?  If so, what?  It looks like ANALYZE's
behavior is pretty tightly constrained, judging by the comments in
acquire_sample_rows.

Another problem is that it looks like CREATE INDEX will set reltuples
to the total number of heap entries it chose to index, because that
is what IndexBuildHeapScan counts.  Maybe we should adjust that?

            regards, tom lane

diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index 5bf0613..68211c6 100644
*** a/contrib/pgstattuple/pgstatapprox.c
--- b/contrib/pgstattuple/pgstatapprox.c
*************** statapprox_heap(Relation rel, output_typ
*** 68,74 ****
      Buffer        vmbuffer = InvalidBuffer;
      BufferAccessStrategy bstrategy;
      TransactionId OldestXmin;
-     uint64        misc_count = 0;

      OldestXmin = GetOldestXmin(rel, PROCARRAY_FLAGS_VACUUM);
      bstrategy = GetAccessStrategy(BAS_BULKREAD);
--- 68,73 ----
*************** statapprox_heap(Relation rel, output_typ
*** 153,177 ****
              tuple.t_tableOid = RelationGetRelid(rel);

              /*
!              * We count live and dead tuples, but we also need to add up
!              * others in order to feed vac_estimate_reltuples.
               */
              switch (HeapTupleSatisfiesVacuum(&tuple, OldestXmin, buf))
              {
-                 case HEAPTUPLE_RECENTLY_DEAD:
-                     misc_count++;
-                     /* Fall through */
-                 case HEAPTUPLE_DEAD:
-                     stat->dead_tuple_len += tuple.t_len;
-                     stat->dead_tuple_count++;
-                     break;
                  case HEAPTUPLE_LIVE:
                      stat->tuple_len += tuple.t_len;
                      stat->tuple_count++;
                      break;
!                 case HEAPTUPLE_INSERT_IN_PROGRESS:
!                 case HEAPTUPLE_DELETE_IN_PROGRESS:
!                     misc_count++;
                      break;
                  default:
                      elog(ERROR, "unexpected HeapTupleSatisfiesVacuum result");
--- 152,172 ----
              tuple.t_tableOid = RelationGetRelid(rel);

              /*
!              * We follow VACUUM's lead in counting INSERT_IN_PROGRESS and
!              * DELETE_IN_PROGRESS tuples as "live".
               */
              switch (HeapTupleSatisfiesVacuum(&tuple, OldestXmin, buf))
              {
                  case HEAPTUPLE_LIVE:
+                 case HEAPTUPLE_INSERT_IN_PROGRESS:
+                 case HEAPTUPLE_DELETE_IN_PROGRESS:
                      stat->tuple_len += tuple.t_len;
                      stat->tuple_count++;
                      break;
!                 case HEAPTUPLE_DEAD:
!                 case HEAPTUPLE_RECENTLY_DEAD:
!                     stat->dead_tuple_len += tuple.t_len;
!                     stat->dead_tuple_count++;
                      break;
                  default:
                      elog(ERROR, "unexpected HeapTupleSatisfiesVacuum result");
*************** statapprox_heap(Relation rel, output_typ
*** 184,191 ****

      stat->table_len = (uint64) nblocks * BLCKSZ;

      stat->tuple_count = vac_estimate_reltuples(rel, false, nblocks, scanned,
!                                                stat->tuple_count + misc_count);

      /*
       * Calculate percentages if the relation has one or more pages.
--- 179,190 ----

      stat->table_len = (uint64) nblocks * BLCKSZ;

+     /*
+      * Extrapolate the live-tuple count to the whole table in the same way
+      * that VACUUM does.  (XXX shouldn't we also extrapolate tuple_len?)
+      */
      stat->tuple_count = vac_estimate_reltuples(rel, false, nblocks, scanned,
!                                                stat->tuple_count);

      /*
       * Calculate percentages if the relation has one or more pages.
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index ef60a58..87bba31 100644
*** a/doc/src/sgml/catalogs.sgml
--- b/doc/src/sgml/catalogs.sgml
*************** SCRAM-SHA-256$<replaceable><iteration
*** 1739,1746 ****
        <entry><type>float4</type></entry>
        <entry></entry>
        <entry>
!        Number of rows in the table.  This is only an estimate used by the
!        planner.  It is updated by <command>VACUUM</command>,
         <command>ANALYZE</command>, and a few DDL commands such as
         <command>CREATE INDEX</command>.
        </entry>
--- 1739,1746 ----
        <entry><type>float4</type></entry>
        <entry></entry>
        <entry>
!        Number of live rows in the table.  This is only an estimate used by
!        the planner.  It is updated by <command>VACUUM</command>,
         <command>ANALYZE</command>, and a few DDL commands such as
         <command>CREATE INDEX</command>.
        </entry>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index cbd6e9b..067034d 100644
*** a/src/backend/commands/vacuum.c
--- b/src/backend/commands/vacuum.c
*************** vacuum_set_xid_limits(Relation rel,
*** 771,776 ****
--- 771,779 ----
   *        we take the old value of pg_class.reltuples as a measurement of the
   *        tuple density in the unscanned pages.
   *
+  *        Note: scanned_tuples should count only *live* tuples, since
+  *        pg_class.reltuples is defined that way.
+  *
   *        This routine is shared by VACUUM and ANALYZE.
   */
  double
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 6587db7..ed14db2 100644
*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
*************** typedef struct LVRelStats
*** 115,122 ****
      BlockNumber frozenskipped_pages;    /* # of frozen pages we skipped */
      BlockNumber tupcount_pages; /* pages whose tuples we counted */
      double        scanned_tuples; /* counts only tuples on tupcount_pages */
!     double        old_rel_tuples; /* previous value of pg_class.reltuples */
      double        new_rel_tuples; /* new estimated total # of tuples */
      double        new_dead_tuples;    /* new estimated total # of dead tuples */
      BlockNumber pages_removed;
      double        tuples_deleted;
--- 115,123 ----
      BlockNumber frozenskipped_pages;    /* # of frozen pages we skipped */
      BlockNumber tupcount_pages; /* pages whose tuples we counted */
      double        scanned_tuples; /* counts only tuples on tupcount_pages */
!     double        old_live_tuples;    /* previous value of pg_class.reltuples */
      double        new_rel_tuples; /* new estimated total # of tuples */
+     double        new_live_tuples;    /* new estimated total # of live tuples */
      double        new_dead_tuples;    /* new estimated total # of dead tuples */
      BlockNumber pages_removed;
      double        tuples_deleted;
*************** lazy_vacuum_rel(Relation onerel, int opt
*** 196,202 ****
      TransactionId xidFullScanLimit;
      MultiXactId mxactFullScanLimit;
      BlockNumber new_rel_pages;
-     double        new_rel_tuples;
      BlockNumber new_rel_allvisible;
      double        new_live_tuples;
      TransactionId new_frozen_xid;
--- 197,202 ----
*************** lazy_vacuum_rel(Relation onerel, int opt
*** 245,251 ****
      vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));

      vacrelstats->old_rel_pages = onerel->rd_rel->relpages;
!     vacrelstats->old_rel_tuples = onerel->rd_rel->reltuples;
      vacrelstats->num_index_scans = 0;
      vacrelstats->pages_removed = 0;
      vacrelstats->lock_waiter_detected = false;
--- 245,251 ----
      vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));

      vacrelstats->old_rel_pages = onerel->rd_rel->relpages;
!     vacrelstats->old_live_tuples = onerel->rd_rel->reltuples;
      vacrelstats->num_index_scans = 0;
      vacrelstats->pages_removed = 0;
      vacrelstats->lock_waiter_detected = false;
*************** lazy_vacuum_rel(Relation onerel, int opt
*** 311,321 ****
       * since then we don't know for certain that all tuples have a newer xmin.
       */
      new_rel_pages = vacrelstats->rel_pages;
!     new_rel_tuples = vacrelstats->new_rel_tuples;
      if (vacrelstats->tupcount_pages == 0 && new_rel_pages > 0)
      {
          new_rel_pages = vacrelstats->old_rel_pages;
!         new_rel_tuples = vacrelstats->old_rel_tuples;
      }

      visibilitymap_count(onerel, &new_rel_allvisible, NULL);
--- 311,321 ----
       * since then we don't know for certain that all tuples have a newer xmin.
       */
      new_rel_pages = vacrelstats->rel_pages;
!     new_live_tuples = vacrelstats->new_live_tuples;
      if (vacrelstats->tupcount_pages == 0 && new_rel_pages > 0)
      {
          new_rel_pages = vacrelstats->old_rel_pages;
!         new_live_tuples = vacrelstats->old_live_tuples;
      }

      visibilitymap_count(onerel, &new_rel_allvisible, NULL);
*************** lazy_vacuum_rel(Relation onerel, int opt
*** 327,333 ****

      vac_update_relstats(onerel,
                          new_rel_pages,
!                         new_rel_tuples,
                          new_rel_allvisible,
                          vacrelstats->hasindex,
                          new_frozen_xid,
--- 327,333 ----

      vac_update_relstats(onerel,
                          new_rel_pages,
!                         new_live_tuples,
                          new_rel_allvisible,
                          vacrelstats->hasindex,
                          new_frozen_xid,
*************** lazy_vacuum_rel(Relation onerel, int opt
*** 335,344 ****
                          false);

      /* report results to the stats collector, too */
-     new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples;
-     if (new_live_tuples < 0)
-         new_live_tuples = 0;    /* just in case */
-
      pgstat_report_vacuum(RelationGetRelid(onerel),
                           onerel->rd_rel->relisshared,
                           new_live_tuples,
--- 335,340 ----
*************** lazy_scan_heap(Relation onerel, int opti
*** 1275,1284 ****
      vacrelstats->new_dead_tuples = nkeep;

      /* now we can compute the new value for pg_class.reltuples */
!     vacrelstats->new_rel_tuples = vac_estimate_reltuples(onerel, false,
!                                                          nblocks,
!                                                          vacrelstats->tupcount_pages,
!                                                          num_tuples);

      /*
       * Release any remaining pin on visibility map page.
--- 1271,1284 ----
      vacrelstats->new_dead_tuples = nkeep;

      /* now we can compute the new value for pg_class.reltuples */
!     vacrelstats->new_live_tuples = vac_estimate_reltuples(onerel, false,
!                                                           nblocks,
!                                                           vacrelstats->tupcount_pages,
!                                                           num_tuples - nkeep);
!
!     /* indexes probably care more about the total number of heap entries */
!     vacrelstats->new_rel_tuples =
!         vacrelstats->new_live_tuples + vacrelstats->new_dead_tuples;

      /*
       * Release any remaining pin on visibility map page.
*************** lazy_vacuum_index(Relation indrel,
*** 1614,1620 ****
      ivinfo.analyze_only = false;
      ivinfo.estimated_count = true;
      ivinfo.message_level = elevel;
!     ivinfo.num_heap_tuples = vacrelstats->old_rel_tuples;
      ivinfo.strategy = vac_strategy;

      /* Do bulk deletion */
--- 1614,1621 ----
      ivinfo.analyze_only = false;
      ivinfo.estimated_count = true;
      ivinfo.message_level = elevel;
!     /* We can only provide an approximate value of num_heap_tuples here */
!     ivinfo.num_heap_tuples = vacrelstats->old_live_tuples;
      ivinfo.strategy = vac_strategy;

      /* Do bulk deletion */
*************** lazy_cleanup_index(Relation indrel,
*** 1645,1650 ****
--- 1646,1652 ----
      ivinfo.analyze_only = false;
      ivinfo.estimated_count = (vacrelstats->tupcount_pages < vacrelstats->rel_pages);
      ivinfo.message_level = elevel;
+     /* Now we can provide a better estimate of total # of surviving tuples */
      ivinfo.num_heap_tuples = vacrelstats->new_rel_tuples;
      ivinfo.strategy = vac_strategy;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tomas Vondra
Дата:
Hi,

On 11/02/2017 08:15 PM, Tom Lane wrote:
> Haribabu Kommi <kommi.haribabu@gmail.com> writes:
>> The changes are fine and now it reports proper live tuples in both
>> pg_class and stats. The other issue of continuous vacuum operation
>> leading to decrease of number of live tuples is not related to this
>> patch and that can be handled separately.
> 
> I did not like this patch too much, because it did nothing to fix
> the underlying problem of confusion between "live tuples" and "total
> tuples"; in fact, it made that worse, because now the comment on
> LVRelStats.new_rel_tuples is a lie.  And there's at least one usage
> of that field value where I think we do want total tuples not live
> tuples: where we pass it to index AM cleanup functions.  Indexes
> don't really care whether heap entries are live or not, only that
> they're there.  Plus the VACUUM VERBOSE log message that uses the
> field is supposed to be reporting total tuples not live tuples.
> 
> I hacked up the patch trying to make that better, finding along the
> way that contrib/pgstattuple shared in the confusion about what
> it was trying to count.  Results attached.
> 

Thanks for looking into this. I agree your patch is more consistent and
generally cleaner.

> However, I'm not sure we're there yet, because there remains a fairly
> nasty discrepancy even once we've gotten everyone onto the same page
> about reltuples counting just live tuples: VACUUM and ANALYZE have
> different definitions of what's "live".  In particular they do not treat
> INSERT_IN_PROGRESS and DELETE_IN_PROGRESS tuples the same.  Should we
> try to do something about that?  If so, what?  It looks like ANALYZE's
> behavior is pretty tightly constrained, judging by the comments in
> acquire_sample_rows.
> 

Yeah :-(

For the record (and people following this thread who are too lazy to
open the analyze.c and search for the comments), the problem is that
acquire_sample_rows does not count HEAPTUPLE_INSERT_IN_PROGRESS as live
(and HEAPTUPLE_DELETE_IN_PROGRESS as dead) as it assumes the transaction
will send the stats at the end.

Which is a correct assumption, but it means that when there is a
long-running transaction that inserted/deleted many rows, the reltuples
value will oscillate during VACUUM / ANALYZE runs.

ISTM we need to unify those definitions, probably so that VACUUM adopts
what acquire_sample_rows does. I mean, if ANALYZE assumes that the stats
will be updated at the end of transaction, why shouldn't VACUUM do the
same thing?

The one reason I can think of is that VACUUM is generally expected to
run longer than ANALYZE, so the assumption that large transactions
complete later is not quite reliable.

Or is there some other reason why VACUUM can't treat the in-progress
tuples the same way?

> Another problem is that it looks like CREATE INDEX will set reltuples
> to the total number of heap entries it chose to index, because that
> is what IndexBuildHeapScan counts.  Maybe we should adjust that?
> 

You mean by only counting live tuples in IndexBuildHeapRangeScan,
following whatever definition we end up using in VACUUM/ANALYZE? Seems
like a good idea I guess, although CREATE INDEX not as frequent as the
other commands.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 11/02/2017 08:15 PM, Tom Lane wrote:
>> However, I'm not sure we're there yet, because there remains a fairly
>> nasty discrepancy even once we've gotten everyone onto the same page
>> about reltuples counting just live tuples: VACUUM and ANALYZE have
>> different definitions of what's "live".  In particular they do not treat
>> INSERT_IN_PROGRESS and DELETE_IN_PROGRESS tuples the same.  Should we
>> try to do something about that?  If so, what?  It looks like ANALYZE's
>> behavior is pretty tightly constrained, judging by the comments in
>> acquire_sample_rows.

> ISTM we need to unify those definitions, probably so that VACUUM adopts
> what acquire_sample_rows does. I mean, if ANALYZE assumes that the stats
> will be updated at the end of transaction, why shouldn't VACUUM do the
> same thing?

That was the way I was leaning.  I haven't thought very hard about the
implications, but as long as the change in VACUUM's behavior extends
only to the live-tuple count it reports, it seems like adjusting it
couldn't break anything too badly.

>> Another problem is that it looks like CREATE INDEX will set reltuples
>> to the total number of heap entries it chose to index, because that
>> is what IndexBuildHeapScan counts.  Maybe we should adjust that?

> You mean by only counting live tuples in IndexBuildHeapRangeScan,
> following whatever definition we end up using in VACUUM/ANALYZE?

Right.  One issue is that, as I mentioned, the index AMs probably want to
think about total-tuples-indexed not live-tuples; so for their purposes,
what IndexBuildHeapScan currently counts is the right thing.  We need to
look and see if any AMs are actually using that value rather than just
silently passing it back.  If they are, we might need to go to the trouble
of computing/returning two values.
        regards, tom lane


Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Alvaro Herrera
Дата:
BTW see bug #14863 which is related to this:
https://postgr.es/m/CAEBTBzu5j_E1K1jb9OKwTZj98MPeM7V81-vadp5adRm=NhJnwA@mail.gmail.com

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Michael Paquier
Дата:
On Sun, Nov 19, 2017 at 6:40 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> Thanks for looking into this. I agree your patch is more consistent and
> generally cleaner.

This is classified as a bug fix, and is marked as waiting on author. I
am moving it to next CF as work continues.
-- 
Michael


Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Stephen Frost
Дата:
Tomas,

* Michael Paquier (michael.paquier@gmail.com) wrote:
> On Sun, Nov 19, 2017 at 6:40 AM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
> > Thanks for looking into this. I agree your patch is more consistent and
> > generally cleaner.
>
> This is classified as a bug fix, and is marked as waiting on author. I
> am moving it to next CF as work continues.

This is still in waiting-on-author state; it'd be great to get your
thoughts on where this patch is and what the next steps are to move it
forward.  If you need additional feedback or there needs to be more
discussion (perhaps with Tom) then maybe this should be in needs-review
instead, but it'd be great if you could review the discussion and patch
again and at least provide an update on it (last update from you was in
November).

Thanks!

Stephen

Вложения

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tomas Vondra
Дата:
On 01/05/2018 05:25 AM, Stephen Frost wrote:
> Tomas,
> 
> * Michael Paquier (michael.paquier@gmail.com) wrote:
>> On Sun, Nov 19, 2017 at 6:40 AM, Tomas Vondra
>> <tomas.vondra@2ndquadrant.com> wrote:
>>> Thanks for looking into this. I agree your patch is more consistent and
>>> generally cleaner.
>>
>> This is classified as a bug fix, and is marked as waiting on author. I
>> am moving it to next CF as work continues.
> 
> This is still in waiting-on-author state; it'd be great to get your 
> thoughts on where this patch is and what the next steps are to move
> it forward. If you need additional feedback or there needs to be
> more discussion (perhaps with Tom) then maybe this should be in
> needs-review instead, but it'd be great if you could review the
> discussion and patch again and at least provide an update on it (last
> update from you was in November).
> 

Hmmm, I'm not sure, TBH.

As I already mentioned, Tom's updated patch is better than what I posted
initially, and I agree with his approach to the remaining issues he
pointed out. But I somehow assumed that he's already looking into that.

Tom, do you plan to look into this patch soon, or should I?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> As I already mentioned, Tom's updated patch is better than what I posted
> initially, and I agree with his approach to the remaining issues he
> pointed out. But I somehow assumed that he's already looking into that.
> Tom, do you plan to look into this patch soon, or should I?

No, I thought you were going to run with those ideas.  I have a lot of
other stuff on my plate ...

            regards, tom lane


Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tomas Vondra
Дата:

On 01/08/2018 08:39 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> As I already mentioned, Tom's updated patch is better than what I
>> posted initially, and I agree with his approach to the remaining
>> issues he pointed out. But I somehow assumed that he's already
>> looking into that. Tom, do you plan to look into this patch soon,
>> or should I?
> 
> No, I thought you were going to run with those ideas. I have a lot
> of other stuff on my plate ...
> 

OK, will do.

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuplesmeans

От
David Steele
Дата:
Hi Tomas,

On 1/8/18 3:28 PM, Tomas Vondra wrote:
> 
> 
> On 01/08/2018 08:39 PM, Tom Lane wrote:
>> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>> As I already mentioned, Tom's updated patch is better than what I
>>> posted initially, and I agree with his approach to the remaining
>>> issues he pointed out. But I somehow assumed that he's already
>>> looking into that. Tom, do you plan to look into this patch soon,
>>> or should I?
>>
>> No, I thought you were going to run with those ideas. I have a lot
>> of other stuff on my plate ...
>>
> 
> OK, will do.

What the status of this patch?  It's been waiting on author since last
November, though I can see there was some confusion over who was working
on it.

Given that it's a bug fix it would be good to see a patch for this CF,
or very soon after.

Thanks,
-- 
-David
david@pgmasters.net


Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tomas Vondra
Дата:

On 03/05/2018 04:12 PM, David Steele wrote:
> Hi Tomas,
> 
> On 1/8/18 3:28 PM, Tomas Vondra wrote:
>>
>>
>> On 01/08/2018 08:39 PM, Tom Lane wrote:
>>> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>>> As I already mentioned, Tom's updated patch is better than what I
>>>> posted initially, and I agree with his approach to the remaining
>>>> issues he pointed out. But I somehow assumed that he's already
>>>> looking into that. Tom, do you plan to look into this patch soon,
>>>> or should I?
>>>
>>> No, I thought you were going to run with those ideas. I have a lot
>>> of other stuff on my plate ...
>>>
>>
>> OK, will do.
> 
> What the status of this patch?  It's been waiting on author since last
> November, though I can see there was some confusion over who was working
> on it.
> 
> Given that it's a bug fix it would be good to see a patch for this CF,
> or very soon after.
> 

Yeah, it's the next thing on my TODO.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tomas Vondra
Дата:
Hi,

So here is an updated version of the patch/fix, addressing the remaining
issues in v3 posted by Tom in November.


The 0001 part is actually a bugfix in bloom and spgist index AM, which
did something like this:

    reltuples = IndexBuildHeapScan(...)

    result->heap_tuples = result->index_tuples = reltuples;

That is, these two AMs simply used the number of heap rows for the
index. That does not work for partial indexes, of course, where the
correct index reltuples value is likely much lower.

0001 fixes this by tracking the number of actually indexed rows in the
build states, just like in the other index AMs.

A VACUUM or ANALYZE will fix the estimate, of course, but for tables
that are not changing very much it may take quite a while. So I think
this is something we definitely need to back-patch.


The 0002 part is the main part, unifying the definition of reltuples on
three main places:

 a) acquire_sample_rows (ANALYZE)
 b) lazy_scan_heap (VACUUM)
 c) IndexBuildHeapRangeScan (CREATE INDEX)

As the ANALYZE case seems the most constrained, the other two places
were updated to use the same criteria for which rows to include in the
reltuples estimate:

  * HEAPTUPLE_LIVE
  * HEAPTUPLE_INSERT_IN_PROGRESS (same transaction)
  * HEAPTUPLE_DELETE_IN_PROGRESS (not the same trasaction)

This resolves the issue with oscillating reltuples estimates, produced
by VACUUM and ANALYZE (with many non-removable dead tuples).

I've checked all IndexBuildHeapRangeScan callers, and none of them is
using the reltuples estimate for anything except for passing it to
index_update_stats. Aside from the bug fixed in 0001, of course.



regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> 0001 fixes this by tracking the number of actually indexed rows in the
> build states, just like in the other index AMs.
> A VACUUM or ANALYZE will fix the estimate, of course, but for tables
> that are not changing very much it may take quite a while. So I think
> this is something we definitely need to back-patch.

Agreed, and done.  I noticed a second bug in contrib/bloom, too: it'd
forget to write out the last index page if that contained only one
tuple, because the new-page code path in bloomBuildCallback failed to
increment the "count" field after clearing it.

> The 0002 part is the main part, unifying the definition of reltuples on
> three main places:

On to this part ...

            regards, tom lane


Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tom Lane
Дата:
I wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> The 0002 part is the main part, unifying the definition of reltuples on
>> three main places:

> On to this part ...

I've pushed 0002 with several corrections: it did not seem to me that
you'd correctly propagated what ANALYZE is doing into CREATE INDEX or
pgstattuple.  Also, since one of the things VACUUM does with num_tuples
is to report it as "the total number of non-removable tuples", I thought
we'd better leave that calculation alone.  I made the added code count
live tuples in a new variable live_tuples, instead.

            regards, tom lane


Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

От
Tomas Vondra
Дата:
On 03/22/2018 08:51 PM, Tom Lane wrote:
> I wrote:
>> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>> The 0002 part is the main part, unifying the definition of reltuples on
>>> three main places:
> 
>> On to this part ...
> 
> I've pushed 0002 with several corrections: it did not seem to me that
> you'd correctly propagated what ANALYZE is doing into CREATE INDEX or
> pgstattuple.  Also, since one of the things VACUUM does with num_tuples
> is to report it as "the total number of non-removable tuples", I thought
> we'd better leave that calculation alone.  I made the added code count
> live tuples in a new variable live_tuples, instead.
> 

OK, makes sense. Thanks.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate.

От
Justin Pryzby
Дата:
Just want to add for the archive that I happened to run across what appears to
be a 7-year old report of (I think) both of these vacuum/analyze bugs:

Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
                                                                                          
 
Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
                                                                                      
 


https://www.postgresql.org/message-id/flat/6373488.29.1323717222303.JavaMail.root%40store1.zcs.ext.wpsrv.net#6373488.29.1323717222303.JavaMail.root@store1.zcs.ext.wpsrv.net


Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate.

От
David Gould
Дата:
On Mon, 23 Apr 2018 20:09:21 -0500
Justin Pryzby <pryzby@telsasoft.com> wrote:

> Just want to add for the archive that I happened to run across what appears to
> be a 7-year old report of (I think) both of these vacuum/analyze bugs:
> 
> Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
                                                                                            
 
> Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
                                                                                        
 
> 
>
https://www.postgresql.org/message-id/flat/6373488.29.1323717222303.JavaMail.root%40store1.zcs.ext.wpsrv.net#6373488.29.1323717222303.JavaMail.root@store1.zcs.ext.wpsrv.net

Nice find. It does look like both, although, since the info is not in the
thread it is not certain that relpages was large enough to hit the analyze
issue. Unless that was with the old default statistics target in which case
it would be pretty easy to hit.

-dg


-- 
David Gould                                   daveg@sonic.net
If simplicity worked, the world would be overrun with insects.