Обсуждение: Why not represent "never vacuumed" accurately wrt pg_class.relpages?

Поиск
Список
Период
Сортировка

Why not represent "never vacuumed" accurately wrt pg_class.relpages?

От
Andres Freund
Дата:
Hi,

estimate_rel_size() explains:

            /*
             * HACK: if the relation has never yet been vacuumed, use a
             * minimum size estimate of 10 pages.  The idea here is to avoid
             * assuming a newly-created table is really small, even if it
             * currently is, because that may not be true once some data gets
             * loaded into it.  Once a vacuum or analyze cycle has been done
             * on it, it's more reasonable to believe the size is somewhat
             * stable.
             *
             * (Note that this is only an issue if the plan gets cached and
             * used again after the table has been filled.  What we're trying
             * to avoid is using a nestloop-type plan on a table that has
             * grown substantially since the plan was made.  Normally,
             * autovacuum/autoanalyze will occur once enough inserts have
             * happened and cause cached-plan invalidation; but that doesn't
             * happen instantaneously, and it won't happen at all for cases
             * such as temporary tables.)
             *
             * We approximate "never vacuumed" by "has relpages = 0", which
             * means this will also fire on genuinely empty relations.  Not
             * great, but fortunately that's a seldom-seen case in the real
             * world, and it shouldn't degrade the quality of the plan too
             * much anyway to err in this direction.
             *
             * There are two exceptions wherein we don't apply this heuristic.
             * One is if the table has inheritance children.  Totally empty
             * parent tables are quite common, so we should be willing to
             * believe that they are empty.  Also, we don't apply the 10-page
             * minimum to indexes.
             */

I don't quite get why we don't instead just represent "never vacuumed"
by storing a more meaningful value in relpages?  We could go for
InvalidBlockNumber, or even NULL (although the latter would be a bit
annoying due to not being mappable to a struct anymore).

I've seen numerous cases where relpages = 0 -> never vacuumed has caused
worse plans, and it just doesn't seem necessary?

Greetings,

Andres Freund


Re: Why not represent "never vacuumed" accurately wrt pg_class.relpages?

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> I don't quite get why we don't instead just represent "never vacuumed"
> by storing a more meaningful value in relpages?

Mostly, not wanting to break clients that look at these fields.
If catalog compatibility weren't a concern, I'd seriously consider
replacing both of them with a float "average tuples per page" ratio.

> We could go for
> InvalidBlockNumber, or even NULL (although the latter would be a bit
> annoying due to not being mappable to a struct anymore).

NULL seems right out on every ground.  I don't much care for
InvalidBlockNumber either.

> I've seen numerous cases where relpages = 0 -> never vacuumed has caused
> worse plans, and it just doesn't seem necessary?

Worse plans than what?  And why do you blame it on this representation?
We don't believe that relpages is the actual size of the table.

            regards, tom lane


Re: Why not represent "never vacuumed" accurately wrtpg_class.relpages?

От
Andres Freund
Дата:
Hi,

On 2018-12-11 09:47:38 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I don't quite get why we don't instead just represent "never vacuumed"
> > by storing a more meaningful value in relpages?
> 
> Mostly, not wanting to break clients that look at these fields.
> If catalog compatibility weren't a concern, I'd seriously consider
> replacing both of them with a float "average tuples per page" ratio.

Yea, that'd be better.  I'm not sure I believe this is a grave concern,
but if we really are concerned about not breaking clients, we could just
add a separate bool for the fact the table has been vacuumed.


> > I've seen numerous cases where relpages = 0 -> never vacuumed has caused
> > worse plans, and it just doesn't seem necessary?
> 
> Worse plans than what?

The plans if the stats were believed.


> And why do you blame it on this representation?  We don't believe that
> relpages is the actual size of the table.

No, but we assume that there's 10 pages. Even if both relpages and the
actual relation stats say there's not. And we assume there's as many
tuples on the page as can fit on it, using get_rel_data_width().  So if
you have a small table with a handful of entries at most, you suddenly
get estimates of a few hundred to ~a thousand rows.

I'd one client once insert a row into a lock table to make sense, just
to make sure it never gets vacuumed while there were no locks.


CREATE TABLE locks(lockid int);
VACUUM locks;
EXPLAIN SELECT * FROM locks;
┌─────────────────────────────────────────────────────────┐
│                       QUERY PLAN                        │
├─────────────────────────────────────────────────────────┤
│ Seq Scan on locks  (cost=0.00..35.50 rows=2550 width=4) │
└─────────────────────────────────────────────────────────┘
(1 row)
INSERT INTO locks VALUES (1);
VACUUM locks;
EXPLAIN SELECT * FROM locks;
┌─────────────────────────────────────────────────────┐
│                     QUERY PLAN                      │
├─────────────────────────────────────────────────────┤
│ Seq Scan on locks  (cost=0.00..1.01 rows=1 width=4) │
└─────────────────────────────────────────────────────┘
(1 row)
DELETE FROM locks;
VACUUM locks ;
EXPLAIN SELECT * FROM locks;
┌─────────────────────────────────────────────────────────┐
│                       QUERY PLAN                        │
├─────────────────────────────────────────────────────────┤
│ Seq Scan on locks  (cost=0.00..35.50 rows=2550 width=4) │
└─────────────────────────────────────────────────────────┘
(1 row)

Estimates on a table constantly varying by three orders of magnitude -
not hard to imagine large plan changes due to that.


That's not what I was complaining about, but isn't it fairly broken that
the "actually empty relation" case here doesn't get hit if relpages ==
0?  We clamp curpages to 10 before returning for the the empty size:

            if (curpages < 10 &&
                rel->rd_rel->relpages == 0 &&
                !rel->rd_rel->relhassubclass &&
                rel->rd_rel->relkind != RELKIND_INDEX)
                curpages = 10;

            /* report estimated # pages */
            *pages = curpages;
            /* quick exit if rel is clearly empty */
            if (curpages == 0)
            {
                *tuples = 0;
                *allvisfrac = 0;
                break;
            }

I guess you could argue that the relation would potentially not be be
empty anymore by the time the plan is executed, but if that were part of
the logic it a) wouldn't just be relevant if relpages = 0, and b) should
be documented.

Greetings,

Andres Freund


Re: Why not represent "never vacuumed" accurately wrt pg_class.relpages?

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2018-12-11 09:47:38 -0500, Tom Lane wrote:
>> And why do you blame it on this representation?  We don't believe that
>> relpages is the actual size of the table.

> No, but we assume that there's 10 pages. Even if both relpages and the
> actual relation stats say there's not. And we assume there's as many
> tuples on the page as can fit on it, using get_rel_data_width().  So if
> you have a small table with a handful of entries at most, you suddenly
> get estimates of a few hundred to ~a thousand rows.

That's intentional, and not particularly constrained by the representation
used in pg_class.  The downsides of incorrectly assuming a table is tiny
are a lot worse than those of assuming the opposite.

> I guess you could argue that the relation would potentially not be be
> empty anymore by the time the plan is executed, but if that were part of
> the logic it a) wouldn't just be relevant if relpages = 0, and b) should
> be documented.

Yeah, that's the case where you can get hurt the worst.  As for "not
documented", the first para in the comment is trying to say that.

            regards, tom lane


Re: Why not represent "never vacuumed" accurately wrtpg_class.relpages?

От
Andres Freund
Дата:
Hi,

On 2018-12-11 13:43:47 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2018-12-11 09:47:38 -0500, Tom Lane wrote:
> >> And why do you blame it on this representation?  We don't believe that
> >> relpages is the actual size of the table.
> 
> > No, but we assume that there's 10 pages. Even if both relpages and the
> > actual relation stats say there's not. And we assume there's as many
> > tuples on the page as can fit on it, using get_rel_data_width().  So if
> > you have a small table with a handful of entries at most, you suddenly
> > get estimates of a few hundred to ~a thousand rows.
> 
> That's intentional, and not particularly constrained by the representation
> used in pg_class.  The downsides of incorrectly assuming a table is tiny
> are a lot worse than those of assuming the opposite.

How's being unable to distinguish "never vacuumed" from "table is
knowingly empty right now" not constrained by the representation?  I
mean:
             * We approximate "never vacuumed" by "has relpages = 0", which
             * means this will also fire on genuinely empty relations.  Not
             * great, but fortunately that's a seldom-seen case in the real
             * world, and it shouldn't degrade the quality of the plan too
             * much anyway to err in this direction.
             *
             * There are two exceptions wherein we don't apply this heuristic.
             * One is if the table has inheritance children.  Totally empty
             * parent tables are quite common, so we should be willing to
             * believe that they are empty.  Also, we don't apply the 10-page
             * minimum to indexes.

My case of small tables of ephemeral data aside, with hash & range
partitioning it's becoming more common to have individual tables be
empty too, by virtue of nothing falling into the range/being hashed of
the partitions.

Without having a separate "no page, but really" value, how can we fix
this?

Greetings,

Andres Freund


Re: Why not represent "never vacuumed" accurately wrt pg_class.relpages?

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> How's being unable to distinguish "never vacuumed" from "table is
> knowingly empty right now" not constrained by the representation?

Well, OK, it is constrained, but even if it weren't I'd be afraid to
assume a table is totally empty, even if it was so the last time
ANALYZE looked at it.  The plans you get from that are spectacularly
fragile.

            regards, tom lane


Re: Why not represent "never vacuumed" accurately wrtpg_class.relpages?

От
Andres Freund
Дата:
Hi,

On 2018-12-11 13:59:55 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > How's being unable to distinguish "never vacuumed" from "table is
> > knowingly empty right now" not constrained by the representation?
> 
> Well, OK, it is constrained, but even if it weren't I'd be afraid to
> assume a table is totally empty, even if it was so the last time
> ANALYZE looked at it.  The plans you get from that are spectacularly
> fragile.

I think some clamping would be acceptable, but three orders of magnitude
off seems excessive. By clamping both the tuple density and the number
of pages, we're essentially multiplying the error.

But if this is your concern: Why are we believing that a previously
vacuumed table is actually empty if it was previously one block long
(relpages == 1), but not if it previously zero blocks long (relpages ==
0). In the former case we're entering the
            /* quick exit if rel is clearly empty */
            if (curpages == 0)
            {
                *tuples = 0;
                *allvisfrac = 0;
                break;
            }
bit, but not in the latter case.  Your fragility argument seems to hold
about as much sway there.

To me the whole logic around this largely seems to be cargo culting
things forward.

Greetings,

Andres Freund


Re: Why not represent "never vacuumed" accurately wrt pg_class.relpages?

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> To me the whole logic around this largely seems to be cargo culting
> things forward.

Well, feel free to propose some other behavior --- but I'm more
worried about how badly it will fail when wrong than whether the
plans are better when right.

            regards, tom lane


Re: Why not represent "never vacuumed" accurately wrt pg_class.relpages?

От
Robert Haas
Дата:
On Tue, Dec 11, 2018 at 11:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I don't quite get why we don't instead just represent "never vacuumed"
> > by storing a more meaningful value in relpages?
>
> Mostly, not wanting to break clients that look at these fields.
> If catalog compatibility weren't a concern, I'd seriously consider
> replacing both of them with a float "average tuples per page" ratio.

I think we should do exactly that thing.

And I also agree that assuming 10 pages when pg_class says 0 and 1
page when pg_class says 1 is not particularly bright.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Why not represent "never vacuumed" accurately wrtpg_class.relpages?

От
Stephen Frost
Дата:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Tue, Dec 11, 2018 at 11:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Andres Freund <andres@anarazel.de> writes:
> > > I don't quite get why we don't instead just represent "never vacuumed"
> > > by storing a more meaningful value in relpages?
> >
> > Mostly, not wanting to break clients that look at these fields.
> > If catalog compatibility weren't a concern, I'd seriously consider
> > replacing both of them with a float "average tuples per page" ratio.
>
> I think we should do exactly that thing.

On first blush, I tend to agree, but what I really wanted to remark here
is that I don't think we should be hand-tying ourselves as to what we
can do because we don't want to upset people who are reading the
catalogs.  We make changes to the catalogs pretty regularly and while we
get complaints here and there, by and large, users are accustomed to
them and appreciate that we don't have a lot of baggage from trying to
support old interfaces and that we're able to make as much progress year
over year as we are.

Further, we support major releases for 5 years for a reason- users have
quite a bit of time to adjust to the changes.

> And I also agree that assuming 10 pages when pg_class says 0 and 1
> page when pg_class says 1 is not particularly bright.

Agreed.

Thanks!

Stephen

Вложения