Re: Why not represent "never vacuumed" accurately wrtpg_class.relpages?

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Why not represent "never vacuumed" accurately wrtpg_class.relpages?
Дата
Msg-id 20181211182331.uosjnfvncq3o7rxx@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Why not represent "never vacuumed" accurately wrt pg_class.relpages?  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Why not represent "never vacuumed" accurately wrt pg_class.relpages?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

On 2018-12-11 09:47:38 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I don't quite get why we don't instead just represent "never vacuumed"
> > by storing a more meaningful value in relpages?
> 
> Mostly, not wanting to break clients that look at these fields.
> If catalog compatibility weren't a concern, I'd seriously consider
> replacing both of them with a float "average tuples per page" ratio.

Yea, that'd be better.  I'm not sure I believe this is a grave concern,
but if we really are concerned about not breaking clients, we could just
add a separate bool for the fact the table has been vacuumed.


> > I've seen numerous cases where relpages = 0 -> never vacuumed has caused
> > worse plans, and it just doesn't seem necessary?
> 
> Worse plans than what?

The plans if the stats were believed.


> And why do you blame it on this representation?  We don't believe that
> relpages is the actual size of the table.

No, but we assume that there's 10 pages. Even if both relpages and the
actual relation stats say there's not. And we assume there's as many
tuples on the page as can fit on it, using get_rel_data_width().  So if
you have a small table with a handful of entries at most, you suddenly
get estimates of a few hundred to ~a thousand rows.

I'd one client once insert a row into a lock table to make sense, just
to make sure it never gets vacuumed while there were no locks.


CREATE TABLE locks(lockid int);
VACUUM locks;
EXPLAIN SELECT * FROM locks;
┌─────────────────────────────────────────────────────────┐
│                       QUERY PLAN                        │
├─────────────────────────────────────────────────────────┤
│ Seq Scan on locks  (cost=0.00..35.50 rows=2550 width=4) │
└─────────────────────────────────────────────────────────┘
(1 row)
INSERT INTO locks VALUES (1);
VACUUM locks;
EXPLAIN SELECT * FROM locks;
┌─────────────────────────────────────────────────────┐
│                     QUERY PLAN                      │
├─────────────────────────────────────────────────────┤
│ Seq Scan on locks  (cost=0.00..1.01 rows=1 width=4) │
└─────────────────────────────────────────────────────┘
(1 row)
DELETE FROM locks;
VACUUM locks ;
EXPLAIN SELECT * FROM locks;
┌─────────────────────────────────────────────────────────┐
│                       QUERY PLAN                        │
├─────────────────────────────────────────────────────────┤
│ Seq Scan on locks  (cost=0.00..35.50 rows=2550 width=4) │
└─────────────────────────────────────────────────────────┘
(1 row)

Estimates on a table constantly varying by three orders of magnitude -
not hard to imagine large plan changes due to that.


That's not what I was complaining about, but isn't it fairly broken that
the "actually empty relation" case here doesn't get hit if relpages ==
0?  We clamp curpages to 10 before returning for the the empty size:

            if (curpages < 10 &&
                rel->rd_rel->relpages == 0 &&
                !rel->rd_rel->relhassubclass &&
                rel->rd_rel->relkind != RELKIND_INDEX)
                curpages = 10;

            /* report estimated # pages */
            *pages = curpages;
            /* quick exit if rel is clearly empty */
            if (curpages == 0)
            {
                *tuples = 0;
                *allvisfrac = 0;
                break;
            }

I guess you could argue that the relation would potentially not be be
empty anymore by the time the plan is executed, but if that were part of
the logic it a) wouldn't just be relevant if relpages = 0, and b) should
be documented.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andreas Karlsson
Дата:
Сообщение: Re: Introducing SNI in TLS handshake for SSL connections
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Why not represent "never vacuumed" accurately wrt pg_class.relpages?