Re: Analyze on table creation?

Поиск
Список
Период
Сортировка
От James Coleman
Тема Re: Analyze on table creation?
Дата
Msg-id CAAaqYe-QcKnV9TjAr-2AHqoh8fnpBcYjXfcEe7PjeDoU2zaR5Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Analyze on table creation?  (James Coleman <jtc331@gmail.com>)
Список pgsql-hackers
On Mon, Jun 26, 2023 at 4:16 PM James Coleman <jtc331@gmail.com> wrote:
>
> On Mon, Jun 26, 2023 at 4:00 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
> >
> > On 2023-06-26 13:40:49 -0400, James Coleman wrote:
> > > Have we ever discussed running an analyze immediately after creating a table?
> >
> > That doesn't make a whole lot of sense to me - we could just insert the
> > constants stats we wanted in that case.
> >
>
> I thought that was implicit in that, but fair enough :)
>
> > > Consider the following:
> > >
> > > create table stats(i int, t text not null);
> > > explain select * from stats;
> > >    Seq Scan on stats  (cost=0.00..22.70 rows=1270 width=36
> > > analyze stats;
> > > explain select * from stats;
> > >    Seq Scan on stats  (cost=0.00..0.00 rows=1 width=36)
> > >
> > > Combined with rapidly increasing error margin on row estimates when
> > > adding joins means that a query joining to a bunch of empty tables
> > > when a database first starts up can result in some pretty wild plan
> > > costs.
> >
> > The issue is that the table stats are likely going to quickly out of date in
> > that case, even a hand full of inserts (which wouldn't trigger
> > autovacuum analyzing) would lead to the "0 rows" stats causing very bad plans.
> >
>
> It's not obvious to me (as noted elsewhere in the thread) which is
> worse: a bunch of JOINs on empty tables can result in (specific
> example) plans with cost=15353020, and then trigger JIT, and...here we
> collide with my other thread about JIT [1].
>
> Regards,
> James Coleman
>
> 1: https://www.postgresql.org/message-id/CAAaqYe-g-Q0Mm5H9QLcu8cHeMwok%2BHaxS4-UC9Oj3bK3a5jPvg%40mail.gmail.com

Thinking about this a bit more: it seems like what we're missing is either:

1. A heuristic for "this table will probably remain empty", or
2. A way to invalidate "0 rows" stats more quickly on even a handful of inserts.

I think one of those (ignoring questions about "how" for now) would
solve both cases?

Regards,
James Coleman



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Assert !bms_overlap(joinrel->relids, required_outer)
Следующее
От: "Jonathan S. Katz"
Дата:
Сообщение: PostgreSQL 16 Beta 2 release announcement draft