Re: planner support functions: handle GROUP BY estimates ?

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: planner support functions: handle GROUP BY estimates ?
Дата
Msg-id 20191226213250.GB12890@telsasoft.com
обсуждение исходный текст
Ответ на Re: planner support functions: handle GROUP BY estimates ?  (Justin Pryzby <pryzby@telsasoft.com>)
Ответы Re: planner support functions: handle GROUP BY estimates ?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Sun, Dec 22, 2019 at 06:16:48PM -0600, Justin Pryzby wrote:
> On Tue, Nov 19, 2019 at 01:34:21PM -0600, Justin Pryzby wrote:
> > Tom implemented "Planner support functions":
> > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a391ff3c3d418e404a2c6e4ff0865a107752827b
> > https://www.postgresql.org/docs/12/xfunc-optimization.html
> > 
> > I wondered whether there was any consideration to extend that to allow
> > providing improved estimates of "group by".  That currently requires manually
> > by creating an expression index, if the function is IMMUTABLE (which is not
> > true for eg.  date_trunc of timestamptz).
> 
> I didn't hear back so tried implementing this for date_trunc().  Currently, the

> I currently assume that the input data has 1 second granularity:
...
> If the input timestamps have (say) hourly granularity, rowcount will be
> *underestimated* by 3600x, which is worse than the behavior in master of
> overestimating by (for "day") 24x.
> 
> I'm trying to think of ways to address that:

In the attached, I handled that by using histogram and variable's initial
ndistinct estimate, giving good estimates even for intermediate granularities
of input timestamps.

|postgres=# DROP TABLE IF EXISTS t; CREATE TABLE t(i) AS SELECT a FROM generate_series(now(), now()+'11 day'::interval,
'15minutes')a,generate_series(1,9)b; ANALYZE t;
 
|
|postgres=# explain analyze SELECT date_trunc('hour',i) i FROM t GROUP BY 1;
| HashAggregate  (cost=185.69..188.99 rows=264 width=8) (actual time=42.110..42.317 rows=265 loops=1)
|
|postgres=# explain analyze SELECT date_trunc('minute',i) i FROM t GROUP BY 1;
| HashAggregate  (cost=185.69..198.91 rows=1057 width=8) (actual time=41.685..42.264 rows=1057 loops=1)
|
|postgres=# explain analyze SELECT date_trunc('day',i) i FROM t GROUP BY 1;
| HashAggregate  (cost=185.69..185.83 rows=11 width=8) (actual time=46.672..46.681 rows=12 loops=1)
|
|postgres=# explain analyze SELECT date_trunc('second',i) i FROM t GROUP BY 1;
| HashAggregate  (cost=185.69..198.91 rows=1057 width=8) (actual time=41.816..42.435 rows=1057 loops=1)

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Calling PLpgSQL function with composite type fails with an error: "ERROR: could not open relation with OID ..."
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Libpq support to connect to standby server as priority