Re: hashagg slowdown due to spill changes

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: hashagg slowdown due to spill changes
Дата	23 июня 2020 г. 07:01:57
Msg-id	20200623040157.mye76koajqhsjbht@alap3.anarazel.de обсуждение исходный текст
Ответ на	Re: hashagg slowdown due to spill changes (Jeff Davis <pgsql@j-davis.com>)
Ответы	Re: hashagg slowdown due to spill changes (Melanie Plageman <melanieplageman@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

Hi,

I think it'd be good to get the changes that aren't related to
projection merged. As far as I can tell there's performance regressions
both because of the things I'd listed upthread, and due to the
projection issue.  That's not obvious because because we first won
performance and then lost it again in several incremental steps.

COPY (SELECT (random() * 4)::int cat, (random()*10000)::int val FROM generate_series(1, 100000000)) TO '/tmp/data' WITH
BINARY;
BEGIN;
DROP TABLE IF EXISTS fewgroups_many_rows;
CREATE TABLE fewgroups_many_rows(cat int4 not null, val int4 not null);
COPY fewgroups_many_rows FROM '/tmp/data' WITH (FORMAT BINARY, FREEZE);
COMMIT;
VACUUM FREEZE fewgroups_many_rows;

Test prep:

Test query:
SET seed=0;SELECT cat, count(*) FROM fewgroups_many_rows GROUP BY 1;
(the seed seems to reduce noise due to hashtable iv being the same)

best of six:
9e1c9f959422192bbe1b842a2a1ffaf76b080196        12031.906 ms
d52eaa094847d395f942827a6f413904e516994c        12045.487 ms
ac88807f9b227ddcd92b8be9a053094837c1b99a        11950.006 ms
36d22dd95bc87ca68e742da91f47f8826f8758c9        11769.991 ms
5ac4e9a12c6543414891cd8972b2cd36a08e40cc    11551.932 ms
1fdb7f9789c4550204cd62d1746a7deed1dc4c29        11706.948 ms
4eaea3db150af56aa2e40efe91997fd25f3b6d73        11999.908 ms
11de6c903da99a4b2220acfa776fc26c7f384ccc        11999.054 ms
b7fabe80df9a65010bfe5e5d0a979bacebfec382        12165.463 ms
2742c45080077ed3b08b810bb96341499b86d530        12137.505 ms
1f39bce021540fde00990af55b4432c55ef4b3c7        12501.764 ms
9b60c4b979bce060495e2b05ba01d1cc6bffdd2d        12389.047 ms
4cad2534da6d17067d98cf04be2dfc1bda8f2cd0        13319.786 ms
1b2c29469a58cd9086bd86e20c708eb437564a80        13330.616 ms

There's certainly some noise in here, but I think the trends are valid.


>  /*
> - * find_unaggregated_cols
> - *      Construct a bitmapset of the column numbers of un-aggregated Vars
> - *      appearing in our targetlist and qual (HAVING clause)
> + * Walk tlist and qual to find referenced colnos, dividing them into
> + * aggregated and unaggregated sets.
>   */
> -static Bitmapset *
> -find_unaggregated_cols(AggState *aggstate)
> +static void
> +find_cols(AggState *aggstate, Bitmapset **aggregated, Bitmapset **unaggregated)
>  {

It's not this patch's fault, but none, really none, of this stuff should
be in the executor.

Greetings,

Andres Freund

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Masahiko Sawada
Дата: 23 июня 2020 г., 07:01:45
Сообщение: Re: Resetting spilled txn statistics in pg_stat_replication

Следующее

От: Thomas Munro
Дата: 23 июня 2020 г., 07:04:36
Сообщение: Re: Testing big endian code with Travis CI's new s390x support

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: hashagg slowdown due to spill changes

Предыдущее

Следующее