Re: Default setting for enable_hashagg_disk

Поиск

Список

Период

Сортировка

От	Alvaro Herrera
Тема	Re: Default setting for enable_hashagg_disk
Дата	25 июня 2020 г. 22:44:22
Msg-id	20200625224422.GA9653@alvherre.pgsql обсуждение исходный текст
Ответ на	Re: Default setting for enable_hashagg_disk (Andres Freund <andres@anarazel.de>)
Ответы	Re: Default setting for enable_hashagg_disk
Список	pgsql-hackers

Дерево обсуждения

On 2020-Jun-25, Andres Freund wrote:

> > What are people doing for those cases already?  Do we have an
> > real-world queries that are a problem in PG 13 for this?
> 
> I don't know about real world, but it's pretty easy to come up with
> examples.
> 
> query:
> SELECT a, array_agg(b) FROM (SELECT generate_series(1, 10000)) a(a), (SELECT generate_series(1, 10000)) b(b) GROUP BY
aHAVING array_length(array_agg(b), 1) = 0;

> 
> work_mem = 4MB
> 
> 12      18470.012 ms
> HEAD    44635.210 ms
> 
> HEAD causes ~2.8GB of file IO, 12 doesn't cause any. If you're IO
> bandwidth constrained, this could be quite bad.

... however, you can pretty much get the previous performance back by
increasing work_mem.  I just tried your example here, and I get 32
seconds of runtime for work_mem 4MB, and 13.5 seconds for work_mem 1GB
(this one spills about 800 MB); if I increase that again to 1.7GB I get
no spilling and 9 seconds of runtime.  (For comparison, 12 takes 15.7
seconds regardless of work_mem).

My point here is that maybe we don't need to offer a GUC to explicitly
turn spilling off; it seems sufficient to let users change work_mem so
that spilling will naturally not occur.  Why do we need more?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Default setting for enable_hashagg_disk