Re: POC: GROUP BY optimization

Поиск
Список
Период
Сортировка
От Teodor Sigaev
Тема Re: POC: GROUP BY optimization
Дата
Msg-id 78d94983-0fcd-78ea-f4c0-efd5022e1386@sigaev.ru
обсуждение исходный текст
Ответ на Re: POC: GROUP BY optimization  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers

> OK, I haven't looked at v7 yet, but if I understand correctly it tries
> to maintain the ordering as much as possible? Does that actually help? I
> mean, the incremental sort patch allows the sorting to happen by pieces,
> but here we still need to sort all the data, right?
> 
> Can you give an example demonstrating the benefit?


> 
> SELECT a, SUM(x) FROM
> (
>      SELECT a, b, COUNT(c) AS x FROM t1 GROUP BY a, b
>      UNION ALL
>      SELECT a, b, COUNT(c) AS x FROM t2 GROUP BY a, b
> ) foo GROUP BY a;
> 
> and indexes on (a,b) and (b,a) for both relations. The "deduplication"
> by pathkeys I suggested would mean we might keep only index (a,b) on t1
> and (b,a) on t2, which means the grouping by "a" can't leverage index
> scans on both relations. But if we keep paths for both indexes on each
> relation, we can.
yes, one of option

> Isn't "estimation of cost of comparing function/number of unique values
> in column could be not very accurate and so planner could make a wrong
> choice" is more an argument against relying on it when doing these
> optimizations?
> 
> FWIW it's one of the arguments Tom made in the incremental sort patch,
> which relies on it too when computing cost of the incremental sort. I'm
> sure it's going to be an obstacle there too. >
>> I saw 2 times difference in real-world application. Again, improving
>> sort cost estimation is a separate task.
> Sure. But we also need to ask the other question, i.e. how many people
> would be negatively affected by the optimization. And I admit I don't
> know the answer to that, the next example is entirely made up.
Hm, seems, the best way here is a improving cost_sort estimation. Will try, but 
I think that is separated patch


> FWIW I think it would be useful to have "development GUC" that would
> allow us to enable/disable these options during development, because
> that makes experiments much easier. But then remove them before commit.

Will do

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: commitfest 2018-07
Следующее
От: Ashutosh Bapat
Дата:
Сообщение: Re: Remove mention in docs that foreign keys on partitioned tablesare not supported