Обсуждение: POC: GROUP BY optimization

Поиск

Список

Период

Сортировка

POC: GROUP BY optimization

От

Teodor Sigaev

Дата:

30 мая 2018 г., 02:07:50

Hi!

As far as I known, columns in GROUP BY could be reordered without loss of 
correctness. But reorder could give a benefits in performance. Suggested patch 
implements that in several ways:

1) Reorder GROUP BY columns by number of unique values in descent order. Simple 
example shows a performance difference by two times (see 
opt_group_by_demostrator.sql, on my notebook first two queries demonstrate 
that). The idea is: if first column is unique then sort comparator will never 
compute difference of following columns.

2) Reorder GROUP BY columns to match existing pathkeys which are result of index 
scan or ORDER BY clause. It prevents to add sort node - suppose, it's a clear win.

3) Patch makes suggestion to use index by GROUP BY clause, but unlike to ORDER 
BY or merge join case it doesn't pay attention to actual order of columns 
because of 2)

Patch doesn't change any cost estimation computation, although 1) could take an 
advantage of it.

Some issues/problems/notices:

1) I didn't play around GROUPING SETS at all. As I can see, current coding 
prevents any optimization around it and, suppose, it should be addressed  to 
another patch

2) I'm not completely satisfied with counting of number of groups per column, it 
looks ugly without refactoring estimate_num_groups(). At least, now it could be 
called multiple times for each column. May be, this number should be added to 
PathKey structure?

3) EquivalenceClass->ec_sortref. If planner faced with column first time in 
WHERE clause it doesn't fill target reference field because it is unknown yet. 
Patch looks for accordance of PathKey (group_pathkeys) and SortGroupClause 
(groupClause) and fails in this case. So, get_eclass_for_sort_expr() now updates 
ec_sortref field if it's not initialized yet.

4) Algorithms to reorder columns is proportional to N^2 where N is number of 
columns, but I hope it isn't a problem because number of GROUP BY columns isn't 
very big usually.




-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Вложения

Re: POC: GROUP BY optimization

От

Teodor Sigaev

Дата:

30 мая 2018 г., 19:17:42

Cosmetics change: remove find_sort_group_clause_by_sortref() function added in 
v5 patch version because it duplicates existsing get_sortgroupref_clause().

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Вложения

opt_group_by-v6.patch

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

04 июня 2018 г., 00:00:22

Hi Teodor,

Thanks for the patch. This (missing) optimization popped-up repeatedly 
recently, and I was planning to look into it for PG12. So now I don't 
have to, because you've done all the hard work ;-)

I've have done only very basic review so far, but ISTM the general 
approach of considering additional indexes in create_index_paths is 
correct. That being said, I do have a couple of comments:


1) add_path() ensures that we only keep the one cheapest path sorted 
path for each pathkeys. This patch somewhat defeats that because it 
considers additional pathkeys (essentially any permutation of group 
keys) as interesting. So we may end up with more paths in the list.

I wonder if we should make the add_path() logic smarter to recognize 
when two paths have different pathkeys but were generated to match the 
same grouping, to reduce the number of paths added by this optimization. 
Currently we do that for each pathkeys list independently, but we're 
considering many more pathkeys orderings ...

That being said, maybe this concern is moot - to generate large number 
of new paths, there would have to be a lot of indexes matching the group 
by clause (as a prefix), and the group by clause would need to be fairly 
large (to have many permutations). That seems like a fairly rare case. I 
do know a couple of people who do create large numbers of indexes on 
tables, but those are usually single-column indexes.


2) sort reordering based on ndistinct estimates

This part of the patch (reordering the columns in an explicit sort node) 
seems to be largely independent of the "consider other indexes" part. So 
I do suggest separating those two, and discussing them independently.

But thinking about this optimization, I'm worried it relies on a couple 
of important assumptions. For now those decisions could have be made by 
the person writing the SQL query, but this optimization makes that 
impossible. So we really need to get this right.

For example, it seems to disregard that different data types have 
different comparison costs. For example comparing bytea will be far more 
expensive compared to int4, so it may be much more efficient to compare 
int4 columns first, even if there are far fewer distinct values in them.

This costing is probably related to abbreviated keys too, which is a 
huge improvement for text and other varlena-based types.

Also, simply sorting the columns by their ndistinct estimate is somewhat 
naive, because it assumes the columns are independent. Imagine for 
example a table with three columns:

    CREATE TABLE t (a INT, b INT, c INT);

    INSERT INTO t SELECT mod(i,100), mod(i,100)/2, 49*random()
      FROM generate_series(1,1000000) s(i);

Clearly, "a" has the most distinct values (100), while both "b" and "c" 
have 50 distinct values. But "b" does not actually split any of the "a" 
groups, while "c" does:

     select count(*) from (select 1 from t group by a,b) foo;
      count
     -------
        100
     (1 row)

     select count(*) from (select 1 from t group by a,c) foo;
      count
     -------
       5000
     (1 row)

So clearly, when evaluating GROUP BY a,b,c it'd be more efficient to use 
"(a,c,b)" than "(a,b,c)" but there is no way to decide this merely using 
per-column ndistinct values. Luckily, we now have ndistinct multi-column 
coefficients which could be used to decide this I believe (but I haven't 
tried).

The real issue however is that this decision can't be made entirely 
locally. Consider for example this:

     explain select a,b,c, count(*) from t group by a,b,c order by c,b,a;
                                       QUERY PLAN 

 
------------------------------------------------------------------------------
      Sort  (cost=156010.16..156260.16 rows=100000 width=20)
        Sort Key: c, b, a
        ->  GroupAggregate  (cost=132154.34..145654.34 rows=100000 width=20)
              Group Key: a, c, b
              ->  Sort  (cost=132154.34..134654.34 rows=1000000 width=12)
                    Sort Key: a, c, b
                    ->  Seq Scan on t  (cost=0.00..15406.00 rows=1000000 
width=12)
     (7 rows)

while without the patch, this would happen:

     explain select a,b,c, count(*) from t group by a,b,c order by c,b,a;
                                    QUERY PLAN 

 
------------------------------------------------------------------------
      GroupAggregate  (cost=132154.34..145654.34 rows=100000 width=20)
        Group Key: c, b, a
        ->  Sort  (cost=132154.34..134654.34 rows=1000000 width=12)
              Sort Key: c, b, a
              ->  Seq Scan on t  (cost=0.00..15406.00 rows=1000000 width=12)
     (5 rows)

Which is clearly cheaper (at least according to the optimizer) than 
doing two separate sorts. So the optimization actually does the wrong 
thing here, and it needs to somehow consider the other ordering 
requirements (in this case ORDER BY) - either by generating multiple 
paths with different orderings or some heuristics.

I'm also wondering how freely we can change the group by result 
ordering. Assume you disable hash aggregate and parallel query - 
currently we are guaranteed to use group aggregate that produces exactly 
the ordering as specified in GROUP BY, but this patch removes that 
"guarantee" and we can produce arbitrary permutation of the ordering. 
But as I argued in other threads, such implicit guarantees are really 
fragile, can silently break for arbitrary reasons (say, parallel query 
will do just that) and can be easily fixed by adding a proper ORDER BY. 
So I don't think this is an issue.

The other random thought is how will/could this interact with the 
incremental sorts patch. I know Alexander wanted to significantly limit 
where we actually consider the incremental sort, but I'm not sure if 
this was one of those places or not (is sure looks like a place where we 
would greatly benefit from it).

So to summarize this initial review - I do suggest splitting the patch 
into two parts. One that does the index magic, and one for this 
reordering optimization. The first part (additional indexes) seems quite 
fairly safe, likely to get committable soon. The other part (ndistinct 
reordering) IMHO requires more thought regarding costing and interaction 
with other query parts.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: GROUP BY optimization

От

Teodor Sigaev

Дата:

05 июня 2018 г., 20:56:19

> Thanks for the patch. This (missing) optimization popped-up repeatedly recently, 
> and I was planning to look into it for PG12. So now I don't have to, because 
> you've done all the hard work ;-)
You are welcome. Actually one of out customers faced the problem with  GROUP BY 
column order and exactly with reordering without any indexes, you mean it as 
problem 2). Index optimization was noticed by me later. But based on your 
suggested patch's order I split the patch to index  and non-index part and 
second part depends of first one. They touch the same part of code and they 
could not be independent

> 1) add_path() ensures that we only keep the one cheapest path sorted path for 
> each pathkeys. This patch somewhat defeats that because it considers additional 
> pathkeys (essentially any permutation of group keys) as interesting. So we may 
> end up with more paths in the list.
Seems, index scan here could give benefits here only if:
   1) it's a index only scan
   2) it's a index full (opposite to only) scan but physical order of heap is
      close to logical index order (ie table is clustered)

In other cases costs of disk seeking will be very high. But on this stage of 
planing we don't know that facts yet. So we couldn't make a good decision here 
and should believe in add_path() logic.

 > I wonder if we should make the add_path() logic smarter to recognize when two
 > paths have different pathkeys but were generated to match the same grouping,
 > to reduce the number of paths added by this optimization. Currently we do 
that > for each pathkeys list independently, but we're considering many more 
pathkeys > orderings ...

Hm. I tend to say no.
select .. from t1 group by a, b
union
select .. from t2 group by a, b

t1 and t2 could have different set of indexes and different distribution, so 
locally it could be cheaper to use one index (for example, one defined as (b, a) 
and second as (a,b,c,d) - second is larger) but totally - another (example: 
second table doesn't have (b,a) index)

> 2) sort reordering based on ndistinct estimates

> But thinking about this optimization, I'm worried it relies on a couple of 
> important assumptions. For now those decisions could have be made by the person 
> writing the SQL query, but this optimization makes that impossible. So we really 
> need to get this right.
Hm, sql by design should not be used that way, but, of course, it's used :(

> For example, it seems to disregard that different data types have different 
> comparison costs. For example comparing bytea will be far more expensive 
> compared to int4, so it may be much more efficient to compare int4 columns 
> first, even if there are far fewer distinct values in them.
as I can see cost_sort() doesn't pay attention to this details. And it should be 
a separated patch to improve that.

> Also, simply sorting the columns by their ndistinct estimate is somewhat naive, 
> because it assumes the columns are independent. Imagine for example a table with 
> three columns:
> So clearly, when evaluating GROUP BY a,b,c it'd be more efficient to use 
> "(a,c,b)" than "(a,b,c)" but there is no way to decide this merely using 
> per-column ndistinct values. Luckily, we now have ndistinct multi-column 
> coefficients which could be used to decide this I believe (but I haven't tried).
Agree, but I don't know how to use it here. Except, may be:
1) first column - the column with bigger estimated number of groups
2) second column - the pair of (first, second) with bigger estimated number of 
groups
3) third column - the triple of (first, second, third) with bigger ...

But seems even with that algorithm, ISTM, it could be implemented in cheaper manner.


> The real issue however is that this decision can't be made entirely locally. 
> Consider for example this:
> 
>      explain select a,b,c, count(*) from t group by a,b,c order by c,b,a;
> 
> Which is clearly cheaper (at least according to the optimizer) than doing two 
> separate sorts. So the optimization actually does the wrong thing here, and it 
> needs to somehow consider the other ordering requirements (in this case ORDER 
> BY) - either by generating multiple paths with different orderings or some 
> heuristics.
Hm, thank you. I consider it is a bug of my implementation - basic idea was that 
we try to match already existing or needed order and only if we fail or have 
unmatched tail of pathkey list than we will try to find cheapest column order.
Fixed in v7 (0002-opt_group_by_index_and_order-v7.patch), but may be by naive 
way: if we don't have a path pathkey first try to reorder columns accordingly to 
order by clause. Test for your is also added.


> I'm also wondering how freely we can change the group by result ordering. Assume 
> you disable hash aggregate and parallel query - currently we are guaranteed to 
> use group aggregate that produces exactly the ordering as specified in GROUP BY, 
> but this patch removes that "guarantee" and we can produce arbitrary permutation 
> of the ordering. But as I argued in other threads, such implicit guarantees are 
> really fragile, can silently break for arbitrary reasons (say, parallel query 
> will do just that) and can be easily fixed by adding a proper ORDER BY. So I 
> don't think this is an issue.
Agree. SQL by design doesn't give a warranty of particular order without 
explicit ORDER BY clause.

> The other random thought is how will/could this interact with the incremental 
> sorts patch. I know Alexander wanted to significantly limit where we actually 
> consider the incremental sort, but I'm not sure if this was one of those places 
> or not (is sure looks like a place where we would greatly benefit from it).
Seems, they should not directly interact. Patch tries to find cheapest column 
order, Alexander's patch tries to make sort cheaper - they are a different 
tasks. But I will try.

> So to summarize this initial review - I do suggest splitting the patch into two 
> parts. One that does the index magic, and one for this reordering optimization. 
> The first part (additional indexes) seems quite fairly safe, likely to get 
> committable soon. The other part (ndistinct reordering) IMHO requires more 
> thought regarding costing and interaction with other query parts.
Thank you for review!

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Вложения

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

06 июня 2018 г., 16:52:52

On 06/05/2018 07:56 PM, Teodor Sigaev wrote:
>> Thanks for the patch. This (missing) optimization popped-up repeatedly 
>> recently, and I was planning to look into it for PG12. So now I don't 
>> have to, because you've done all the hard work ;-)
> You are welcome. Actually one of out customers faced the problem with  
> GROUP BY column order and exactly with reordering without any indexes, 
> you mean it as problem 2). Index optimization was noticed by me later. 
> But based on your suggested patch's order I split the patch to index  
> and non-index part and second part depends of first one. They touch the 
> same part of code and they could not be independent
> 

The way I see it the patch does two different things:

a) consider additional indexes by relaxing the pathkeys check

b) if there are no indexes, i.e. when an explicit Sort is needed, 
consider reordering the columns by ndistinct

Not sure why those two parts couldn't be separated. I haven't tried 
splitting the patch, of course, so I may be missing something.

In the worst case, one part will depend on the other, which is OK. It 
still allows us to commit the first part and continue working on the 
other one, for example.

>> 1) add_path() ensures that we only keep the one cheapest path sorted 
>> path for each pathkeys. This patch somewhat defeats that because it 
>> considers additional pathkeys (essentially any permutation of group 
>> keys) as interesting. So we may end up with more paths in the list.
> Seems, index scan here could give benefits here only if:
>    1) it's a index only scan
>    2) it's a index full (opposite to only) scan but physical order of 
> heap is
>       close to logical index order (ie table is clustered)
> 
> In other cases costs of disk seeking will be very high. But on this 
> stage of planing we don't know that facts yet. So we couldn't make a 
> good decision here and should believe in add_path() logic.
>
Not sure what you mean? Surely we do costing of the paths at this stage, 
so we can decide which one is cheaper etc. The decision which paths to 
keep is done by add_path(), and it should stay like this, of course. I 
wasn't suggesting to move the logic elsewhere.

>  > I wonder if we should make the add_path() logic smarter to recognize 
> when two
>  > paths have different pathkeys but were generated to match the same 
> grouping,
>  > to reduce the number of paths added by this optimization. Currently 
> we do that > for each pathkeys list independently, but we're considering 
> many more pathkeys > orderings ...
> 
> Hm. I tend to say no.
> select .. from t1 group by a, b
> union
> select .. from t2 group by a, b
> 
> t1 and t2 could have different set of indexes and different 
> distribution, so locally it could be cheaper to use one index (for 
> example, one defined as (b, a) and second as (a,b,c,d) - second is 
> larger) but totally - another (example: second table doesn't have (b,a) 
> index)
> 

But add_path() treats each of the relations independently, why couldn't 
we pick a different index for each of the two relations?

>> 2) sort reordering based on ndistinct estimates
> 
>> But thinking about this optimization, I'm worried it relies on a 
>> couple of important assumptions. For now those decisions could have be 
>> made by the person writing the SQL query, but this optimization makes 
>> that impossible. So we really need to get this right.
> Hm, sql by design should not be used that way, but, of course, it's used :(
> 

Well, yes and no. I'm not worried about people relying on us to give 
them some ordering - they can (and should) add an ORDER BY clause to fix 
that. I'm more worried about the other stuff.

>> For example, it seems to disregard that different data types have 
>> different comparison costs. For example comparing bytea will be far 
>> more expensive compared to int4, so it may be much more efficient to 
>> compare int4 columns first, even if there are far fewer distinct 
>> values in them.
> as I can see cost_sort() doesn't pay attention to this details. And it 
> should be a separated patch to improve that.
> 

But sort also does not reorder columns.

Imagine you have a custom data type that is expensive for comparisons. 
You know that, so you place it at the end of GROUP BY clauses, to reduce 
the number of comparisons on that field. And then we come along and just 
reorder the columns, placing it first, because it happens to have a high 
ndistinct statistic. And there's no way to get the original behavior :-(

>> Also, simply sorting the columns by their ndistinct estimate is 
>> somewhat naive, because it assumes the columns are independent. 
>> Imagine for example a table with three columns:
>> So clearly, when evaluating GROUP BY a,b,c it'd be more efficient to 
>> use "(a,c,b)" than "(a,b,c)" but there is no way to decide this merely 
>> using per-column ndistinct values. Luckily, we now have ndistinct 
>> multi-column coefficients which could be used to decide this I believe 
>> (but I haven't tried).
> Agree, but I don't know how to use it here. Except, may be:
> 1) first column - the column with bigger estimated number of groups
> 2) second column - the pair of (first, second) with bigger estimated 
> number of groups
> 3) third column - the triple of (first, second, third) with bigger ...
> 
> But seems even with that algorithm, ISTM, it could be implemented in 
> cheaper manner.
> 

Maybe. I do have some ideas, although I haven't tried implementing it.

If there's no extended statistic on the columns, you can do the current 
thing (assuming independence etc.). There's not much we can do here.

If there's an extended statistic, you can do either a greedy search (get 
the next column with the highest ndistinct coefficient) or exhaustive 
search (computing the estimated number of comparisons).

Another challenge is that using only the ndistinct coefficient assumes 
uniform distribution of the values. But you can have a column with 1M 
distinct values, where a single value represents 99% of the rows. And 
another column with 100k distinct values, with actual uniform 
distribution. I'm pretty sure it'd be more efficient to place the 100k 
column first.

> 
>> The real issue however is that this decision can't be made entirely 
>> locally. Consider for example this:
>>
>>      explain select a,b,c, count(*) from t group by a,b,c order by c,b,a;
>>
>> Which is clearly cheaper (at least according to the optimizer) than 
>> doing two separate sorts. So the optimization actually does the wrong 
>> thing here, and it needs to somehow consider the other ordering 
>> requirements (in this case ORDER BY) - either by generating multiple 
>> paths with different orderings or some heuristics.
> Hm, thank you. I consider it is a bug of my implementation - basic idea 
> was that we try to match already existing or needed order and only if we 
> fail or have unmatched tail of pathkey list than we will try to find 
> cheapest column order.
> Fixed in v7 (0002-opt_group_by_index_and_order-v7.patch), but may be by 
> naive way: if we don't have a path pathkey first try to reorder columns 
> accordingly to order by clause. Test for your is also added.
> 

OK. I'll give it a try.

> 
>> I'm also wondering how freely we can change the group by result 
>> ordering. Assume you disable hash aggregate and parallel query - 
>> currently we are guaranteed to use group aggregate that produces 
>> exactly the ordering as specified in GROUP BY, but this patch removes 
>> that "guarantee" and we can produce arbitrary permutation of the 
>> ordering. But as I argued in other threads, such implicit guarantees 
>> are really fragile, can silently break for arbitrary reasons (say, 
>> parallel query will do just that) and can be easily fixed by adding a 
>> proper ORDER BY. So I don't think this is an issue.
> Agree. SQL by design doesn't give a warranty of particular order without 
> explicit ORDER BY clause.
> 
>> The other random thought is how will/could this interact with the
>> incremental sorts patch. I know Alexander wanted to significantly 
>> limit where we actually consider the incremental sort, but I'm not 
>> sure if this was one of those places or not (is sure looks like a 
>> place where we would greatly benefit from it).
> Seems, they should not directly interact. Patch tries to find cheapest 
> column order, Alexander's patch tries to make sort cheaper - they are a 
> different tasks. But I will try.
> 

Thanks.

>> So to summarize this initial review - I do suggest splitting the patch 
>> into two parts. One that does the index magic, and one for this 
>> reordering optimization. The first part (additional indexes) seems 
>> quite fairly safe, likely to get committable soon. The other part 
>> (ndistinct reordering) IMHO requires more thought regarding costing 
>> and interaction with other query parts.
> Thank you for review!
> 

;-)


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: GROUP BY optimization

От

Teodor Sigaev

Дата:

06 июня 2018 г., 21:04:09

>> problem 2). Index optimization was noticed by me later. But based on your 
>> suggested patch's order I split the patch to index and non-index part and 
>> second part depends of first one. They touch the same part of code and they 
>> could not be independent
> The way I see it the patch does two different things:
Yes

> a) consider additional indexes by relaxing the pathkeys check
Yes, but not only. Patch could reorder GROUP BY clause to match existing 
pathkeys which could come from index scan (suggested by patch or not) or some 
another way to get ordered output.

> b) if there are no indexes, i.e. when an explicit Sort is needed, consider 
> reordering the columns by ndistinct
Yes. But again, this description is a bit short. First it works after first 
patch and might get some preordered leading pathkeys. Second, it tries to match 
ORDER BY clause order if there is no preordered leading pathkeys from first 
patch (it was introduced in v7). And third, if there is a tail of unmatched 
pathkeys on previous stages then it will reorder that tail.

> In the worst case, one part will depend on the other, which is OK. It still 
> allows us to commit the first part and continue working on the other one, for 
> example.
Exactly it's our case. Of course, it's possible to split first patch for two 
again: just suggestion of index (1.1) and reordering by existing pathkeys (1.2). 
Then 1.1 will be independent but 2 still should be applied after 1.2. But patch 
1.1 is rather useless.

> can decide which one is cheaper etc. The decision which paths to keep is done by 
> add_path(), and it should stay like this, of course. I wasn't suggesting to move 
> the logic elsewhere.
Cool, I haven't intention to modify it too.

> 
>>  > I wonder if we should make the add_path() logic smarter to recognize when two
>>  > paths have different pathkeys but were generated to match the same grouping,
>>  > to reduce the number of paths added by this optimization. Currently we do 
>> that > for each pathkeys list independently, but we're considering many more 
>> pathkeys > orderings ...
>>
>> Hm. I tend to say no.
>> select .. from t1 group by a, b
>> union
>> select .. from t2 group by a, b
>>
>> t1 and t2 could have different set of indexes and different distribution, so 
>> locally it could be cheaper to use one index (for example, one defined as (b, 
>> a) and second as (a,b,c,d) - second is larger) but totally - another (example: 
>> second table doesn't have (b,a) index)
>>
> 
> But add_path() treats each of the relations independently, why couldn't we pick 
> a different index for each of the two relations?

Having of sorted output of both subselect could be cheaper that sorting one of 
them even if index scan was cheaper. But it seems to me that is not deal of 
suggested here optimization.


>>> For example, it seems to disregard that different data types have different 
>>> comparison costs. For example comparing bytea will be far more expensive 
>>> compared to int4, so it may be much more efficient to compare int4 columns 
>>> first, even if there are far fewer distinct values in them.
>> as I can see cost_sort() doesn't pay attention to this details. And it should 
>> be a separated patch to improve that.
>>
> 
> But sort also does not reorder columns.
Yes. But estimation of cost of comparing function/number of unique values in 
column could be not very accurate and so planner could make a wrong choice. I 
saw 2 times difference in real-world application. Again, improving sort cost 
estimation is a separate task.

> 
> Imagine you have a custom data type that is expensive for comparisons. You know 
> that, so you place it at the end of GROUP BY clauses, to reduce the number of 
> comparisons on that field. And then we come along and just reorder the columns, 
> placing it first, because it happens to have a high ndistinct statistic. And 
> there's no way to get the original behavior :-(
Hm. that it could be, but I don't know how to improve here.  Current cost_sort() 
will return the same cost for any columns order.

Okay, here we know an estimation of nrow, we could somehow find a comparing 
function oid and find a its procost field. And then? we also need to know width 
of tuple here and mostly repeat the cost_sort.

Another option is an introducing enable_groupby_reorder GUC variable although 
personally I don't like an idea to add new GUC variable.

>> Agree, but I don't know how to use it here. Except, may be:
>> 1) first column - the column with bigger estimated number of groups
>> 2) second column - the pair of (first, second) with bigger estimated number of 
>> groups
>> 3) third column - the triple of (first, second, third) with bigger ...
>>
>> But seems even with that algorithm, ISTM, it could be implemented in cheaper 
>> manner.
>>
> 
> Maybe. I do have some ideas, although I haven't tried implementing it.
Implemented, pls, have a look.

> 
> If there's no extended statistic on the columns, you can do the current thing 
> (assuming independence etc.). There's not much we can do here.
Agree

> 
> If there's an extended statistic, you can do either a greedy search (get the 
> next column with the highest ndistinct coefficient) or exhaustive search 
> (computing the estimated number of comparisons).
> 
> Another challenge is that using only the ndistinct coefficient assumes uniform 
> distribution of the values. But you can have a column with 1M distinct values, 
> where a single value represents 99% of the rows. And another column with 100k 
> distinct values, with actual uniform distribution. I'm pretty sure it'd be more 
> efficient to place the 100k column first.

Interesting.  Will think, thank you


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

>> Yes. But again, this description is a bit short. First it works after
>> first patch and might get some preordered leading pathkeys. Second, it
>> tries to match ORDER BY clause order if there is no preordered leading
>> pathkeys from first patch (it was introduced in v7). And third, if there
>> is a tail of unmatched pathkeys on previous stages then it will reorder
>> that tail.
>>
> 
> OK, I haven't looked at v7 yet, but if I understand correctly it tries
> to maintain the ordering as much as possible? Does that actually help? I
> mean, the incremental sort patch allows the sorting to happen by pieces,
> but here we still need to sort all the data, right?
> 
> Can you give an example demonstrating the benefit?

See tst.sql. queries are marked with opt (optimization is on) and noopt.

Query 1: select count(*) from btg group by v, r;
Query 2: select count(*) from btg group by n, v, r order by n;

For both queries it's possible to reorder v and r column, n column has the 
single distinct value.

On my laptop:
Query 1opt vs 1noopt: 3177.500 ms vs 6604.493 ms
       2opt vs 2noopt: 5800.307 ms vs 7486.967 ms

So, what we see:
1) for query 1 optimization gives 2 times better performance, for query 2 only 
30%. if column 'n' will be unique then time for query 1 and 2 should be the 
same. We could add check for preordered pathkeys in 
get_cheapest_group_keys_order() and if estimate_num_groups(reordered pathkeys) 
is close to 1 then we could do not reordering of tail of pathkeys.

2) Planing cost is the same for all queries. So, cost_sort() doesn't take into 
account even number of columns.

> FWIW I think it would be useful to have "development GUC" that would
> allow us to enable/disable these options during development, because
> that makes experiments much easier. But then remove them before commit.
Added, v9, debug_enable_group_by_match_order_by and 
debug_enable_cheapest_group_by. I also checked compatibility with incremental 
sort patch, and all works except small merge conflict which could be resolved 
right before committing.

Next, I had a look on cost_incremental_sort() provided by incremental sort patch 
and, it's a pity, it doesn't solve our problem with the impact of the cost of 
per-column comparison function and number of its calls.


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

> I see v9 is now calling estimate_num_groups(), so it already benefits
> from extended statistics. Nice! I think the algorithm is more or less
> the greedy option I proposed in one of the earlier messages - I don't
> know if it's sufficient or not, but the only other idea I have is
> essentially an exhaustive search through all possible permutations.

All possible permutation could be slow with extremely large group by clause, so 
we will need some some stop limit - like join_collapse_limit. I don't like this 
idea.

> So that's a nice improvement, although I think we should also consider
> non-uniform distributions (using the per-column MCV lists).
Could you clarify how to do that?

> 
>> 2) Planing cost is the same for all queries. So, cost_sort() doesn't
>> take into account even number of columns.
>>
> 
> Yeah. As I wrote before, I think we'll need to come up with a model for
> comparison_cost depending on the column order, which should make costs
> for those paths different.
Yeah. But I still think it should be separated patch which will improve cost 
estimation and plan choosing in other optimizer part too.

> OK. I think we should also have a debug GUC for the first patch (i.e.
> one that would allow reordering group_pathkeys to match the input path).
Added to 0002-opt_group_by_index_and_order-v10.patch . 
debug_group_by_reorder_by_pathkeys.

> Regarding the two GUCs introduced in v9, I'm not sure I 100% understand
> what they are doing. Let me try explaining what I think they do:
> 
> 1) debug_cheapest_group_by - Reorder GROUP BY clauses to using ndistinct
> stats for columns, placing columns with more distinct values first (so
> that we don't need to compare the following columns as often).
yes

> 
> 2) debug_group_by_match_order_by - Try to reorder the GROUP BY clauses
> to match the ORDER BY, so that the group aggregate produces results in
> the desired output (and we don't need an explicit Sort).
yes

> FWIW I doubt about the debug_group_by_match_order_by optimization. The
> trouble is that it's only based on comparing the pathkeys lists, and
> entirely ignores that the two result sets may operate on very different
> numbers of rows.
> 
> Consider for example "SELECT * FROM t GROUP BY a,b,c,d ORDER BY c,d"
> where table "t" has 1B rows, but there are only ~10k rows in the result
> (which is fairly common in fact tables in star-schema DBs). IIUC the
> optimization will ensure "c,d" is placed first in the GROUP BY, and then
> we reorder "a,b" using ndistinct. But this may be much less efficient
> than simply reordering (a,b,c,d) per ndistinct and then adding explicit
> Sort on top of that (because on 10k rows that's going to be cheap).
> 
As you pointed in next email, this optimization is already implemented in 
preprocess_groupclause(). And correct resolving of this issue could be done only 
after implementing of correct cost_sort() - which will pay attention to 
comparison cost, number of distinct value in columns and how often it's needed 
to compare second and next columns.

> 
> The other "issue" I have with get_cheapest_group_keys_order() is how it
> interacts with group_keys_reorder_by_pathkeys(). I mean, we first call
> 
>    n_preordered = group_keys_reorder_by_pathkeys(path->pathkeys, ...);
> 
> so the reordering tries to maintain ordering of the input path. Then if
> (n_preordered == 0) we do this:
> 
>    group_keys_reorder_by_pathkeys(root->sort_pathkeys, ...);
> 
> Doesn't that mean that if we happen to have input path with partially
> matching ordering (so still requiring explicit Sort before grouping) we
> may end up ignoring the ORDER BY entirely? Instead of using Sort that
> would match the ORDER BY? I might have misunderstood this, though.
Hm. I believe ordering of input of GROUP clause is always more expensive - just 
because output of GROUP BY clause should have less number of rows than  its 
input, what means more cheap ordering. So here it uses ORDER BY only if we don't 
have a group pathkey(s) matched by path pathkeys.

> 
> I'm not sure why the first group_keys_reorder_by_pathkeys() call does
> not simply return 0 when the input path ordering is not sufficient for
> the grouping? Is there some reasoning behind that (e.g. that sorting
> partially sorted is faster)?
Hm, what do you mean - sufficient? group_keys_reorder_by_pathkeys() always tries 
to order GROUP BY columns by preexisting pathkeys. But of course, incremental 
sort will add more benefits here. I'd like to hope that incremental sort patch 
is close to commit.

> 
> Overall I think this patch introduces four different optimizations that
> are indeed very interesting:
> 
> 1) consider additional sorted paths for grouping
Agree
> 
> 2) reorder GROUP BY clauses per ndistinct to minimize comparisons
Agree (with a help of estimate_num_groups() and patch tries to maximize a number 
of groups on each step)

> 3) when adding Sort for grouping, maintain partial input ordering
Agree - and incremental sort patch will helpful here.

> 4) when adding Sort for grouping, try producing the right output order
>     (if the ORDER BY was specified)
Yes.

> 
> But at this point it's kinda mashed together in ad-hoc manner, using
> very simple heuristics / assumptions. We'll need to figure out how to
> combine those optimizations.
The used heuristics here:
1) if there is a some order  - let's used it
2) if there is a required order - let's match that order to prevent extra sort node.

Incremental sort patch will improve cases where there is partial match of order.

> 
> BTW I get compiler warnings that n_preordered_groups may be used
> uninitialized in add_paths_to_grouping_rel, because it's only set in an
> if/else branch, and then used further down.
Fixed, but I believe that compiler is not smart enough here.



-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

On Sat, Jun 30, 2018 at 09:19:13AM +1200, Gavin Flower wrote:
> Additionally put an upper limit threshold on the number of combinations to
> check, fairly large by default?
>
> If first threshold is exceeded, could consider checking out a few more
> selected at random from paths not yet checked, to avoid any bias caused by
> stopping a systematic search.  This might prove important when N! is fairly
> large.

Please note that the latest patch available does not apply, so this has
been moved to next CF 2018-11, waiting for input from its author.
--
Michael

Вложения

signature.asc

Re: POC: GROUP BY optimization

От

Dmitry Dolgov

Дата:

29 ноября 2018 г., 16:56:53

> On Tue, Oct 2, 2018 at 4:16 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sat, Jun 30, 2018 at 09:19:13AM +1200, Gavin Flower wrote:
> > Additionally put an upper limit threshold on the number of combinations to
> > check, fairly large by default?
> >
> > If first threshold is exceeded, could consider checking out a few more
> > selected at random from paths not yet checked, to avoid any bias caused by
> > stopping a systematic search.  This might prove important when N! is fairly
> > large.
>
> Please note that the latest patch available does not apply, so this has
> been moved to next CF 2018-11, waiting for input from its author.

Unfortunately, patch still has some conflicts, could you please rebase it?

Re: POC: GROUP BY optimization

От

Andres Freund

Дата:

31 января 2019 г., 11:24:07

On 2018-11-29 17:56:53 +0100, Dmitry Dolgov wrote:
> > On Tue, Oct 2, 2018 at 4:16 AM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Sat, Jun 30, 2018 at 09:19:13AM +1200, Gavin Flower wrote:
> > > Additionally put an upper limit threshold on the number of combinations to
> > > check, fairly large by default?
> > >
> > > If first threshold is exceeded, could consider checking out a few more
> > > selected at random from paths not yet checked, to avoid any bias caused by
> > > stopping a systematic search.  This might prove important when N! is fairly
> > > large.
> >
> > Please note that the latest patch available does not apply, so this has
> > been moved to next CF 2018-11, waiting for input from its author.
> 
> Unfortunately, patch still has some conflicts, could you please rebase it?

As nothing has happened since, I'm marking this as returned with
feedback.

Greetings,

Andres Freund

Re: POC: GROUP BY optimization

От

Dmitry Dolgov

Дата:

04 апреля 2019 г., 15:11:09

> On Thu, Jan 31, 2019 at 12:24 PM Andres Freund <andres@anarazel.de> wrote:
>
> As nothing has happened since, I'm marking this as returned with
> feedback.

This patch was on my radar for some time in the past and we've seen use cases
where it could be pretty useful (probably even without the incremental sort
patch). I would like to make some progress here and see if it's possible to
continue it's development. I've attached the rebased version with a small
changes, e.g. I've created a separate patch with group by reordering tests to
make it easy to see what changes were introduced, and after some experiments
removed part that seems to duplicate "group by" reordering to follow "order
by". Also looks like it's possible to make these patches independent by having
a base patch with the isolated group_keys_reorder_by_pathkeys (they're
connected via n_preordered), but I haven't done this yet.

I went through the thread to summarize the objections, that were mentioned so
far. Most of them are related to the third patch in the series, where
reordering based on "ndistincs" is implemented, and are about cost_sort (all
the possible problems that could happen without proper cost estimation due to
non uniform distribution, different comparison costs and so on) and figuring
out how to limit number of possible combinations of pathkeys to compare. I
haven't looked at the proposed backtracking approach, but taking into account
that suggested patch for cost_sort [1] is RWF, I wonder what would be the best
strategy to proceed?

[1]: https://commitfest.postgresql.org/21/1706/

Вложения

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

09 апреля 2019 г., 15:21:00

On Thu, Apr 04, 2019 at 05:11:09PM +0200, Dmitry Dolgov wrote:
>> On Thu, Jan 31, 2019 at 12:24 PM Andres Freund <andres@anarazel.de> wrote:
>>
>> As nothing has happened since, I'm marking this as returned with
>> feedback.
>
>This patch was on my radar for some time in the past and we've seen use cases
>where it could be pretty useful (probably even without the incremental sort
>patch). I would like to make some progress here and see if it's possible to
>continue it's development. I've attached the rebased version with a small
>changes, e.g. I've created a separate patch with group by reordering tests to
>make it easy to see what changes were introduced, and after some experiments
>removed part that seems to duplicate "group by" reordering to follow "order
>by". Also looks like it's possible to make these patches independent by having
>a base patch with the isolated group_keys_reorder_by_pathkeys (they're
>connected via n_preordered), but I haven't done this yet.
>
>I went through the thread to summarize the objections, that were mentioned so
>far. Most of them are related to the third patch in the series, where
>reordering based on "ndistincs" is implemented, and are about cost_sort (all
>the possible problems that could happen without proper cost estimation due to
>non uniform distribution, different comparison costs and so on) and figuring
>out how to limit number of possible combinations of pathkeys to compare. I
>haven't looked at the proposed backtracking approach, but taking into account
>that suggested patch for cost_sort [1] is RWF, I wonder what would be the best
>strategy to proceed?
>
>[1]: https://commitfest.postgresql.org/21/1706/

Dunno. It seems the progres on the sort-related patches was rather limited
in the PG12 cycle in general :-( There's the Incremental Sort patch, GROUP
BY optimization and then the cost_sort patch.

Not sure about the best strategy, though. One obvious option is to rely on
cost_sort patch to do all the improvements needed for the other patches,
but that assumes that patch moves reasonably fast.

So I personally would suggest to treat those patches as independent until
the very last moment, develop the costing improvements needed by each
of them, and then decide which of them are committable / in what order.

At the end of PG11 cycle I've offered my help with testing / reviewing
those patches, if there is progress. That still holds, if there are new
patch versions I'll look at them.

cheers

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: GROUP BY optimization

От

Dmitry Dolgov

Дата:

03 мая 2019 г., 20:28:21

> On Tue, Apr 9, 2019 at 5:21 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> So I personally would suggest to treat those patches as independent until
> the very last moment, develop the costing improvements needed by each
> of them, and then decide which of them are committable / in what order.

I had the same idea, but judging from the questions, raised in this thread,
it's quite hard to go with reordering based only on frequency of values. I
hoped that the cost_sort improvement patch would be simple enough to
incorporate it here, but of course it wasn't. Having an assumption, that the
amount of work, required for performing sorting, depends only on the number of
distinct groups and how costly it is to compare a values of this data type,
I've ended up extracting get_width_multiplier and get_func_cost parts from
cost_sort patch and including them into 0003-Reorder-by-values-distribution.
This allows to take into account situations when we compare e.g. long strings
or a custom data type with high procost for comparison (but I've used this
values directly without any adjusting coefficients yet).

> On Wed, Jun 13, 2018 at 6:41 PM Teodor Sigaev <teodor@sigaev.ru> wrote:
>
> > So that's a nice improvement, although I think we should also consider
> > non-uniform distributions (using the per-column MCV lists).
>
> Could you clarify how to do that?

Since I'm not familiar with this topic, I would like to ask the same question,
how to do that and what are the advantages?

> On Sat, Jun 16, 2018 at 5:59 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> I still think we need to be careful when introducing new optimizations
> in this area - reordering the grouping keys by ndistinct, ORDER BY or
> whatever. In particular I don't think we should commit these patches
> that may quite easily cause regressions, and then hope some hypothetical
> future patch fixes the costing.

I'm a bit concerned about this part of the discussion. There is an idea through
the whole thread about avoiding the situation, when a user knows which order is
better and we generate different one by mistake. From what I see right now even
if all the objections would be addressed, there is a chance that some
coefficients will be not good enough (e.g. width multiplier is based on an
average width, or it can suddenly happen that all the compared string have some
difference at the very beginning) and the chosen order will be not optimal.
Does it mean that in any case the implementation of such optimization should
provide a way to override it?

On Sat, May 16, 2020 at 02:24:31PM +0200, Dmitry Dolgov wrote:
>> On Fri, May 15, 2020 at 01:52:20AM +0200, Tomas Vondra wrote:
>>
>> I wonder if anyone has plans to try again with this optimization in v14
>> cycle? The patches no longer apply thanks to the incremental sort patch,
>> but I suppose fixing that should not be extremely hard.
>>
>> The 2020-07 CF is still a couple weeks away, but it'd be good to know if
>> there are any plans to revive this. I'm willing to spend some time on
>> reviewing / testing this, etc.
>
>Yes, if you believe that this patch has potential, I would love to pick
>it up again.
>

I think it's worth another try. I've been reminded of this limitation
when working on the incremental sort, which significantly increases the
number of orderings that we can sort efficiently. But we can't possibly
leverage that unless it matches the GROUP BY ordering.

The attached WIP patch somewhat addresses this by trying to reorder the
group_pathkeys so allow leveraging sort of existing ordering (even just
partial, with incremental sort).

>> I've only quickly skimmed the old thread, but IIRC there were two main
>> challenges in getting the optimization right:
>>
>>
>> 1) deciding which orderings are interesting / worth additional work
>>
>> I think we need to consider these orderings, in addition to the one
>> specified in GROUP BY:
>>
>> 1) as specified in ORDER BY (if different from 1)
>
>What is the idea behind considering this ordering?

I'll try to answer this in general, i.e. what I think this patch needs
to consider. Hopefully that'll answer why we should look at ORDER BY
pathkeys too ...

Reordering the pathkeys has both local and global effects, and we need
to consider all of that to make the optimization reliable, otherwise
we'll inevitably end up with trivial regressions.

The local effects are trivial - it's for example reordering the pathkeys
to make the explicit sort as cheap as possible. This thread already
discussed a number of things to base this on - ndistinct for columns,
cost of comparison function, ... In any case, this is something we can
decide locally, when building the grouping paths.

The global effects are much harder to tackle, because the decision can't
be made locally when building the grouping paths. It requires changes
both below and above the point where we build grouping paths.

An example of a decision we need to make before we even get to building
a grouping path is which index paths to build. Currently we only build
index paths with "useful" pathkeys, and without tweaking that we'll
never even see the index in add_paths_to_grouping_rel().

But there are also decisions that can be made only after we build the
grouping paths. For example, we may have both GROUP BY and ORDER BY, and
there is no "always correct" way to combine those. In some cases it may
be correct to use the same pathkeys, in other cases it's better to use
different ones (which will require an extra Sort, with additional cost).

So I don't think there will be a single "interesting" grouping pathkeys
(i.e. root->group_pathkeys), but a collection of pathkeys. And we'll
need to build grouping paths for all of those, and leave the planner to
eventually pick the one giving us the cheapest plan ...

A "brute-force" approach would be to generate all possible orderings of
group_pathkeys, but that's probably not going to fly - it might easily
cause an explosion in number of paths we track, etc. So we'll need to
pick/prioritize orderings that are more likely to give us efficient
plans, and ORDER BY seems like a good example because it means we won't
need an explicit sort.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

wip-group-by-incremental-sort-v1.patch

Re: POC: GROUP BY optimization

От

Dmitry Dolgov

Дата:

20 июня 2020 г., 14:30:10

> On Sat, May 16, 2020 at 04:56:09PM +0200, Tomas Vondra wrote:
>
> So I don't think there will be a single "interesting" grouping pathkeys
> (i.e. root->group_pathkeys), but a collection of pathkeys. And we'll
> need to build grouping paths for all of those, and leave the planner to
> eventually pick the one giving us the cheapest plan ...
>
> A "brute-force" approach would be to generate all possible orderings of
> group_pathkeys, but that's probably not going to fly - it might easily
> cause an explosion in number of paths we track, etc. So we'll need to
> pick/prioritize orderings that are more likely to give us efficient
> plans, and ORDER BY seems like a good example because it means we won't
> need an explicit sort.

Yes, I see. I've already rebased the original version of patch and now
working on your suggestions. In the meantime one question:

> But there are also decisions that can be made only after we build the
> grouping paths. For example, we may have both GROUP BY and ORDER BY, and
> there is no "always correct" way to combine those. In some cases it may
> be correct to use the same pathkeys, in other cases it's better to use
> different ones (which will require an extra Sort, with additional cost).

Do I understand correctly, your idea is that in some cases it's cheaper
to use different order for GROUP BY than in ORDER BY even with an extra
sorting? In the current patch implementation there is an assumption that
says it's always better to match the order of pathkeys for both GROUP
BY/ORDER BY (which means that the only degree of freedom there is to
reorder the tail, which in turn makes both "unsorted input" and
"partially sorted input" cases from your original email essentially the
same). Can you show such an example when this assumption is invalid?

> I've only quickly skimmed the old thread, but IIRC there were two main
> challenges in getting the optimization right:
>
> 1) deciding which orderings are interesting / worth additional work
>
> I think we need to consider these orderings, in addition to the one
> specified in GROUP BY:

Yes, looks like the current patch implementation together with
preprocess_groupclause already implements this, although maybe not that
flexible.

> 2) costing the alternative orderings
>
> I think we've already discussed various ways to leverage as much
> available info as possible (extended stats, MCVs, ...) but I think the
> patch only does some of it.

Right, there were couple of ideas what to do in case of a few groups
which are too big they can invalidate current assumptions about costs.
E.g. do not apply reordering if we detected such situation, or use
"conservative" approach and take the biggest group for estimations.

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

20 июня 2020 г., 18:23:57

On Sat, Jun 20, 2020 at 04:30:10PM +0200, Dmitry Dolgov wrote:
>> On Sat, May 16, 2020 at 04:56:09PM +0200, Tomas Vondra wrote:
>>
>> So I don't think there will be a single "interesting" grouping pathkeys
>> (i.e. root->group_pathkeys), but a collection of pathkeys. And we'll
>> need to build grouping paths for all of those, and leave the planner to
>> eventually pick the one giving us the cheapest plan ...
>>
>> A "brute-force" approach would be to generate all possible orderings of
>> group_pathkeys, but that's probably not going to fly - it might easily
>> cause an explosion in number of paths we track, etc. So we'll need to
>> pick/prioritize orderings that are more likely to give us efficient
>> plans, and ORDER BY seems like a good example because it means we won't
>> need an explicit sort.
>
>Yes, I see. I've already rebased the original version of patch and now
>working on your suggestions. In the meantime one question:
>
>> But there are also decisions that can be made only after we build the
>> grouping paths. For example, we may have both GROUP BY and ORDER BY, and
>> there is no "always correct" way to combine those. In some cases it may
>> be correct to use the same pathkeys, in other cases it's better to use
>> different ones (which will require an extra Sort, with additional cost).
>
>Do I understand correctly, your idea is that in some cases it's cheaper
>to use different order for GROUP BY than in ORDER BY even with an extra
>sorting? In the current patch implementation there is an assumption that
>says it's always better to match the order of pathkeys for both GROUP
>BY/ORDER BY (which means that the only degree of freedom there is to
>reorder the tail, which in turn makes both "unsorted input" and
>"partially sorted input" cases from your original email essentially the
>same). Can you show such an example when this assumption is invalid?
>

Yes, that is mostly the point - I don't think we can assume simply using
the ORDER BY pathkeys (if specified) for grouping is optimal. As a
trivial counter-example, consider this

CREATE TABLE t (
   a int,
   b int,
   c int);

INSERT INTO t SELECT 1000 * random(), 1000 * random(), i
   FROM generate_series(1,10000000) s(i);

CREATE INDEX ON t (a, b, c);

VACUUM ANALYZE t;


And now some simple queries (you may need to tweak the planning options
a bit, to get these plans - disable hashagg etc.).


test=# explain analyze select a, b, count(c) from t group by a, b;
                                                                    QUERY PLAN
                         
 

-------------------------------------------------------------------------------------------------------------------------------------------------
  GroupAggregate  (cost=0.43..389008.55 rows=1000018 width=16) (actual time=0.037..10509.505 rows=1001923 loops=1)
    Group Key: a, b
    ->  Index Only Scan using t_a_b_c_idx on t  (cost=0.43..304007.06 rows=10000175 width=12) (actual
time=0.024..5189.908rows=10000000 loops=1)
 
          Heap Fetches: 0
  Planning Time: 0.113 ms
  Execution Time: 10941.677 ms
(6 rows)


test=# explain analyze select a,b, count(c) from t group by a, b order by a, b;
                                                                    QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------
  GroupAggregate  (cost=0.43..389008.55 rows=1000018 width=16) (actual time=0.033..10606.925 rows=1001923 loops=1)
    Group Key: a, b
    ->  Index Only Scan using t_a_b_c_idx on t  (cost=0.43..304007.06 rows=10000175 width=12) (actual
time=0.020..5240.332rows=10000000 loops=1)
 
          Heap Fetches: 0
  Planning Time: 0.100 ms
  Execution Time: 11043.358 ms
(6 rows)


test=# explain analyze select a,b, count(c) from t group by a, b order by b, a;
                                                           QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------
  GroupAggregate  (cost=1658556.19..1768558.12 rows=1000018 width=16) (actual time=14707.676..25533.053 rows=1001923
loops=1)
    Group Key: b, a
    ->  Sort  (cost=1658556.19..1683556.63 rows=10000175 width=12) (actual time=14707.659..20011.024 rows=10000000
loops=1)
          Sort Key: b, a
          Sort Method: external merge  Disk: 219200kB
          ->  Seq Scan on t  (cost=0.00..154056.75 rows=10000175 width=12) (actual time=0.028..4751.244 rows=10000000
loops=1)
  Planning Time: 0.103 ms
  Execution Time: 25989.412 ms
(8 rows)


test=# explain analyze select * from (select a,b, count(c) from t group by a, b offset 0) foo order by b,a;
                                                                       QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
  Sort  (cost=515759.00..518259.04 rows=1000018 width=16) (actual time=11281.094..11783.920 rows=1001923 loops=1)
    Sort Key: t.b, t.a
    Sort Method: external merge  Disk: 34880kB
    ->  GroupAggregate  (cost=0.43..389008.55 rows=1000018 width=16) (actual time=0.064..10507.256 rows=1001923
loops=1)
          Group Key: t.a, t.b
          ->  Index Only Scan using t_a_b_c_idx on t  (cost=0.43..304007.06 rows=10000175 width=12) (actual
time=0.052..5209.939rows=10000000 loops=1)
 
                Heap Fetches: 0
  Planning Time: 0.111 ms
  Execution Time: 12218.370 ms
(9 rows)


To sum this up:

   grouping (a,b): 10941 ms
   grouping (a,b) + ordering (a,b): 11043
   grouping (b,a) + ordering (b,a): 25989
   grouping (a,b) + ordering (b,a): 12218

So, it's fast as long as we can use the IOS, even if we have to do an
extra Sort at the end. This obviously relies on the grouping to reduce
the number of rows (in this case from 10M to 1M), which makes the extra
cost much cheaper.

I'm pretty sure I could construct similar examples e.g. with incremental
sort, and so on.

Does this explain why I think we can't make the assumption that it's
always better to match the pathkeys?

FWIW, I think the change of plan for the third query (where we specify
"GROUP BY a,b" but the planner decides to just change that) is rather
annoying, and it clearly makes things worse :-(


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: GROUP BY optimization

От

Robert Haas

Дата:

22 июня 2020 г., 15:50:49

On Sat, May 16, 2020 at 10:56 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> The local effects are trivial - it's for example reordering the pathkeys
> to make the explicit sort as cheap as possible. This thread already
> discussed a number of things to base this on - ndistinct for columns,
> cost of comparison function, ... In any case, this is something we can
> decide locally, when building the grouping paths.
>
> The global effects are much harder to tackle, because the decision can't
> be made locally when building the grouping paths. It requires changes
> both below and above the point where we build grouping paths.
>
> An example of a decision we need to make before we even get to building
> a grouping path is which index paths to build. Currently we only build
> index paths with "useful" pathkeys, and without tweaking that we'll
> never even see the index in add_paths_to_grouping_rel().
>
> But there are also decisions that can be made only after we build the
> grouping paths. For example, we may have both GROUP BY and ORDER BY, and
> there is no "always correct" way to combine those. In some cases it may
> be correct to use the same pathkeys, in other cases it's better to use
> different ones (which will require an extra Sort, with additional cost).
>
> So I don't think there will be a single "interesting" grouping pathkeys
> (i.e. root->group_pathkeys), but a collection of pathkeys. And we'll
> need to build grouping paths for all of those, and leave the planner to
> eventually pick the one giving us the cheapest plan ...

I agree with all of this and I think it's really good analysis. Part
of the reason why the planner isn't that sophisticated in this area is
that, for a long time, we didn't use paths at this level, and so it
was much harder to write any kind of viable patch to consider
alternatives. With Tom's planner path-ification word there should be a
lot more potential for optimization here, but, as you say, we need to
do that by leveraging the existing costing machinery, not just via
simple heuristics. It also strikes me that one of the problems in this
area is that the statistics we currently gather don't seem to be
entirely useful or reliable for aggregate planning. I wonder if there
are extensions to the extended statistics mechanism, or even just more
things we should gather during a routine ANALYZE, that would enable us
to estimate things better here. The most obvious thing is that
n_distinct is often wildly inaccurate, but it's deeper than that. For
instance, we need some way to estimate how many groups you're going to
get when you filter on a and then group by b that doesn't assume
uniform distribution. And we need also need something that doesn't
just output a number of groups, but gives you some inkling of the
sizes of those groups: 1 giant group and a bunch of little ones isn't
the same as a bunch of equally sized groups. I don't know that these
are things we really have much chance of figuring out with the
currently-available information, though.

Sorry if this is hijacking the thread a bit; I don't mean to
discourage work on this specific patch. I'm just wondering if we need
to think a little bigger to see our way to a good solution.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

22 июня 2020 г., 22:25:30

On Mon, Jun 22, 2020 at 11:50:49AM -0400, Robert Haas wrote:
>On Sat, May 16, 2020 at 10:56 AM Tomas Vondra
><tomas.vondra@2ndquadrant.com> wrote:
>> The local effects are trivial - it's for example reordering the
>> pathkeys to make the explicit sort as cheap as possible. This thread
>> already discussed a number of things to base this on - ndistinct for
>> columns, cost of comparison function, ... In any case, this is
>> something we can decide locally, when building the grouping paths.
>>
>> The global effects are much harder to tackle, because the decision
>> can't be made locally when building the grouping paths. It requires
>> changes both below and above the point where we build grouping paths.
>>
>> An example of a decision we need to make before we even get to
>> building a grouping path is which index paths to build. Currently we
>> only build index paths with "useful" pathkeys, and without tweaking
>> that we'll never even see the index in add_paths_to_grouping_rel().
>>
>> But there are also decisions that can be made only after we build the
>> grouping paths. For example, we may have both GROUP BY and ORDER BY,
>> and there is no "always correct" way to combine those. In some cases
>> it may be correct to use the same pathkeys, in other cases it's
>> better to use different ones (which will require an extra Sort, with
>> additional cost).
>>
>> So I don't think there will be a single "interesting" grouping
>> pathkeys (i.e. root->group_pathkeys), but a collection of pathkeys.
>> And we'll need to build grouping paths for all of those, and leave
>> the planner to eventually pick the one giving us the cheapest plan
>> ...
>
> I agree with all of this and I think it's really good analysis. Part
> of the reason why the planner isn't that sophisticated in this area is
> that, for a long time, we didn't use paths at this level, and so it
> was much harder to write any kind of viable patch to consider
> alternatives.  With Tom's planner path-ification word there should be
> a lot more potential for optimization here, but, as you say, we need
> to do that by leveraging the existing costing machinery, not just via
> simple heuristics.

Agreed.

> It also strikes me that one of the problems in this area is that the
> statistics we currently gather don't seem to be entirely useful or
> reliable for aggregate planning.

I don't think that's an immediate issue. I don't think anyone started
implementing these optimizations and decided not to do that because of
lack of statistics.

The reason why we don't have those optimizations is more that we decided
not to even consider those optimizations, possibly because of planner
limitations. Of course, the optimizations may require additional stats,
but that's only step #2.

> I wonder if there are extensions to the extended statistics mechanism,
> or even just more things we should gather during a routine ANALYZE,
> that would enable us to estimate things better here. The most obvious
> thing is that n_distinct is often wildly inaccurate, but it's deeper
> than that. For instance, we need some way to estimate how many groups
> you're going to get when you filter on a and then group by b that
> doesn't assume uniform distribution. And we need also need something
> that doesn't just output a number of groups, but gives you some inkling
> of the sizes of those groups: 1 giant group and a bunch of little ones
> isn't the same as a bunch of equally sized groups. I don't know that
> these are things we really have much chance of figuring out with the
> currently-available information, though.
>

Not sure. I think the extended stats we have now (ndistinct coeffs,
multi-column MCV lists) should be good basis for these decisions, even
with skewed data. Of course, we may need to add something else, but I'm
not sure what would that be.

The main limitation of extended stats is it's at the per-table level.
Once you do a join, it's all over - we don't have that capability :-(

> Sorry if this is hijacking the thread a bit; I don't mean to
> discourage work on this specific patch. I'm just wondering if we need
> to think a little bigger to see our way to a good solution.
>

I think it's fine continuing on this optimization. Either we'll be able
to get it done with existing stats, or we'll find what other stats we
need to sufficiently good heuristics.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: GROUP BY optimization

От

Pavel Borisov

Дата:

26 октября 2020 г., 08:44:58

I wonder if anyone has plans to try again with this optimization in v14
cycle? The patches no longer apply thanks to the incremental sort patch,
but I suppose fixing that should not be extremely hard.

I found this feature interesting and I'd like to join.

PFA the updated version of the patch made fully compatible with changes in v13 and I suppose it is ready to be reviewed for v14.

Main things changed:

1. Patch is made compatible with new lists representation ( 1cff1b95ab6d )

2. Patch is made compatible with the incremental sort ( d2d8a229bc58 and ba3e76cc571e )

3. Tests are changed to represent changes in keys numbering and naming convention in the plan ( 55a1954da16 and 6ef77cf46e8 )

As always your ideas are very much welcome!

Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Вложения

v11-0001-GROUP-BY-optimization-made-compatible-with-chang.patch

Re: POC: GROUP BY optimization

От

Dmitry Dolgov

Дата:

26 октября 2020 г., 08:57:21

> On Mon, Oct 26, 2020 at 12:44:58PM +0400, Pavel Borisov wrote:
> >
> >
> > I wonder if anyone has plans to try again with this optimization in v14
> > cycle? The patches no longer apply thanks to the incremental sort patch,
> > but I suppose fixing that should not be extremely hard.
> >
> > I found this feature interesting and I'd like to join.
> PFA the updated version of the patch made fully compatible with changes in
> v13 and I suppose it is ready to be reviewed for v14.
>
> Main things changed:
> 1. Patch is made compatible with new lists representation ( 1cff1b95ab6d )
> 2. Patch is made compatible with the incremental sort ( d2d8a229bc58 and
> ba3e76cc571e )
> 3. Tests are changed to represent changes in keys numbering and
> naming convention in the plan ( 55a1954da16 and 6ef77cf46e8 )
>
> As always your ideas are very much welcome!

Hi,

Thanks for your interest! FYI there is a new thread about this topic [1]
with the next version of the patch and more commentaries (I've created
it for visibility purposes, but probably it also created some confusion,
sorry for that).

[1]: https://www.postgresql.org/message-id/flat/CA%2Bq6zcW_4o2NC0zutLkOJPsFt80megSpX_dVRo6GK9PC-Jx_Ag%40mail.gmail.com

Re: POC: GROUP BY optimization

От

Pavel Borisov

Дата:

26 октября 2020 г., 09:28:59

Thanks for your interest! FYI there is a new thread about this topic [1]
with the next version of the patch and more commentaries (I've created
it for visibility purposes, but probably it also created some confusion,
sorry for that).

Thanks!

I made a very quick look at your updates and noticed that it is intended to be simple and some parts of the code are removed as they have little test coverage. I'd propose vice versa to increase test coverage to enjoy more precise cost calculation and probably partial grouping.

Or maybe it's worth to benchmark both patches and then re-decide what we want more to have a more complicated or a simpler version.

Good to know that this feature is not stuck anymore and we have more than one proposal.

Thanks!

Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: POC: GROUP BY optimization

От

Dmitry Dolgov

Дата:

26 октября 2020 г., 10:40:40

> On Mon, Oct 26, 2020 at 01:28:59PM +0400, Pavel Borisov wrote:
> > Thanks for your interest! FYI there is a new thread about this topic [1]
> > with the next version of the patch and more commentaries (I've created
> > it for visibility purposes, but probably it also created some confusion,
> > sorry for that).
> >
> > Thanks!
>
> I made a very quick look at your updates and noticed that it is intended to
> be simple and some parts of the code are removed as they have little test
> coverage. I'd propose vice versa to increase test coverage to enjoy more
> precise cost calculation and probably partial grouping.
>
> Or maybe it's worth to benchmark both patches and then re-decide what we
> want more to have a more complicated or a simpler version.
>
> Good to know that this feature is not stuck anymore and we have more than
> one proposal.
> Thanks!

Just to clarify, the patch that I've posted in another thread mentioned
above is not an alternative proposal, but a development of the same
patch I had posted in this thread. As mentioned in [1], reduce of
functionality is an attempt to reduce the scope, and as soon as the base
functionality looks good enough it will be returned back.

[1]: https://www.postgresql.org/message-id/flat/CA%2Bq6zcW_4o2NC0zutLkOJPsFt80megSpX_dVRo6GK9PC-Jx_Ag%40mail.gmail.com

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

27 октября 2020 г., 20:19:51

On Mon, Oct 26, 2020 at 11:40:40AM +0100, Dmitry Dolgov wrote:
>> On Mon, Oct 26, 2020 at 01:28:59PM +0400, Pavel Borisov wrote:
>> > Thanks for your interest! FYI there is a new thread about this topic [1]
>> > with the next version of the patch and more commentaries (I've created
>> > it for visibility purposes, but probably it also created some confusion,
>> > sorry for that).
>> >
>> > Thanks!
>>
>> I made a very quick look at your updates and noticed that it is intended to
>> be simple and some parts of the code are removed as they have little test
>> coverage. I'd propose vice versa to increase test coverage to enjoy more
>> precise cost calculation and probably partial grouping.
>>
>> Or maybe it's worth to benchmark both patches and then re-decide what we
>> want more to have a more complicated or a simpler version.
>>
>> Good to know that this feature is not stuck anymore and we have more than
>> one proposal.
>> Thanks!
>
>Just to clarify, the patch that I've posted in another thread mentioned
>above is not an alternative proposal, but a development of the same
>patch I had posted in this thread. As mentioned in [1], reduce of
>functionality is an attempt to reduce the scope, and as soon as the base
>functionality looks good enough it will be returned back.
>

I find it hard to follow two similar threads trying to do the same (or
very similar) things in different ways. Is there any chance to join
forces and produce a single patch series merging the changes? With the
"basic" functionality at the beginning, then patches with the more
complex stuff. That's the usual way, I think.

As I said in my response on the other thread [1], I think constructing
additional paths with alternative orderings of pathkeys is the right
approach. Otherwise we can't really deal with optimizations above the
place where we consider this optimization.

That's essentially what I was trying in explain May 16 response [2]
when I actually said this:

    So I don't think there will be a single "interesting" grouping
    pathkeys (i.e. root->group_pathkeys), but a collection of pathkeys.
    And we'll need to build grouping paths for all of those, and leave
    the planner to eventually pick the one giving us the cheapest plan.

I wouldn't go as far as saying the approach in this patch (i.e. picking
one particular ordering) is doomed, but it's going to be very hard to
make it work reliably. Even if we get the costing *at this node* right,
who knows how it'll affect costing of the nodes above us?

So if I can suggest something, I'd merge the two patches, adopting the
path-based approach. With the very basic functionality/costing in the
first patch, and the more advanced stuff in additional patches.

Does that make sense?

regards

[1] https://www.postgresql.org/message-id/20200901210743.lutgvnfzntvhuban%40development

[2] https://www.postgresql.org/message-id/20200516145609.vm7nrqy7frj4ha6r%40development

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: POC: GROUP BY optimization

От

Dmitry Dolgov

Дата:

29 октября 2020 г., 14:50:25

> On Tue, Oct 27, 2020 at 09:19:51PM +0100, Tomas Vondra wrote:
> On Mon, Oct 26, 2020 at 11:40:40AM +0100, Dmitry Dolgov wrote:
> > > On Mon, Oct 26, 2020 at 01:28:59PM +0400, Pavel Borisov wrote:
> > > > Thanks for your interest! FYI there is a new thread about this topic [1]
> > > > with the next version of the patch and more commentaries (I've created
> > > > it for visibility purposes, but probably it also created some confusion,
> > > > sorry for that).
> > > >
> > > > Thanks!
> > >
> > > I made a very quick look at your updates and noticed that it is intended to
> > > be simple and some parts of the code are removed as they have little test
> > > coverage. I'd propose vice versa to increase test coverage to enjoy more
> > > precise cost calculation and probably partial grouping.
> > >
> > > Or maybe it's worth to benchmark both patches and then re-decide what we
> > > want more to have a more complicated or a simpler version.
> > >
> > > Good to know that this feature is not stuck anymore and we have more than
> > > one proposal.
> > > Thanks!
> >
> > Just to clarify, the patch that I've posted in another thread mentioned
> > above is not an alternative proposal, but a development of the same
> > patch I had posted in this thread. As mentioned in [1], reduce of
> > functionality is an attempt to reduce the scope, and as soon as the base
> > functionality looks good enough it will be returned back.
> >
>
> I find it hard to follow two similar threads trying to do the same (or
> very similar) things in different ways. Is there any chance to join
> forces and produce a single patch series merging the changes? With the
> "basic" functionality at the beginning, then patches with the more
> complex stuff. That's the usual way, I think.
>
> As I said in my response on the other thread [1], I think constructing
> additional paths with alternative orderings of pathkeys is the right
> approach. Otherwise we can't really deal with optimizations above the
> place where we consider this optimization.
>
> That's essentially what I was trying in explain May 16 response [2]
> when I actually said this:
>
>    So I don't think there will be a single "interesting" grouping
>    pathkeys (i.e. root->group_pathkeys), but a collection of pathkeys.
>    And we'll need to build grouping paths for all of those, and leave
>    the planner to eventually pick the one giving us the cheapest plan.
>
> I wouldn't go as far as saying the approach in this patch (i.e. picking
> one particular ordering) is doomed, but it's going to be very hard to
> make it work reliably. Even if we get the costing *at this node* right,
> who knows how it'll affect costing of the nodes above us?
>
> So if I can suggest something, I'd merge the two patches, adopting the
> path-based approach. With the very basic functionality/costing in the
> first patch, and the more advanced stuff in additional patches.
>
> Does that make sense?

Yes, and from what I understand it's already what had happened in the
newer thread [1]. To avoid any confusion, there are no "two patches" at
least from my side, and what I've posted in [1] is the continuation of
this work, but with path-based approach adopted and a bit less
functionality (essentially I've dropped everything what was not covered
with tests in the original patch).

In case if I'm missing something and Pavel's proposal is significantly
different from the original patch (if I understand correctly, at the
moment the latest patch posted here is a rebase and adjusting the old
patch to work with the latest changes in master, right?), then indeed
they could be merged, but please in the newer thread [1].

[1]: https://www.postgresql.org/message-id/flat/CA%2Bq6zcW_4o2NC0zutLkOJPsFt80megSpX_dVRo6GK9PC-Jx_Ag%40mail.gmail.com

Re: POC: GROUP BY optimization

От

Pavel Borisov

Дата:

29 октября 2020 г., 15:14:50

In case if I'm missing something and Pavel's proposal is significantly
different from the original patch (if I understand correctly, at the
moment the latest patch posted here is a rebase and adjusting the old
patch to work with the latest changes in master, right?), then indeed
they could be merged, but please in the newer thread [1].

Sure, my patch has the only difference from the previous Theodor's code for compatibility with v.13, though it is not small, and I appreciate the changes in paths processing. The only thing that caused my notice, is that some useful changes which I've mentioned before, are discarded now. But as long as they are planned to be put in later it is completely fine. I agree to discuss the thing in any thread, though I don't quite understand the reason for a switch.

Still I don't see a problem.

Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: POC: GROUP BY optimization

От

Ibrar Ahmed

Дата:

13 июля 2021 г., 11:37:01

On Wed, Mar 10, 2021 at 4:05 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

Hi,

I take a look at the patch today. Attached is the v12 and a separate
patch with some comment tweaks and review comments.

1) I see the patch results in some plan changes in postgres_fdw. I
assume it's somehow related to the sort costing changes, but I find it a
bit suspicious. Why should the plan change from this:

Foreign Scan
Remote SQL: ... ORDER BY ... OFFSET 100 LIMIT 10;

to

Limit
Foreign Scan
Remote SQL: ... ORDER BY ...

But even if this is due to changes to order by costing, postponing the
limit seems like a net loss - the sort happens before the limit, and by
postponing the limit after foreign scan we need to transfer 100 more
rows. So what's causing this?

The difference in plan cost seems fairly substantial (old: 196.52, new:
241.94). But maybe that's OK and expected.

2) I wonder how much more expensive (in terms of CPU) is the new sort
costing code. It's iterative, and it's calling functions to calculate
number of groups etc. I wonder what's the impact simple queries?

3) It's not clear to me why we need "fake var" protection? Why would the
estimate_num_groups() fail for that?

4) In one of the places a comment says DEFAULT_NUM_DISTINCT is not used
because we're dealing with multiple columns. That makes sense, but maybe
we should use DEFAULT_NUM_DISTINCT at least to limit the range. That is,
with N columns we should restrict the nGroups estimate by:

min = DEFAULT_NUM_DISTINCT
max = Min(pow(DEFAULT_NUM_DISTINCT, ncolumns), tuples)

Also, it's not clear what's the logic behind the formula:

nGroups = ceil(2.0 + sqrt(tuples) *
list_length(pathkeyExprs) / list_length(pathkeys));

A comment explaining that would be helpful.

5) The compute_cpu_sort_cost comment includes two references (in a quite
mixed-up way), and then there's another reference in the code. I suggest
we list all of them in the function comment.

6) There's a bunch of magic values (various multipliers etc.). It's not
clear which values are meant to be equal, etc. Let's introduce nicer
constants with reasonable names.

7) The comment at get_cheapest_group_keys_order is a bit misleading,
because it talks about "uniqueness" - but that's not what we're looking
at here (I mean, is the column unique or not). We're looking at number
of distinct values in the column, which is a different thing. Also, it'd
be good to roughly explain what get_cheapest_group_keys_order does - how
it calculates the order (permutations up to 4 values, etc.).

8) The comment at compute_cpu_sort_cost should also explain basics of
the algorithm. I tried doing something like that, but I'm not sure I got
all the details right and it probably needs further improvements.

9) The main concern I have is still about the changes in planner.c, and
I think I've already shared it before. The code takes the grouping
pathkeys, and just reshuffles them to what it believes is a better order
(cheaper, ...). That is, there's on input pathkey list, and it gets
transformed into another pathkey list. The problem is that even if this
optimizes the sort, it may work against some optimization in the upper
part of the plan.

Imagine a query that does something like this:

SELECT a, b, count(*) FROM (
SELECT DISTINCT a, b, c, d FROM x
) GROUP BY a, b;

Now, from the costing perspective, it may be cheaper to do the inner
grouping by "c, d, a, b" for example. But that works directly against
the second grouping, which would benefit from having the input sorted by
"a, b". How exactly would this behaves depends on the number of distinct
values in the columns, how expensive the comparisons are for each
column, and so on. But you get the idea.

I admit I haven't managed to construct a reasonably query that'd be
significantly harmed by this, but I haven't spent a lot of time on it.

I'm pretty sure I used this trick in the past, when doing aggregations
on large data sets, where it was much better to "pipe" the data through
multiple GroupAggregate nodes.

For this reason I think the code generating paths should work a bit like
get_useful_pathkeys_for_relation and generate_useful_gather_paths.

* group_keys_reorder_by_pathkeys should not produce just one list of
pathkeys, but multiple interesting options. At the moment it should
produce at least the input grouping pathkeys (as specified in the
query), and the "optimal" pathkeys (by lowest cost).

* the places calling group_keys_reorder_by_pathkeys should loop on the
result, and generate separate path for each option.

I'd guess in the future we'll "peek forward" in the plan and see if
there are other interesting pathkeys (that's the expectation for
get_useful_pathkeys_for_relation).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

The patch does not apply successfully, can you please take a look at that

http://cfbot.cputube.org/patch_33_1651.log

=== applying patch ./0001-v12-20210309.patch

atching file src/include/utils/selfuncs.h
Hunk #1 FAILED at 198.
1 out of 1 hunk FAILED -- saving rejects to file src/include/utils/selfuncs.h.rej

Based on @Tomas Vondra comments, and patch does not apply, I am changing the status to "Waiting On Author".

--

Ibrar Ahmed

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

20 июля 2021 г., 13:12:38

Hi,

here is an updated version of this patch, with some significant changes.

The main change that instead of calling get_cheapest_group_keys_order
directly, the planner now calls get_useful_group_keys_orderings and gets
multiple "interesting" pathkey orderings instead of just a single one.
The callers then loop over these orderings and construct paths for all
of them. This idea is similar to get_useful_pathkeys_for_relation()
added by incremental sort.

FWIW this addresses point (9) from my last review - I started with it,
because it was the main thing affecting the overall architecture. The
remaining bits are more "local".

I haven't investigated how expensive those changes are (in terms of
planning overhead), but the number of extra orderings is fairly low, and
I'd expect most of the paths to be eliminated fairly quickly.

I've also added / improved a number of comments etc. but I'm sure more
cleanup is needed.


The other comments from the review still apply - I'm particularly
concerned about the (1) point, i.e. plan changes in postgres_fdw. Those
seem to be rather strange (LIMIT not being pushed down in queries
without any grouping). I'd bet this is due to changes in sort costing
and does not seem very desirable.



regards


[1]
https://www.postgresql.org/message-id/22c44f98-bfa8-8630-62b5-5155e11eb284%40enterprisedb.com

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

group-by-reorder-20210720.patch

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

20 июля 2021 г., 23:08:40

On 7/20/21 3:12 PM, Tomas Vondra wrote:
> ...
>
> The other comments from the review still apply - I'm particularly
> concerned about the (1) point, i.e. plan changes in postgres_fdw. Those
> seem to be rather strange (LIMIT not being pushed down in queries
> without any grouping). I'd bet this is due to changes in sort costing
> and does not seem very desirable.
> 

I did look at this a bit closer, and yes - this very much seems like a
costing issue in the patch. The first query in postgres_fdw that changes
switches from a query with LIMIT pushed to remote

                              QUERY PLAN
------------------------------------------------------------------------
 Foreign Scan  (cost=196.29..196.52 rows=10 width=14)
   Output: t1.c1, t2.c1, t1.c3
   Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
   Remote SQL: SELECT r1."C 1", r2."C 1", r1.c3 FROM ("S 1"."T 1" r1
               INNER JOIN "S 1"."T 1" r2 ON ... LIMIT ... OFFSET ...
(4 rows)

to the LIMIT executed locally

                              QUERY PLAN
-------------------------------------------------------------------------
 Limit  (cost=241.72..241.94 rows=10 width=14)
   Output: t1.c1, t2.c1, t1.c3
   ->  Foreign Scan  (cost=239.47..261.97 rows=1000 width=14)
         Output: t1.c1, t2.c1, t1.c3
         Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
         Remote SQL: SELECT r1."C 1", r2."C 1", r1.c3 FROM ("S 1"."T 1"
         r1 INNER JOIN ...
(6 rows)

The FDW code runs explain on both queries - with and without LIMIT
pushed to remote side, to get estimates, and without the patch it gets this:

                                  QUERY PLAN
------------------------------------------------------------------------------
 Sort  (cost=106.97..109.47 rows=1000 width=14)
   Sort Key: r1.c3, r1."C 1"
   ->  Hash Join  (cost=33.50..57.14 rows=1000 width=14)
         Hash Cond: (r1."C 1" = r2."C 1")
         ->  Seq Scan on "T 1" r1  (cost=0.00..21.00 rows=1000 width=10)
         ->  Hash  (cost=21.00..21.00 rows=1000 width=4)
               ->  Seq Scan on "T 1" r2  (cost=0.00..21.00 rows=1000
width=4)
(7 rows)

                                     QUERY PLAN
------------------------------------------------------------------------------------
 Limit  (cost=96.29..96.32 rows=10 width=14)
   ->  Sort  (cost=96.04..98.54 rows=1000 width=14)
         Sort Key: r1.c3, r1."C 1"
         ->  Hash Join  (cost=33.50..57.14 rows=1000 width=14)
               Hash Cond: (r1."C 1" = r2."C 1")
               ->  Seq Scan on "T 1" r1  (cost=0.00..21.00 rows=1000
width=10)
               ->  Hash  (cost=21.00..21.00 rows=1000 width=4)
                     ->  Seq Scan on "T 1" r2  (cost=0.00..21.00
rows=1000 width=4)
(8 rows)


while with the patch it gets this:

                                  QUERY PLAN
------------------------------------------------------------------------------
 Sort  (cost=139.47..141.97 rows=1000 width=14)
   Sort Key: r1.c3, r1."C 1"
   ->  Hash Join  (cost=33.50..57.14 rows=1000 width=14)
         Hash Cond: (r1."C 1" = r2."C 1")
         ->  Seq Scan on "T 1" r1  (cost=0.00..21.00 rows=1000 width=10)
         ->  Hash  (cost=21.00..21.00 rows=1000 width=4)
               ->  Seq Scan on "T 1" r2  (cost=0.00..21.00 rows=1000
width=4)
(7 rows)

                                     QUERY PLAN
------------------------------------------------------------------------------------
 Limit  (cost=145.75..145.77 rows=10 width=14)
   ->  Sort  (cost=145.50..148.00 rows=1000 width=14)
         Sort Key: r1.c3, r1."C 1"
         ->  Hash Join  (cost=33.50..57.14 rows=1000 width=14)
               Hash Cond: (r1."C 1" = r2."C 1")
               ->  Seq Scan on "T 1" r1  (cost=0.00..21.00 rows=1000
width=10)
               ->  Hash  (cost=21.00..21.00 rows=1000 width=4)
                     ->  Seq Scan on "T 1" r2  (cost=0.00..21.00
rows=1000 width=4)
(8 rows)


Notice that the costs get "inverted"

                       master            patched
   ----------------------------------------------
    no limit   106.97..109.47     139.47..141.97
    limit       96.29.. 96.32     145.75..145.77

so the limit looks a bit more expensive - just enough so that it seems
cheaper to transfer more rows and execute the limit locally.

IMO this looks rather suspicious (or wrong), and compute_cpu_sort_cost
may need some fixes. I wonder why differs from the old code so much ...


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

21 июля 2021 г., 22:58:44

Hi,

after a bit more time spent on this, I found that the issue is due to
this chunk of code in

    if (heapSort)
    {
        if (tuplesPerPrevGroup < output_tuples)
            /* comparing only inside output_tuples */
            correctedNGroups =
                ceil(2.0 * output_tuples /
                     (tuplesPerPrevGroup / nGroups));
        else
            /* two groups - in output and out */
            correctedNGroups = 2.0;
    }
    else
        correctedNGroups = nGroups;

    if (correctedNGroups <= 1.0)
        correctedNGroups = 2.0;
    else
        correctedNGroups = ceil(correctedNGroups);
    per_tuple_cost += totalFuncCost * LOG2(correctedNGroups);


There's a couple issues, mostly due to differences in handling of cases
with different heapSort flag. A full-sort (no LIMIT clause) we have
heapSort=false, and hence the execution simply jumps to

    correctedNGroups = nGroups;

while for LIMIT we do heapSort=true, in which case we also start with

    tuplesPerPrevGroup = ntuples;

That is almost certainly greater than output_tuples (=limit+offset), so
the first if condition in calculating correctedNGroups can't possibly be
true, and we simply end with

    correctedNGroups = 2.0;

in the first loop. Which seems pretty bogus - why would there be just
two groups? When processing the first expression, it's as if there was
one big "prev group" with all the tuples, so why not to just use nGroups
as it is?

This seems to confuse the costing quite a bit - enough to produce the
"inversed" costs with/without LIMIT, and picking the other plan.

I've simplified the costing a bit, and the attached version actually
undoes all the "suspicious" plan changes in postgres_fdw. It changes one
new plan, but that seems somewhat reasonable, as it pushes sort to the
remote side.

But after looking at the costing function, I have a bunch of additional
comments and questions:


1) I looked at the resources mentioned as sources the formulas came
from, but I've been unable to really match the algorithm to them. The
quicksort paper is particularly "dense", the notation seems to be very
different, and none of the theorems seem like an obvious fit. Would be
good to make the relationship clearer in comments etc.

For the Sedgewick course it's even worse - it's way too extensive to
just point at it and say "We're using ideas from this!" because no one
is going to know which chapter/section to look at. We need to be a bit
more specific about the reference.


2) I'm a bit puzzled by the "heapsort" chunk, actually. How come we need
it now, when we didn't need that before? In a way, the difference in
behavior between heasort and non-heapsort is what triggered the plan
changes ...

FWIW It's quite possible I tweaked the costing incorrectly, but it ends
up choosing the right plans purely by accident.


3) I'm getting a bit skeptical about the various magic coefficients that
are meant to model higher costs with non-uniform distribution. But
consider that we do this, for example:

   tuplesPerPrevGroup = ceil(1.5 * tuplesPerPrevGroup / nGroups);

but then in the next loop we call estimate_num_groups_incremental and
pass this "tweaked" tuplesPerPrevGroup value to it. I'm pretty sure this
may have various strange consequences - we'll calculate the nGroups
based on the inflated value, and we'll calculate tuplesPerPrevGroup from
that again - that seems susceptible to "amplification".

We inflate tuplesPerPrevGroup by 50%, which means we'll get a higher
nGroups estimate in the next loop - but not linearly. An then we'll
calculate the inflated tuplesPerPrevGroup and estimated nGroup ...

That seems pretty dubious, with hard to explain behavior, IMO.

If we want to keep applying these coefficients, we need to do that in a
way that does not affect the subsequent loop like this - we might tweak
the per_tuple_cost formula, for example, not tuplesPerPrevGroup.


4) I'm not sure it's actually a good idea to pass tuplesPerPrevGroup to
estimate_num_groups_incremental. In principle yes, if we use "group
size" from the previous step, then the returned value is the number of
new groups after adding the "new" pathkey.

But even if we ignore the issues with amplification mentioned in (3),
there's an issue with non-linear behavior in estimate_num_groups,
because at some point it's calculating

    D(N,n,p) = n * (1 - ((N-p)/N)^(N/n))

where N - total rows, p - sample size, n - number of distinct values.
And if we have (N1,n1) and (N2,n2) then the ratio of calculated
estimated (which is pretty much what calculating group size does)

    D(N2,n2,p2) / D(N1,n1,p1)

which will differ depending on p1 and p2. And if we're tweaking the
tuplesPerPrevGroup all the time, that's really annoying, as it may make
the groups smaller or larger, which is unpredictable and annoying, and I
wonder if it might go against the idea of penalizing tuplesPerPrevGroup
to some extent.

We could simply use the input "tuples" value here, and then divide the
current and previous estimate to calculate the number of new groups.



regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

group-by-reorder-20210722.patch

Re: POC: GROUP BY optimization

От

"Andrey V. Lepikhov"

Дата:

20 января 2022 г., 05:05:57

I keep work on this patch. Here is intermediate results.

On 7/22/21 3:58 AM, Tomas Vondra wrote:
> in the first loop. Which seems pretty bogus - why would there be just
> two groups? When processing the first expression, it's as if there was
> one big "prev group" with all the tuples, so why not to just use nGroups
> as it is?

I think, heapsort code seems very strange. Look into fallback case. It 
based on an output_tuples value. Maybe we should use nGroups value here, 
but based on a number of output_tuples?

 > 1) I looked at the resources mentioned as sources the formulas came
 > from, but I've been unable to really match the algorithm to them. The
 > quicksort paper is particularly "dense", the notation seems to be very
 > different, and none of the theorems seem like an obvious fit. Would be
 > good to make the relationship clearer in comments etc.

Fixed (See attachment).

> 3) I'm getting a bit skeptical about the various magic coefficients that
> are meant to model higher costs with non-uniform distribution. But
> consider that we do this, for example:
> 
>     tuplesPerPrevGroup = ceil(1.5 * tuplesPerPrevGroup / nGroups);
> 
> but then in the next loop we call estimate_num_groups_incremental and
> pass this "tweaked" tuplesPerPrevGroup value to it. I'm pretty sure this
> may have various strange consequences - we'll calculate the nGroups
> based on the inflated value, and we'll calculate tuplesPerPrevGroup from
> that again - that seems susceptible to "amplification".
> 
> We inflate tuplesPerPrevGroup by 50%, which means we'll get a higher
> nGroups estimate in the next loop - but not linearly. An then we'll
> calculate the inflated tuplesPerPrevGroup and estimated nGroup ...

Weighting coefficient '1.5' shows our desire to minimize the number of 
comparison operations on each next attribute of a pathkeys list.
Increasing this coef we increase a chance, that planner will order 
pathkeys by decreasing of uniqueness.
I think, it's ok.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Вложения

fixes.txt

Re: POC: GROUP BY optimization

От

Andrey Lepikhov

Дата:

21 января 2022 г., 11:09:01

On 7/22/21 3:58 AM, Tomas Vondra wrote:
> 4) I'm not sure it's actually a good idea to pass tuplesPerPrevGroup to
> estimate_num_groups_incremental. In principle yes, if we use "group
> size" from the previous step, then the returned value is the number of
> new groups after adding the "new" pathkey.
> But even if we ignore the issues with amplification mentioned in (3),
> there's an issue with non-linear behavior in estimate_num_groups,
> because at some point it's calculating
> 
>     D(N,n,p) = n * (1 - ((N-p)/N)^(N/n))
> 
> where N - total rows, p - sample size, n - number of distinct values.
> And if we have (N1,n1) and (N2,n2) then the ratio of calculated
> estimated (which is pretty much what calculating group size does)
> 
>     D(N2,n2,p2) / D(N1,n1,p1)
> 
> which will differ depending on p1 and p2. And if we're tweaking the
> tuplesPerPrevGroup all the time, that's really annoying, as it may make
> the groups smaller or larger, which is unpredictable and annoying, and I
> wonder if it might go against the idea of penalizing tuplesPerPrevGroup
> to some extent.
> We could simply use the input "tuples" value here, and then divide the
> current and previous estimate to calculate the number of new groups.
tuplesPerPrevGroup is only a top boundary for the estimation. I think, 
we should use result of previous estimation as a limit for the next 
during incremental estimation. Maybe we simply limit the 
tuplesPerPrevGroup value to ensure of monotonic non-growth of this 
value? - see in attachment patch to previous fixes.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Вложения

guard_tuplesPerPrevGroup.txt

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

21 января 2022 г., 20:34:43


On 1/21/22 12:09, Andrey Lepikhov wrote:
> On 7/22/21 3:58 AM, Tomas Vondra wrote:
>> 4) I'm not sure it's actually a good idea to pass tuplesPerPrevGroup to
>> estimate_num_groups_incremental. In principle yes, if we use "group
>> size" from the previous step, then the returned value is the number of
>> new groups after adding the "new" pathkey.
>> But even if we ignore the issues with amplification mentioned in (3),
>> there's an issue with non-linear behavior in estimate_num_groups,
>> because at some point it's calculating
>>
>>     D(N,n,p) = n * (1 - ((N-p)/N)^(N/n))
>>
>> where N - total rows, p - sample size, n - number of distinct values.
>> And if we have (N1,n1) and (N2,n2) then the ratio of calculated
>> estimated (which is pretty much what calculating group size does)
>>
>>     D(N2,n2,p2) / D(N1,n1,p1)
>>
>> which will differ depending on p1 and p2. And if we're tweaking the
>> tuplesPerPrevGroup all the time, that's really annoying, as it may make
>> the groups smaller or larger, which is unpredictable and annoying, and I
>> wonder if it might go against the idea of penalizing tuplesPerPrevGroup
>> to some extent.
>> We could simply use the input "tuples" value here, and then divide the
>> current and previous estimate to calculate the number of new groups.
 >
> tuplesPerPrevGroup is only a top boundary for the estimation. I think, 
> we should use result of previous estimation as a limit for the next 
> during incremental estimation. Maybe we simply limit the 
> tuplesPerPrevGroup value to ensure of monotonic non-growth of this 
> value? - see in attachment patch to previous fixes.
> 

Yes, I think that's a reasonable defense - it makes no sense to exceed 
the group size in the preceding step, which could happen with small 
number of groups or some unexpected ndistinct estimate.

The other thing we could do is reduce the coefficient gradually - so 
it'd be 1.5 for the first pathkey, then 1.25 for the next one, and so 
on. But it seems somewhat arbitrary (I certainly don't have some sound 
theoretical justification ...).

I've merged most of the fixes you reported. I've skipped the path_save 
removal in planner.c, because that seems incorrect - if there are 
multiple pathkeys, we must start with the original path, not the 
modified one we built in the last iteration. Or am I missing something?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

v20220121-0001-GROUP-BY-reordering.patch

Re: POC: GROUP BY optimization

От

Andrey Lepikhov

Дата:

24 января 2022 г., 05:02:03

On 22/1/2022 01:34, Tomas Vondra wrote:
> The other thing we could do is reduce the coefficient gradually - so 
> it'd be 1.5 for the first pathkey, then 1.25 for the next one, and so 
> on. But it seems somewhat arbitrary (I certainly don't have some sound 
> theoretical justification ...).

I think, it hasn't a reason to increase complexity without any theory at 
the bottom. Simple solution allows to realize problems much faster, if 
they arise.

> ... I've skipped the path_save 
> removal in planner.c, because that seems incorrect - if there are 
> multiple pathkeys, we must start with the original path, not the 
> modified one we built in the last iteration. Or am I missing something
You are right, I misunderstood the idea of path_save variable.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

"Andrey V. Lepikhov"

Дата:

01 февраля 2022 г., 09:48:11

> On 7/22/21 3:58 AM, Tomas Vondra wrote:
> I've simplified the costing a bit, and the attached version actually
> undoes all the "suspicious" plan changes in postgres_fdw. It changes one
> new plan, but that seems somewhat reasonable, as it pushes sort to the
> remote side.

I tried to justify heap-sort part of the compute_cpu_sort_cost() routine 
and realized, that here we may have a mistake.
After a week of efforts, I don't found any research papers on dependence 
of bounded heap-sort time compexity on number of duplicates.

So, I suppose self-made formula, based on simple logical constructions:

1. We should base on initial formula: cost ~ N*LOG2(M), where M - 
output_tuples.
2. Realize, that full representation of this formula is:

cost ~ N*LOG2(min{N,M})

3. In the case of multicolumn, number of comparisons for each next 
column can be estimated by the same formula, but arranged to a number of 
tuples per group:

comparisons ~ input * LOG2(min{input,M})

4. Realize, that for the case, when M > input, we should change this 
formula a bit:

comparisons ~ max{input,M} * LOG2(min{input,M})

Remember, that in our case M << tuples.
So, general formula for bounded heap sort can be written as:

cost ~ N * sum(max{n_i,M}/N * LOG2(min{n_i,M})), i=1,ncols

where n_1 == N, n_i - number of tuples per group, estimated from 
previous iteration.

In attachment - an implementation of this approach.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Вложения

bounded_heap_sort_fix.txt

Re: POC: GROUP BY optimization

От

"Andrey V. Lepikhov"

Дата:

10 февраля 2022 г., 09:00:37

On 1/22/22 01:34, Tomas Vondra wrote:
> 

I rebased (with minor fixes) this patch onto current master.

Also, second patch dedicated to a problem of "varno 0" (fake_var).
I think, this case should make the same estimations as in the case of 
varno != 0, but no any stats found. So I suppose to restrict number of 
groups with min of a number of incoming tuples and DEFAULT_NUM_DISTINCT 
value.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Вложения

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

15 марта 2022 г., 08:26:47

On 2/10/22 10:00, Andrey V. Lepikhov wrote:
> On 1/22/22 01:34, Tomas Vondra wrote:
>>
> 
> I rebased (with minor fixes) this patch onto current master.
> 
> Also, second patch dedicated to a problem of "varno 0" (fake_var).
> I think, this case should make the same estimations as in the case of
> varno != 0, but no any stats found. So I suppose to restrict number of
> groups with min of a number of incoming tuples and DEFAULT_NUM_DISTINCT
> value.
> 

Thanks for the rebase. The two proposed changes (tweaked costing and
simplified fake_var handling) seem fine to me. I think the last thing
that needs to be done is cleanup of the debug GUCs, which I added to
allow easier experimentation with the patch.

I probably won't remove the GUCs entirely, though. I plan to add a
single GUC that would enable/disable this optimization. I'm not a huge
fan of adding more and more GUCs, but in this case it's probably the
right thing to do given the complexity of estimating cost with
correlated columns etc.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: POC: GROUP BY optimization

От

"Andrey V. Lepikhov"

Дата:

18 марта 2022 г., 07:06:57

On 3/15/22 13:26, Tomas Vondra wrote:
> Thanks for the rebase. The two proposed changes (tweaked costing and
> simplified fake_var handling) seem fine to me. I think the last thing
> that needs to be done is cleanup of the debug GUCs, which I added to
> allow easier experimentation with the patch.
Thanks, I'm waiting for the last step.
> 
> I probably won't remove the GUCs entirely, though. I plan to add a
> single GUC that would enable/disable this optimization. I'm not a huge
> fan of adding more and more GUCs, but in this case it's probably the
> right thing to do given the complexity of estimating cost with
> correlated columns etc.
Agree. Because it is a kind of automation we should allow user to switch 
it off in the case of problems or manual tuning.

Also, I looked through this patch. It has some minor problems:
1. Multiple typos in the patch comment.
2. The term 'cardinality of a key' - may be replace with 'number of 
duplicates'?

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

29 марта 2022 г., 00:48:50

Hi,

Here's a rebased/improved version of the patch, with smaller parts
addressing various issues. There are seven parts:

0001 - main part, just rebased

0002 - replace the debug GUC options with a single GUC to disable the
       optimization if needed

0003 - minor code cleanup, removal of unnecessary variable

0004 - various comment fixes (rewordings, typos, ...)

0005 - a minor code simplification, addressing FIXMEs from 0004

0006 - adds the new GUC to the docs

0007 - demonstrates plan changes with a disabled optimization

The first 6 parts should be squashed and committed at one, I only kept
them separate for clarity. The 0007 is merely a demonstration of the new
GUC and that it disables the optimization.

> Agree. Because it is a kind of automation we should allow user to switch
> it off in the case of problems or manual tuning.
> > Also, I looked through this patch. It has some minor problems:
> 1. Multiple typos in the patch comment.

I went through the comments and checked all of them for grammar mistakes
and typos using a word processor, so hopefully that should be OK. But
maybe there's still something wrong.

> 2. The term 'cardinality of a key' - may be replace with 'number of
> duplicates'?

No, cardinality means "number of distinct values", so "duplicates" would
be wrong. And I think "cardinality" is well established term, so I think
it's fine.

BTW I named the GUC enable_group_by_reordering, I wonder if it should be
named differently, e.g. enable_groupby_reordering? Opinions?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: POC: GROUP BY optimization

От

Zhihong Yu

Дата:

29 марта 2022 г., 02:02:51

On Mon, Mar 28, 2022 at 5:49 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

Hi,

Here's a rebased/improved version of the patch, with smaller parts
addressing various issues. There are seven parts:

0001 - main part, just rebased

0002 - replace the debug GUC options with a single GUC to disable the
optimization if needed

0003 - minor code cleanup, removal of unnecessary variable

0004 - various comment fixes (rewordings, typos, ...)

0005 - a minor code simplification, addressing FIXMEs from 0004

0006 - adds the new GUC to the docs

0007 - demonstrates plan changes with a disabled optimization

The first 6 parts should be squashed and committed at one, I only kept
them separate for clarity. The 0007 is merely a demonstration of the new
GUC and that it disables the optimization.

> Agree. Because it is a kind of automation we should allow user to switch
> it off in the case of problems or manual tuning.
> > Also, I looked through this patch. It has some minor problems:
> 1. Multiple typos in the patch comment.

I went through the comments and checked all of them for grammar mistakes
and typos using a word processor, so hopefully that should be OK. But
maybe there's still something wrong.

> 2. The term 'cardinality of a key' - may be replace with 'number of
> duplicates'?

No, cardinality means "number of distinct values", so "duplicates" would
be wrong. And I think "cardinality" is well established term, so I think
it's fine.

BTW I named the GUC enable_group_by_reordering, I wonder if it should be
named differently, e.g. enable_groupby_reordering? Opinions?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Hi,

For 0001-Optimize-order-of-GROUP-BY-keys-20220328.patch:

multiple parametes need to be

parametes -> parameters

leave more expensive comparions

comparions -> comparisons

+ if (has_fake_var == false)

The above can be written as:

if (!has_fake_var)

+ nGroups = ceil(2.0 + sqrt(tuples) * (i + 1) / list_length(pathkeys));

Looks like the value of tuples doesn't change inside the loop.

You can precompute sqrt(tuples) outside the loop and store the value in a variable.

+ return -1;
+ else if (a->cost == b->cost)
+ return 0;
+ return 1;

the keyword 'else' is not needed.

+ * Returns newly allocated lists. If no reordering is possible (or needed),
+ * the lists are set to NIL.
+ */
+static bool

+get_cheapest_group_keys_order(PlannerInfo *root, double nrows,

It seems the comment for return value doesn't match the bool return type.

+ /* If this optimization is disabled, we're done. */
+ if (!debug_cheapest_group_by)

It seems enable_cheapest_group_by would be better name for the flag.

Cheers

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

29 марта 2022 г., 18:00:29

On 3/29/22 04:02, Zhihong Yu wrote:
> 
> 
> On Mon, Mar 28, 2022 at 5:49 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
> wrote:
> 
>     Hi,
> 
>     Here's a rebased/improved version of the patch, with smaller parts
>     addressing various issues. There are seven parts:
> 
>     0001 - main part, just rebased
> 
>     0002 - replace the debug GUC options with a single GUC to disable the
>            optimization if needed
> 
>     0003 - minor code cleanup, removal of unnecessary variable
> 
>     0004 - various comment fixes (rewordings, typos, ...)
> 
>     0005 - a minor code simplification, addressing FIXMEs from 0004
> 
>     0006 - adds the new GUC to the docs
> 
>     0007 - demonstrates plan changes with a disabled optimization
> 
>     The first 6 parts should be squashed and committed at one, I only kept
>     them separate for clarity. The 0007 is merely a demonstration of the new
>     GUC and that it disables the optimization.
> 
>     > Agree. Because it is a kind of automation we should allow user to
>     switch
>     > it off in the case of problems or manual tuning.
>     > > Also, I looked through this patch. It has some minor problems:
>     > 1. Multiple typos in the patch comment.
> 
>     I went through the comments and checked all of them for grammar mistakes
>     and typos using a word processor, so hopefully that should be OK. But
>     maybe there's still something wrong.
> 
>     > 2. The term 'cardinality of a key' - may be replace with 'number of
>     > duplicates'?
> 
>     No, cardinality means "number of distinct values", so "duplicates" would
>     be wrong. And I think "cardinality" is well established term, so I think
>     it's fine.
> 
>     BTW I named the GUC enable_group_by_reordering, I wonder if it should be
>     named differently, e.g. enable_groupby_reordering? Opinions?
> 
> 
>     regards
> 
>     -- 
>     Tomas Vondra
>     EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com>
>     The Enterprise PostgreSQL Company
> 
> Hi,
> 
> For 0001-Optimize-order-of-GROUP-BY-keys-20220328.patch:
> 
> multiple parametes need to be
> 
>   parametes -> parameters
> 
> leave more expensive comparions
> 
>   comparions -> comparisons
> 
> +       if (has_fake_var == false)
> 
> The above can be written as:
> 
>        if (!has_fake_var)
> 

All of this was already fixed in one of the subsequent "fixup" patches.
Attached is a patch merging 0001-0006, which is what I proposed to get
committed.

> +           nGroups = ceil(2.0 + sqrt(tuples) * (i + 1) /
> list_length(pathkeys));
> 
> Looks like the value of tuples doesn't change inside the loop.
> You can precompute sqrt(tuples) outside the loop and store the value in
> a variable.
> 

IMHO it makes the formula harder to read, and the effect is not going to
be measurable - we're processing only a couple elements. If the loop was
~100 iterations or more, maybe it'd have impact.

> +       return -1;
> +   else if (a->cost == b->cost)
> +       return 0;
> +   return 1;
> 
> the keyword 'else' is not needed.
> 

True, but this is how comparators are usually implemented.

> + * Returns newly allocated lists. If no reordering is possible (or needed),
> + * the lists are set to NIL.
> + */
> +static bool
> +get_cheapest_group_keys_order(PlannerInfo *root, double nrows,
> 
> It seems the comment for return value doesn't match the bool return type.
> 

Yup, this is a valid point. I've fixed/reworded the comment a bit.

> +   /* If this optimization is disabled, we're done. */
> +   if (!debug_cheapest_group_by)
> 
> It seems enable_cheapest_group_by would be better name for the flag.
> 

This was renamed to enable_order_by_reordering in one of the fixup patches.

Attached is a patch merging 0001 and all the fixup patches, i.e. the
patch I plan to commit.

There was a minor test failure - the new GUC was not added to the sample
config file, so 003_check_guc.pl was failing.

I'm not including the 0007 part, because that's meant to demonstrate
plan changes with disabled optimization, which would confuse cfbot.

regards

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

0001-Optimize-order-of-GROUP-BY-keys-20220329.patch

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

30 марта 2022 г., 23:19:24

Pushed, after going through the patch once more, running check-world
under valgrind, and updating the commit message.

Thanks to everyone who reviewed/tested this patch!

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: POC: GROUP BY optimization

От

David Rowley

Дата:

13 июля 2022 г., 03:37:19

On Thu, 31 Mar 2022 at 12:19, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> Pushed, after going through the patch once more, running check-world
> under valgrind, and updating the commit message.

I'm just in this general area of the code again today and wondered
about the header comment for the preprocess_groupclause() function.

It says:

 * In principle it might be interesting to consider other orderings of the
 * GROUP BY elements, which could match the sort ordering of other
 * possible plans (eg an indexscan) and thereby reduce cost.  We don't
 * bother with that, though.  Hashed grouping will frequently win anyway.

I'd say this commit makes that paragraph mostly obsolete.  It's only
true now in the sense that we don't try orders that suit some index
that would provide pre-sorted results for a GroupAggregate path.  The
comment leads me to believe that we don't do anything at all to find a
better order, and that's not true now.

David

Re: POC: GROUP BY optimization

От

David Rowley

Дата:

15 июля 2022 г., 05:18:09

On Thu, 31 Mar 2022 at 12:19, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> Pushed, after going through the patch once more, running check-world
> under valgrind, and updating the commit message.

I'm still working in this area and I noticed that db0d67db2 updated
some regression tests in partition_aggregate.out without any care as
to what the test was testing.

The comment above the test reads:

-- Without ORDER BY clause, to test Gather at top-most path

and you've changed the expected plan from being a parallel plan with a
Gather to being a serial plan. So it looks like the test might have
become useless.

I see that the original plan appears to come back with some
adjustments to parallel_setup_cost and parallel_tuple_cost. It seems a
bit strange to me that the changes with this patch would cause a
change of plan for this. There is only 1 GROUP BY column in the query
in question. There's no rearrangement to do with a single column GROUP
BY.

David

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

15 июля 2022 г., 19:46:51


On 7/15/22 07:18, David Rowley wrote:
> On Thu, 31 Mar 2022 at 12:19, Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>> Pushed, after going through the patch once more, running check-world
>> under valgrind, and updating the commit message.
> 
> I'm still working in this area and I noticed that db0d67db2 updated
> some regression tests in partition_aggregate.out without any care as
> to what the test was testing.
> 
> The comment above the test reads:
> 
> -- Without ORDER BY clause, to test Gather at top-most path
> 
> and you've changed the expected plan from being a parallel plan with a
> Gather to being a serial plan. So it looks like the test might have
> become useless.
> 

I agree this is a mistake in db0d67db2 that makes the test useless.

> I see that the original plan appears to come back with some
> adjustments to parallel_setup_cost and parallel_tuple_cost. It seems a
> bit strange to me that the changes with this patch would cause a
> change of plan for this. There is only 1 GROUP BY column in the query
> in question. There's no rearrangement to do with a single column GROUP
> BY.
> 

It might seem a bit strange, but the patch tweaked the costing a bit, so
it's not entirely unexpected. I'd bet the plan cost changed just a teeny
bit, but enough to change the cheapest plan. The costing changed for all
group counts, including a single group.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: POC: GROUP BY optimization

От

Michael Paquier

Дата:

02 августа 2022 г., 10:21:13

On Fri, Jul 15, 2022 at 09:46:51PM +0200, Tomas Vondra wrote:
> I agree this is a mistake in db0d67db2 that makes the test useless.

Tomas, please note that this is an open item assigned to you.  Are you
planning to do something with these regression tests by beta3?
--
Michael

Вложения

signature.asc

Re: POC: GROUP BY optimization

От

David Rowley

Дата:

02 августа 2022 г., 11:14:37

On Tue, 2 Aug 2022 at 22:21, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Jul 15, 2022 at 09:46:51PM +0200, Tomas Vondra wrote:
> > I agree this is a mistake in db0d67db2 that makes the test useless.
>
> Tomas, please note that this is an open item assigned to you.  Are you
> planning to do something with these regression tests by beta3?

There's still something to do there for PG15, but 1349d2790 (just gone
in to master) made those two tests run as a parallel query again.

David

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

17 августа 2022 г., 14:46:16

On 8/2/22 13:14, David Rowley wrote:
> On Tue, 2 Aug 2022 at 22:21, Michael Paquier <michael@paquier.xyz> wrote:
>>
>> On Fri, Jul 15, 2022 at 09:46:51PM +0200, Tomas Vondra wrote:
>>> I agree this is a mistake in db0d67db2 that makes the test useless.
>>
>> Tomas, please note that this is an open item assigned to you.  Are you
>> planning to do something with these regression tests by beta3?
> 
> There's still something to do there for PG15, but 1349d2790 (just gone
> in to master) made those two tests run as a parallel query again.
> 

Hi,

I've been looking at this, and as I speculated before it's due to the
sort costing changing a little bit. On PG14, the costing of the plans
looks like this:

 Gather  (cost=1869.39..2823.15 rows=8 width=52)

and with disabled parallelism, it's like this:

 Append  (cost=998.04..3000.64 rows=10 width=52)

so there's a (fairly small) diffrence in favor of the parallel plan. But
on PG15 it's the other way around - the selected non-parallel one is
costed like this:

 Append  (cost=779.41..2490.61 rows=10 width=52)

and by setting parallel_setup_cost=0 you get this:

 Gather  (cost=700.34..1531.76 rows=8 width=52)

So with the setup cost included it's ~2531, and it loses to the simple
plan. This happens because the patch changed sort costing - the same
sort on PG14 and PG15 looks like this:

 PG14: ->  Sort  (cost=998.04..1028.04 rows=12000 width=13)

 PG15: ->  Sort  (cost=779.41..809.41 rows=12000 width=13)

As mentioned, the commit tweaked sort costing - before it was pretty
much just

  comparison_cost * tuples * LOG2(tuples)

but the patch needs to cost different pathkey orderings, and consider
that maybe we don't need to compare all the keys (which the original
costing kind assumes). That's the whole point of this optimization.

The costing (compute_cpu_sort_cost) considers a couple other things,
like cost of the comparator function for the data type, width of the
values, groupings determined by preceding keys, and so on.

It might seem strange that a query with a single pathkey changes, but
that single pathkey is costed the same way, of course. In principle we
might have a special costing for this case, but I'd bet that would
result in pretty surprising inconsistencies when adding a sort key
(going from 1 to 2 keys).

So I don't think the current costing is wrong, but it certainly is more
complex. But the test does not test what it intended - I have two ideas
how to make it work:

1) increase the number of rows in the table

2) increase cpu_operator_cost (for that one test?)

3) tweak the costing somehow, to increase the cost a bit

Both (1) and (2) work - I've tried doubling the number of rows or
setting the operator cost to 0.005, and that does the trick, but maybe a
smaller change would be enough.

I don't like (3) very much - changing the costing just to get the same
test behavior as on older release does not seem very principled. Yes,
maybe it should be tweaked, but not because of a regression test.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: POC: GROUP BY optimization

От

David Rowley

Дата:

18 августа 2022 г., 01:32:14

On Thu, 18 Aug 2022 at 02:46, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> So I don't think the current costing is wrong, but it certainly is more
> complex. But the test does not test what it intended - I have two ideas
> how to make it work:
>
> 1) increase the number of rows in the table
>
> 2) increase cpu_operator_cost (for that one test?)
>
> 3) tweak the costing somehow, to increase the cost a bit

Why not, 4) SET parallel_setup_cost = 0;  there are plenty of other
places we do just that so we get a parallel plan without having to
generate enough cost to drown out the parallel worker startup cost.

Here are a couple of patches to demo the idea.

David

Вложения

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

18 августа 2022 г., 07:29:43


On 8/18/22 03:32, David Rowley wrote:
> On Thu, 18 Aug 2022 at 02:46, Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>> So I don't think the current costing is wrong, but it certainly is more
>> complex. But the test does not test what it intended - I have two ideas
>> how to make it work:
>>
>> 1) increase the number of rows in the table
>>
>> 2) increase cpu_operator_cost (for that one test?)
>>
>> 3) tweak the costing somehow, to increase the cost a bit
> 
> Why not, 4) SET parallel_setup_cost = 0;  there are plenty of other
> places we do just that so we get a parallel plan without having to
> generate enough cost to drown out the parallel worker startup cost.
> 
> Here are a couple of patches to demo the idea.
> 

Yeah, that's an option too. I should have mentioned it along with the
cpu_operator_cost.

BTW would you mind taking a look at the costing? I think it's fine, but
it would be good if someone not involved in the patch takes a look.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: POC: GROUP BY optimization

От

"Jonathan S. Katz"

Дата:

30 августа 2022 г., 13:15:51

On 8/18/22 3:29 AM, Tomas Vondra wrote:
> 
> 
> On 8/18/22 03:32, David Rowley wrote:

>> Here are a couple of patches to demo the idea.
>>
> 
> Yeah, that's an option too. I should have mentioned it along with the
> cpu_operator_cost.
> 
> BTW would you mind taking a look at the costing? I think it's fine, but
> it would be good if someone not involved in the patch takes a look.

With RMT hat on, just a friendly reminder that this is still on the open 
items list[1] -- we have the Beta 4 release next week and it would be 
great if we can get this resolved prior to it.

Thanks!

Jonathan

[1] https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items
[2] 
https://www.postgresql.org/message-id/9d251aec-cea2-bc1a-5ed8-46ef0bcf6c69@postgresql.org

Вложения

OpenPGP_signature

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

01 сентября 2022 г., 13:06:04


On 8/30/22 15:15, Jonathan S. Katz wrote:
> On 8/18/22 3:29 AM, Tomas Vondra wrote:
>>
>>
>> On 8/18/22 03:32, David Rowley wrote:
> 
>>> Here are a couple of patches to demo the idea.
>>>
>>
>> Yeah, that's an option too. I should have mentioned it along with the
>> cpu_operator_cost.
>>
>> BTW would you mind taking a look at the costing? I think it's fine, but
>> it would be good if someone not involved in the patch takes a look.
> 
> With RMT hat on, just a friendly reminder that this is still on the open
> items list[1] -- we have the Beta 4 release next week and it would be
> great if we can get this resolved prior to it.
> 

I know. I'll fix this by tweaking one of the cost gucs, mentioned by David.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: POC: GROUP BY optimization

От

"Jonathan S. Katz"

Дата:

01 сентября 2022 г., 14:05:48

On 9/1/22 9:06 AM, Tomas Vondra wrote:
> 
> 
> On 8/30/22 15:15, Jonathan S. Katz wrote:
>> On 8/18/22 3:29 AM, Tomas Vondra wrote:
>>>
>>>
>>> On 8/18/22 03:32, David Rowley wrote:
>>
>>>> Here are a couple of patches to demo the idea.
>>>>
>>>
>>> Yeah, that's an option too. I should have mentioned it along with the
>>> cpu_operator_cost.
>>>
>>> BTW would you mind taking a look at the costing? I think it's fine, but
>>> it would be good if someone not involved in the patch takes a look.
>>
>> With RMT hat on, just a friendly reminder that this is still on the open
>> items list[1] -- we have the Beta 4 release next week and it would be
>> great if we can get this resolved prior to it.
>>
> 
> I know. I'll fix this by tweaking one of the cost gucs, mentioned by David.

Great -- thanks for the update!

Jonathan

Вложения

OpenPGP_signature

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

04 сентября 2022 г., 22:14:48

On 9/1/22 16:05, Jonathan S. Katz wrote:
> On 9/1/22 9:06 AM, Tomas Vondra wrote:
>>
>>
>> On 8/30/22 15:15, Jonathan S. Katz wrote:
>>> On 8/18/22 3:29 AM, Tomas Vondra wrote:
>>>>
>>>>
>>>> On 8/18/22 03:32, David Rowley wrote:
>>>
>>>>> Here are a couple of patches to demo the idea.
>>>>>
>>>>
>>>> Yeah, that's an option too. I should have mentioned it along with the
>>>> cpu_operator_cost.
>>>>
>>>> BTW would you mind taking a look at the costing? I think it's fine, but
>>>> it would be good if someone not involved in the patch takes a look.
>>>
>>> With RMT hat on, just a friendly reminder that this is still on the open
>>> items list[1] -- we have the Beta 4 release next week and it would be
>>> great if we can get this resolved prior to it.
>>>
>>
>> I know. I'll fix this by tweaking one of the cost gucs, mentioned by
>> David.
> 
> Great -- thanks for the update!
> 

I've pushed the fix to 15+master. In the end I just used David's patches
that set parallel_setup_cost to 0.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: POC: GROUP BY optimization

От

"Jonathan S. Katz"

Дата:

05 сентября 2022 г., 00:50:33

On 9/4/22 6:14 PM, Tomas Vondra wrote:

> I've pushed the fix to 15+master. In the end I just used David's patches
> that set parallel_setup_cost to 0.

Thanks! I have closed the open item.

Jonathan

Вложения

OpenPGP_signature

Re: POC: GROUP BY optimization

От

Michael Paquier

Дата:

05 сентября 2022 г., 06:38:04

On Mon, Sep 05, 2022 at 12:14:48AM +0200, Tomas Vondra wrote:
> I've pushed the fix to 15+master. In the end I just used David's patches
> that set parallel_setup_cost to 0.

Thanks, Tomas!
--
Michael

Вложения

signature.asc

Re: POC: GROUP BY optimization

От

David Rowley

Дата:

19 сентября 2022 г., 21:16:09

On Wed, 13 Jul 2022 at 15:37, David Rowley <dgrowleyml@gmail.com> wrote:
> I'm just in this general area of the code again today and wondered
> about the header comment for the preprocess_groupclause() function.
>
> It says:
>
>  * In principle it might be interesting to consider other orderings of the
>  * GROUP BY elements, which could match the sort ordering of other
>  * possible plans (eg an indexscan) and thereby reduce cost.  We don't
>  * bother with that, though.  Hashed grouping will frequently win anyway.
>
> I'd say this commit makes that paragraph mostly obsolete.  It's only
> true now in the sense that we don't try orders that suit some index
> that would provide pre-sorted results for a GroupAggregate path.  The
> comment leads me to believe that we don't do anything at all to find a
> better order, and that's not true now.

I've just pushed a fix for this.

David

Re: POC: GROUP BY optimization

От

Andrey Lepikhov

Дата:

20 июля 2023 г., 06:37:24

On 3/10/2022 21:56, Tom Lane wrote:
 > Revert "Optimize order of GROUP BY keys".
 >
 > This reverts commit db0d67db2401eb6238ccc04c6407a4fd4f985832 and
 > several follow-on fixes.
 > ...
 > Since we're hard up against the release deadline for v15, let's
 > revert these changes for now.  We can always try again later.

It may be time to restart the project. As a first step, I rebased the 
patch on the current master. It wasn't trivial because of some latest 
optimizations (a29eab, 1349d27 and 8d83a5d).
Now, Let's repeat the review and rewrite the current path according to 
the reasons uttered in the revert commit.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Вложения

0001-When-evaluating-a-query-with-a-multi-column-GROUP-BY-20230720.patch

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

20 июля 2023 г., 11:46:19

On 7/20/23 08:37, Andrey Lepikhov wrote:
> On 3/10/2022 21:56, Tom Lane wrote:
>> Revert "Optimize order of GROUP BY keys".
>>
>> This reverts commit db0d67db2401eb6238ccc04c6407a4fd4f985832 and
>> several follow-on fixes.
>> ...
>> Since we're hard up against the release deadline for v15, let's
>> revert these changes for now.  We can always try again later.
> 
> It may be time to restart the project. As a first step, I rebased the
> patch on the current master. It wasn't trivial because of some latest
> optimizations (a29eab, 1349d27 and 8d83a5d).
> Now, Let's repeat the review and rewrite the current path according to
> the reasons uttered in the revert commit.

I think the fundamental task is to make the costing more reliable, and
the commit message 443df6e2db points out a couple challenges in this
area. Not sure how feasible it is to address enough of them ...

1) procost = 1.0 - I guess we could make this more realistic by doing
some microbenchmarks and tuning the costs for the most expensive cases.

2) estimating quicksort comparisons - This relies on ndistinct
estimates, and I'm not sure how much more reliable we can make those.
Probably not much :-( Not sure what to do about this, the only thing I
can think of is to track "reliability" of the estimates and only do the
reordering if we have high confidence in the estimates. That means we'll
miss some optimization opportunities, but it should limit the risk.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: POC: GROUP BY optimization

От

Andrey Lepikhov

Дата:

21 июля 2023 г., 04:21:01

On 20/7/2023 18:46, Tomas Vondra wrote:
> On 7/20/23 08:37, Andrey Lepikhov wrote:
>> On 3/10/2022 21:56, Tom Lane wrote:
>>> Revert "Optimize order of GROUP BY keys".
>>>
>>> This reverts commit db0d67db2401eb6238ccc04c6407a4fd4f985832 and
>>> several follow-on fixes.
>>> ...
>>> Since we're hard up against the release deadline for v15, let's
>>> revert these changes for now.  We can always try again later.
>>
>> It may be time to restart the project. As a first step, I rebased the
>> patch on the current master. It wasn't trivial because of some latest
>> optimizations (a29eab, 1349d27 and 8d83a5d).
>> Now, Let's repeat the review and rewrite the current path according to
>> the reasons uttered in the revert commit.
> 
> I think the fundamental task is to make the costing more reliable, and
> the commit message 443df6e2db points out a couple challenges in this
> area. Not sure how feasible it is to address enough of them ...
> 
> 1) procost = 1.0 - I guess we could make this more realistic by doing
> some microbenchmarks and tuning the costs for the most expensive cases.
> 
> 2) estimating quicksort comparisons - This relies on ndistinct
> estimates, and I'm not sure how much more reliable we can make those.
> Probably not much :-( Not sure what to do about this, the only thing I
> can think of is to track "reliability" of the estimates and only do the
> reordering if we have high confidence in the estimates. That means we'll
> miss some optimization opportunities, but it should limit the risk.
For me personally, the most challenging issue is:
3. Imbalance, induced by the cost_sort() changes. It may increase or 
decrease the contribution of sorting to the total cost compared to other 
factors and change choice of sorted paths.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

Andrey Lepikhov

Дата:

24 июля 2023 г., 02:10:32

On 20/7/2023 18:46, Tomas Vondra wrote:
> On 7/20/23 08:37, Andrey Lepikhov wrote:
>> On 3/10/2022 21:56, Tom Lane wrote:
>>> Revert "Optimize order of GROUP BY keys".
>>>
>>> This reverts commit db0d67db2401eb6238ccc04c6407a4fd4f985832 and
>>> several follow-on fixes.
>>> ...
>>> Since we're hard up against the release deadline for v15, let's
>>> revert these changes for now.  We can always try again later.
>>
>> It may be time to restart the project. As a first step, I rebased the
>> patch on the current master. It wasn't trivial because of some latest
>> optimizations (a29eab, 1349d27 and 8d83a5d).
>> Now, Let's repeat the review and rewrite the current path according to
>> the reasons uttered in the revert commit.
> 
> I think the fundamental task is to make the costing more reliable, and
> the commit message 443df6e2db points out a couple challenges in this
> area. Not sure how feasible it is to address enough of them ...
> 
> 1) procost = 1.0 - I guess we could make this more realistic by doing
> some microbenchmarks and tuning the costs for the most expensive cases.
> 
> 2) estimating quicksort comparisons - This relies on ndistinct
> estimates, and I'm not sure how much more reliable we can make those.
> Probably not much :-( Not sure what to do about this, the only thing I
> can think of is to track "reliability" of the estimates and only do the
> reordering if we have high confidence in the estimates. That means we'll
> miss some optimization opportunities, but it should limit the risk.
I read up on the history of this thread.
As I see, all the problems mentioned above can be beaten by excluding 
the new cost model at all. We can sort GROUP BY columns according to the 
'ndistinct' value.
I see the reason for introducing the cost model in [1]. The main 
supporting point here is that with this patch, people couldn't optimize 
the query by themselves, organizing the order of the columns in a more 
optimal way. But now we have at least the GUC to switch off the 
behaviour introduced here. Also, some extensions, like the well-known 
pg_hint_plan, can help with automation.
So, how about committing of the undoubted part of the feature and 
working on the cost model in a new thread?

[1] 
https://www.postgresql.org/message-id/6d1e0cdb-dde3-f62a-43e2-e90bbd9b0f42%402ndquadrant.com

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

24 июля 2023 г., 09:56:24

On 7/24/23 04:10, Andrey Lepikhov wrote:
> On 20/7/2023 18:46, Tomas Vondra wrote:
>> On 7/20/23 08:37, Andrey Lepikhov wrote:
>>> On 3/10/2022 21:56, Tom Lane wrote:
>>>> Revert "Optimize order of GROUP BY keys".
>>>>
>>>> This reverts commit db0d67db2401eb6238ccc04c6407a4fd4f985832 and
>>>> several follow-on fixes.
>>>> ...
>>>> Since we're hard up against the release deadline for v15, let's
>>>> revert these changes for now.  We can always try again later.
>>>
>>> It may be time to restart the project. As a first step, I rebased the
>>> patch on the current master. It wasn't trivial because of some latest
>>> optimizations (a29eab, 1349d27 and 8d83a5d).
>>> Now, Let's repeat the review and rewrite the current path according to
>>> the reasons uttered in the revert commit.
>>
>> I think the fundamental task is to make the costing more reliable, and
>> the commit message 443df6e2db points out a couple challenges in this
>> area. Not sure how feasible it is to address enough of them ...
>>
>> 1) procost = 1.0 - I guess we could make this more realistic by doing
>> some microbenchmarks and tuning the costs for the most expensive cases.
>>
>> 2) estimating quicksort comparisons - This relies on ndistinct
>> estimates, and I'm not sure how much more reliable we can make those.
>> Probably not much :-( Not sure what to do about this, the only thing I
>> can think of is to track "reliability" of the estimates and only do the
>> reordering if we have high confidence in the estimates. That means we'll
>> miss some optimization opportunities, but it should limit the risk.
> I read up on the history of this thread.
> As I see, all the problems mentioned above can be beaten by excluding
> the new cost model at all. We can sort GROUP BY columns according to the
> 'ndistinct' value.
> I see the reason for introducing the cost model in [1]. The main
> supporting point here is that with this patch, people couldn't optimize
> the query by themselves, organizing the order of the columns in a more
> optimal way. But now we have at least the GUC to switch off the
> behaviour introduced here. Also, some extensions, like the well-known
> pg_hint_plan, can help with automation.

I think the main concern is that if we reorder the group keys and get it
wrong, it's a regression. If that happens often (due to costing based on
poor stats), it's a problem. Yes, there's a GUC, but that's a rather
blunt instrument, unfortunately.

> So, how about committing of the undoubted part of the feature and
> working on the cost model in a new thread?
> 

But Tom's commit message says this:

    Worse, to arrive at estimates of the number of calls made to the
    lower-order-column comparison functions, the code needs to make
    estimates of the numbers of distinct values of multiple columns,
    which are necessarily even less trustworthy than per-column stats.

so I'm not sure this really counts as "undoubted".


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: POC: GROUP BY optimization

От

Andrey Lepikhov

Дата:

24 июля 2023 г., 12:04:21

On 24/7/2023 16:56, Tomas Vondra wrote:
> On 7/24/23 04:10, Andrey Lepikhov wrote:
>> On 20/7/2023 18:46, Tomas Vondra wrote:
>>> On 7/20/23 08:37, Andrey Lepikhov wrote:
>>>> On 3/10/2022 21:56, Tom Lane wrote:
>>>>> Revert "Optimize order of GROUP BY keys".
>>>>>
>>>>> This reverts commit db0d67db2401eb6238ccc04c6407a4fd4f985832 and
>>>>> several follow-on fixes.
>>>>> ...
>>>>> Since we're hard up against the release deadline for v15, let's
>>>>> revert these changes for now.  We can always try again later.
>>>>
>>>> It may be time to restart the project. As a first step, I rebased the
>>>> patch on the current master. It wasn't trivial because of some latest
>>>> optimizations (a29eab, 1349d27 and 8d83a5d).
>>>> Now, Let's repeat the review and rewrite the current path according to
>>>> the reasons uttered in the revert commit.
>>>
>>> I think the fundamental task is to make the costing more reliable, and
>>> the commit message 443df6e2db points out a couple challenges in this
>>> area. Not sure how feasible it is to address enough of them ...
>>>
>>> 1) procost = 1.0 - I guess we could make this more realistic by doing
>>> some microbenchmarks and tuning the costs for the most expensive cases.
>>>
>>> 2) estimating quicksort comparisons - This relies on ndistinct
>>> estimates, and I'm not sure how much more reliable we can make those.
>>> Probably not much :-( Not sure what to do about this, the only thing I
>>> can think of is to track "reliability" of the estimates and only do the
>>> reordering if we have high confidence in the estimates. That means we'll
>>> miss some optimization opportunities, but it should limit the risk.
>> I read up on the history of this thread.
>> As I see, all the problems mentioned above can be beaten by excluding
>> the new cost model at all. We can sort GROUP BY columns according to the
>> 'ndistinct' value.
>> I see the reason for introducing the cost model in [1]. The main
>> supporting point here is that with this patch, people couldn't optimize
>> the query by themselves, organizing the order of the columns in a more
>> optimal way. But now we have at least the GUC to switch off the
>> behaviour introduced here. Also, some extensions, like the well-known
>> pg_hint_plan, can help with automation.
> 
> I think the main concern is that if we reorder the group keys and get it
> wrong, it's a regression. If that happens often (due to costing based on
> poor stats), it's a problem. Yes, there's a GUC, but that's a rather
> blunt instrument, unfortunately.
I see. My point here is if the ndistinct of one column is much more than 
the ndistinct of another, it is more probable that this correlation will 
be kept in the grouping phase. Here we can get some regression, which 
can be overweighed by the possibility below.
> 
>> So, how about committing of the undoubted part of the feature and
>> working on the cost model in a new thread?
>>
> 
> But Tom's commit message says this:
> 
>      Worse, to arrive at estimates of the number of calls made to the
>      lower-order-column comparison functions, the code needs to make
>      estimates of the numbers of distinct values of multiple columns,
>      which are necessarily even less trustworthy than per-column stats.
> 
> so I'm not sure this really counts as "undoubted".
Don't try to estimate multiple  columns - just sort columns according to 
the value of ndistinct as a heuristic.
I think we should estimate the number of values of multiple columns only 
if we have extended statistics on these columns. And this can extend the 
applicability of extended statistics.

The suggestion above is just an attempt to gather low-hanging fruit 
right now. If it is not applicable, we will go a long way.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

Tomas Vondra

Дата:

24 июля 2023 г., 13:11:35


On 7/24/23 14:04, Andrey Lepikhov wrote:
> On 24/7/2023 16:56, Tomas Vondra wrote:
>> On 7/24/23 04:10, Andrey Lepikhov wrote:
>>> On 20/7/2023 18:46, Tomas Vondra wrote:
>>>> On 7/20/23 08:37, Andrey Lepikhov wrote:
>>>>> On 3/10/2022 21:56, Tom Lane wrote:
>>>>>> Revert "Optimize order of GROUP BY keys".
>>>>>>
>>>>>> This reverts commit db0d67db2401eb6238ccc04c6407a4fd4f985832 and
>>>>>> several follow-on fixes.
>>>>>> ...
>>>>>> Since we're hard up against the release deadline for v15, let's
>>>>>> revert these changes for now.  We can always try again later.
>>>>>
>>>>> It may be time to restart the project. As a first step, I rebased the
>>>>> patch on the current master. It wasn't trivial because of some latest
>>>>> optimizations (a29eab, 1349d27 and 8d83a5d).
>>>>> Now, Let's repeat the review and rewrite the current path according to
>>>>> the reasons uttered in the revert commit.
>>>>
>>>> I think the fundamental task is to make the costing more reliable, and
>>>> the commit message 443df6e2db points out a couple challenges in this
>>>> area. Not sure how feasible it is to address enough of them ...
>>>>
>>>> 1) procost = 1.0 - I guess we could make this more realistic by doing
>>>> some microbenchmarks and tuning the costs for the most expensive cases.
>>>>
>>>> 2) estimating quicksort comparisons - This relies on ndistinct
>>>> estimates, and I'm not sure how much more reliable we can make those.
>>>> Probably not much :-( Not sure what to do about this, the only thing I
>>>> can think of is to track "reliability" of the estimates and only do the
>>>> reordering if we have high confidence in the estimates. That means
>>>> we'll
>>>> miss some optimization opportunities, but it should limit the risk.
>>> I read up on the history of this thread.
>>> As I see, all the problems mentioned above can be beaten by excluding
>>> the new cost model at all. We can sort GROUP BY columns according to the
>>> 'ndistinct' value.
>>> I see the reason for introducing the cost model in [1]. The main
>>> supporting point here is that with this patch, people couldn't optimize
>>> the query by themselves, organizing the order of the columns in a more
>>> optimal way. But now we have at least the GUC to switch off the
>>> behaviour introduced here. Also, some extensions, like the well-known
>>> pg_hint_plan, can help with automation.
>>
>> I think the main concern is that if we reorder the group keys and get it
>> wrong, it's a regression. If that happens often (due to costing based on
>> poor stats), it's a problem. Yes, there's a GUC, but that's a rather
>> blunt instrument, unfortunately.
> I see. My point here is if the ndistinct of one column is much more than
> the ndistinct of another, it is more probable that this correlation will
> be kept in the grouping phase. Here we can get some regression, which
> can be overweighed by the possibility below.

I think the word "probable" hits what the problem is. Because what if it
isn't? Also, we don't actually need ndistinct for individual attributes,
but for the groups of attributes. Imagine you do

GROUP BY a, b

so you need to estimate whether to do (a,b) or (b,a). You need to calculate

  ndistinct(a,b) / ndistinct(a)

and

  ndistinct(b,a) / ndistinct(b)

And similarly for more grouping keys. This is why we have ndistinct
extended statistics, mostly.

>>
>>> So, how about committing of the undoubted part of the feature and
>>> working on the cost model in a new thread?
>>>
>>
>> But Tom's commit message says this:
>>
>>      Worse, to arrive at estimates of the number of calls made to the
>>      lower-order-column comparison functions, the code needs to make
>>      estimates of the numbers of distinct values of multiple columns,
>>      which are necessarily even less trustworthy than per-column stats.
>>
>> so I'm not sure this really counts as "undoubted".
> Don't try to estimate multiple  columns - just sort columns according to
> the value of ndistinct as a heuristic.

Surely you realize how easy such simple heuristics can fail, right?

> I think we should estimate the number of values of multiple columns only
> if we have extended statistics on these columns. And this can extend the
> applicability of extended statistics.
> 

I don't see why this should estimate ndistinct differently than every
other place. That'd certainly be surprising. I can be convinced, but
there needs to be sound justification why that's the right way to do
(i.e. why it's less likely to cause poor planning choices).

> The suggestion above is just an attempt to gather low-hanging fruit
> right now. If it is not applicable, we will go a long way.
> 
I'm not sure this qualities as "low hanging fruit" - that generally
means simple changes with little risk. But you're using it for rather
simplistic heuristic that can easily misfire (I think). At least that's
my impression, I can be convinced otherwise.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: POC: GROUP BY optimization

От

Andrey Lepikhov

Дата:

13 сентября 2023 г., 05:21:32

Hi,
Here is the patch rebased on the current master. Also, I fixed some 
minor slips and one static analyzer warning.
This is just for adding to the next commitfest and enforcing work with 
this patch.

One extra difference in newly added postgres_fdw tests is caused by this 
patch - see changes in the query plan in attachment.

-- 
regards,
Andrey Lepikhov
Postgres Professional

New version of the patch. Fixed minor inconsistencies and rebased onto 
current master.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Вложения

0001-Explore-alternative-orderings-of-group-by-pathkeys-d-20231001.patch

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

21 декабря 2023 г., 10:53:27

Hi!

On Sun, Oct 1, 2023 at 11:45 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
>
> New version of the patch. Fixed minor inconsistencies and rebased onto
> current master.

Thank you (and other authors) for working on this subject.  Indeed to
GROUP BY clauses are order-agnostic.  Reordering them in the most
suitable order could give up significant query planning benefits.  I
went through the thread: I see significant work has been already made
on this patch, the code is quite polished.

I'd like to make some notes.

1) As already mentioned, there is clearly a repetitive pattern for the
code following after get_useful_group_keys_orderings() calls.  I think
it would be good to extract it into a separate function.  Please, do
this as a separate patch coming before the group-by patch. That would
simplify the review.

2) I wonder what planning overhead this patch could introduce?  Could
you try to measure the worst case?  What if we have a table with a lot
of indexes and a long list of group-by clauses partially patching
every index.  This should give us an understanding on whether we need
a separate GUC to control this feature.

3) I see that get_useful_group_keys_orderings() makes 3 calls to
get_cheapest_group_keys_order() function.  Each time
get_cheapest_group_keys_order() performs the cost estimate and
reorders the free keys.  However, cost estimation implies the system
catalog lookups (that is quite expensive).  I wonder if we could
change the algorithm.  Could we just sort the group-by keys by cost
once, save this ordering and then just re-use it.  So, every time we
need to reorder a group by, we can just pull the required keys to the
top and use saved ordering for the rest.  I also wonder if we could do
this once for add_paths_to_grouping_rel() and
create_partial_grouping_paths() calls.  So, it probably should be
somewhere in create_ordinary_grouping_paths().

4) I think we can do some optimizations when enable_incremental_sort
== off.  Then in get_useful_group_keys_orderings() we should only deal
with input_path fully matching the group-by clause, and try only full
match of group-by output to the required order.

------
Regards,
Alexander Korotkov

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

26 декабря 2023 г., 11:37:01

On 21/12/2023 17:53, Alexander Korotkov wrote:
> On Sun, Oct 1, 2023 at 11:45 AM Andrei Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> New version of the patch. Fixed minor inconsistencies and rebased onto
>> current master.
> Thank you (and other authors) for working on this subject.  Indeed to
> GROUP BY clauses are order-agnostic.  Reordering them in the most
> suitable order could give up significant query planning benefits.  I
> went through the thread: I see significant work has been already made
> on this patch, the code is quite polished.
Maybe, but issues, mentioned in [1], still not resolved. It is the only 
reason, why this thread hasn't been active.
> I'd like to make some notes.
> 1) As already mentioned, there is clearly a repetitive pattern for the
> code following after get_useful_group_keys_orderings() calls.  I think
> it would be good to extract it into a separate function.  Please, do
> this as a separate patch coming before the group-by patch. That would
> simplify the review.
Yeah, these parts of code a bit different. I will try to make common 
routine.
> 2) I wonder what planning overhead this patch could introduce?  Could
> you try to measure the worst case?  What if we have a table with a lot
> of indexes and a long list of group-by clauses partially patching
> every index.  This should give us an understanding on whether we need
> a separate GUC to control this feature.
Ok> 3) I see that get_useful_group_keys_orderings() makes 3 calls to
> get_cheapest_group_keys_order() function.  Each time
> get_cheapest_group_keys_order() performs the cost estimate and
> reorders the free keys.  However, cost estimation implies the system
> catalog lookups (that is quite expensive).  I wonder if we could
> change the algorithm.  Could we just sort the group-by keys by cost
> once, save this ordering and then just re-use it.  So, every time we
> need to reorder a group by, we can just pull the required keys to the
> top and use saved ordering for the rest.  I also wonder if we could do
> this once for add_paths_to_grouping_rel() and
> create_partial_grouping_paths() calls.  So, it probably should be
> somewhere in create_ordinary_grouping_paths().
Thanks for the idea!> 4) I think we can do some optimizations when 
enable_incremental_sort
> == off.  Then in get_useful_group_keys_orderings() we should only deal
> with input_path fully matching the group-by clause, and try only full
> match of group-by output to the required order.
Oh, we had designed before the incremental sort was invented. Will see 
what we can do here.

[1] 
https://www.postgresql.org/message-id/60610df1-c32f-ebdf-e58c-7a664431f452%40enterprisedb.com

-- 
regards,
Andrei Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

27 декабря 2023 г., 02:48:08

On Tue, Dec 26, 2023 at 1:37 PM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 21/12/2023 17:53, Alexander Korotkov wrote:
> > On Sun, Oct 1, 2023 at 11:45 AM Andrei Lepikhov
> > <a.lepikhov@postgrespro.ru> wrote:
> >> New version of the patch. Fixed minor inconsistencies and rebased onto
> >> current master.
> > Thank you (and other authors) for working on this subject.  Indeed to
> > GROUP BY clauses are order-agnostic.  Reordering them in the most
> > suitable order could give up significant query planning benefits.  I
> > went through the thread: I see significant work has been already made
> > on this patch, the code is quite polished.
> Maybe, but issues, mentioned in [1], still not resolved. It is the only
> reason, why this thread hasn't been active.

Yes, this makes sense.  I have a couple of points from me on this subject.
1) The patch reorders GROUP BY items not only to make comparison
cheaper but also to match the ordering of input paths and to match the
ORDER BY clause.  Thus, even if we leave aside for now sorting GROUP
BY items by their cost, the patch will remain valuable.
2) An accurate estimate of the sorting cost is quite a difficult task.
What if we make a simple rule of thumb that sorting integers and
floats is cheaper than sorting numerics and strings with collation C,
in turn, that is cheaper than sorting collation-aware strings
(probably more groups)?  Within the group, we could keep the original
order of items.

------
Regards,
Alexander Korotkov

Re: POC: GROUP BY optimization

От

Tom Lane

Дата:

27 декабря 2023 г., 03:23:24

Alexander Korotkov <aekorotkov@gmail.com> writes:
> 2) An accurate estimate of the sorting cost is quite a difficult task.

Indeed.

> What if we make a simple rule of thumb that sorting integers and
> floats is cheaper than sorting numerics and strings with collation C,
> in turn, that is cheaper than sorting collation-aware strings
> (probably more groups)?  Within the group, we could keep the original
> order of items.

I think it's a fool's errand to even try to separate different sort
column orderings by cost.  We simply do not have sufficiently accurate
cost information.  The previous patch in this thread got reverted because
of that (well, also some implementation issues, but mostly that), and
nothing has happened to make me think that another try will fare any
better.

            regards, tom lane

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

27 декабря 2023 г., 04:15:22

On Wed, Dec 27, 2023 at 5:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
> > 2) An accurate estimate of the sorting cost is quite a difficult task.
>
> Indeed.
>
> > What if we make a simple rule of thumb that sorting integers and
> > floats is cheaper than sorting numerics and strings with collation C,
> > in turn, that is cheaper than sorting collation-aware strings
> > (probably more groups)?  Within the group, we could keep the original
> > order of items.
>
> I think it's a fool's errand to even try to separate different sort
> column orderings by cost.  We simply do not have sufficiently accurate
> cost information.  The previous patch in this thread got reverted because
> of that (well, also some implementation issues, but mostly that), and
> nothing has happened to make me think that another try will fare any
> better.

If there is a choice of what to compare first: 8-bytes integers or
collation-aware texts possibly toasted, then the answer is pretty
evident for me.  For sure, there are cases then this choice is wrong.
But even if all the integers appear to be the same, the penalty isn't
that much.

Besides sorting column orderings by cost, this patch also tries to
match GROUP BY pathkeys to input pathkeys and ORDER BY pathkeys.  Do
you think there is a chance for the second part if we leave the cost
part aside?

------
Regards,
Alexander Korotkov

Re: POC: GROUP BY optimization

От

Tom Lane

Дата:

27 декабря 2023 г., 04:27:39

Alexander Korotkov <aekorotkov@gmail.com> writes:
> On Wed, Dec 27, 2023 at 5:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think it's a fool's errand to even try to separate different sort
>> column orderings by cost.

> Besides sorting column orderings by cost, this patch also tries to
> match GROUP BY pathkeys to input pathkeys and ORDER BY pathkeys.  Do
> you think there is a chance for the second part if we leave the cost
> part aside?

I think it's definitely reasonable to try to match up available
orderings, because that doesn't really require fine distinctions
of cost: either it matches or it doesn't.  Eliminating a sort step
entirely is clearly a win.  (Incremental sort complicates this though.
I doubt our cost model for incremental sorts is any good either, so
I am not eager to rely on that more heavily.)

            regards, tom lane

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

27 декабря 2023 г., 04:35:41

On 27/12/2023 11:15, Alexander Korotkov wrote:
> On Wed, Dec 27, 2023 at 5:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Alexander Korotkov <aekorotkov@gmail.com> writes:
>>> 2) An accurate estimate of the sorting cost is quite a difficult task.
>>
>> Indeed.
>>
>>> What if we make a simple rule of thumb that sorting integers and
>>> floats is cheaper than sorting numerics and strings with collation C,
>>> in turn, that is cheaper than sorting collation-aware strings
>>> (probably more groups)?  Within the group, we could keep the original
>>> order of items.
>>
>> I think it's a fool's errand to even try to separate different sort
>> column orderings by cost.  We simply do not have sufficiently accurate
>> cost information.  The previous patch in this thread got reverted because
>> of that (well, also some implementation issues, but mostly that), and
>> nothing has happened to make me think that another try will fare any
>> better.
To be clear. In [1], I mentioned we can perform micro-benchmarks and 
structure costs of operators. At least for fixed-length operators, it is 
relatively easy. So, the main block here is an accurate prediction of 
ndistincts for different combinations of columns. Does it make sense to 
continue to design the feature in the direction of turning on choosing 
between different sort column orderings if we have extended statistics 
on the columns?

[1] 
https://www.postgresql.org/message-id/e3602ccb-e643-2e79-ed2c-1175a80533a1@postgrespro.ru

-- 
regards,
Andrei Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

Tom Lane

Дата:

27 декабря 2023 г., 05:07:23

Andrei Lepikhov <a.lepikhov@postgrespro.ru> writes:
> To be clear. In [1], I mentioned we can perform micro-benchmarks and 
> structure costs of operators. At least for fixed-length operators, it is 
> relatively easy.

I repeat what I said: this is a fool's errand.  You will not get
trustworthy results even for the cases you measured, let alone
all the rest.  I'd go as far as to say I would not believe your
microbenchmarks, because they would only apply for one platform,
compiler, backend build, phase of the moon, etc.

            regards, tom lane

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

28 декабря 2023 г., 08:22:47

On 27/12/2023 12:07, Tom Lane wrote:
> Andrei Lepikhov <a.lepikhov@postgrespro.ru> writes:
>> To be clear. In [1], I mentioned we can perform micro-benchmarks and
>> structure costs of operators. At least for fixed-length operators, it is
>> relatively easy.
> 
> I repeat what I said: this is a fool's errand.  You will not get
> trustworthy results even for the cases you measured, let alone
> all the rest.  I'd go as far as to say I would not believe your
> microbenchmarks, because they would only apply for one platform,
> compiler, backend build, phase of the moon, etc.

Thanks for the explanation.
I removed all cost-related codes. It still needs to be finished; I will 
smooth the code further and rewrite regression tests - many of them 
without cost-dependent reorderings look silly. Also, remember 
Alexander's remarks, which must be implemented, too.
But already here, it works well. Look:

Preliminaries:
CREATE TABLE t(x int, y int, z text, w int);
INSERT INTO t SELECT gs%100,gs%100, 'abc' || gs%10, gs
   FROM generate_series(1,10000) AS gs;
CREATE INDEX abc ON t(x,y);
ANALYZE t;
SET enable_hashagg = 'off';

This patch eliminates unneeded Sort operation:
explain SELECT x,y FROM t GROUP BY (x,y);
explain SELECT x,y FROM t GROUP BY (y,x);

Engages incremental sort:
explain SELECT x,y FROM t GROUP BY (x,y,z,w);
explain SELECT x,y FROM t GROUP BY (z,y,w,x);
explain SELECT x,y FROM t GROUP BY (w,z,x,y);
explain SELECT x,y FROM t GROUP BY (w,x,z,y);

Works with subqueries:
explain SELECT x,y
FROM (SELECT * FROM t ORDER BY x,y,w,z) AS q1
GROUP BY (w,x,z,y);
explain SELECT x,y
FROM (SELECT * FROM t ORDER BY x,y,w,z LIMIT 100) AS q1
GROUP BY (w,x,z,y);

But arrangement with an ORDER BY clause doesn't work:

DROP INDEX abc;
explain SELECT x,w,z FROM t GROUP BY (w,x,z) ORDER BY (x,z,w);

I think the reason is that the sort_pathkeys and group_pathkeys are 
physically different structures, and we can't just compare pointers here.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

0001-Explore-alternative-orderings-of-group-by-pathkeys-d-20231228.patch

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

28 декабря 2023 г., 11:29:05

On Thu, Dec 28, 2023 at 10:22 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> But arrangement with an ORDER BY clause doesn't work:
>
> DROP INDEX abc;
> explain SELECT x,w,z FROM t GROUP BY (w,x,z) ORDER BY (x,z,w);
>
> I think the reason is that the sort_pathkeys and group_pathkeys are
> physically different structures, and we can't just compare pointers here.

I haven't yet looked into the code.  But this looks strange to me.
Somehow, optimizer currently matches index pathkeys to ORDER BY
pathkeys.  If GROUP BY pathkeys could be matched to index pathkeys,
then it should be possible to match them to ORDER BY pathkeys too.

------
Regards,
Alexander Korotkov

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

29 декабря 2023 г., 03:23:59

On 28/12/2023 18:29, Alexander Korotkov wrote:
> On Thu, Dec 28, 2023 at 10:22 AM Andrei Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> But arrangement with an ORDER BY clause doesn't work:
>>
>> DROP INDEX abc;
>> explain SELECT x,w,z FROM t GROUP BY (w,x,z) ORDER BY (x,z,w);
>>
>> I think the reason is that the sort_pathkeys and group_pathkeys are
>> physically different structures, and we can't just compare pointers here.
> 
> I haven't yet looked into the code.  But this looks strange to me.
> Somehow, optimizer currently matches index pathkeys to ORDER BY
> pathkeys.  If GROUP BY pathkeys could be matched to index pathkeys,
> then it should be possible to match them to ORDER BY pathkeys too.
Oh, I found the mistake: I got used to using GROUP BY and ORDER BY on 
many columns with round brackets. In the case of the grouping list, it 
doesn't change anything. But ordering treats it as a WholeRowVar and 
breaks group-by arrangement. Look:
explain (COSTS OFF) SELECT relname,reltuples FROM pg_class
GROUP BY relname,reltuples ORDER BY reltuples,relname;

  Group
    Group Key: reltuples, relname
    ->  Sort
          Sort Key: reltuples, relname
          ->  Seq Scan on pg_class
But:
explain (COSTS OFF) SELECT relname,reltuples FROM pg_class
GROUP BY relname,reltuples ORDER BY (reltuples,relname);

  Sort
    Sort Key: (ROW(reltuples, relname))
    ->  Group
          Group Key: relname, reltuples
          ->  Sort
                Sort Key: relname, reltuples
                ->  Seq Scan on pg_class

So, let's continue to work.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

09 января 2024 г., 09:01:43

Here is a new version of GROUP-BY optimization without sort model.

On 21/12/2023 17:53, Alexander Korotkov wrote:
> I'd like to make some notes.
> 
> 1) As already mentioned, there is clearly a repetitive pattern for the
> code following after get_useful_group_keys_orderings() calls.  I think
> it would be good to extract it into a separate function.  Please, do
> this as a separate patch coming before the group-by patch. That would
> simplify the review.
Done. See patch 0001-*. Unfortunately, extraction of whole cycle isn't 
practical, because it blows out the interface of the routine.

> 2) I wonder what planning overhead this patch could introduce?  Could
> you try to measure the worst case?  What if we have a table with a lot
> of indexes and a long list of group-by clauses partially patching
> every index.  This should give us an understanding on whether we need
> a separate GUC to control this feature.
In current implementation I don't anticipate any significant overhead. 
GUC is needed here to allow users adhere their own ordering and to 
disable feature in the case of problems.

> 4) I think we can do some optimizations when enable_incremental_sort
> == off.  Then in get_useful_group_keys_orderings() we should only deal
> with input_path fully matching the group-by clause, and try only full
> match of group-by output to the required order.
Hm, is it really make sense in current implementation?

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

Re: POC: GROUP BY optimization

От

vignesh C

Дата:

09 января 2024 г., 09:45:27

On Tue, 9 Jan 2024 at 14:31, Andrei Lepikhov <a.lepikhov@postgrespro.ru> wrote:
>
> Here is a new version of GROUP-BY optimization without sort model.
>
> On 21/12/2023 17:53, Alexander Korotkov wrote:
> > I'd like to make some notes.
> >
> > 1) As already mentioned, there is clearly a repetitive pattern for the
> > code following after get_useful_group_keys_orderings() calls.  I think
> > it would be good to extract it into a separate function.  Please, do
> > this as a separate patch coming before the group-by patch. That would
> > simplify the review.
> Done. See patch 0001-*. Unfortunately, extraction of whole cycle isn't
> practical, because it blows out the interface of the routine.
>
> > 2) I wonder what planning overhead this patch could introduce?  Could
> > you try to measure the worst case?  What if we have a table with a lot
> > of indexes and a long list of group-by clauses partially patching
> > every index.  This should give us an understanding on whether we need
> > a separate GUC to control this feature.
> In current implementation I don't anticipate any significant overhead.
> GUC is needed here to allow users adhere their own ordering and to
> disable feature in the case of problems.
>
> > 4) I think we can do some optimizations when enable_incremental_sort
> > == off.  Then in get_useful_group_keys_orderings() we should only deal
> > with input_path fully matching the group-by clause, and try only full
> > match of group-by output to the required order.
> Hm, is it really make sense in current implementation?

CFBot shows the following errors at [1] with:
[08:33:28.813] ../src/backend/utils/adt/selfuncs.c: In function
‘estimate_num_groups’:
[08:33:28.813] ../src/backend/utils/adt/selfuncs.c:3389:9: warning:
implicit declaration of function ‘estimate_num_groups_incremental’
[-Wimplicit-function-declaration]
[08:33:28.813] 3389 | return estimate_num_groups_incremental(root, groupExprs,
[08:33:28.813] | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[08:33:28.813] ../src/backend/utils/adt/selfuncs.c: At top level:
[08:33:28.813] ../src/backend/utils/adt/selfuncs.c:3400:1: warning: no
previous prototype for ‘estimate_num_groups_incremental’
[-Wmissing-prototypes]
[08:33:28.813] 3400 | estimate_num_groups_incremental(PlannerInfo
*root, List *groupExprs,
[08:33:28.813] | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[08:33:28.813] ../src/backend/utils/adt/selfuncs.c:3400:1: error:
conflicting types for ‘estimate_num_groups_incremental’
[08:33:28.813] ../src/backend/utils/adt/selfuncs.c:3389:9: note:
previous implicit declaration of ‘estimate_num_groups_incremental’ was
here
[08:33:28.813] 3389 | return estimate_num_groups_incremental(root, groupExprs,

[1] - https://cirrus-ci.com/task/5074942069309440

Regards,
Vignesh

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

09 января 2024 г., 11:07:19

On 9/1/2024 16:45, vignesh C wrote:
> On Tue, 9 Jan 2024 at 14:31, Andrei Lepikhov <a.lepikhov@postgrespro.ru> wrote:
>>
>> Here is a new version of GROUP-BY optimization without sort model.
>>
>> On 21/12/2023 17:53, Alexander Korotkov wrote:
>>> I'd like to make some notes.
>>>
>>> 1) As already mentioned, there is clearly a repetitive pattern for the
>>> code following after get_useful_group_keys_orderings() calls.  I think
>>> it would be good to extract it into a separate function.  Please, do
>>> this as a separate patch coming before the group-by patch. That would
>>> simplify the review.
>> Done. See patch 0001-*. Unfortunately, extraction of whole cycle isn't
>> practical, because it blows out the interface of the routine.
>>
>>> 2) I wonder what planning overhead this patch could introduce?  Could
>>> you try to measure the worst case?  What if we have a table with a lot
>>> of indexes and a long list of group-by clauses partially patching
>>> every index.  This should give us an understanding on whether we need
>>> a separate GUC to control this feature.
>> In current implementation I don't anticipate any significant overhead.
>> GUC is needed here to allow users adhere their own ordering and to
>> disable feature in the case of problems.
>>
>>> 4) I think we can do some optimizations when enable_incremental_sort
>>> == off.  Then in get_useful_group_keys_orderings() we should only deal
>>> with input_path fully matching the group-by clause, and try only full
>>> match of group-by output to the required order.
>> Hm, is it really make sense in current implementation?
> 
> CFBot shows the following errors at [1] with:
> [08:33:28.813] ../src/backend/utils/adt/selfuncs.c: In function
> ‘estimate_num_groups’:
> [08:33:28.813] ../src/backend/utils/adt/selfuncs.c:3389:9: warning:
> implicit declaration of function ‘estimate_num_groups_incremental’
> [-Wimplicit-function-declaration]
> [08:33:28.813] 3389 | return estimate_num_groups_incremental(root, groupExprs,
> [08:33:28.813] | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> [08:33:28.813] ../src/backend/utils/adt/selfuncs.c: At top level:
> [08:33:28.813] ../src/backend/utils/adt/selfuncs.c:3400:1: warning: no
> previous prototype for ‘estimate_num_groups_incremental’
> [-Wmissing-prototypes]
> [08:33:28.813] 3400 | estimate_num_groups_incremental(PlannerInfo
> *root, List *groupExprs,
> [08:33:28.813] | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> [08:33:28.813] ../src/backend/utils/adt/selfuncs.c:3400:1: error:
> conflicting types for ‘estimate_num_groups_incremental’
> [08:33:28.813] ../src/backend/utils/adt/selfuncs.c:3389:9: note:
> previous implicit declaration of ‘estimate_num_groups_incremental’ was
> here
> [08:33:28.813] 3389 | return estimate_num_groups_incremental(root, groupExprs,
Hmm, I don't see this old code in these patches. Resend 0002-* because 
of trailing spaces.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

0002-Explore-alternative-orderings-of-group-by-pathkeys-d-20240109.patch

Re: POC: GROUP BY optimization

От

Pavel Borisov

Дата:

09 января 2024 г., 11:14:21

Hi, Andrei!

Hmm, I don't see this old code in these patches. Resend 0002-* because
of trailing spaces.

AFAIK, cfbot does not seek old versions of patchset parts in previous messages. So for it to run correctly, a new version of the whole patchset should be sent even if most patches are unchanged.

Regards,

Pavel Borisov

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

11 января 2024 г., 11:30:29

Hi!

On Tue, Jan 9, 2024 at 1:14 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>> Hmm, I don't see this old code in these patches. Resend 0002-* because
>> of trailing spaces.
>
>
> AFAIK, cfbot does not seek old versions of patchset parts in previous messages. So for it to run correctly, a new
versionof the whole patchset should be sent even if most patches are unchanged. 

Please, find the revised patchset with some refactoring and comments
improvement from me.  I'll continue looking into this.

------
Regards,
Alexander Korotkov

Вложения

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

13 января 2024 г., 09:08:59

On 11/1/2024 18:30, Alexander Korotkov wrote:
> Hi!
> 
> On Tue, Jan 9, 2024 at 1:14 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>>> Hmm, I don't see this old code in these patches. Resend 0002-* because
>>> of trailing spaces.
>>
>>
>> AFAIK, cfbot does not seek old versions of patchset parts in previous messages. So for it to run correctly, a new
versionof the whole patchset should be sent even if most patches are unchanged.
 
> 
> Please, find the revised patchset with some refactoring and comments
> improvement from me.  I'll continue looking into this.
The patch looks better, thanks to your refactoring.
I propose additional comments and tests to make the code more 
understandable (see attachment).
I intended to look into this part of the code more, but the tests show a 
difference between PostgreSQL v.15 and v.16, which causes changes in the 
code of this feature.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

details.txt

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

13 января 2024 г., 15:00:03

On Sat, Jan 13, 2024 at 11:09 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 11/1/2024 18:30, Alexander Korotkov wrote:
> > On Tue, Jan 9, 2024 at 1:14 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> >>> Hmm, I don't see this old code in these patches. Resend 0002-* because
> >>> of trailing spaces.
> >>
> >>
> >> AFAIK, cfbot does not seek old versions of patchset parts in previous messages. So for it to run correctly, a new
versionof the whole patchset should be sent even if most patches are unchanged. 
> >
> > Please, find the revised patchset with some refactoring and comments
> > improvement from me.  I'll continue looking into this.
> The patch looks better, thanks to your refactoring.
> I propose additional comments and tests to make the code more
> understandable (see attachment).
> I intended to look into this part of the code more, but the tests show a
> difference between PostgreSQL v.15 and v.16, which causes changes in the
> code of this feature.

Makes sense.  I've incorporated your changes into the patchset.

------
Regards,
Alexander Korotkov

Вложения

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

14 января 2024 г., 12:14:13

On 13/1/2024 22:00, Alexander Korotkov wrote:
> On Sat, Jan 13, 2024 at 11:09 AM Andrei Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> On 11/1/2024 18:30, Alexander Korotkov wrote:
>>> On Tue, Jan 9, 2024 at 1:14 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>>>>> Hmm, I don't see this old code in these patches. Resend 0002-* because
>>>>> of trailing spaces.
>>>>
>>>>
>>>> AFAIK, cfbot does not seek old versions of patchset parts in previous messages. So for it to run correctly, a new
versionof the whole patchset should be sent even if most patches are unchanged.
 
>>>
>>> Please, find the revised patchset with some refactoring and comments
>>> improvement from me.  I'll continue looking into this.
>> The patch looks better, thanks to your refactoring.
>> I propose additional comments and tests to make the code more
>> understandable (see attachment).
>> I intended to look into this part of the code more, but the tests show a
>> difference between PostgreSQL v.15 and v.16, which causes changes in the
>> code of this feature.
> 
> Makes sense.  I've incorporated your changes into the patchset.
One more improvement. To underpin code change:

-               return cur_ec;  /* Match! */
+           {
+               /*
+                * Match!
+                *
+                * Copy the sortref if it wasn't set yet. That may happen if
+                * the ec was constructed from WHERE clause, i.e. it doesn't
+                * have a target reference at all.
+                */
+               if (cur_ec->ec_sortref == 0 && sortref > 0)
+                   cur_ec->ec_sortref = sortref;
+               return cur_ec;
+           }

I propose the test (see attachment). It shows why we introduce this 
change: GROUP-BY should juggle not only pathkeys generated by explicit 
sort operators but also planner-made, likewise in this example, by 
MergeJoin.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

details-2.txt

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

15 января 2024 г., 00:19:41

On Sun, Jan 14, 2024 at 2:14 PM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 13/1/2024 22:00, Alexander Korotkov wrote:
> > On Sat, Jan 13, 2024 at 11:09 AM Andrei Lepikhov
> > <a.lepikhov@postgrespro.ru> wrote:
> >> On 11/1/2024 18:30, Alexander Korotkov wrote:
> >>> On Tue, Jan 9, 2024 at 1:14 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> >>>>> Hmm, I don't see this old code in these patches. Resend 0002-* because
> >>>>> of trailing spaces.
> >>>>
> >>>>
> >>>> AFAIK, cfbot does not seek old versions of patchset parts in previous messages. So for it to run correctly, a
newversion of the whole patchset should be sent even if most patches are unchanged. 
> >>>
> >>> Please, find the revised patchset with some refactoring and comments
> >>> improvement from me.  I'll continue looking into this.
> >> The patch looks better, thanks to your refactoring.
> >> I propose additional comments and tests to make the code more
> >> understandable (see attachment).
> >> I intended to look into this part of the code more, but the tests show a
> >> difference between PostgreSQL v.15 and v.16, which causes changes in the
> >> code of this feature.
> >
> > Makes sense.  I've incorporated your changes into the patchset.
> One more improvement. To underpin code change:
>
> -               return cur_ec;  /* Match! */
> +           {
> +               /*
> +                * Match!
> +                *
> +                * Copy the sortref if it wasn't set yet. That may happen if
> +                * the ec was constructed from WHERE clause, i.e. it doesn't
> +                * have a target reference at all.
> +                */
> +               if (cur_ec->ec_sortref == 0 && sortref > 0)
> +                   cur_ec->ec_sortref = sortref;
> +               return cur_ec;
> +           }
>
> I propose the test (see attachment). It shows why we introduce this
> change: GROUP-BY should juggle not only pathkeys generated by explicit
> sort operators but also planner-made, likewise in this example, by
> MergeJoin.

Thank you for providing the test case relevant for this code change.
The revised patch incorporating this change is attached.  Now the
patchset looks good to me.  I'm going to push it if there are no
objections.

------
Regards,
Alexander Korotkov

Вложения

Re: POC: GROUP BY optimization

От

Richard Guo

Дата:

15 января 2024 г., 06:42:16

On Mon, Jan 15, 2024 at 8:20 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:

Thank you for providing the test case relevant for this code change.
The revised patch incorporating this change is attached. Now the
patchset looks good to me. I'm going to push it if there are no
objections.

Seems I'm late for the party. Can we hold for several more days? I'd
like to have a review on this patch.

Thanks
Richard

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

15 января 2024 г., 07:56:32

On Mon, Jan 15, 2024 at 8:42 AM Richard Guo <guofenglinux@gmail.com> wrote:
> On Mon, Jan 15, 2024 at 8:20 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>>
>> Thank you for providing the test case relevant for this code change.
>> The revised patch incorporating this change is attached.  Now the
>> patchset looks good to me.  I'm going to push it if there are no
>> objections.
>
> Seems I'm late for the party.  Can we hold for several more days?  I'd
> like to have a review on this patch.

Sure, NP.  I'll hold it till your review.

------
Regards,
Alexander Korotkov

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

15 января 2024 г., 09:46:03

On 15/1/2024 13:42, Richard Guo wrote:
> 
> On Mon, Jan 15, 2024 at 8:20 AM Alexander Korotkov <aekorotkov@gmail.com 
> <mailto:aekorotkov@gmail.com>> wrote:
> 
>     Thank you for providing the test case relevant for this code change.
>     The revised patch incorporating this change is attached.  Now the
>     patchset looks good to me.  I'm going to push it if there are no
>     objections.
> 
> 
> Seems I'm late for the party.  Can we hold for several more days?  I'd
> like to have a review on this patch.
Get on board! It looks like this feature needs as much review as 
possible (likewise SJE).

-- 
regards,
Andrei Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

Alena Rybakina

Дата:

15 января 2024 г., 13:21:05

On 15.01.2024 12:46, Andrei Lepikhov wrote:
> On 15/1/2024 13:42, Richard Guo wrote:
>>
>> On Mon, Jan 15, 2024 at 8:20 AM Alexander Korotkov 
>> <aekorotkov@gmail.com <mailto:aekorotkov@gmail.com>> wrote:
>>
>>     Thank you for providing the test case relevant for this code change.
>>     The revised patch incorporating this change is attached. Now the
>>     patchset looks good to me.  I'm going to push it if there are no
>>     objections.
>>
>>
>> Seems I'm late for the party.  Can we hold for several more days?  I'd
>> like to have a review on this patch.
> Get on board! It looks like this feature needs as much review as 
> possible (likewise SJE).
>
Hi! Thank you for your work on this issue! I believe that this will help 
the scheduler to make a more optimal query plan here and therefore speed 
up their execution.

I have reviewed patches and noticed that we can add some code 
refactoring. I have attached a diff file (group_by.diff) to this email.

The changes involve spelling corrections, renaming variables and porting 
some common parts.

In addition, I have a few questions, since some points in the code 
remained unclear to me.

1.  I didn't understand why we have a question in the comment next to 
the enable_group_by_reordering variable in 
src/backend/optimizer/path/pathkeys.c file, I assumed it was spelling 
and fixed it in the diff file.

2. Why do we set the variable (path = path_save) here 
(add_paths_to_grouping_rel function) if we change its variable below and 
we can pass path_save as a parameter?

foreach(lc2, pathkey_orderings)
{
     PathKeyInfo *info = (PathKeyInfo *) lfirst(lc2);

     /* restore the path (we replace it in the loop) */
     path = path_save;

     path = make_ordered_path(root,
                              grouped_rel,
                              path,
                              cheapest_path,
                              info->pathkeys);
     if (path == NULL)
         continue;

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

group_by.diff

Re: POC: GROUP BY optimization

От

Richard Guo

Дата:

16 января 2024 г., 02:48:15

On Mon, Jan 15, 2024 at 3:56 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Mon, Jan 15, 2024 at 8:42 AM Richard Guo <guofenglinux@gmail.com> wrote:
> On Mon, Jan 15, 2024 at 8:20 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>>
>> Thank you for providing the test case relevant for this code change.
>> The revised patch incorporating this change is attached. Now the
>> patchset looks good to me. I'm going to push it if there are no
>> objections.
>
> Seems I'm late for the party. Can we hold for several more days? I'd
> like to have a review on this patch.

Sure, NP. I'll hold it till your review.

Thanks. Appreciate that.

I have briefly reviewed this patch and here are some comments.

* When trying to match the ordering of GROUP BY to that of ORDER BY in
get_useful_group_keys_orderings, this patch checks against the length of
path->pathkeys. This does not make sense. I guess this is a typo and
the real intention is to check against root->sort_pathkeys.

--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -504,7 +504,7 @@ get_useful_group_keys_orderings(PlannerInfo *root, Path *path)
root->num_groupby_pathkeys);

if (n > 0 &&
- (enable_incremental_sort || n == list_length(path->pathkeys)))
+ (enable_incremental_sort || n == list_length(root->sort_pathkeys)))

However, I think this is also incorrect. When incremental sort is
disabled, it is reasonable to consider reordering the GROUP BY keys only
if the number of matching pathkeys is equal to the length of
root->group_pathkeys i.e. if 'n == list_length(root->group_pathkeys)'.

* When the original ordering of GROUP BY keys matches the path's
pathkeys or ORDER BY keys, get_useful_group_keys_orderings would
generate duplicate PathKeyInfos. Consequently, this duplication would
lead to the creation and addition of identical paths multiple times.
This is not great. Consider the query below:

create table t (a int, b int);
create index on t (a, b);
set enable_hashagg to off;

explain select count(*) from t group by a, b;
QUERY PLAN
----------------------------------------------------------------------------------
GroupAggregate (cost=0.15..97.27 rows=226 width=16)
Group Key: a, b
-> Index Only Scan using t_a_b_idx on t (cost=0.15..78.06 rows=2260 width=8)
(3 rows)

The same path with group keys {a, b} is created and added twice.

* Part of the work performed in this patch overlaps with that of
preprocess_groupclause. They are both trying to adjust the ordering of
the GROUP BY keys to match ORDER BY. I wonder if it would be better to
perform this work only once.

* When reordering the GROUP BY keys, I think the following checks need
some improvements.

+ /*
+ * Since 1349d27 pathkey coming from underlying node can be in the
+ * root->group_pathkeys but not in the processed_groupClause. So, we
+ * should be careful here.
+ */
+ sgc = get_sortgroupref_clause_noerr(pathkey->pk_eclass->ec_sortref,
+ *group_clauses);
+ if (!sgc)
+ /* The grouping clause does not cover this pathkey */
+ break;

I think we need to check or assert 'pathkey->pk_eclass->ec_sortref' is
valid before calling get_sortgroupref_clause_noerr with it. Also, how
can we guarantee that the returned GROUP BY item is sortable? Should we
check that with OidIsValid(sgc->sortop)?

Thanks
Richard

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

16 января 2024 г., 15:05:33

Hi!

Thank you for your review.  The revised patchset is attached.

On Tue, Jan 16, 2024 at 4:48 AM Richard Guo <guofenglinux@gmail.com> wrote:
> On Mon, Jan 15, 2024 at 3:56 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>>
>> On Mon, Jan 15, 2024 at 8:42 AM Richard Guo <guofenglinux@gmail.com> wrote:
>> > On Mon, Jan 15, 2024 at 8:20 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>> >>
>> >> Thank you for providing the test case relevant for this code change.
>> >> The revised patch incorporating this change is attached.  Now the
>> >> patchset looks good to me.  I'm going to push it if there are no
>> >> objections.
>> >
>> > Seems I'm late for the party.  Can we hold for several more days?  I'd
>> > like to have a review on this patch.
>>
>> Sure, NP.  I'll hold it till your review.
>
>
> Thanks.  Appreciate that.
>
> I have briefly reviewed this patch and here are some comments.
>
> * When trying to match the ordering of GROUP BY to that of ORDER BY in
> get_useful_group_keys_orderings, this patch checks against the length of
> path->pathkeys.  This does not make sense.  I guess this is a typo and
> the real intention is to check against root->sort_pathkeys.
>
> --- a/src/backend/optimizer/path/pathkeys.c
> +++ b/src/backend/optimizer/path/pathkeys.c
> @@ -504,7 +504,7 @@ get_useful_group_keys_orderings(PlannerInfo *root, Path *path)
>                                            root->num_groupby_pathkeys);
>
>         if (n > 0 &&
> -           (enable_incremental_sort || n == list_length(path->pathkeys)))
> +           (enable_incremental_sort || n == list_length(root->sort_pathkeys)))
>
> However, I think this is also incorrect.  When incremental sort is
> disabled, it is reasonable to consider reordering the GROUP BY keys only
> if the number of matching pathkeys is equal to the length of
> root->group_pathkeys i.e. if 'n == list_length(root->group_pathkeys)'.

Hmm... Why should this be list_length(root->group_pathkeys) while
we're targeting root->sort_pathkeys.  I yet changed that to
list_length(root->sort_pathkeys).

> * When the original ordering of GROUP BY keys matches the path's
> pathkeys or ORDER BY keys, get_useful_group_keys_orderings would
> generate duplicate PathKeyInfos.  Consequently, this duplication would
> lead to the creation and addition of identical paths multiple times.
> This is not great.  Consider the query below:
>
> create table t (a int, b int);
> create index on t (a, b);
> set enable_hashagg to off;
>
> explain select count(*) from t group by a, b;
>                                     QUERY PLAN
> ----------------------------------------------------------------------------------
>  GroupAggregate  (cost=0.15..97.27 rows=226 width=16)
>    Group Key: a, b
>    ->  Index Only Scan using t_a_b_idx on t  (cost=0.15..78.06 rows=2260 width=8)
> (3 rows)
>
> The same path with group keys {a, b} is created and added twice.

I tried to address that.

> * Part of the work performed in this patch overlaps with that of
> preprocess_groupclause.  They are both trying to adjust the ordering of
> the GROUP BY keys to match ORDER BY.  I wonder if it would be better to
> perform this work only once.

Andrei, could you take a look.

> * When reordering the GROUP BY keys, I think the following checks need
> some improvements.
>
> +       /*
> +        * Since 1349d27 pathkey coming from underlying node can be in the
> +        * root->group_pathkeys but not in the processed_groupClause. So, we
> +        * should be careful here.
> +        */
> +       sgc = get_sortgroupref_clause_noerr(pathkey->pk_eclass->ec_sortref,
> +                                           *group_clauses);
> +       if (!sgc)
> +           /* The grouping clause does not cover this pathkey */
> +           break;
>
> I think we need to check or assert 'pathkey->pk_eclass->ec_sortref' is
> valid before calling get_sortgroupref_clause_noerr with it.  Also, how
> can we guarantee that the returned GROUP BY item is sortable?  Should we
> check that with OidIsValid(sgc->sortop)?

Added.

------
Regards,
Alexander Korotkov

Вложения

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

18 января 2024 г., 09:48:54

On 16/1/2024 22:05, Alexander Korotkov wrote:
> On Tue, Jan 16, 2024 at 4:48 AM Richard Guo <guofenglinux@gmail.com> wrote:
>> * When trying to match the ordering of GROUP BY to that of ORDER BY in
>> get_useful_group_keys_orderings, this patch checks against the length of
>> path->pathkeys.  This does not make sense.  I guess this is a typo and
>> the real intention is to check against root->sort_pathkeys.
Thanks! It is really my blunder - fresh look works.
>>
>> --- a/src/backend/optimizer/path/pathkeys.c
>> +++ b/src/backend/optimizer/path/pathkeys.c
>> @@ -504,7 +504,7 @@ get_useful_group_keys_orderings(PlannerInfo *root, Path *path)
>>                                             root->num_groupby_pathkeys);
>>
>>          if (n > 0 &&
>> -           (enable_incremental_sort || n == list_length(path->pathkeys)))
>> +           (enable_incremental_sort || n == list_length(root->sort_pathkeys)))
>>
>> However, I think this is also incorrect.  When incremental sort is
>> disabled, it is reasonable to consider reordering the GROUP BY keys only
>> if the number of matching pathkeys is equal to the length of
>> root->group_pathkeys i.e. if 'n == list_length(root->group_pathkeys)'.
> 
> Hmm... Why should this be list_length(root->group_pathkeys) while
> we're targeting root->sort_pathkeys.  I yet changed that to
> list_length(root->sort_pathkeys).
I think, in the first case, when we are trying to arrange group-by keys 
according to underlying pathkeys with incremental sort = off, it makes 
sense to do if we fetch all group-by keys regardless of a more or equal 
number of path keys the incoming path contains. The code and test case 
are in step1.txt.
> 
>> * When the original ordering of GROUP BY keys matches the path's
>> pathkeys or ORDER BY keys, get_useful_group_keys_orderings would
>> generate duplicate PathKeyInfos.  Consequently, this duplication would
>> lead to the creation and addition of identical paths multiple times.
>> This is not great.  Consider the query below:
>>
>> create table t (a int, b int);
>> create index on t (a, b);
>> set enable_hashagg to off;
>>
>> explain select count(*) from t group by a, b;
>>                                      QUERY PLAN
>> ----------------------------------------------------------------------------------
>>   GroupAggregate  (cost=0.15..97.27 rows=226 width=16)
>>     Group Key: a, b
>>     ->  Index Only Scan using t_a_b_idx on t  (cost=0.15..78.06 rows=2260 width=8)
>> (3 rows)
>>
>> The same path with group keys {a, b} is created and added twice.
> 
> I tried to address that.
> 
>> * Part of the work performed in this patch overlaps with that of
>> preprocess_groupclause.  They are both trying to adjust the ordering of
>> the GROUP BY keys to match ORDER BY.  I wonder if it would be better to
>> perform this work only once.
> 
> Andrei, could you take a look.
As I see, the PathKeyInfo list often contains duplicated pathkeys, 
coming from either sort_pathkeys or path->pathkeys orderings. So, I can 
propose to check duplicates each time (see step2.txt in attachment).
> 
>> * When reordering the GROUP BY keys, I think the following checks need
>> some improvements.
>>
>> +       /*
>> +        * Since 1349d27 pathkey coming from underlying node can be in the
>> +        * root->group_pathkeys but not in the processed_groupClause. So, we
>> +        * should be careful here.
>> +        */
>> +       sgc = get_sortgroupref_clause_noerr(pathkey->pk_eclass->ec_sortref,
>> +                                           *group_clauses);
>> +       if (!sgc)
>> +           /* The grouping clause does not cover this pathkey */
>> +           break;
>>
>> I think we need to check or assert 'pathkey->pk_eclass->ec_sortref' is
>> valid before calling get_sortgroupref_clause_noerr with it.  Also, how
>> can we guarantee that the returned GROUP BY item is sortable?  Should we
>> check that with OidIsValid(sgc->sortop)?
> 
> Added.
Reviewed it, looks good.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

step1.txt

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

18 января 2024 г., 14:18:34

Just forgotten second attachment.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

step2.txt

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

19 января 2024 г., 14:21:48

Hi!

I've applied your changes with minor editing, thank you.

On Thu, Jan 18, 2024 at 11:49 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> >> * Part of the work performed in this patch overlaps with that of
> >> preprocess_groupclause.  They are both trying to adjust the ordering of
> >> the GROUP BY keys to match ORDER BY.  I wonder if it would be better to
> >> perform this work only once.
> >
> > Andrei, could you take a look.
> As I see, the PathKeyInfo list often contains duplicated pathkeys,
> coming from either sort_pathkeys or path->pathkeys orderings. So, I can
> propose to check duplicates each time (see step2.txt in attachment).

Actually I asked to recheck if we can cut some part of
preprocess_groupclause() given that we're reordering the pathkeys
later.  It seems that we can remove everything except the work with a
grouping set.  I've done this in the revised patchset.

I'm going to push this if there are no objections.

------
Regards,
Alexander Korotkov

On Fri, Feb 2, 2024 at 10:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alexander Korotkov <aekorotkov@gmail.com> writes:
> I'm going to push this if there are no objections.

One of the test cases added by this commit has not been very
stable in the buildfarm. Latest example is here:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04

and I've seen similar failures intermittently on other machines.

I'd suggest building this test atop a table that is more stable
than pg_class. You're just waving a red flag in front of a bull
if you expect stable statistics from that during a regression run.
Nor do I see any particular reason for pg_class to be especially
suited to the test.

Yeah, it's not a good practice to use pg_class in this place. While
looking through the test cases added by this commit, I noticed some
other minor issues that are not great. Such as

* The table 'btg' is inserted with 10000 tuples, which seems a bit
expensive for a test. I don't think we need such a big table to test
what we want.

* I don't see why we need to manipulate GUC max_parallel_workers and
max_parallel_workers_per_gather.

* I think we'd better write the tests with the keywords being all upper
or all lower. A mixed use of upper and lower is not great. Such as in

explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;

* Some comments for the test queries are not easy to read.

* For this statement

CREATE INDEX idx_y_x_z ON btg(y,x,w);

I think the index name would cause confusion. It creates an index on
columns y, x and w, but the name indicates an index on y, x and z.

I'd like to write a draft patch for the fixes.

Thanks
Richard

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

02 февраля 2024 г., 03:42:45

On 2/2/2024 09:02, Tom Lane wrote:
> Alexander Korotkov <aekorotkov@gmail.com> writes:
>> I'm going to push this if there are no objections.
> 
> One of the test cases added by this commit has not been very
> stable in the buildfarm.  Latest example is here:
> 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04
> 
> and I've seen similar failures intermittently on other machines.
> 
> I'd suggest building this test atop a table that is more stable
> than pg_class.  You're just waving a red flag in front of a bull
> if you expect stable statistics from that during a regression run.
> Nor do I see any particular reason for pg_class to be especially
> suited to the test.
Yeah, It is my fault. Please, see in the attachment the patch fixing that.
-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

0001-Replace-the-GROUP-BY-optimization-test-with-the-same.patch

Re: POC: GROUP BY optimization

От

Richard Guo

Дата:

02 февраля 2024 г., 04:06:42

On Fri, Feb 2, 2024 at 11:32 AM Richard Guo <guofenglinux@gmail.com> wrote:

On Fri, Feb 2, 2024 at 10:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
One of the test cases added by this commit has not been very
stable in the buildfarm. Latest example is here:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04

and I've seen similar failures intermittently on other machines.

I'd suggest building this test atop a table that is more stable
than pg_class. You're just waving a red flag in front of a bull
if you expect stable statistics from that during a regression run.
Nor do I see any particular reason for pg_class to be especially
suited to the test.

Yeah, it's not a good practice to use pg_class in this place. While
looking through the test cases added by this commit, I noticed some
other minor issues that are not great. Such as

* The table 'btg' is inserted with 10000 tuples, which seems a bit
expensive for a test. I don't think we need such a big table to test
what we want.

* I don't see why we need to manipulate GUC max_parallel_workers and
max_parallel_workers_per_gather.

* I think we'd better write the tests with the keywords being all upper
or all lower. A mixed use of upper and lower is not great. Such as in

explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;

* Some comments for the test queries are not easy to read.

* For this statement

CREATE INDEX idx_y_x_z ON btg(y,x,w);

I think the index name would cause confusion. It creates an index on
columns y, x and w, but the name indicates an index on y, x and z.

I'd like to write a draft patch for the fixes.

Here is the draft patch that fixes the issues I complained about in
upthread.

Thanks
Richard

Вложения

v1-0001-Multiple-revises-for-the-GROUP-BY-reordering-tests.patch

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

02 февраля 2024 г., 04:40:12

On 2/2/2024 11:06, Richard Guo wrote:
> 
> On Fri, Feb 2, 2024 at 11:32 AM Richard Guo <guofenglinux@gmail.com 
> <mailto:guofenglinux@gmail.com>> wrote:
> 
>     On Fri, Feb 2, 2024 at 10:02 AM Tom Lane <tgl@sss.pgh.pa.us
>     <mailto:tgl@sss.pgh.pa.us>> wrote:
> 
>         One of the test cases added by this commit has not been very
>         stable in the buildfarm.  Latest example is here:
> 
>         https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04
<https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04>
> 
>         and I've seen similar failures intermittently on other machines.
> 
>         I'd suggest building this test atop a table that is more stable
>         than pg_class.  You're just waving a red flag in front of a bull
>         if you expect stable statistics from that during a regression run.
>         Nor do I see any particular reason for pg_class to be especially
>         suited to the test.
> 
> 
>     Yeah, it's not a good practice to use pg_class in this place.  While
>     looking through the test cases added by this commit, I noticed some
>     other minor issues that are not great.  Such as
> 
>     * The table 'btg' is inserted with 10000 tuples, which seems a bit
>     expensive for a test.  I don't think we need such a big table to test
>     what we want.
> 
>     * I don't see why we need to manipulate GUC max_parallel_workers and
>     max_parallel_workers_per_gather.
> 
>     * I think we'd better write the tests with the keywords being all upper
>     or all lower.  A mixed use of upper and lower is not great. Such as in
> 
>          explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;
> 
>     * Some comments for the test queries are not easy to read.
> 
>     * For this statement
> 
>          CREATE INDEX idx_y_x_z ON btg(y,x,w);
> 
>     I think the index name would cause confusion.  It creates an index on
>     columns y, x and w, but the name indicates an index on y, x and z.
> 
>     I'd like to write a draft patch for the fixes.
> 
> 
> Here is the draft patch that fixes the issues I complained about in
> upthread.
I passed through the patch. Looks like it doesn't break anything. Why do 
you prefer to use count(*) in EXPLAIN instead of plain targetlist, like 
"SELECT x,y,..."?
Also, according to the test mentioned by Tom:
1. I see, PG uses IndexScan on (x,y). So, column x will be already 
sorted before the MergeJoin. Why not use Incremental Sort on (x,z,w) 
instead of full sort?
2. For memo, IMO, this test shows us the future near perspective of this 
feature: It is cheaper to use grouping order (w,z) instead of (z,w).

-- 
regards,
Andrei Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

Richard Guo

Дата:

21 февраля 2024 г., 08:08:12

On Fri, Feb 2, 2024 at 12:40 PM Andrei Lepikhov <a.lepikhov@postgrespro.ru> wrote:

On 2/2/2024 11:06, Richard Guo wrote:
>
> On Fri, Feb 2, 2024 at 11:32 AM Richard Guo <guofenglinux@gmail.com
> <mailto:guofenglinux@gmail.com>> wrote:
>
> On Fri, Feb 2, 2024 at 10:02 AM Tom Lane <tgl@sss.pgh.pa.us
> <mailto:tgl@sss.pgh.pa.us>> wrote:
>
> One of the test cases added by this commit has not been very
> stable in the buildfarm. Latest example is here:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04 <https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04>
>
> and I've seen similar failures intermittently on other machines.
>
> I'd suggest building this test atop a table that is more stable
> than pg_class. You're just waving a red flag in front of a bull
> if you expect stable statistics from that during a regression run.
> Nor do I see any particular reason for pg_class to be especially
> suited to the test.
>
>
> Yeah, it's not a good practice to use pg_class in this place. While
> looking through the test cases added by this commit, I noticed some
> other minor issues that are not great. Such as
>
> * The table 'btg' is inserted with 10000 tuples, which seems a bit
> expensive for a test. I don't think we need such a big table to test
> what we want.
>
> * I don't see why we need to manipulate GUC max_parallel_workers and
> max_parallel_workers_per_gather.
>
> * I think we'd better write the tests with the keywords being all upper
> or all lower. A mixed use of upper and lower is not great. Such as in
>
> explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;
>
> * Some comments for the test queries are not easy to read.
>
> * For this statement
>
> CREATE INDEX idx_y_x_z ON btg(y,x,w);
>
> I think the index name would cause confusion. It creates an index on
> columns y, x and w, but the name indicates an index on y, x and z.
>
> I'd like to write a draft patch for the fixes.
>
>
> Here is the draft patch that fixes the issues I complained about in
> upthread.

I passed through the patch. Looks like it doesn't break anything. Why do
you prefer to use count(*) in EXPLAIN instead of plain targetlist, like
"SELECT x,y,..."?

Nothing special. Just making the test cases consistent as much as
possible.

Also, according to the test mentioned by Tom:
1. I see, PG uses IndexScan on (x,y). So, column x will be already
sorted before the MergeJoin. Why not use Incremental Sort on (x,z,w)
instead of full sort?

I think that's because the planner chooses to use (z, w, x) to perform
the mergejoin. I did not delve into the details, but I guess the cost
estimation decides this is cheaper.

Hi Alexander,

What do you think about the revisions for the test cases?

Thanks
Richard

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

21 февраля 2024 г., 10:19:53

Hi, Richard!

> What do you think about the revisions for the test cases?

I've rebased your patch upthread.  Did some minor beautifications.

> * The table 'btg' is inserted with 10000 tuples, which seems a bit
> expensive for a test.  I don't think we need such a big table to test
> what we want.

Your patch reduces the number of rows to 1000 tuples.  I found it
possible to further reduce it to 100 tuples.  That also allowed me to
save the plan in the test case introduced by e1b7fde418.

Please check if you're OK with the patch attached.

------
Regards,
Alexander Korotkov

Вложения

0001-Multiple-revises-for-the-GROUP-BY-reordering-test-v2.patch

Re: POC: GROUP BY optimization

От

Maxim Orlov

Дата:

21 февраля 2024 г., 15:08:54

Hi!

Another issue on test introduced in 0452b461bc405. I think it may be unstable in some circumstances.

For example, if we'll try to use different BLCKSZ. See, I've made a little change in the number of tuples to be inserted:

$ git diff
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index d6ed5d0eff..414078d4ec 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -1187,7 +1187,7 @@ CREATE TABLE btg AS SELECT
i % 100 AS y,
'abc' || i % 10 AS z,
i AS w
-FROM generate_series(1,10000) AS i;
+FROM generate_series(1,11900) AS i;
CREATE INDEX btg_x_y_idx ON btg(x,y);
ANALYZE btg;

And the bulk extension is kicked, so we got zeroed pages in the relation. The plane is also changed,

switched to seq scan from index scan:

@@ -2734,7 +2734,7 @@
i % 100 AS y,
'abc' || i % 10 AS z,
i AS w
-FROM generate_series(1,10000) AS i;
+FROM generate_series(1,11900) AS i;
CREATE INDEX btg_x_y_idx ON btg(x,y);
ANALYZE btg;
-- GROUP BY optimization by reorder columns by frequency
@@ -2760,62 +2760,57 @@

-- Engage incremental sort
explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;
- QUERY PLAN
--------------------------------------------------
+ QUERY PLAN
+------------------------------
Group
Group Key: x, y, z, w
- -> Incremental Sort
+ -> Sort
Sort Key: x, y, z, w
- Presorted Key: x, y
- -> Index Scan using btg_x_y_idx on btg
-(6 rows)
+ -> Seq Scan on btg
+(5 rows)

... and so on.

So, my proposal is simple. I think we need not just "ANALYZE btg", but

"VACUUM ANALYZE btg", to get rid of zeroed pages in this particular

case. PFA corresponding patch.

Best regards,

Maxim Orlov.

Вложения

0001-Force-VACUUM-to-avoid-zeroed-pages-from-bulk-extensi.patch

Re: POC: GROUP BY optimization

От

Richard Guo

Дата:

22 февраля 2024 г., 02:09:20

On Wed, Feb 21, 2024 at 6:20 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

Hi, Richard!

> What do you think about the revisions for the test cases?

I've rebased your patch upthread. Did some minor beautifications.

> * The table 'btg' is inserted with 10000 tuples, which seems a bit
> expensive for a test. I don't think we need such a big table to test
> what we want.

Your patch reduces the number of rows to 1000 tuples. I found it
possible to further reduce it to 100 tuples. That also allowed me to
save the plan in the test case introduced by e1b7fde418.

Please check if you're OK with the patch attached.

I looked through the v2 patch and have two comments.

* The test case under "Check we don't pick aggregate path key instead of
grouping path key" does not have EXPLAIN to show the plan. So how can
we ensure we do not mistakenly select the aggregate pathkeys instead of
the grouping pathkeys?

* I don't think the test case introduced by e1b7fde418 is still needed,
because we already have one under "Utilize the ordering of merge join to
avoid a full Sort operation". This kind of test case is just to ensure
that we are able to utilize the ordering of the subplans underneath. So
it should be parallel to the test cases for utilize the ordering of
index scan and subquery scan.

See the attached v3 patch. I also made cosmetic tweaks to the comments,
and simplified a test case query.

Thanks
Richard

Вложения

v3-0001-Multiple-revisions-to-the-GROUP-BY-reordering-tests.patch

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

22 февраля 2024 г., 04:18:42

On 22/2/2024 09:09, Richard Guo wrote:
> 
> On Wed, Feb 21, 2024 at 6:20 PM Alexander Korotkov <aekorotkov@gmail.com 
> <mailto:aekorotkov@gmail.com>> wrote:
> 
>     Hi, Richard!
> 
>      > What do you think about the revisions for the test cases?
> 
>     I've rebased your patch upthread.  Did some minor beautifications.
> 
>      > * The table 'btg' is inserted with 10000 tuples, which seems a bit
>      > expensive for a test.  I don't think we need such a big table to test
>      > what we want.
> 
>     Your patch reduces the number of rows to 1000 tuples.  I found it
>     possible to further reduce it to 100 tuples.  That also allowed me to
>     save the plan in the test case introduced by e1b7fde418.
> 
>     Please check if you're OK with the patch attached.
> 
> 
> I looked through the v2 patch and have two comments.
> 
> * The test case under "Check we don't pick aggregate path key instead of
> grouping path key" does not have EXPLAIN to show the plan.  So how can
> we ensure we do not mistakenly select the aggregate pathkeys instead of
> the grouping pathkeys?
I confirm it works correctly. I am not sure about the stability of the 
zeros number in the output on different platforms here:
        avg
--------------------
  4.0000000000000000
  5.0000000000000000
It was why I'd used the format() function before. So, may we elaborate 
on the test and restrict the output?
> 
> * I don't think the test case introduced by e1b7fde418 is still needed,
> because we already have one under "Utilize the ordering of merge join to
> avoid a full Sort operation".  This kind of test case is just to ensure
> that we are able to utilize the ordering of the subplans underneath.  So
> it should be parallel to the test cases for utilize the ordering of
> index scan and subquery scan.
I confirm, this version also checks ec_sortref initialization in the 
case when ec are contructed from WHERE clause. Generally, I like more 
one test for one issue instead of one test for all at once. But it works 
and I don't have any reason to dispute it.

Also, I'm unsure about removing the disabling of the 
max_parallel_workers_per_gather parameter. Have you discovered the 
domination of the current plan over the partial one? Do the cost 
fluctuations across platforms not trigger a parallel plan?

What's more, I suggest to address here the complaint from [1]. As I see, 
cost difference between Sort and IncrementalSort strategies in that case 
is around 0.5. To make the test more stable I propose to change it a bit 
and add a limit:
SELECT count(*) FROM btg GROUP BY z, y, w, x LIMIT 10;
It makes efficacy of IncrementalSort more obvious difference around 10 
cost points.

[1] 
https://www.postgresql.org/message-id/CACG=ezaYM1tr6Lmp8PRH1aeZq=rBKXEoTwgzMcLaD5MPhfW0Lg@mail.gmail.com

-- 
regards,
Andrei Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

Richard Guo

Дата:

22 февраля 2024 г., 06:35:41

On Thu, Feb 22, 2024 at 12:18 PM Andrei Lepikhov <a.lepikhov@postgrespro.ru> wrote:

On 22/2/2024 09:09, Richard Guo wrote:
> I looked through the v2 patch and have two comments.
>
> * The test case under "Check we don't pick aggregate path key instead of
> grouping path key" does not have EXPLAIN to show the plan. So how can
> we ensure we do not mistakenly select the aggregate pathkeys instead of
> the grouping pathkeys?

I confirm it works correctly. I am not sure about the stability of the
zeros number in the output on different platforms here:
avg
--------------------
4.0000000000000000
5.0000000000000000
It was why I'd used the format() function before. So, may we elaborate
on the test and restrict the output?

The avg() function on integer argument is commonly used in
aggregates.sql. I don't think this is an issue. See the first test
query in aggregates.sql.

SELECT avg(four) AS avg_1 FROM onek;
avg_1
--------------------
1.5000000000000000
(1 row)

> * I don't think the test case introduced by e1b7fde418 is still needed,
> because we already have one under "Utilize the ordering of merge join to
> avoid a full Sort operation". This kind of test case is just to ensure
> that we are able to utilize the ordering of the subplans underneath. So
> it should be parallel to the test cases for utilize the ordering of
> index scan and subquery scan.

Also, I'm unsure about removing the disabling of the
max_parallel_workers_per_gather parameter. Have you discovered the
domination of the current plan over the partial one? Do the cost
fluctuations across platforms not trigger a parallel plan?

The table used for testing contains only 100 tuples, which is the size
of only one page. I don't believe it would trigger any parallel plans,
unless we manually change min_parallel_table_scan_size.

What's more, I suggest to address here the complaint from [1]. As I see,
cost difference between Sort and IncrementalSort strategies in that case
is around 0.5. To make the test more stable I propose to change it a bit
and add a limit:
SELECT count(*) FROM btg GROUP BY z, y, w, x LIMIT 10;
It makes efficacy of IncrementalSort more obvious difference around 10
cost points.

I don't think that's necessary. With Incremental Sort the final cost
is:

GroupAggregate (cost=1.66..19.00 rows=100 width=25)

while with full Sort it is:

GroupAggregate (cost=16.96..19.46 rows=100 width=25)

With the STD_FUZZ_FACTOR (1.01), there is no doubt that the first path
is cheaper on total cost. Not to say that even if somehow we decide the
two paths are fuzzily the same on total cost, the first path still
dominates because its startup cost is much cheaper.

Thanks
Richard

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

22 февраля 2024 г., 07:04:55

On 22/2/2024 13:35, Richard Guo wrote:
> The avg() function on integer argument is commonly used in
> aggregates.sql.  I don't think this is an issue.  See the first test
> query in aggregates.sql.
Make sense
>      > it should be parallel to the test cases for utilize the ordering of
>      > index scan and subquery scan.
> 
>     Also, I'm unsure about removing the disabling of the
>     max_parallel_workers_per_gather parameter. Have you discovered the
>     domination of the current plan over the partial one? Do the cost
>     fluctuations across platforms not trigger a parallel plan?
> 
> 
> The table used for testing contains only 100 tuples, which is the size
> of only one page.  I don't believe it would trigger any parallel plans,
> unless we manually change min_parallel_table_scan_size.
I don't intend to argue it, but just for the information, I frequently 
reduce it to zero, allowing PostgreSQL to make a decision based on 
costs. It sometimes works much better, because one small table in multi 
join can disallow an effective parallel plan.
> 
>     What's more, I suggest to address here the complaint from [1]. As I
>     see,
>     cost difference between Sort and IncrementalSort strategies in that
>     case
>     is around 0.5. To make the test more stable I propose to change it a
>     bit
>     and add a limit:
>     SELECT count(*) FROM btg GROUP BY z, y, w, x LIMIT 10;
>     It makes efficacy of IncrementalSort more obvious difference around 10
>     cost points.
> 
> 
> I don't think that's necessary.  With Incremental Sort the final cost
> is:
> 
>      GroupAggregate  (cost=1.66..19.00 rows=100 width=25)
> 
> while with full Sort it is:
> 
>      GroupAggregate  (cost=16.96..19.46 rows=100 width=25)
> 
> With the STD_FUZZ_FACTOR (1.01), there is no doubt that the first path
> is cheaper on total cost.  Not to say that even if somehow we decide the
> two paths are fuzzily the same on total cost, the first path still
> dominates because its startup cost is much cheaper.
As before, I won't protest here - it needs some computations about how 
much cost can be added by bulk extension of the relation blocks. If 
Maxim will answer that it's enough to resolve his issue, why not?

-- 
regards,
Andrei Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

Tom Lane

Дата:

11 апреля 2024 г., 23:44:42

I'm late to the party, and I apologize for not having paid attention
to this thread for months ... but I am wondering why 0452b461b got
committed.  It seems to add a substantial amount of complexity and
planner cycles in return for not much.  The reason why I'm dubious
about it can be found in this comment that the patch deleted:

- * In principle it might be interesting to consider other orderings of the
- * GROUP BY elements, which could match the sort ordering of other
- * possible plans (eg an indexscan) and thereby reduce cost.  We don't
- * bother with that, though.  Hashed grouping will frequently win anyway.

Indeed, if you remove the

SET enable_hashagg = off;
SET enable_seqscan = off;

lines added to aggregates.sql, you find that every single one of the
test cases added there changes plan, except for the one case that is
there to prove that the planner *doesn't* pick this type of plan.
That shows that the planner doesn't believe any of the plans developed
here are cheaper than other alternatives it would normally consider
(usually HashAgg).  So it sure seems like we're spending a lot of
effort on uninteresting plans.  Maybe this is an artifact of testing
against toy tables and a useful win could be shown on bigger tables,
but I don't see any indication in the thread that anyone actually
demonstrated that.

I might hold still for this patch anyway if the patch were simpler
and more obviously correct, but there are scary bits in it:

* The very first hunk causes get_eclass_for_sort_expr to have
side-effects on existing EquivalenceClass data structures.
What is the argument that that's not going to cause problems?
At the very least it introduces asymmetry, in that the first
caller wins the chance to change the EC's ec_sortref and later
callers won't be able to.

* I'm pretty unconvinced by group_keys_reorder_by_pathkeys (which
I notice has already had one band-aid added to it since commit).
In particular, it seems to believe that the pathkeys and clauses
lists match one-for-one, but I seriously doubt that that invariant
remains guaranteed after the cleanup steps

    /* append the remaining group pathkeys (will be treated as not sorted) */
    *group_pathkeys = list_concat_unique_ptr(new_group_pathkeys,
                                             *group_pathkeys);
    *group_clauses = list_concat_unique_ptr(new_group_clauses,
                                            *group_clauses);

For that to be reliable, the SortGroupClauses added to
new_group_clauses in the main loop have to be exactly those
that are associated with the same pathkeys in the old lists.
I doubt that that's necessarily true in the presence of redundant
grouping clauses.  (Maybe we can't get here with any redundant
grouping clauses, but still, we don't really guarantee uniqueness of
SortGroupClauses, much less that they are never copied which is what
you need if you want to believe that pointer equality is sufficient
for de-duping here.  PathKeys are explicitly made to be safe to compare
pointer-wise, but I know of no such guarantee for SortGroupClauses.)

* It seems like root->processed_groupClause no longer has much to do
with the grouping behavior, which is scary given how much code still
believes that it does.  I suspect there are bugs lurking there, and
am not comforted by the fact that the patch changed exactly nothing
in the pathnodes.h documentation of that field.  This comment looks
pretty silly now too:

            /* Preprocess regular GROUP BY clause, if any */
            root->processed_groupClause = list_copy(parse->groupClause);

What "preprocessing" is going on there?  This comment was adequate
before, when the code line invoked preprocess_groupclause which had
a bunch of commentary; but now it has to stand on its own and it's
not doing a great job of that.

* Speaking of pathnodes.h, PathKeyInfo is a pretty uninformative node
type name, and the comments provided for it are not going to educate
anybody.  What is the "association" between the pathkeys and clauses?
I guess the clauses are supposed to be SortGroupClauses, but not all
pathkeys match up to SortGroupClauses, so what then?  This is very
underspecified, and fuzzy thinking about data structures tends to lead
to bugs.

So I'm quite afraid that there are still bugs lurking here.
What's more, given that the plans this patch makes apparently
seldom win when you don't put a thumb on the scales, it might
take a very long time to isolate those bugs.  If the patch
produces a faulty plan (e.g. not sorted properly) we'd never
notice if that plan isn't chosen, and even if it is chosen
it probably wouldn't produce anything as obviously wrong as
a crash.

If this patch were producing better results I'd be more excited
about putting more work into it.  But on the basis of what I'm
seeing right now, I think maybe we ought to give up on it.

            regards, tom lane

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

12 апреля 2024 г., 05:05:14

On 4/12/24 06:44, Tom Lane wrote:
> If this patch were producing better results I'd be more excited
> about putting more work into it.  But on the basis of what I'm
> seeing right now, I think maybe we ought to give up on it.
First, thanks for the deep review - sometimes, only a commit gives us a 
chance to get such observation :))).
On a broader note, introducing automatic group-by-order choosing is a 
step towards training the optimiser to handle poorly tuned incoming 
queries. While it's true that this may initially impact performance, 
it's crucial to weigh the potential benefits. So, beforehand, we should 
agree: Is it worth it?
If yes, I would say I see how often hashing doesn't work in grouping. 
Sometimes because of estimation errors, sometimes because grouping 
already has sorted input, sometimes in analytical queries when planner 
doesn't have enough memory for hashing. In analytical cases, the only 
way to speed up queries sometimes is to be smart with features like 
IncrementalSort and this one.
About low efficiency. Remember the previous version of the GROUP-BY 
optimisation - we disagreed on operator costs and the cost model in 
general. In the current version, we went the opposite - adding small 
features step-by-step. The current commit contains an integral part of 
the feature and is designed for safely testing the approach and adding 
more profitable parts like choosing group-by-order according to distinct 
values or unique indexes on grouping columns.
I have passed through the code being steered by the issues explained in 
detail. I see seven issues. Two of them definitely should be scrutinised 
right now, and I'm ready to do that.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

16 апреля 2024 г., 05:59:35

On 4/12/24 06:44, Tom Lane wrote:
> * The very first hunk causes get_eclass_for_sort_expr to have
> side-effects on existing EquivalenceClass data structures.
> What is the argument that that's not going to cause problems?
> At the very least it introduces asymmetry, in that the first
> caller wins the chance to change the EC's ec_sortref and later
> callers won't be able to.
Let me resolve issues bit by bit.
Addressing your first note, 'asymmetry,' I discovered this part of the 
code. Before the 8d83a5d, it could cause some artefacts, but since then, 
a kind of asymmetry has been introduced into the GROUP-BY clause list.
As you can see in the attached patch, GROUP-BY reordering works even 
when the asymmetry of the 8d83a5d chooses different keys.

At the same time, I agree that implicitly setting anything in an 
exporting function, which should just look up for EC, is a questionable 
coding style. To avoid possible usage issues (in extensions, for 
example), I propose setting it up afterwards, explicitly forcing this 
action by input parameter - see my attempt in the attachment.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

group_by_asymmetry.diff

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

17 апреля 2024 г., 03:11:59

On 4/12/24 06:44, Tom Lane wrote:
> * I'm pretty unconvinced by group_keys_reorder_by_pathkeys (which
> I notice has already had one band-aid added to it since commit).
> In particular, it seems to believe that the pathkeys and clauses
> lists match one-for-one, but I seriously doubt that that invariant
> remains guaranteed after the cleanup steps
> 
>      /* append the remaining group pathkeys (will be treated as not sorted) */
>      *group_pathkeys = list_concat_unique_ptr(new_group_pathkeys,
>                                               *group_pathkeys);
>      *group_clauses = list_concat_unique_ptr(new_group_clauses,
>                                              *group_clauses);
> 
> For that to be reliable, the SortGroupClauses added to
> new_group_clauses in the main loop have to be exactly those
> that are associated with the same pathkeys in the old lists.
> I doubt that that's necessarily true in the presence of redundant
> grouping clauses.  (Maybe we can't get here with any redundant
> grouping clauses, but still, we don't really guarantee uniqueness of
> SortGroupClauses, much less that they are never copied which is what
> you need if you want to believe that pointer equality is sufficient
> for de-duping here.  PathKeys are explicitly made to be safe to compare
> pointer-wise, but I know of no such guarantee for SortGroupClauses.)
I spent a lot of time inventing situations with SortGroupClause 
duplicates. Unfortunately, it looks impossible so far. But because we 
really don't guarantee uniqueness, I changed the code to survive in this 
case. Also, I added assertion checking to be sure we don't have logical 
mistakes here - see attachment.
About the band-aid mentioned above - as I see, 4169850 introduces the 
same trick in planner.c. So, it looks like result of design of the 
current code.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

get_useful_group_keys_orderings.patch

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

18 апреля 2024 г., 05:01:13

On 4/12/24 06:44, Tom Lane wrote:
> * It seems like root->processed_groupClause no longer has much to do
> with the grouping behavior, which is scary given how much code still
> believes that it does.  I suspect there are bugs lurking there, and
1. Analysing the code, processed_groupClause is used in:
- adjust_foreign_grouping_path_cost - to decide, do we need to add cost 
of sort to the foreign grouping.
- just for replacing relids during SJE process.
- create_groupingsets_plan(), preprocess_grouping_sets, 
remove_useless_groupby_columns - we don't apply this feature in the case 
of grouping sets.
- get_number_of_groups - estimation shouldn't depend on the order of 
clauses.
- create_grouping_paths - to analyse hashability of grouping clause
- make_group_input_target, make_partial_grouping_target - doesn't depend 
on the order of clauses
planner.c: add_paths_to_grouping_rel, create_partial_grouping_paths - in 
the case of hash grouping.

So, the only case we can impact current behavior lies in the 
postgres_fdw. But here we don't really know which plan will be chosen 
during planning on foreign node and stay the same behavior is legal for me.
> am not comforted by the fact that the patch changed exactly nothing
> in the pathnodes.h documentation of that field.  This comment looks
> pretty silly now too:
> 
>              /* Preprocess regular GROUP BY clause, if any */
>              root->processed_groupClause = list_copy(parse->groupClause);
> 
> What "preprocessing" is going on there?  This comment was adequate
> before, when the code line invoked preprocess_groupclause which had
> a bunch of commentary; but now it has to stand on its own and it's
> not doing a great job of that.
Thanks for picking this place. I fixed it.
More interesting here is that we move decisions on the order of group-by 
clauses to the stage, where we already have underlying subtree and 
incoming path keys. In principle, we could return the preprocessing 
routine and arrange GROUP-BY with the ORDER-BY clause as it was before. 
But looking into the code, I found only one reason to do this: 
adjust_group_pathkeys_for_groupagg. Curious about how much profit we get 
here, I think we can discover it later with no hurry. A good outcome 
here will be a test case that can show the importance of arranging 
GROUP-BY and ORDER-BY at an early stage.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

minor_comment.patch

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

18 апреля 2024 г., 08:54:45

On 4/12/24 06:44, Tom Lane wrote:
> * Speaking of pathnodes.h, PathKeyInfo is a pretty uninformative node
> type name, and the comments provided for it are not going to educate
> anybody.  What is the "association" between the pathkeys and clauses?
> I guess the clauses are supposed to be SortGroupClauses, but not all
> pathkeys match up to SortGroupClauses, so what then?  This is very
> underspecified, and fuzzy thinking about data structures tends to lead
> to bugs.
I'm not the best in English documentation and naming style. So, feel 
free to check my version.
> 
> So I'm quite afraid that there are still bugs lurking here.
> What's more, given that the plans this patch makes apparently
> seldom win when you don't put a thumb on the scales, it might
> take a very long time to isolate those bugs.  If the patch
> produces a faulty plan (e.g. not sorted properly) we'd never
> notice if that plan isn't chosen, and even if it is chosen
> it probably wouldn't produce anything as obviously wrong as
> a crash.
I added checkings on the proper order of pathkeys and clauses.
  If you really care about that, we should spend additional time and 
rewrite the code to generate an order of clauses somewhere in the plan 
creation phase. For example, during the create_group_plan call, we could 
use the processed_groupClause list, cycle through subpath->pathkeys, set 
the order, and detect whether the pathkeys list corresponds to the 
group-by or is enough to build a grouping plan.
Anyway, make this part of code more resistant to blunders is another story.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

final_improvements.patch

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

18 апреля 2024 г., 10:57:41

Hi, Andrei!

On Thu, Apr 18, 2024 at 11:54 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 4/12/24 06:44, Tom Lane wrote:
> > * Speaking of pathnodes.h, PathKeyInfo is a pretty uninformative node
> > type name, and the comments provided for it are not going to educate
> > anybody.  What is the "association" between the pathkeys and clauses?
> > I guess the clauses are supposed to be SortGroupClauses, but not all
> > pathkeys match up to SortGroupClauses, so what then?  This is very
> > underspecified, and fuzzy thinking about data structures tends to lead
> > to bugs.
> I'm not the best in English documentation and naming style. So, feel
> free to check my version.
> >
> > So I'm quite afraid that there are still bugs lurking here.
> > What's more, given that the plans this patch makes apparently
> > seldom win when you don't put a thumb on the scales, it might
> > take a very long time to isolate those bugs.  If the patch
> > produces a faulty plan (e.g. not sorted properly) we'd never
> > notice if that plan isn't chosen, and even if it is chosen
> > it probably wouldn't produce anything as obviously wrong as
> > a crash.
> I added checkings on the proper order of pathkeys and clauses.
>   If you really care about that, we should spend additional time and
> rewrite the code to generate an order of clauses somewhere in the plan
> creation phase. For example, during the create_group_plan call, we could
> use the processed_groupClause list, cycle through subpath->pathkeys, set
> the order, and detect whether the pathkeys list corresponds to the
> group-by or is enough to build a grouping plan.
> Anyway, make this part of code more resistant to blunders is another story.

Thank you for the fixes you've proposed.  I didn't look much into
details yet, but I think the main concern Tom expressed in [1] is
whether the feature is reasonable at all.  I think at this stage the
most important thing is to come up with convincing examples showing
how huge performance benefits it could cause.  I will return to this
later today and will try to provide some convincing examples.

Links
1. https://www.postgresql.org/message-id/266850.1712879082%40sss.pgh.pa.us

------
Regards,
Alexander Korotkov

Re: POC: GROUP BY optimization

От

jian he

Дата:

19 апреля 2024 г., 10:44:44

On Thu, Apr 18, 2024 at 6:58 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> Thank you for the fixes you've proposed.  I didn't look much into
> details yet, but I think the main concern Tom expressed in [1] is
> whether the feature is reasonable at all.  I think at this stage the
> most important thing is to come up with convincing examples showing
> how huge performance benefits it could cause.  I will return to this
> later today and will try to provide some convincing examples.
>

hi.
I found a case where it improved performance.

+-- GROUP BY optimization by reorder columns
+CREATE TABLE btg AS SELECT
+ i % 100 AS x,
+ i % 100 AS y,
+ 'abc' || i % 10 AS z,
+ i AS w
+FROM generate_series(1,10000) AS i;
+CREATE INDEX abc ON btg(x,y);
+ANALYZE btg;
+
I change
+FROM generate_series(1,10000) AS i;
to
+ FROM generate_series(1, 1e6) AS i;

Then I found out about these 2 queries performance improved a lot.
A: explain(analyze) SELECT count(*) FROM btg GROUP BY w, x, y, z ORDER
BY y, x \watch i=0.1 c=10
B: explain(analyze) SELECT count(*) FROM btg GROUP BY w, x, z, y ORDER
BY y, x, z, w \watch i=0.1 c=10

set (enable_seqscan,enable_hashagg) from on to off:
queryA execution time from 1533.013 ms to 533.430 ms
queryB execution time from 1996.817 ms to 497.020 ms

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

22 апреля 2024 г., 03:51:02

On 4/12/24 06:44, Tom Lane wrote:
> If this patch were producing better results I'd be more excited
> about putting more work into it.  But on the basis of what I'm
> seeing right now, I think maybe we ought to give up on it.
Let me show current cases where users have a profit with this tiny 
improvement (see queries and execution results in query.sql):
1. 'Not optimised query text' — when we didn't consider group-by 
ordering during database development.
2. 'Accidental pathkeys' - we didn't see any explicit orderings, but 
accidentally, the planner used merge join that caused some orderings and 
we can utilise it.
3. 'Uncertain scan path' — We have options regarding which index to use, 
and because of that, we can't predict the optimal group-by ordering 
before the start of query planning.
4. 'HashAgg V/S GroupAgg' — sometimes, the GroupAgg strategy outperforms 
HashAgg just because we don't need any ordering procedure at all.

And the last thing here — this code introduces the basics needed to add 
more sophisticated strategies, like ordering according to uniqueness or 
preferring to set constant-width columns first in grouping.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Вложения

query.sql

Re: POC: GROUP BY optimization

От

jian he

Дата:

22 апреля 2024 г., 08:56:19

On Fri, Apr 19, 2024 at 6:44 PM jian he <jian.universality@gmail.com> wrote:
>
> On Thu, Apr 18, 2024 at 6:58 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >
> > Thank you for the fixes you've proposed.  I didn't look much into
> > details yet, but I think the main concern Tom expressed in [1] is
> > whether the feature is reasonable at all.  I think at this stage the
> > most important thing is to come up with convincing examples showing
> > how huge performance benefits it could cause.  I will return to this
> > later today and will try to provide some convincing examples.
> >

hi.
previously preprocess_groupclause will not process cases
where no ORDER BY clause is specified.
commit 0452b461b will reorder the GROUP BY element even though no
ORDER BY clause is specified
, if there are associated indexes on it.
(hope I understand it correctly).

for example (when enable_hashagg is false)
explain(verbose) select count(*) FROM btg GROUP BY y,x;
in pg16 will not reorder, it will be as is: `GROUP BY y,x`

after commit 0452b461b, it will reorder to `GROUP BY x,y`.
because there is an index `btree (x, y)` (only one) associated with it.
if you drop the index `btree (x, y)` , it will be `GROUP BY y,x` as pg16.

This reordering GROUP BY element when no ORDER BY clause is not specified
is performant useful when the work_mem is small.
I've attached some tests comparing master with REL_16_STABLE to
demonstrate that.
all the tests attached are under the condition:
work_mem='64kB', buildtype=release, max_parallel_workers_per_gather=0.

one example:
CREATE TABLE btg5 AS
SELECT i::numeric % 10 AS x, i % 10 AS y, 'abc' || i % 10 AS z, i % 100000 AS w
FROM generate_series(1, 1e6) AS i;
CREATE INDEX btg5_x_y_idx ON btg5(x, y);

explain(analyze) SELECT count(*) FROM btg5 GROUP BY z, y, w, x;
in pg17,  the execution time is: 746.574 ms
in pg16,  the execution time is: 1693.483 ms

if I reorder it manually as:
`explain(analyze) SELECT count(*) FROM btg5 GROUP BY x, y, w, z;`
then in pg16, the execution time is 630.394 ms

Вложения

low_mem_groupby_reorder.sql

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

22 апреля 2024 г., 22:19:59

Hi!

On Thu, Apr 18, 2024 at 1:57 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> Thank you for the fixes you've proposed.  I didn't look much into
> details yet, but I think the main concern Tom expressed in [1] is
> whether the feature is reasonable at all.  I think at this stage the
> most important thing is to come up with convincing examples showing
> how huge performance benefits it could cause.  I will return to this
> later today and will try to provide some convincing examples.

I took a fresh look at 0452b461b, and have the following thoughts:
1) Previously, preprocess_groupclause() tries to reorder GROUP BY
clauses to match the required ORDER BY order.  It only reorders if
GROUP BY pathkeys are the prefix of ORDER BY pathkeys or vice versa.
So, both of them need to be satisfied by one ordering.  0452b461b also
tries to match GROUP BY clauses to ORDER BY clauses, but takes into
account an incremental sort.  Actually, instead of removing
preprocess_groupclause(), we could just teach it to take incremental
sort into account.
2) The get_useful_group_keys_orderings() function takes into account 3
orderings of pathkeys and clauses: original order as written in GROUP
BY, matching ORDER BY clauses as much as possible, and matching the
input path as much as possible.  Given that even before 0452b461b,
preprocess_groupclause() could change the original order of GROUP BY
clauses, so we don't need to distinguish the first two.  We could just
consider output of new preprocess_groupclause() taking into account an
incremental sort and the ordering matching the input path as much as
possible.  This seems like significant simplification.

Let me also describe my thoughts about the justification of the
feature itself.  As Tom pointed upthread, Sort + Grouping is generally
unlikely faster than Hash Aggregate.  The main point of this feature
is being able to match the order of GROUP BY clauses to the order of
the input path.  That input path could be Index Scan or Subquery.
Let's concentrate on Index Scan.  Index Scan can give us the required
order, so we can do grouping without Sort or with significantly
cheaper Incremental Sort.  That could be much faster than Hash
Aggregate.  But if we scan the full table (or its significant
fraction), Index Scan is typically much more expensive than Sequential
Scan because of random page access.  However, there are cases when
Index Scan could be cheaper.
1) If the heap row is wide and the index contains all the required
columns, Index Only Scan can be cheaper than Sequential Scan because
of lesser volume.
2) If the query predicate is selective enough and matches the index,
Index Scan might be significantly cheaper.  One may say that if the
query predicate is selective enough then there are not that many
matching rows, so aggregating them either way isn't a big deal anyway.
However, given that data volumes are growing tremendously, it's not
hard to imagine that the index selected a small fraction of table
rows, but they are still enough to be costly for aggregating.
Therefore, I think there are significant cases where matching GROUP BY
clauses to the order of the input path could give a substantial
improvement over Hash Aggregate.

While there are some particular use-cases by Jian He, I hope that
above could give some rationale.

------
Regards,
Alexander Korotkov

Re: POC: GROUP BY optimization

От

jian he

Дата:

24 апреля 2024 г., 06:25:22

hi.
I found an interesting case.

CREATE TABLE t1 AS
  SELECT (i % 10)::numeric AS x,(i % 10)::int8 AS y,'abc' || i % 10 AS
z, i::int4 AS w
  FROM generate_series(1, 100) AS i;
CREATE INDEX t1_x_y_idx ON t1 (x, y);
ANALYZE t1;
SET enable_hashagg = off;
SET enable_seqscan = off;

EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,y,w;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,y,z;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,w,y;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,z,y;
the above part will use:
  ->  Incremental Sort
         Sort Key: x, $, $, $
         Presorted Key: x
         ->  Index Scan using t1_x_y_idx on t1

EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY z,y,w,x;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY w,y,z,x;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,z,x,w;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,w,x,z;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,x,z,w;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,x,w,z;

these will use:
  ->  Incremental Sort
         Sort Key: x, y, $, $
         Presorted Key: x, y
         ->  Index Scan using t1_x_y_idx on t1

I guess this is fine, but not optimal?

Re: POC: GROUP BY optimization

От

jian he

Дата:

24 апреля 2024 г., 11:17:34

hi
one more question (maybe a dumb one....)

drop table if exists t1;
CREATE TABLE t1 AS
  SELECT (i % 10)::numeric AS x,(i % 10)::int8 AS y,'abc' || i % 10 AS
z, i::int4 AS w
  FROM generate_series(1, 100) AS i;
CREATE INDEX t1_x_y_idx ON t1 (x, y);
ANALYZE t1;
SET enable_hashagg = off;
SET enable_seqscan = off;


EXPLAIN (COSTS OFF, verbose) SELECT count(*) FROM t1 GROUP BY z,x,y,w
order by w;


                      QUERY PLAN
------------------------------------------------------
 GroupAggregate
   Output: count(*), w, z, x, y
   Group Key: t1.w, t1.x, t1.y, t1.z
   ->  Sort
         Output: w, z, x, y
         Sort Key: t1.w, t1.x, t1.y, t1.z
         ->  Index Scan using t1_x_y_idx on public.t1
               Output: w, z, x, y
(8 rows)


if you do
` Sort Key: t1.w, t1.x, t1.y, t1.z`

then the output is supposed to be:

Output: w, x, y, z

?

Re: POC: GROUP BY optimization

От

jian he

Дата:

29 апреля 2024 г., 02:50:00

On Wed, Apr 24, 2024 at 2:25 PM jian he <jian.universality@gmail.com> wrote:
>
> hi.
> I found an interesting case.
>
> CREATE TABLE t1 AS
>   SELECT (i % 10)::numeric AS x,(i % 10)::int8 AS y,'abc' || i % 10 AS
> z, i::int4 AS w
>   FROM generate_series(1, 100) AS i;
> CREATE INDEX t1_x_y_idx ON t1 (x, y);
> ANALYZE t1;
> SET enable_hashagg = off;
> SET enable_seqscan = off;
>
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,y,w;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,y,z;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,w,y;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,z,y;
> the above part will use:
>   ->  Incremental Sort
>          Sort Key: x, $, $, $
>          Presorted Key: x
>          ->  Index Scan using t1_x_y_idx on t1

We can make these cases also `Presorted Key: x, y`.

in
`if (path->pathkeys && !pathkeys_contained_in(path->pathkeys,
root->group_pathkeys))` branch
we can simple do
-                       infos = lappend(infos, info);
+                       infos = lcons(info, infos);

similar to what we did at plancat.c (search lcons).

get_useful_group_keys_orderings returns a  list of PathKeyInfo,
then the caller function just iterates each element.
so for the caller, order of the returned list element from
get_useful_group_keys_orderings
does not matter.

for path Incremental Sort:
function make_ordered_path will return the same cost for different
numbers of  presorted keys.

for example:
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,y,w;
make_ordered_path cost is same for:
`info->pathkeys: x,y,z,w`
`info->pathkeys:x,z,y,w`

if we arrange `info->pathkeys: x,y,z,w` before `info->pathkeys:x,z,y,w`
in get_useful_group_keys_orderings.
then with the same cost, we will choose the first one
(`info->pathkeys: x,y,z,w`),
if we use IncrementalSort, then we use `Presorted Key: x, y`.

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

16 мая 2024 г., 07:47:01

On 24.04.2024 13:25, jian he wrote:
> hi.
> I found an interesting case.
> 
> CREATE TABLE t1 AS
>    SELECT (i % 10)::numeric AS x,(i % 10)::int8 AS y,'abc' || i % 10 AS
> z, i::int4 AS w
>    FROM generate_series(1, 100) AS i;
> CREATE INDEX t1_x_y_idx ON t1 (x, y);
> ANALYZE t1;
> SET enable_hashagg = off;
> SET enable_seqscan = off;
> 
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,y,w;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,y,z;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,w,y;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,z,y;
> the above part will use:
>    ->  Incremental Sort
>           Sort Key: x, $, $, $
>           Presorted Key: x
>           ->  Index Scan using t1_x_y_idx on t1
> 
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY z,y,w,x;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY w,y,z,x;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,z,x,w;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,w,x,z;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,x,z,w;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,x,w,z;
> 
> these will use:
>    ->  Incremental Sort
>           Sort Key: x, y, $, $
>           Presorted Key: x, y
>           ->  Index Scan using t1_x_y_idx on t1
> 
> I guess this is fine, but not optimal?
It looks like a bug right now - in current implementation we don't 
differentiate different orders. So:
1. Applying all the patches from the thread which I proposed as an 
answer to T.Lane last rebuke - does behavior still the same?.
2. Could you try to find the reason?

-- 
regards,
Andrei Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

jian he

Дата:

20 мая 2024 г., 08:54:56

On Thu, May 16, 2024 at 3:47 PM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
>
> On 24.04.2024 13:25, jian he wrote:
> > hi.
> > I found an interesting case.
> >
> > CREATE TABLE t1 AS
> >    SELECT (i % 10)::numeric AS x,(i % 10)::int8 AS y,'abc' || i % 10 AS
> > z, i::int4 AS w
> >    FROM generate_series(1, 100) AS i;
> > CREATE INDEX t1_x_y_idx ON t1 (x, y);
> > ANALYZE t1;
> > SET enable_hashagg = off;
> > SET enable_seqscan = off;
> >
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,y,w;
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,y,z;
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,w,y;
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,z,y;
> > the above part will use:
> >    ->  Incremental Sort
> >           Sort Key: x, $, $, $
> >           Presorted Key: x
> >           ->  Index Scan using t1_x_y_idx on t1
> >
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY z,y,w,x;
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY w,y,z,x;
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,z,x,w;
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,w,x,z;
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,x,z,w;
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,x,w,z;
> >
> > these will use:
> >    ->  Incremental Sort
> >           Sort Key: x, y, $, $
> >           Presorted Key: x, y
> >           ->  Index Scan using t1_x_y_idx on t1
> >
> > I guess this is fine, but not optimal?
> It looks like a bug right now - in current implementation we don't
> differentiate different orders. So:
> 1. Applying all the patches from the thread which I proposed as an
> answer to T.Lane last rebuke - does behavior still the same?.

I've applied these 4 patches in  this thread.
final_improvements.patch
minor_comment.patch
get_useful_group_keys_orderings.patch
group_by_asymmetry.diff

The behavior is still the same as the master.
meaning that below quoted queries are still using "Presorted Key: x".

> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,y,w;
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,y,z;
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,w,y;
> > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,z,y;
> > the above part will use:
> >    ->  Incremental Sort
> >           Sort Key: x, $, $, $
> >           Presorted Key: x
> >           ->  Index Scan using t1_x_y_idx on t1

> 2. Could you try to find the reason?
the following are my understanding, it may be wrong...

function get_useful_group_keys_orderings, may generate alternative
pathkeys and group clauses.
for different PathKeyInfo
sub functions within add_paths_to_grouping_rel:
make_ordered_path, create_agg_path will yield the same cost.
function add_path (within add_paths_to_grouping_rel) will add these
paths (different PathKeyInfos) in a *sequential* order.

then later in the function set_cheapest.
`foreach(p, parent_rel->pathlist)`  will iterate these different paths
in a sequential order.
so in set_cheapest if 2 path costs are the same, then the first path
residing in parent_rel->pathlist wins.

fo a concrete example:
CREATE TABLE t1 AS
SELECT (i % 10)::numeric AS x,(i % 10)::int8 AS y,'abc' || i % 10 AS
z, i::float4 AS w FROM generate_series(1, 100) AS i;
CREATE INDEX t1_x_y_idx ON t1 (x, y);
ANALYZE t1;
SET enable_hashagg = off;
SET enable_seqscan = off;
EXPLAIN (COSTS OFF, verbose) SELECT count(*) FROM t1 GROUP BY x,z,y,w;

add_paths_to_grouping_rel will do the following in a sequential order.
A. generate and add original PathKeyInfo(groupby x,z,y,w)
B. generate and add alternative PathKeyInfo(groupby x,y,z,w). (because
of the index path)
C. can_hash is true, generate a HashAgg Path (because of
enable_hashagg is set to off, so this path the cost is very high)

As mentioned previously,
both A and B can use Incremental Sort, set_cheapest will choose the A
instead of B,
because A step generated path **first** satisfies increment sort.

Re: POC: GROUP BY optimization

От

Andrei Lepikhov

Дата:

21 мая 2024 г., 09:01:19

On 20/5/2024 15:54, jian he wrote:
> As mentioned previously,
> both A and B can use Incremental Sort, set_cheapest will choose the A
> instead of B,
> because A step generated path **first** satisfies increment sort.
Yeah, I agree with your analysis.
Looking into the cost_incremental_sort, I see that we have ten 
group_tuples. This value doesn't depend on how many presorted keys (1 or 
2) we use. This is caused by default estimation.
Given the current circumstances, it seems unlikely that we can make any 
significant changes without introducing a new sort cost model that 
accounts for the number of sorted columns.

-- 
regards,
Andrei Lepikhov
Postgres Professional

Re: POC: GROUP BY optimization

От

jian he

Дата:

24 мая 2024 г., 00:00:00

On Mon, May 20, 2024 at 4:54 PM jian he <jian.universality@gmail.com> wrote:
>
>
> The behavior is still the same as the master.
> meaning that below quoted queries are still using "Presorted Key: x".
>
> > > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,y,w;
> > > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,y,z;
> > > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,w,y;
> > > EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,z,y;
> > > the above part will use:
> > >    ->  Incremental Sort
> > >           Sort Key: x, $, $, $
> > >           Presorted Key: x
> > >           ->  Index Scan using t1_x_y_idx on t1
>
>
> > 2. Could you try to find the reason?
> the following are my understanding, it may be wrong...
>

I think
my previous thread only explained that two paths have the same cost,
the planner chooses the one that was first added into the pathlist.

but didn't explain why Incremental Sort path, presorted by one key and
presorted by two keys
yield the same cost.
--------------------------------------------
if (rel->tuples > 0)
{
/*
* Clamp to size of rel, or size of rel / 10 if multiple Vars. The
* fudge factor is because the Vars are probably correlated but we
* don't know by how much.  We should never clamp to less than the
* largest ndistinct value for any of the Vars, though, since
* there will surely be at least that many groups.
*/
double clamp = rel->tuples;

if (relvarcount > 1)
{
clamp *= 0.1;
if (clamp < relmaxndistinct)
{
clamp = relmaxndistinct;
/* for sanity in case some ndistinct is too large: */
if (clamp > rel->tuples)
clamp = rel->tuples;
}
}
if (reldistinct > clamp)
reldistinct = clamp;
...
}

i think,
the above code[0] snippet within function estimate_num_groups
makes Incremental Sort Path not pickup the optimal presorted keys.

see original problem thread: [1]
----------------------------------------------------------------------------------------
CREATE TABLE t1 AS
SELECT (i % 10)::numeric AS x,(i % 10)::int8 AS y,'abc' || i % 10 AS
z, i::int4 AS w
FROM generate_series(1, 100) AS i;
CREATE INDEX t1_x_y_idx ON t1 (x, y);
ANALYZE t1;
SET enable_hashagg = off;
SET enable_seqscan = off;

EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,y,w;
will generate 2 Incremental Sort path, one: "Presorted Key: x",
another one: "Presorted Key: x,y".
The first Incremental Sort path is added first.
The function estimate_num_groups returned value (input_groups) is the
main key to
calculate the cost of Incremental Sort path!

But here, the estimate_num_groups function returned the same
value: 10 for
"Presorted Key: x" and "Presorted Key: x,y".
then later cost_incremental_sort returns the same cost for "Presorted
Key: x" and "Presorted Key: x,y".

(line refers to src/backend/utils/adt/selfuncs line number).
why "Presorted Key: x,y" return 10 in estimate_num_groups:
line 3667 assign clamp to 100.0, because rel->tuples is 100.
line 3671 clamp *= 0.1; make clamp because 10.
line 3680, 3681  `if (reldistinct > clamp) branch` make reldistinct
from 100 to 10.
line 3733 make  `numdistinct *= reldistinct;` makes numdistinct because 10.

If I change the total number of rows in a relation, or change the
distinct number of y values
then the planner will use "Incremental Sort Presorted Key: x,y".
for example:

    CREATE TABLE t10 AS
    SELECT (i % 10)::numeric AS x,(i % 11)::int8 AS y,'abc' || i % 10
AS z, i::float4 AS w FROM generate_series(1, 1E2) AS i;
    CREATE INDEX t10_x_y_idx ON t10 (x, y);
    ANALYZE t10;

The above setup will make the following query using "Incremental Sort
Presorted Key: x,y".
EXPLAIN (COSTS OFF) SELECT count(*) FROM t10 GROUP BY x,z,y,w;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t10 GROUP BY x,w,y,z;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t10 GROUP BY x,z,w,y;
EXPLAIN (COSTS OFF) SELECT count(*) FROM t10 GROUP BY x,w,z,y;

summary:
* the regression test setup at [2] can make some cases not pickup the
best optimal
Incremental Sort path.

we use this test in src/test/regress/sql/aggregates.sql
-- Engage incremental sort
EXPLAIN (COSTS OFF)
SELECT count(*) FROM btg GROUP BY z, y, w, x;

but if we use:
EXPLAIN (COSTS OFF)
SELECT count(*) FROM btg GROUP BY x, z, w, y;
then the plan is not the best.

* The regression test setup makes estimate_num_groups code logic
return the same value.
therefore make Incremental Sort presort by one key, two keys yield the
same cost.
the cost_incremental_sort will return the same cost.

* https://git.postgresql.org/cgit/postgresql.git/tree/src/test/regress/sql/aggregates.sql#n1194
maybe we can change:

CREATE TABLE btg AS SELECT
  i % 10 AS x,
  i % 10 AS y,
  'abc' || i % 10 AS z,
  i AS w
FROM generate_series(1, 100) AS i;

to

CREATE TABLE btg AS SELECT
  i % 10 AS x,
  i % 10 AS y,
  'abc' || i % 10 AS z,
  i AS w
FROM generate_series(1, 1000) AS i;

[0] https://git.postgresql.org/cgit/postgresql.git/tree/src/backend/utils/adt/selfuncs.c#n3669
[1] https://www.postgresql.org/message-id/CACJufxGt99nZ%2Bnir%2BaB6pFQ%3DK8oNiHAQ3OELqSbGMqNxok8nxA%40mail.gmail.com
[2] https://git.postgresql.org/cgit/postgresql.git/tree/src/test/regress/sql/aggregates.sql#n1194

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

27 мая 2024 г., 12:41:25

Hi!

On Tue, Apr 23, 2024 at 1:19 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Thu, Apr 18, 2024 at 1:57 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > Thank you for the fixes you've proposed.  I didn't look much into
> > details yet, but I think the main concern Tom expressed in [1] is
> > whether the feature is reasonable at all.  I think at this stage the
> > most important thing is to come up with convincing examples showing
> > how huge performance benefits it could cause.  I will return to this
> > later today and will try to provide some convincing examples.
>
> I took a fresh look at 0452b461b, and have the following thoughts:
> 1) Previously, preprocess_groupclause() tries to reorder GROUP BY
> clauses to match the required ORDER BY order.  It only reorders if
> GROUP BY pathkeys are the prefix of ORDER BY pathkeys or vice versa.
> So, both of them need to be satisfied by one ordering.  0452b461b also
> tries to match GROUP BY clauses to ORDER BY clauses, but takes into
> account an incremental sort.  Actually, instead of removing
> preprocess_groupclause(), we could just teach it to take incremental
> sort into account.
> 2) The get_useful_group_keys_orderings() function takes into account 3
> orderings of pathkeys and clauses: original order as written in GROUP
> BY, matching ORDER BY clauses as much as possible, and matching the
> input path as much as possible.  Given that even before 0452b461b,
> preprocess_groupclause() could change the original order of GROUP BY
> clauses, so we don't need to distinguish the first two.  We could just
> consider output of new preprocess_groupclause() taking into account an
> incremental sort and the ordering matching the input path as much as
> possible.  This seems like significant simplification.
>
> Let me also describe my thoughts about the justification of the
> feature itself.  As Tom pointed upthread, Sort + Grouping is generally
> unlikely faster than Hash Aggregate.  The main point of this feature
> is being able to match the order of GROUP BY clauses to the order of
> the input path.  That input path could be Index Scan or Subquery.
> Let's concentrate on Index Scan.  Index Scan can give us the required
> order, so we can do grouping without Sort or with significantly
> cheaper Incremental Sort.  That could be much faster than Hash
> Aggregate.  But if we scan the full table (or its significant
> fraction), Index Scan is typically much more expensive than Sequential
> Scan because of random page access.  However, there are cases when
> Index Scan could be cheaper.
> 1) If the heap row is wide and the index contains all the required
> columns, Index Only Scan can be cheaper than Sequential Scan because
> of lesser volume.
> 2) If the query predicate is selective enough and matches the index,
> Index Scan might be significantly cheaper.  One may say that if the
> query predicate is selective enough then there are not that many
> matching rows, so aggregating them either way isn't a big deal anyway.
> However, given that data volumes are growing tremendously, it's not
> hard to imagine that the index selected a small fraction of table
> rows, but they are still enough to be costly for aggregating.
> Therefore, I think there are significant cases where matching GROUP BY
> clauses to the order of the input path could give a substantial
> improvement over Hash Aggregate.
>
> While there are some particular use-cases by Jian He, I hope that
> above could give some rationale.

I've assembled patches in this thread into one patchset.
0001 The patch fixing asymmetry in setting EquivalenceClass.ec_sortref
by Andrei [1].  I've revised comments and wrote the commit message.
0002 The patch for handling duplicates of SortGroupClause.  I didn't
get the sense of Andrei implementation.  It seems to care about
duplicate pointers in group clauses list.  But the question is the
equal SortGroupClause's comprising different pointers.  I think we
should group duplicate SortGroupClause's together as
preprocess_groupclause() used to do.  Reimplemented patch to do so.
0003 Rename PathKeyInfo to GroupByOrdering by Andres [3].  I only
revised comments and wrote the commit message.
0004 Turn back preprocess_groupclause() for the reason I described upthread [4].

Any thoughts?

Links.
1. https://www.postgresql.org/message-id/17037754-f187-4138-8285-0e2bfebd0dea%40postgrespro.ru
2. https://www.postgresql.org/message-id/a663f0f6-cbf6-49aa-af2e-234dc6768a07%40postgrespro.ru
3. https://www.postgresql.org/message-id/db0fc3a4-966c-4cec-a136-94024d39212d%40postgrespro.ru
4. https://www.postgresql.org/message-id/CAPpHfdvyWLMGwvxaf%3D7KAp-z-4mxbSH8ti2f6mNOQv5metZFzg%40mail.gmail.com

------
Regards,
Alexander Korotkov
Supabase

Hi!

On Thu, May 30, 2024 at 7:22 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 5/29/24 19:53, Alexander Korotkov wrote:
> > Thank you for your feedback.
> >
> > On Wed, May 29, 2024 at 11:08 AM Andrei Lepikhov
> > <a.lepikhov@postgrespro.ru> wrote:
> >> On 5/27/24 19:41, Alexander Korotkov wrote:
> >>> Any thoughts?
> >> About 0001:
> >> Having overviewed it, I don't see any issues (but I'm the author),
> >> except grammatical ones - but I'm not a native to judge it.
> >> Also, the sentence 'turning GROUP BY clauses  into pathkeys' is unclear
> >> to me. It may be better to write something like:  'building pathkeys by
> >> the list of grouping clauses'.
> >
> > OK, thank you.  I'll run once again for the grammar issues.

I've revised some grammar including the sentence you've proposed.

> >> 0002:
> >> The part under USE_ASSERT_CHECKING looks good to me. But the code in
> >> group_keys_reorder_by_pathkeys looks suspicious: of course, we do some
> >> doubtful work without any possible way to reproduce, but if we envision
> >> some duplicated elements in the group_clauses, we should avoid usage of
> >> the list_concat_unique_ptr.
> >
> > As I understand Tom, there is a risk that clauses list may contain
> > multiple instances of equivalent SortGroupClause, not duplicate
> > pointers.
> Maybe. I just implemented the worst-case scenario with the intention of
> maximum safety. So, it's up to you.
> >
> >> What's more, why do you not exit from
> >> foreach_ptr immediately after SortGroupClause has been found? I think
> >> the new_group_clauses should be consistent with the new_group_pathkeys.
> >
> > I wanted this to be consistent with preprocess_groupclause(), where
> > duplicate SortGroupClause'es are grouped together.  Otherwise, we
> > could just delete redundant SortGroupClause'es.
> Hm, preprocess_groupclause is called before the standard_qp_callback
> forms the 'canonical form' of group_pathkeys and such behaviour needed.
> But the code that chooses the reordering strategy uses already processed
> grouping clauses, where we don't have duplicates in the first
> num_groupby_pathkeys elements, do we?

Yep, it seems that we work with group clauses which were processed by
standard_qp_callback().  In turn standard_qp_callback() called
make_pathkeys_for_sortclauses_extended() with remove_redundant ==
true.  So, there shouldn't be duplicates.  And it's unclear what
should we be protected from.

I leaved 0002 with just asserts.  And extracted questionable changes into 0005.

------
Regards,
Alexander Korotkov
Supabase

Вложения

Re: POC: GROUP BY optimization

От

jian he

Дата:

02 июня 2024 г., 07:55:44

On Fri, May 31, 2024 at 8:12 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> I've revised some grammar including the sentence you've proposed.
>

-static List *groupclause_apply_groupingset(PlannerInfo *root, List *force);
+static List *preprocess_groupclause(PlannerInfo *root, List *force);

changing preprocess_groupclause the second argument
from "force" to "gset" would be more intuitive, I think.

`elog(ERROR, "Order of group-by clauses doesn't correspond incoming
sort order");`

I think this error message makes people wonder what "incoming sort order" is.
BTW, "correspond", generally people use  "correspond to".

I did some minor cosmetic changes, mainly changing foreach to foreach_node.
Please check the attachment.

Вложения

minor_change_on_v3_Restore-preprocess_groupclause.diff

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

02 июня 2024 г., 13:18:11

Hi!

On Sun, Jun 2, 2024 at 10:55 AM jian he <jian.universality@gmail.com> wrote:
>
> On Fri, May 31, 2024 at 8:12 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >
> > I've revised some grammar including the sentence you've proposed.
> >
>
> -static List *groupclause_apply_groupingset(PlannerInfo *root, List *force);
> +static List *preprocess_groupclause(PlannerInfo *root, List *force);
>
> changing preprocess_groupclause the second argument
> from "force" to "gset" would be more intuitive, I think.

Probably, but my intention is to restore preprocess_groupclause() as
it was before 0452b461bc with minimal edits to support incremental
sort.  I'd rather avoid refactoring if this area for now.

> `elog(ERROR, "Order of group-by clauses doesn't correspond incoming
> sort order");`
>
> I think this error message makes people wonder what "incoming sort order" is.
> BTW, "correspond", generally people use  "correspond to".

Thank you.  On the second thought, I think it would be better to turn
this into an assertion like the checks before.

> I did some minor cosmetic changes, mainly changing foreach to foreach_node.
> Please check the attachment.

I would avoid refactoring of preprocess_groupclause() for the reason
described above.  But I picked the grammar fix for PlannerInfo's
comment.

------
Regards,
Alexander Korotkov
Supabase

Вложения

Re: POC: GROUP BY optimization

От

Pavel Borisov

Дата:

03 июня 2024 г., 15:54:53

Hi, Alexander!

On Sun, 2 Jun 2024 at 17:18, Alexander Korotkov <aekorotkov@gmail.com> wrote:

Hi!

On Sun, Jun 2, 2024 at 10:55 AM jian he <jian.universality@gmail.com> wrote:
>
> On Fri, May 31, 2024 at 8:12 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >
> > I've revised some grammar including the sentence you've proposed.
> >
>
> -static List *groupclause_apply_groupingset(PlannerInfo *root, List *force);
> +static List *preprocess_groupclause(PlannerInfo *root, List *force);
>
> changing preprocess_groupclause the second argument
> from "force" to "gset" would be more intuitive, I think.

Probably, but my intention is to restore preprocess_groupclause() as
it was before 0452b461bc with minimal edits to support incremental
sort. I'd rather avoid refactoring if this area for now.

> `elog(ERROR, "Order of group-by clauses doesn't correspond incoming
> sort order");`
>
> I think this error message makes people wonder what "incoming sort order" is.
> BTW, "correspond", generally people use "correspond to".

Thank you. On the second thought, I think it would be better to turn
this into an assertion like the checks before.

> I did some minor cosmetic changes, mainly changing foreach to foreach_node.
> Please check the attachment.

I would avoid refactoring of preprocess_groupclause() for the reason
described above. But I picked the grammar fix for PlannerInfo's
comment.

Thank you for working on this patchset.

0001: Patch looks right

Comments:

Covert -> Convert

sets the uninitialized value of ec_sortref of the pathkey's EquivalenceClass -> sets the value of the pathkey's EquivalenceClass unless it's already initialized

wasn't set yet -> hasn't been set yet

0002: additional assert checking only. Looks right.

0003: pure renaming, looks good.

0004: Restores pre 0452b461bc state to preprocess_groupclause with removed partial_match fallback. Looks right. I haven't checked the comments provided they are restored from pre 0452b461bc state.

0005: Looks right.

Regards,

Pavel Borisov

Supabase

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

03 июня 2024 г., 22:47:43

On Mon, Jun 3, 2024 at 6:55 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> On Sun, 2 Jun 2024 at 17:18, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>>
>> Hi!
>>
>> On Sun, Jun 2, 2024 at 10:55 AM jian he <jian.universality@gmail.com> wrote:
>> >
>> > On Fri, May 31, 2024 at 8:12 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>> > >
>> > > I've revised some grammar including the sentence you've proposed.
>> > >
>> >
>> > -static List *groupclause_apply_groupingset(PlannerInfo *root, List *force);
>> > +static List *preprocess_groupclause(PlannerInfo *root, List *force);
>> >
>> > changing preprocess_groupclause the second argument
>> > from "force" to "gset" would be more intuitive, I think.
>>
>> Probably, but my intention is to restore preprocess_groupclause() as
>> it was before 0452b461bc with minimal edits to support incremental
>> sort.  I'd rather avoid refactoring if this area for now.
>>
>> > `elog(ERROR, "Order of group-by clauses doesn't correspond incoming
>> > sort order");`
>> >
>> > I think this error message makes people wonder what "incoming sort order" is.
>> > BTW, "correspond", generally people use  "correspond to".
>>
>> Thank you.  On the second thought, I think it would be better to turn
>> this into an assertion like the checks before.
>>
>> > I did some minor cosmetic changes, mainly changing foreach to foreach_node.
>> > Please check the attachment.
>>
>> I would avoid refactoring of preprocess_groupclause() for the reason
>> described above.  But I picked the grammar fix for PlannerInfo's
>> comment.
>
>
> Thank you for working on this patchset.
>
> 0001: Patch looks right
>
> Comments:
>
> Covert -> Convert
> sets the uninitialized value of ec_sortref of the pathkey's EquivalenceClass -> sets the value of the pathkey's
EquivalenceClassunless it's already initialized 
> wasn't set yet -> hasn't been set yet
>
> 0002: additional assert checking only. Looks right.
>
> 0003: pure renaming, looks good.
>
> 0004: Restores pre 0452b461bc state to preprocess_groupclause with removed partial_match fallback. Looks right. I
haven'tchecked the comments provided they are restored from pre 0452b461bc state. 
>
> 0005: Looks right.

Thank you.  Revised according to your comments.  I think 0001-0004
comprising a good refactoring addressing concerns from Tom [1].
0001 Removes from get_eclass_for_sort_expr() unnatural responsibility
of setting EquivalenceClass.ec_sortref.  Instead this field is set in
make_pathkeys_for_sortclauses_extended() while processing of group
clauses.
0002 Provides additional asserts.  That should helping to defend
against lurking bugs.
0003 Fixes unsuitable name of data structure.
0004 Restores pre 0452b461bc state to preprocess_groupclause() and
lowers the number of paths to consider.
It would be good to explicitly hear Tom about this.  But I'm quite
sure these patches makes situation better not worse.  I'm going to
push them if no objections.

I'm putting 0005 aside.  It's unclear why and how there could be
duplicate SortGroupClauses at that stage.  Also, it's unclear why
existing code handles them bad.  So, this should wait a comment from
Tom.

Links.
1. https://www.postgresql.org/message-id/266850.1712879082%40sss.pgh.pa.us

------
Regards,
Alexander Korotkov
Supabase

Вложения

Re: POC: GROUP BY optimization

От

Alexander Korotkov

Дата:

04 июня 2024 г., 18:05:25

On Tue, Jun 4, 2024 at 1:47 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Mon, Jun 3, 2024 at 6:55 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
> > On Sun, 2 Jun 2024 at 17:18, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >> On Sun, Jun 2, 2024 at 10:55 AM jian he <jian.universality@gmail.com> wrote:
> >> >
> >> > On Fri, May 31, 2024 at 8:12 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >> > >
> >> > > I've revised some grammar including the sentence you've proposed.
> >> > >
> >> >
> >> > -static List *groupclause_apply_groupingset(PlannerInfo *root, List *force);
> >> > +static List *preprocess_groupclause(PlannerInfo *root, List *force);
> >> >
> >> > changing preprocess_groupclause the second argument
> >> > from "force" to "gset" would be more intuitive, I think.
> >>
> >> Probably, but my intention is to restore preprocess_groupclause() as
> >> it was before 0452b461bc with minimal edits to support incremental
> >> sort.  I'd rather avoid refactoring if this area for now.
> >>
> >> > `elog(ERROR, "Order of group-by clauses doesn't correspond incoming
> >> > sort order");`
> >> >
> >> > I think this error message makes people wonder what "incoming sort order" is.
> >> > BTW, "correspond", generally people use  "correspond to".
> >>
> >> Thank you.  On the second thought, I think it would be better to turn
> >> this into an assertion like the checks before.
> >>
> >> > I did some minor cosmetic changes, mainly changing foreach to foreach_node.
> >> > Please check the attachment.
> >>
> >> I would avoid refactoring of preprocess_groupclause() for the reason
> >> described above.  But I picked the grammar fix for PlannerInfo's
> >> comment.
> >
> >
> > Thank you for working on this patchset.
> >
> > 0001: Patch looks right
> >
> > Comments:
> >
> > Covert -> Convert
> > sets the uninitialized value of ec_sortref of the pathkey's EquivalenceClass -> sets the value of the pathkey's
EquivalenceClassunless it's already initialized 
> > wasn't set yet -> hasn't been set yet
> >
> > 0002: additional assert checking only. Looks right.
> >
> > 0003: pure renaming, looks good.
> >
> > 0004: Restores pre 0452b461bc state to preprocess_groupclause with removed partial_match fallback. Looks right. I
haven'tchecked the comments provided they are restored from pre 0452b461bc state. 
> >
> > 0005: Looks right.
>
> Thank you.  Revised according to your comments.  I think 0001-0004
> comprising a good refactoring addressing concerns from Tom [1].
> 0001 Removes from get_eclass_for_sort_expr() unnatural responsibility
> of setting EquivalenceClass.ec_sortref.  Instead this field is set in
> make_pathkeys_for_sortclauses_extended() while processing of group
> clauses.
> 0002 Provides additional asserts.  That should helping to defend
> against lurking bugs.
> 0003 Fixes unsuitable name of data structure.
> 0004 Restores pre 0452b461bc state to preprocess_groupclause() and
> lowers the number of paths to consider.
> It would be good to explicitly hear Tom about this.  But I'm quite
> sure these patches makes situation better not worse.  I'm going to
> push them if no objections.
>
> I'm putting 0005 aside.  It's unclear why and how there could be
> duplicate SortGroupClauses at that stage.  Also, it's unclear why
> existing code handles them bad.  So, this should wait a comment from
> Tom.
>
> Links.
> 1. https://www.postgresql.org/message-id/266850.1712879082%40sss.pgh.pa.us

Ops... I found I've published an old version of patchset.  The
relevant version is attached.

------
Regards,
Alexander Korotkov
Supabase

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: POC: GROUP BY optimization

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения