Обсуждение: Two Window aggregate node for logically same over clause

Поиск
Список
Период
Сортировка

Two Window aggregate node for logically same over clause

От
"\"Anitha S\""
Дата:


Hi team,

      We have observed that for logically same over clause two different window aggregate nodes are created in plan.  
The below query contains two window functions. Both Over clause contain the same partition & order clause in it. But in one over clause ordering option is mentioned as ascending but not in another over clause which represents the default option "ascending".

 
Sample Query:
 
CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
enroll_date date
);

INSERT INTO empsalary VALUES ('develop', 10, 5200, '2007-08-01'), ('sales', 1, 5000, '2006-10-01'),
('personnel', 5, 3500, '2007-12-10'),('sales', 4, 4800, '2007-08-08'),('personnel', 2, 3900, '2006-12-23'),
('develop', 7, 4200, '2008-01-01'),('develop', 9, 4500, '2008-01-01'),('sales', 3, 4800, '2007-08-01'),
('develop', 8, 6000, '2006-10-01'),('develop', 11, 5200, '2007-08-15');

explain verbose select rank() over (partition by depname order by salary), percent_rank() over(partition by depname order by salary asc) from empsalary;
                  QUERY PLAN
------------------------------------------------------------------------------------------
     WindowAgg (cost=10000000074.54..10000000114.66 rows=1070 width=52)
     Output: (rank() OVER (?)), percent_rank() OVER (?), salary, depname
      -> WindowAgg (cost=10000000074.54..10000000095.94 rows=1070 width=44)
          Output: salary, depname, rank() OVER (?)
            -> Sort (cost=10000000074.54..10000000077.21 rows=1070 width=36)
                Output: salary, depname
                Sort Key: empsalary.depname, empsalary.salary
                   -> Seq Scan on pg_temp_7.empsalary (cost=0.00..20.70 rows=1070 width=36)
                       Output: salary, depname


Ordering option of Sort is represented by enum SortByDir (parsenodes.h).

List of SortBy is present in WindowDef structure which has info of order by clause in Over clause

/* Sort ordering options for ORDER BY and CREATE INDEX */
typedef enum SortByDir
{
SORTBY_DEFAULT,
SORTBY_ASC,
SORTBY_DESC,
SORTBY_USING /* not allowed in CREATE INDEX ... */
} SortByDir;
typedef struct SortBy
{
NodeTag type;
Node *node; /* expression to sort on */
SortByDir sortby_dir; /* ASC/DESC/USING/default */
SortByNulls sortby_nulls; /* NULLS FIRST/LAST */
List *useOp; /* name of op to use, if SORTBY_USING */
int location; /* operator location, or -1 if none/unknown */
} SortBy;


In transformWindowFuncCall API, Equality check of order clause in window definition failed while comparing SortByDir enum of both over clause i.e SORT_DEFAULT  is not equal to SORT_ASC. Hence two window clause are created in parse tree resulting in the creation of two different window aggregate node.

This check can be modified to form a single window aggregate node for the above results in faster query execution. Is there any reason for creating two different window aggregate node?

Thanks 
Anitha S

 









Re: Two Window aggregate node for logically same over clause

От
Ashutosh Bapat
Дата:
On Thu, Oct 5, 2023 at 8:53 PM "Anitha S" <anitha.sg@zohocorp.com> wrote:
>
>
>
> Hi team,
>
>       We have observed that for logically same over clause two different window aggregate nodes are created in plan.
> The below query contains two window functions. Both Over clause contain the same partition & order clause in it. But
inone over clause ordering option is mentioned as ascending but not in another over clause which represents the default
option"ascending". 
>
>
> Sample Query:
>
> CREATE TEMPORARY TABLE empsalary (
> depname varchar,
> empno bigint,
> salary int,
> enroll_date date
> );
>
> INSERT INTO empsalary VALUES ('develop', 10, 5200, '2007-08-01'), ('sales', 1, 5000, '2006-10-01'),
> ('personnel', 5, 3500, '2007-12-10'),('sales', 4, 4800, '2007-08-08'),('personnel', 2, 3900, '2006-12-23'),
> ('develop', 7, 4200, '2008-01-01'),('develop', 9, 4500, '2008-01-01'),('sales', 3, 4800, '2007-08-01'),
> ('develop', 8, 6000, '2006-10-01'),('develop', 11, 5200, '2007-08-15');
>
> explain verbose select rank() over (partition by depname order by salary), percent_rank() over(partition by depname
orderby salary asc) from empsalary; 
>                   QUERY PLAN
> ------------------------------------------------------------------------------------------
>      WindowAgg (cost=10000000074.54..10000000114.66 rows=1070 width=52)
>      Output: (rank() OVER (?)), percent_rank() OVER (?), salary, depname
>       -> WindowAgg (cost=10000000074.54..10000000095.94 rows=1070 width=44)
>           Output: salary, depname, rank() OVER (?)
>             -> Sort (cost=10000000074.54..10000000077.21 rows=1070 width=36)
>                 Output: salary, depname
>                 Sort Key: empsalary.depname, empsalary.salary
>                    -> Seq Scan on pg_temp_7.empsalary (cost=0.00..20.70 rows=1070 width=36)
>                        Output: salary, depname
>
>
> Ordering option of Sort is represented by enum SortByDir (parsenodes.h).
>
> List of SortBy is present in WindowDef structure which has info of order by clause in Over clause
>
> /* Sort ordering options for ORDER BY and CREATE INDEX */
> typedef enum SortByDir
> {
> SORTBY_DEFAULT,
> SORTBY_ASC,
> SORTBY_DESC,
> SORTBY_USING /* not allowed in CREATE INDEX ... */
> } SortByDir;
> typedef struct SortBy
> {
> NodeTag type;
> Node *node; /* expression to sort on */
> SortByDir sortby_dir; /* ASC/DESC/USING/default */
> SortByNulls sortby_nulls; /* NULLS FIRST/LAST */
> List *useOp; /* name of op to use, if SORTBY_USING */
> int location; /* operator location, or -1 if none/unknown */
> } SortBy;
>
>
> In transformWindowFuncCall API, Equality check of order clause in window definition failed while comparing SortByDir
enumof both over clause i.e SORT_DEFAULT  is not equal to SORT_ASC. Hence two window clause are created in parse tree
resultingin the creation of two different window aggregate node. 
>
> This check can be modified to form a single window aggregate node for the above results in faster query execution. Is
thereany reason for creating two different window aggregate node? 

I don't see any. https://www.postgresql.org/docs/16/sql-select.html
description of ORDER BY clause clearly says that ASC is assumed when
no direction is mentioned. The only place in code which is used to
create the node treats DEFAULT and ASC as same. May be we want to
allow default to be ASC or DESC based on some setting (read GUC) in
some future.

Another angle is to ask: Why would the query add ASC to one window
specification and not the other?

--
Best Wishes,
Ashutosh Bapat



Re: Two Window aggregate node for logically same over clause

От
Tom Lane
Дата:
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes:
> On Thu, Oct 5, 2023 at 8:53 PM "Anitha S" <anitha.sg@zohocorp.com> wrote:
>> We have observed that for logically same over clause two different window aggregate nodes are created in plan.
>> The below query contains two window functions. Both Over clause contain the same partition & order clause in it. But
inone over clause ordering option is mentioned as ascending but not in another over clause which represents the default
option"ascending". 

> Another angle is to ask: Why would the query add ASC to one window
> specification and not the other?

Yeah.  I can't get excited about doing anything about this.  We
promise to merge identical window clauses, but these aren't identical.
If you say "let's merge semantically equivalent clauses", that's
opening a fairly large can of worms --- for example, ought we to
recognize that "x + 1.0" and "x + 1.00" are equivalent?  Or even
"x" and "x + 0"?  (I'm pretty sure I've seen query hacks recommended
that depend on our *not* detecting that.)

Also, it would be an extremely bad idea IMO to change the way
equal() deals with this, which means that transformWindowFuncCall
would have to use bespoke code not equal() to check for matches.
That'd be ugly and a permanent maintenance gotcha.

            regards, tom lane



Re: Two Window aggregate node for logically same over clause

От
"\"Anitha S\""
Дата:
https://www.postgresql.org/docs/16/sql-select.html#SQL-ORDERBY:~:text=Optionally%20one%20can%20add%20the%20key%20word%20ASC%20(ascending)%20or%20DESC%20(descending)%20after%20any%20expression%20in%20the%20ORDER%20BY%20clause.%20If%20not%20specified%2C%20ASC%20is%20assumed%20by%20default.%20Alternatively

If order by directions is not mentioned it is assumed as ASC. It is mention ORDER by section. Currently PG don't have any GUC to set default order direction. Is there any idea to set default ordering direction via GUC ?

Another angle is to ask: Why would the query add ASC to one window
specification and not the other?

Had seen a query from user where one over clause contains order direction & others not.  The reason for this thread is to get a solution to handle such cases also.



---- On Fri, 06 Oct 2023 18:06:10 +0530 Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote ---

On Thu, Oct 5, 2023 at 8:53 PM "Anitha S" <anitha.sg@zohocorp.com> wrote:
>
>
>
> Hi team,
>
> We have observed that for logically same over clause two different window aggregate nodes are created in plan.
> The below query contains two window functions. Both Over clause contain the same partition & order clause in it. But in one over clause ordering option is mentioned as ascending but not in another over clause which represents the default option "ascending".
>
>
> Sample Query:
>
> CREATE TEMPORARY TABLE empsalary (
> depname varchar,
> empno bigint,
> salary int,
> enroll_date date
> );
>
> INSERT INTO empsalary VALUES ('develop', 10, 5200, '2007-08-01'), ('sales', 1, 5000, '2006-10-01'),
> ('personnel', 5, 3500, '2007-12-10'),('sales', 4, 4800, '2007-08-08'),('personnel', 2, 3900, '2006-12-23'),
> ('develop', 7, 4200, '2008-01-01'),('develop', 9, 4500, '2008-01-01'),('sales', 3, 4800, '2007-08-01'),
> ('develop', 8, 6000, '2006-10-01'),('develop', 11, 5200, '2007-08-15');
>
> explain verbose select rank() over (partition by depname order by salary), percent_rank() over(partition by depname order by salary asc) from empsalary;
> QUERY PLAN
> ------------------------------------------------------------------------------------------
> WindowAgg (cost=10000000074.54..10000000114.66 rows=1070 width=52)
> Output: (rank() OVER (?)), percent_rank() OVER (?), salary, depname
> -> WindowAgg (cost=10000000074.54..10000000095.94 rows=1070 width=44)
> Output: salary, depname, rank() OVER (?)
> -> Sort (cost=10000000074.54..10000000077.21 rows=1070 width=36)
> Output: salary, depname
> Sort Key: empsalary.depname, empsalary.salary
> -> Seq Scan on pg_temp_7.empsalary (cost=0.00..20.70 rows=1070 width=36)
> Output: salary, depname
>
>
> Ordering option of Sort is represented by enum SortByDir (parsenodes.h).
>
> List of SortBy is present in WindowDef structure which has info of order by clause in Over clause
>
> /* Sort ordering options for ORDER BY and CREATE INDEX */
> typedef enum SortByDir
> {
> SORTBY_DEFAULT,
> SORTBY_ASC,
> SORTBY_DESC,
> SORTBY_USING /* not allowed in CREATE INDEX ... */
> } SortByDir;
> typedef struct SortBy
> {
> NodeTag type;
> Node *node; /* expression to sort on */
> SortByDir sortby_dir; /* ASC/DESC/USING/default */
> SortByNulls sortby_nulls; /* NULLS FIRST/LAST */
> List *useOp; /* name of op to use, if SORTBY_USING */
> int location; /* operator location, or -1 if none/unknown */
> } SortBy;
>
>
> In transformWindowFuncCall API, Equality check of order clause in window definition failed while comparing SortByDir enum of both over clause i.e SORT_DEFAULT is not equal to SORT_ASC. Hence two window clause are created in parse tree resulting in the creation of two different window aggregate node.
>
> This check can be modified to form a single window aggregate node for the above results in faster query execution. Is there any reason for creating two different window aggregate node?

I don't see any. https://www.postgresql.org/docs/16/sql-select.html
description of ORDER BY clause clearly says that ASC is assumed when
no direction is mentioned. The only place in code which is used to
create the node treats DEFAULT and ASC as same. May be we want to
allow default to be ASC or DESC based on some setting (read GUC) in
some future.

Another angle is to ask: Why would the query add ASC to one window
specification and not the other?

--
Best Wishes,
Ashutosh Bapat