Обсуждение: Questionable result from lead(0) IGNORE NULLS

Поиск
Список
Период
Сортировка

Questionable result from lead(0) IGNORE NULLS

От
Tatsuo Ishii
Дата:
Hi Oliver,

I noticed a questionable result from "lead(0) IGNORE NULLS".

CREATE TEMP TABLE g(x INT, y INT);
CREATE TABLE
INSERT INTO g (VALUES(NULL,1),(NULL,2),(1,3));
INSERT 0 3
SELECT * FROM g;
 x | y 
---+---
   | 1
   | 2
 1 | 3
(3 rows)

SELECT x, y, lead(x, 0) RESPECT NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
 x | y | lead 
---+---+------
   | 1 |     
   | 2 |     
 1 | 3 |    1
(3 rows)

SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
 x | y | lead 
---+---+------
   | 1 |     
   | 2 |     
 1 | 3 |    1
(3 rows)

As you can see, "lead(x, 0) IGNORE NULLS" shows the same result as
"lead(x, 0) RESPECT NULLS". IMO "lead(x, 0) IGNORE NULLS" should show
something like:

 x | y | lead 
---+---+------
   | 1 |    1 
   | 2 |    1 
 1 | 3 |    1
(3 rows)

The same thing can be said to lag().

Looking into the code, in
WinGetFuncArgInPartition(src/backend/executor/nodeWindowAgg.c) I see
this:
    if (winobj->ignore_nulls == IGNORE_NULLS && relpos != 0)
    {
        null_treatment = true;

Here, if the caller is lead(0), then relpos == 0, thus
"null_treatment" is not set to true and falls into the code later on:

    if (!null_treatment)        /* IGNORE NULLS is not specified */
    {
        /* get tupple and evaluate in a partition */
        datum = gettuple_eval_partition(winobj, argno,

and runs through the same code path as RESPECT NULLS. I think this is
the reason why "lead(0, x) IGNORE NULLS" showed the same result as
"lead(0, x) RESPECT NULLS". "relpos != 0" part was originally in your
patch. Oliver, what's the reason why you excluded relpose==0 case? Can
we eliminate the restriction and let "lead(0) IGNORE NULLS" case run
the same code path as relpos!=0 (of course with proper adjustment in
related code)?

Best regards,

[1] https://www.postgresql.org/message-id/CAGMVOduHcfhh7Wo9W1Tff0DH_ccPuQGc8D_f5S2_y4OHFOjn%3DA%40mail.gmail.com
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Questionable result from lead(0) IGNORE NULLS

От
Oliver Ford
Дата:


On Tue, Oct 7, 2025 at 8:41 AM Tatsuo Ishii <ishii@postgresql.org> wrote:

As you can see, "lead(x, 0) IGNORE NULLS" shows the same result as
"lead(x, 0) RESPECT NULLS". IMO "lead(x, 0) IGNORE NULLS" should show
something like:

 x | y | lead
---+---+------
   | 1 |    1
   | 2 |    1
 1 | 3 |    1
(3 rows)

Looking into the code, in
WinGetFuncArgInPartition(src/backend/executor/nodeWindowAgg.c) I see
this:
        if (winobj->ignore_nulls == IGNORE_NULLS && relpos != 0)
        {
                null_treatment = true;

Here, if the caller is lead(0), then relpos == 0, thus
"null_treatment" is not set to true and falls into the code later on:

        if (!null_treatment)            /* IGNORE NULLS is not specified */
        {
                /* get tupple and evaluate in a partition */
                datum = gettuple_eval_partition(winobj, argno,

and runs through the same code path as RESPECT NULLS. I think this is
the reason why "lead(0, x) IGNORE NULLS" showed the same result as
"lead(0, x) RESPECT NULLS". "relpos != 0" part was originally in your
patch. Oliver, what's the reason why you excluded relpose==0 case? Can
we eliminate the restriction and let "lead(0) IGNORE NULLS" case run
the same code path as relpos!=0 (of course with proper adjustment in
related code)?


 The result looks wrong. So I've just tried removing the "&& relpos != 0" and I get:

SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
 x | y | lead
---+---+------
   | 1 |
   | 2 |
 1 | 3 |
(3 rows)

Nothing appears for lead at all. So it was doing something but doesn't look like it handles the lead(x, 0) case, but it does handle lead(x) - which is the same as lead(x, 1):

SELECT x, y, lead(x) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
 x | y | lead
---+---+------
   | 1 |    1
   | 2 |    1
 1 | 3 |
(3 rows)

Without Ignore Nulls, lead(x,0) is just the row itself so you'd never use that function. But yes this case needs to be handled, I'll look through the code again, handle this for lead/lag, and add tests.

Re: Questionable result from lead(0) IGNORE NULLS

От
Tatsuo Ishii
Дата:
Hi Oliver,

I have just pushed a change to WinGetFuncArgInPartition() in
nodeWindowAgg.c to fix Coverity issues. So please update your git
respository.

>  The result looks wrong. So I've just tried removing the "&& relpos != 0"
> and I get:
> 
> SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g
> WINDOW w AS (ORDER BY y);
>  x | y | lead
> ---+---+------
>    | 1 |
>    | 2 |
>  1 | 3 |
> (3 rows)
> 
> Nothing appears for lead at all. So it was doing something but doesn't look
> like it handles the lead(x, 0) case

I think we need to change this:

        forward = relpos > 0 ? 1 : -1;
:
:
    /*
     * Get the next nonnull value in the partition, moving forward or backward
     * until we find a value or reach the partition's end.
     */
    do
    {
        int            nn_info;    /* NOT NULL info */

        abs_pos += forward;
        if (abs_pos < 0)        /* apparently out of partition */
            break;

In lead(0, x) case, abs_pos==0 and foward==-1. So it exits the loop
due to out of partition. Probably we need to change
        forward = relpos > 0 ? 1 : -1;
        to
        forward = relpos >= 0 ? 1 : -1;
and change the do..while loop to a for loop?

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Questionable result from lead(0) IGNORE NULLS

От
Tatsuo Ishii
Дата:
Hi Oliver,

>> Nothing appears for lead at all. So it was doing something but doesn't look
>> like it handles the lead(x, 0) case
> 
> I think we need to change this:
> 
>         forward = relpos > 0 ? 1 : -1;
> :
> :
>     /*
>      * Get the next nonnull value in the partition, moving forward or backward
>      * until we find a value or reach the partition's end.
>      */
>     do
>     {
>         int            nn_info;    /* NOT NULL info */
> 
>         abs_pos += forward;
>         if (abs_pos < 0)        /* apparently out of partition */
>             break;
> 
> In lead(0, x) case, abs_pos==0 and foward==-1. So it exits the loop
> due to out of partition. Probably we need to change
>         forward = relpos > 0 ? 1 : -1;
>         to
>         forward = relpos >= 0 ? 1 : -1;
> and change the do..while loop to a for loop?

Attached patch is written in this direction. What do you think?
Below are the results. IMO now lead() returns correct results.

psql -a -f window.sql test
CREATE TEMP TABLE g(x INT, y INT);
CREATE TABLE
INSERT INTO g (VALUES(NULL,1),(NULL,2),(1,3));
INSERT 0 3
SELECT * FROM g;
 x | y 
---+---
   | 1
   | 2
 1 | 3
(3 rows)

SELECT x, y, lead(x, 0) RESPECT NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
 x | y | lead 
---+---+------
   | 1 |     
   | 2 |     
 1 | 3 |    1
(3 rows)

SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
 x | y | lead 
---+---+------
   | 1 |    1
   | 2 |    1
 1 | 3 |    1
(3 rows)

SELECT x, y, lead(x, 1) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
 x | y | lead 
---+---+------
   | 1 |     
   | 2 |     
 1 | 3 |     
(3 rows)

While working on this, I found some of window function regression
tests using lead/lag are not quite correct. Below is some of them.

-- lead
SELECT name,
       orbit,
       lead(orbit) OVER w AS lead,
       lead(orbit) RESPECT NULLS OVER w AS lead_respect,
       lead(orbit) IGNORE NULLS OVER w AS lead_ignore
FROM planets
WINDOW w AS (ORDER BY name)
;
  name   | orbit | lead  | lead_respect | lead_ignore 
---------+-------+-------+--------------+-------------
 earth   |       |  4332 |         4332 |        4332
 jupiter |  4332 |       |              |          88
 mars    |       |    88 |           88 |          88
 mercury |    88 | 60182 |        60182 |       60182
 neptune | 60182 | 90560 |        90560 |       90560
 pluto   | 90560 | 24491 |        24491 |       24491
 saturn  | 24491 |       |              |         224
 uranus  |       |   224 |          224 |         224
 venus   |   224 |       |              |            
 xyzzy   |       |       |              |            
(10 rows)

Why lead_ignore shows "4332" on the first row? Since "orbit"'s second
non null row is orbit==88, I think lead(orbit) should return 88,
rather than 4332 if my understanding of the SQL standard is correct.

IMO the right result is as below, which is actually the one after
applying the patch.

  name   | orbit | lead  | lead_respect | lead_ignore 
---------+-------+-------+--------------+-------------
 earth   |       |  4332 |         4332 |          88
 jupiter |  4332 |       |              |          88
 mars    |       |    88 |           88 |       60182
 mercury |    88 | 60182 |        60182 |       60182
 neptune | 60182 | 90560 |        90560 |       90560
 pluto   | 90560 | 24491 |        24491 |       24491
 saturn  | 24491 |       |              |         224
 uranus  |       |   224 |          224 |            
 venus   |   224 |       |              |            
 xyzzy   |       |       |              |            
(10 rows)

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Questionable result from lead(0) IGNORE NULLS

От
Tatsuo Ishii
Дата:
Hi Oliver,

After studying the standard more, it seems I was totally wrong.
In summary, current code is correct. Sorry for noise.

From the standard explaining lead():

> B) If OFFSET = 0 (zero), then the value of <window function> is the
> value of VE1 evaluated for the current row.
(here VE1 referes to the first argument of lead()).

So it seems we need to treat offset==0 case specially. i.e. eval VE1
and do not through away even if the result is NULL.

> I noticed a questionable result from "lead(0) IGNORE NULLS".
> 
> CREATE TEMP TABLE g(x INT, y INT);
> CREATE TABLE
> INSERT INTO g (VALUES(NULL,1),(NULL,2),(1,3));
> INSERT 0 3
> SELECT * FROM g;
>  x | y 
> ---+---
>    | 1
>    | 2
>  1 | 3
> (3 rows)
> 
> SELECT x, y, lead(x, 0) RESPECT NULLS OVER w FROM g
> WINDOW w AS (ORDER BY y);
>  x | y | lead 
> ---+---+------
>    | 1 |     
>    | 2 |     
>  1 | 3 |    1
> (3 rows)
> 
> SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g
> WINDOW w AS (ORDER BY y);
>  x | y | lead 
> ---+---+------
>    | 1 |     
>    | 2 |     
>  1 | 3 |    1
> (3 rows)
> 
> As you can see, "lead(x, 0) IGNORE NULLS" shows the same result as
> "lead(x, 0) RESPECT NULLS". IMO "lead(x, 0) IGNORE NULLS" should show
> something like:

No. The result above is perfectly correct.

>    if (winobj->ignore_nulls == IGNORE_NULLS && relpos != 0)

Now I understand why "relpos != 0" part is necessary.

Also, my email below was wrong.

> While working on this, I found some of window function regression
> tests using lead/lag are not quite correct. Below is some of them.
> 
> -- lead
> SELECT name,
>        orbit,
>        lead(orbit) OVER w AS lead,
>        lead(orbit) RESPECT NULLS OVER w AS lead_respect,
>        lead(orbit) IGNORE NULLS OVER w AS lead_ignore
> FROM planets
> WINDOW w AS (ORDER BY name)
> ;
>   name   | orbit | lead  | lead_respect | lead_ignore 
> ---------+-------+-------+--------------+-------------
>  earth   |       |  4332 |         4332 |        4332
>  jupiter |  4332 |       |              |          88
>  mars    |       |    88 |           88 |          88
>  mercury |    88 | 60182 |        60182 |       60182
>  neptune | 60182 | 90560 |        90560 |       90560
>  pluto   | 90560 | 24491 |        24491 |       24491
>  saturn  | 24491 |       |              |         224
>  uranus  |       |   224 |          224 |         224
>  venus   |   224 |       |              |            
>  xyzzy   |       |       |              |            
> (10 rows)
> 
> Why lead_ignore shows "4332" on the first row? Since "orbit"'s second
> non null row is orbit==88, I think lead(orbit) should return 88,
> rather than 4332 if my understanding of the SQL standard is correct.

According to the standard if OFFSET is not 0, then:
> B) Otherwise, let TX be the sequence of values that is the result of applying VE1 to each
> row of T that follows the current row and eliminating null values, ordered according
> to the window ordering of WDX.

> C) Otherwise, the value of <window function> is the m-th value of
> TX, where m = OFFSET.

Thus TX does not include current row and for the first row, and
lead(orbit) returns the second row, that is orbit==4332 because m==1.
So, current regression test expected file is correct.

Again, sorry for noise.
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp