Обсуждение: Questionable result from lead(0) IGNORE NULLS
Hi Oliver, I noticed a questionable result from "lead(0) IGNORE NULLS". CREATE TEMP TABLE g(x INT, y INT); CREATE TABLE INSERT INTO g (VALUES(NULL,1),(NULL,2),(1,3)); INSERT 0 3 SELECT * FROM g; x | y ---+--- | 1 | 2 1 | 3 (3 rows) SELECT x, y, lead(x, 0) RESPECT NULLS OVER w FROM g WINDOW w AS (ORDER BY y); x | y | lead ---+---+------ | 1 | | 2 | 1 | 3 | 1 (3 rows) SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g WINDOW w AS (ORDER BY y); x | y | lead ---+---+------ | 1 | | 2 | 1 | 3 | 1 (3 rows) As you can see, "lead(x, 0) IGNORE NULLS" shows the same result as "lead(x, 0) RESPECT NULLS". IMO "lead(x, 0) IGNORE NULLS" should show something like: x | y | lead ---+---+------ | 1 | 1 | 2 | 1 1 | 3 | 1 (3 rows) The same thing can be said to lag(). Looking into the code, in WinGetFuncArgInPartition(src/backend/executor/nodeWindowAgg.c) I see this: if (winobj->ignore_nulls == IGNORE_NULLS && relpos != 0) { null_treatment = true; Here, if the caller is lead(0), then relpos == 0, thus "null_treatment" is not set to true and falls into the code later on: if (!null_treatment) /* IGNORE NULLS is not specified */ { /* get tupple and evaluate in a partition */ datum = gettuple_eval_partition(winobj, argno, and runs through the same code path as RESPECT NULLS. I think this is the reason why "lead(0, x) IGNORE NULLS" showed the same result as "lead(0, x) RESPECT NULLS". "relpos != 0" part was originally in your patch. Oliver, what's the reason why you excluded relpose==0 case? Can we eliminate the restriction and let "lead(0) IGNORE NULLS" case run the same code path as relpos!=0 (of course with proper adjustment in related code)? Best regards, [1] https://www.postgresql.org/message-id/CAGMVOduHcfhh7Wo9W1Tff0DH_ccPuQGc8D_f5S2_y4OHFOjn%3DA%40mail.gmail.com -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
On Tue, Oct 7, 2025 at 8:41 AM Tatsuo Ishii <ishii@postgresql.org> wrote:
As you can see, "lead(x, 0) IGNORE NULLS" shows the same result as
"lead(x, 0) RESPECT NULLS". IMO "lead(x, 0) IGNORE NULLS" should show
something like:
x | y | lead
---+---+------
| 1 | 1
| 2 | 1
1 | 3 | 1
(3 rows)
Looking into the code, in
WinGetFuncArgInPartition(src/backend/executor/nodeWindowAgg.c) I see
this:
if (winobj->ignore_nulls == IGNORE_NULLS && relpos != 0)
{
null_treatment = true;
Here, if the caller is lead(0), then relpos == 0, thus
"null_treatment" is not set to true and falls into the code later on:
if (!null_treatment) /* IGNORE NULLS is not specified */
{
/* get tupple and evaluate in a partition */
datum = gettuple_eval_partition(winobj, argno,
and runs through the same code path as RESPECT NULLS. I think this is
the reason why "lead(0, x) IGNORE NULLS" showed the same result as
"lead(0, x) RESPECT NULLS". "relpos != 0" part was originally in your
patch. Oliver, what's the reason why you excluded relpose==0 case? Can
we eliminate the restriction and let "lead(0) IGNORE NULLS" case run
the same code path as relpos!=0 (of course with proper adjustment in
related code)?
The result looks wrong. So I've just tried removing the "&& relpos != 0" and I get:
SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
x | y | lead
---+---+------
| 1 |
| 2 |
1 | 3 |
(3 rows)
WINDOW w AS (ORDER BY y);
x | y | lead
---+---+------
| 1 |
| 2 |
1 | 3 |
(3 rows)
Nothing appears for lead at all. So it was doing something but doesn't look like it handles the lead(x, 0) case, but it does handle lead(x) - which is the same as lead(x, 1):
SELECT x, y, lead(x) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
x | y | lead
---+---+------
| 1 | 1
| 2 | 1
1 | 3 |
(3 rows)
WINDOW w AS (ORDER BY y);
x | y | lead
---+---+------
| 1 | 1
| 2 | 1
1 | 3 |
(3 rows)
Without Ignore Nulls, lead(x,0) is just the row itself so you'd never use that function. But yes this case needs to be handled, I'll look through the code again, handle this for lead/lag, and add tests.
Hi Oliver, I have just pushed a change to WinGetFuncArgInPartition() in nodeWindowAgg.c to fix Coverity issues. So please update your git respository. > The result looks wrong. So I've just tried removing the "&& relpos != 0" > and I get: > > SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g > WINDOW w AS (ORDER BY y); > x | y | lead > ---+---+------ > | 1 | > | 2 | > 1 | 3 | > (3 rows) > > Nothing appears for lead at all. So it was doing something but doesn't look > like it handles the lead(x, 0) case I think we need to change this: forward = relpos > 0 ? 1 : -1; : : /* * Get the next nonnull value in the partition, moving forward or backward * until we find a value or reach the partition's end. */ do { int nn_info; /* NOT NULL info */ abs_pos += forward; if (abs_pos < 0) /* apparently out of partition */ break; In lead(0, x) case, abs_pos==0 and foward==-1. So it exits the loop due to out of partition. Probably we need to change forward = relpos > 0 ? 1 : -1; to forward = relpos >= 0 ? 1 : -1; and change the do..while loop to a for loop? Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Hi Oliver, >> Nothing appears for lead at all. So it was doing something but doesn't look >> like it handles the lead(x, 0) case > > I think we need to change this: > > forward = relpos > 0 ? 1 : -1; > : > : > /* > * Get the next nonnull value in the partition, moving forward or backward > * until we find a value or reach the partition's end. > */ > do > { > int nn_info; /* NOT NULL info */ > > abs_pos += forward; > if (abs_pos < 0) /* apparently out of partition */ > break; > > In lead(0, x) case, abs_pos==0 and foward==-1. So it exits the loop > due to out of partition. Probably we need to change > forward = relpos > 0 ? 1 : -1; > to > forward = relpos >= 0 ? 1 : -1; > and change the do..while loop to a for loop? Attached patch is written in this direction. What do you think? Below are the results. IMO now lead() returns correct results. psql -a -f window.sql test CREATE TEMP TABLE g(x INT, y INT); CREATE TABLE INSERT INTO g (VALUES(NULL,1),(NULL,2),(1,3)); INSERT 0 3 SELECT * FROM g; x | y ---+--- | 1 | 2 1 | 3 (3 rows) SELECT x, y, lead(x, 0) RESPECT NULLS OVER w FROM g WINDOW w AS (ORDER BY y); x | y | lead ---+---+------ | 1 | | 2 | 1 | 3 | 1 (3 rows) SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g WINDOW w AS (ORDER BY y); x | y | lead ---+---+------ | 1 | 1 | 2 | 1 1 | 3 | 1 (3 rows) SELECT x, y, lead(x, 1) IGNORE NULLS OVER w FROM g WINDOW w AS (ORDER BY y); x | y | lead ---+---+------ | 1 | | 2 | 1 | 3 | (3 rows) While working on this, I found some of window function regression tests using lead/lag are not quite correct. Below is some of them. -- lead SELECT name, orbit, lead(orbit) OVER w AS lead, lead(orbit) RESPECT NULLS OVER w AS lead_respect, lead(orbit) IGNORE NULLS OVER w AS lead_ignore FROM planets WINDOW w AS (ORDER BY name) ; name | orbit | lead | lead_respect | lead_ignore ---------+-------+-------+--------------+------------- earth | | 4332 | 4332 | 4332 jupiter | 4332 | | | 88 mars | | 88 | 88 | 88 mercury | 88 | 60182 | 60182 | 60182 neptune | 60182 | 90560 | 90560 | 90560 pluto | 90560 | 24491 | 24491 | 24491 saturn | 24491 | | | 224 uranus | | 224 | 224 | 224 venus | 224 | | | xyzzy | | | | (10 rows) Why lead_ignore shows "4332" on the first row? Since "orbit"'s second non null row is orbit==88, I think lead(orbit) should return 88, rather than 4332 if my understanding of the SQL standard is correct. IMO the right result is as below, which is actually the one after applying the patch. name | orbit | lead | lead_respect | lead_ignore ---------+-------+-------+--------------+------------- earth | | 4332 | 4332 | 88 jupiter | 4332 | | | 88 mars | | 88 | 88 | 60182 mercury | 88 | 60182 | 60182 | 60182 neptune | 60182 | 90560 | 90560 | 90560 pluto | 90560 | 24491 | 24491 | 24491 saturn | 24491 | | | 224 uranus | | 224 | 224 | venus | 224 | | | xyzzy | | | | (10 rows) Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Вложения
Hi Oliver, After studying the standard more, it seems I was totally wrong. In summary, current code is correct. Sorry for noise. From the standard explaining lead(): > B) If OFFSET = 0 (zero), then the value of <window function> is the > value of VE1 evaluated for the current row. (here VE1 referes to the first argument of lead()). So it seems we need to treat offset==0 case specially. i.e. eval VE1 and do not through away even if the result is NULL. > I noticed a questionable result from "lead(0) IGNORE NULLS". > > CREATE TEMP TABLE g(x INT, y INT); > CREATE TABLE > INSERT INTO g (VALUES(NULL,1),(NULL,2),(1,3)); > INSERT 0 3 > SELECT * FROM g; > x | y > ---+--- > | 1 > | 2 > 1 | 3 > (3 rows) > > SELECT x, y, lead(x, 0) RESPECT NULLS OVER w FROM g > WINDOW w AS (ORDER BY y); > x | y | lead > ---+---+------ > | 1 | > | 2 | > 1 | 3 | 1 > (3 rows) > > SELECT x, y, lead(x, 0) IGNORE NULLS OVER w FROM g > WINDOW w AS (ORDER BY y); > x | y | lead > ---+---+------ > | 1 | > | 2 | > 1 | 3 | 1 > (3 rows) > > As you can see, "lead(x, 0) IGNORE NULLS" shows the same result as > "lead(x, 0) RESPECT NULLS". IMO "lead(x, 0) IGNORE NULLS" should show > something like: No. The result above is perfectly correct. > if (winobj->ignore_nulls == IGNORE_NULLS && relpos != 0) Now I understand why "relpos != 0" part is necessary. Also, my email below was wrong. > While working on this, I found some of window function regression > tests using lead/lag are not quite correct. Below is some of them. > > -- lead > SELECT name, > orbit, > lead(orbit) OVER w AS lead, > lead(orbit) RESPECT NULLS OVER w AS lead_respect, > lead(orbit) IGNORE NULLS OVER w AS lead_ignore > FROM planets > WINDOW w AS (ORDER BY name) > ; > name | orbit | lead | lead_respect | lead_ignore > ---------+-------+-------+--------------+------------- > earth | | 4332 | 4332 | 4332 > jupiter | 4332 | | | 88 > mars | | 88 | 88 | 88 > mercury | 88 | 60182 | 60182 | 60182 > neptune | 60182 | 90560 | 90560 | 90560 > pluto | 90560 | 24491 | 24491 | 24491 > saturn | 24491 | | | 224 > uranus | | 224 | 224 | 224 > venus | 224 | | | > xyzzy | | | | > (10 rows) > > Why lead_ignore shows "4332" on the first row? Since "orbit"'s second > non null row is orbit==88, I think lead(orbit) should return 88, > rather than 4332 if my understanding of the SQL standard is correct. According to the standard if OFFSET is not 0, then: > B) Otherwise, let TX be the sequence of values that is the result of applying VE1 to each > row of T that follows the current row and eliminating null values, ordered according > to the window ordering of WDX. > C) Otherwise, the value of <window function> is the m-th value of > TX, where m = OFFSET. Thus TX does not include current row and for the first row, and lead(orbit) returns the second row, that is orbit==4332 because m==1. So, current regression test expected file is correct. Again, sorry for noise. -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp