Re: Underspecified window queries in regression tests

Поиск
Список
Период
Сортировка
От Florian Pflug
Тема Re: Underspecified window queries in regression tests
Дата
Msg-id BEAC1AFC-8A2B-48FA-9682-993AF0A68427@phlo.org
обсуждение исходный текст
Ответ на Re: Underspecified window queries in regression tests  (Greg Stark <stark@mit.edu>)
Ответы Re: Underspecified window queries in regression tests
Список pgsql-hackers
On Oct16, 2011, at 20:04 , Greg Stark wrote:
> On Fri, Oct 14, 2011 at 9:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> We could hack around this by adding more columns to the result so that
>> an index-only scan doesn't work.  But I wonder whether it wouldn't be
>> smarter to add ORDER BY clauses to the window function calls.  I've been
>> known to argue against adding just-in-case ORDER BYs to the regression
>> tests in the past; but these cases bother me more because a plan change
>> will not just rearrange the result rows but change their contents,
>> making it really difficult to verify that nothing's seriously wrong.
>
> I'm not sure if it applies to this case but I recall I was recently
> running queries on Oracle that included window functions and it
> wouldn't even let me run them without ORDER BY clauses in the window
> definition. I don't know if it cleverly determines that the ORDER BY
> will change the results or if Oracle just requires ORDER BY on all
> window definitions or what.

I was about to point out that whether or not ORDER BY is required probably
ought to depend on whether the window function acts on the frame, or the
whole partition. And that oracle quite likely knows that for the individual
window functions while we don't.

But, actually, our documentation states in 3.5, Window Functions, that
 By default, if ORDER BY is supplied then the frame consists of all rows from the start of the partition up through the
currentrow, plus any following rows that are equal to the current row according to the ORDER BY clause. When ORDER BY
isomitted the default frame consists of all rows in the partition.  

Either I'm confused, or that doesn't match the observed regression test
failure.

I did a few experiments. Assume that table "d" contains three rows with
v=1, v=2 and v=3.

This case seems to work (the frame is always the whole partition, even
though the frame_clause says "between unbounded preceding and current row"):

postgres=# select array_agg(v) over (range between unbounded preceding                                    and current
row)from d;array_agg  
-----------{1,2,3}{1,2,3}{1,2,3}

Once one adds an ORDER BY, the frame_clause is taken into account, so
that works too:

postgres=# select array_agg(v) over (order by v range between unbounded preceding
andcurrent row) from d;array_agg  
-----------{1}{1,2}{1,2,3}

But some frame clauses (row 1 preceding, for example) have an effect despite
there being no ORDER BY, like here:

postgres=# select array_agg(v) over (rows 1 preceding) from d;array_agg
-----------{1}{1,2}{2,3}

ISTM that we probably should ignore frame clauses, unless there's an ORDER BY.
Or maybe even throw an error?

best regards,
Florian Pflug



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Pushing ScalarArrayOpExpr support into the btree index AM
Следующее
От: Dimitri Fontaine
Дата:
Сообщение: Re: proposal: set GUC variables for single query