Re: Underspecified window queries in regression tests

Поиск
Список
Период
Сортировка
От Florian Pflug
Тема Re: Underspecified window queries in regression tests
Дата
Msg-id D91C5B29-8B22-4EC2-803D-B7C4677E8EA5@phlo.org
обсуждение исходный текст
Ответ на Re: Underspecified window queries in regression tests  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Underspecified window queries in regression tests
Re: Underspecified window queries in regression tests
Список pgsql-hackers
On Oct17, 2011, at 01:09 , Tom Lane wrote:
> Florian Pflug <fgp@phlo.org> writes:
>> ... reading those parts again, I realize the it says "When ORDER BY is omitted
>> the *default* frame consists ... ", and that the second quote is followed
>> by a footnote which says
>
>>  There are options to define the window frame in other ways, but this tutorial
>>  does not cover them. See Section 4.2.8 for details. [3.5, Window Functions]
>
>> So it was just me being thick. Sorry for the noise.
>
> Hmm.  Maybe the use of a <footnote> there is too subtle, and we should
> instead have that text in-line (probably in parentheses)?  Or we could
> use a <note>, but that's probably too much emphasis.

Inline and in parentheses sounds fine.

In addition, I think we should reword the explanation in 4.2.8 (The SQL Language
/ SQL Syntax / Value Expressions / Window Functions). Instead of that rather
long (and IMHO hard to read) paragraph about possible frame clauses and their
behaviour in the presence or absence of an ORDER BY clause, we should go with
a more algorithmic explanation I think.

Something along these lines maybe:

----------
.) PARTITION BY splits the rows into disjoint partitions. All further processing  happens only inside a single
partition

.) In RANGE mode, ORDER BY then splits each partition into an ordered list of  sub-partitions, each containing rows
whichthe ORDER BY considers to be  equivalent. 

.) In ROWS mode, OTOH, each sub-partition contains only a single row. Thus, if  there are rows which are considered to
beequivalent by the ORDER BY, the  ordering of the sub-partition isn't fully determined. 

.) Each row's frame then consists of some consecutive range of sub-partitions.

.) In RANGE mode, that consecutive range can only start at either the first  sub-partition or the current row's
sub-partition,and can only end at either  the current row's sub-partition or the last sub-partitions. 

.) In ROWS mode, the consecutive range may additional start <n> sub-partitions  (or rows, it's the same thing here)
beforethe current row, and may additionally  end <m> sub-partitions/rows after the current row. 

From that, it follows that even with an underspecified sort order, the contents of
each frame are still fully determined in RANGE mode. The ordering of rows within
a frame is not determined, though. So overall, in RANGE mode, a query's result is
only non-deterministic if the window function is sensitive to the ordering of rows
within a frame.

In ROWS mode, OTOH, the contents each frame themselves are not fully determined,
so even an ordering agnostic window function may produce non-deterministic results.
----------

If you think that something along these lines would be an improvement, I can try
to come up with a patch.

best regards,
Florian Pflug



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jan Urbański
Дата:
Сообщение: Re: proposal: set GUC variables for single query
Следующее
От: Fujii Masao
Дата:
Сообщение: termination of backend waiting for sync rep generates a junk log message