Re: window function ordering not working as expected

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: window function ordering not working as expected
Дата
Msg-id 17807.1424218681@sss.pgh.pa.us
обсуждение исходный текст
Ответ на window function ordering not working as expected  (Lonni J Friedman <netllama@gmail.com>)
Ответы Re: window function ordering not working as expected  (Lonni J Friedman <netllama@gmail.com>)
Список pgsql-general
Lonni J Friedman <netllama@gmail.com> writes:
> I'm interested in seeing:
> * the date for the most recent result
> * test name (identifier)
> * most recent result (decimal value)
> * the worst (lowest decimal value) test result from the past 21 days
> * the date which corresponds with the worst test result from the past 21 days
> * the 2nd worst (2nd lowest decimal value) test result
> ...
> The problem that I'm seeing is in the prv_score column. It should show
> a value of 0.6, which corresponds with 2015-02-13, however instead its
> returning 0.7. I thought by ordering by metrics->>'PT TWBR' I'd always
> be sorting by the scores, and as a result, the lead(metrics->>'PT
> TWBR', 1) would give me the next greatest value of the score. Thus my
> confusion as to why ORDER BY metrics->>'PT TWBR' isn't working as
> expected.

lead() and lag() retrieve values from rows that are N away from the
current row in the specified ordering.  That isn't what you want here
AFAICS.

I think the worst test result would be obtained with
    nth_value(metrics->>'PT TWBR', 1)
which is equivalent to what you used,
    first_value(metrics->>'PT TWBR')
while the 2nd worst result would be obtained with
    nth_value(metrics->>'PT TWBR', 2)

However, "worst" and "2nd worst" with this implementation would mean
"worst and 2nd worst within the partition", which isn't the stated
goal either, at least not with the partition definition you're using.

What you really want for the "worst in last 21 days" is something like

    min(metrics->>'PT TWBR') OVER (
       PARTITION BY ... that same mess you used ...
       ORDER BY tstamp
       RANGE BETWEEN '21 days'::interval PRECEDING AND CURRENT ROW)

However Postgres doesn't implement RANGE x PRECEDING yet.  You could
get "worst in last 21 observations" easily:

    min(metrics->>'PT TWBR') OVER (
       PARTITION BY ... that mess ...
       ORDER BY tstamp
       ROWS BETWEEN 20 PRECEDING AND CURRENT ROW)

and maybe that's close enough.

I do not know an easy way to get "second worst" :-(.  You could build a
user-defined aggregate to produce "second smallest value among the inputs"
and then apply it in the same way as I used min() here.

            regards, tom lane


В списке pgsql-general по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Issue dumping schema using readonly user
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Issue dumping schema using readonly user