Re: window function ordering not working as expected

Поиск
Список
Период
Сортировка
От Lonni J Friedman
Тема Re: window function ordering not working as expected
Дата
Msg-id CAP=oouHDK50ZmHKQt7hCBykrKv1AyqcMAoqR1oRpOzP2AMARjg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: window function ordering not working as expected  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
On Tue, Feb 17, 2015 at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Lonni J Friedman <netllama@gmail.com> writes:
>> I'm interested in seeing:
>> * the date for the most recent result
>> * test name (identifier)
>> * most recent result (decimal value)
>> * the worst (lowest decimal value) test result from the past 21 days
>> * the date which corresponds with the worst test result from the past 21 days
>> * the 2nd worst (2nd lowest decimal value) test result
>> ...
>> The problem that I'm seeing is in the prv_score column. It should show
>> a value of 0.6, which corresponds with 2015-02-13, however instead its
>> returning 0.7. I thought by ordering by metrics->>'PT TWBR' I'd always
>> be sorting by the scores, and as a result, the lead(metrics->>'PT
>> TWBR', 1) would give me the next greatest value of the score. Thus my
>> confusion as to why ORDER BY metrics->>'PT TWBR' isn't working as
>> expected.
>
> lead() and lag() retrieve values from rows that are N away from the
> current row in the specified ordering.  That isn't what you want here
> AFAICS.
>
> I think the worst test result would be obtained with
>         nth_value(metrics->>'PT TWBR', 1)
> which is equivalent to what you used,
>         first_value(metrics->>'PT TWBR')
> while the 2nd worst result would be obtained with
>         nth_value(metrics->>'PT TWBR', 2)
>
> However, "worst" and "2nd worst" with this implementation would mean
> "worst and 2nd worst within the partition", which isn't the stated
> goal either, at least not with the partition definition you're using.
>
> What you really want for the "worst in last 21 days" is something like
>
>         min(metrics->>'PT TWBR') OVER (
>            PARTITION BY ... that same mess you used ...
>            ORDER BY tstamp
>            RANGE BETWEEN '21 days'::interval PRECEDING AND CURRENT ROW)
>
> However Postgres doesn't implement RANGE x PRECEDING yet.  You could
> get "worst in last 21 observations" easily:
>
>         min(metrics->>'PT TWBR') OVER (
>            PARTITION BY ... that mess ...
>            ORDER BY tstamp
>            ROWS BETWEEN 20 PRECEDING AND CURRENT ROW)
>
> and maybe that's close enough.
>
> I do not know an easy way to get "second worst" :-(.  You could build a
> user-defined aggregate to produce "second smallest value among the inputs"
> and then apply it in the same way as I used min() here.

Thanks Tom, much appreciate the fast reply.  I'll chew this over and
see if I have any other questions.


В списке pgsql-general по дате отправления:

Предыдущее
От: David G Johnston
Дата:
Сообщение: Re: which is better- storing data as array or json?
Следующее
От: David G Johnston
Дата:
Сообщение: Re: Revoking access for pg_catalog schema objects