Re: [PATCH] Negative Transition Aggregate Functions (WIP)

Поиск
Список
Период
Сортировка
От Dean Rasheed
Тема Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Дата
Msg-id CAEZATCX3aBPmTp=go+4f-Q9i0Ko11HZtoL3a2r6CA62vdt-03g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Negative Transition Aggregate Functions (WIP)  (Florian Pflug <fgp@phlo.org>)
Ответы Re: [PATCH] Negative Transition Aggregate Functions (WIP)  (Florian Pflug <fgp@phlo.org>)
Список pgsql-hackers
On 3 March 2014 23:00, Florian Pflug <fgp@phlo.org> wrote:
>> * In show_windowagg_info(), this calculation looks suspicious to me:
>>
>>        double tperrow = winaggstate->aggfwdtrans /
>>            (inst->nloops * inst->ntuples);
>>
>> If the node is executed multiple times, aggfwdtrans will be reset in
>> each loop, so the transitions per row figure will be under-estimated.
>> ISTM that if you want to report on this, you'd need aggfwdtrans to be
>> reset once per query, but I'm not sure exactly how to do that.
>>
>> ...
>>
>> Actually, I think it's misleading to only count forward transition
>> function calls, because a call to the inverse transition function
>> still represents a state transition, and is likely to be around the
>> same cost. For a window of size 2, there would not be much advantage
>> to using inverse transition functions, because it would be around 2
>> transitions per row either way.
>
> True. In fact, I pondered whether to avoid using the inverse transition
> function for windows of 2 rows. In the end, I didn't because I felt that
> it makes custom aggregates harder to test.
>
> On the question of whether to count inverse transition function calls -
> the idea of the EXPLAIN VERBOSE ANALYZE output isn't really to show the
> number of state transitions, but rather to show whether the aggregation
> has O(n) or O(n^2) behaviour. The idea being that a value close to "1"
> means "inverse transition function works as expected", and larger values
> mean "not working so well".
>
> Regarding multiple evaluations - I think I based the behaviour on how
> ntuples works, which also only reports the value of the last evaluation
> I think. But maybe I'm confused about this.
>

No, it doesn't look like that's correct for multiple loops. Consider
this example:

explain (verbose, analyse) select * from (values (10), (20), (30), (40)) v(x), lateral (select sum(i) over (rows
between4 preceding and current row)    from generate_series(1, x) i) t;
 
                                                                QUERY
PLAN

--------------------------------------------------------------------------------------------------------------------------------------------Nested
Loop (cost=0.00..170.06 rows=4000 width=12) (actual
 
time=0.027..0.414 rows=100 loops=1)  Output: "*VALUES*".column1, (sum(i.i) OVER (?))  ->  Values Scan on "*VALUES*"
(cost=0.00..0.05rows=4 width=4)
 
(actual time=0.002..0.006 rows=4 loops=1)        Output: "*VALUES*".column1  ->  WindowAgg  (cost=0.00..22.50 rows=1000
width=4)(actual
 
time=0.019..0.094 rows=25 loops=4)        Output: sum(i.i) OVER (?)        Transitions Per Row: 0.2        ->  Function
Scanon pg_catalog.generate_series i
 
(cost=0.00..10.00 rows=1000 width=4) (actual time=0.010..0.015 rows=25
loops=4)              Output: i.i              Function Call: generate_series(1, "*VALUES*".column1)


It turns out that show_windowagg_info() is only called once at the
end, with ntuples=100, nloops=4 and aggfwdtrans=100, so it's computing
tperrow=100/(4*100)=0.25, which then gets truncated to 0.2. So to get
1, you'd have to use this formula:
   double tperrow = winaggstate->aggfwdtrans / inst->ntuples;

I'm still not convinced that's the most useful thing to report though.
Personally, I'd prefer to just see the separate counts, e.g.:
  ->  WindowAgg  (cost=0.00..22.50 rows=1000 width=4) (actual
time=0.019..0.094 rows=25 loops=4)        Output: sum(i.i) OVER (?)        Forward transitions: 25  Inverse
transitions:25
 

IMO that gives a clearer picture of what's going on.

Thoughts?

Regards,
Dean



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: ALTER TABLE lock strength reduction patch is unsafe Reply-To:
Следующее
От: Fabrízio de Royes Mello
Дата:
Сообщение: GSoC propostal - "CREATE SCHEMA ... LIKE ..."