Re: slow count in window query

Поиск
Список
Период
Сортировка
От Hitoshi Harada
Тема Re: slow count in window query
Дата
Msg-id e08cc0400907291020v7381fa35j18dc1eccd01352c7@mail.gmail.com
обсуждение исходный текст
Ответ на Re: slow count in window query  (Hitoshi Harada <umi.tanuki@gmail.com>)
Список pgsql-hackers
2009/7/18 Hitoshi Harada <umi.tanuki@gmail.com>:
> If I understand exlain analyze correctly and it tells us the fact,
> WindowAgg without ORDER BY clause gets unreasonably slow. Let me see.
>

I haven't determined the difference between with and without ORDER BY
clause in OVER(), but I took a benchmark that throws an interesting
result.

$ bin/psql regression -c 'explain analyze select count(*) over() from x'
 QUERY PLAN
 

--------------------------------------------------------------------------------
--------------------------------WindowAgg  (cost=0.00..2741.00 rows=100000 width=0) (actual time=3725.294..4559
.828 rows=100000 loops=1)  ->  Seq Scan on x  (cost=0.00..1491.00 rows=100000 width=0) (actual time=0.11
2..310.349 rows=100000 loops=1)Total runtime: 4811.115 ms
(3 rows)

The query is quite slow because profiling hook function calls
gettimeofday() each time. And here's the result that counted up
eval_windowaggregate() call and its children functions. Elapse time is
in second and it is subtracted with total gettimeofday() overhead.

eval_windowaggregates:
Count 100000
Elapse 0.588426

Address  |Name                          |Count |Elapse(Total)
0x8204067|initialize_windowaggregate    |     1|     0.000277
0x8204d4a|spool_tuples                  |100002|     0.620092
0x83dcd08|tuplestore_select_read_pointer|100001|     0.011080
0x83dda2f|tuplestore_gettupleslot       |100001|     0.049005
0x8204fdd|row_is_in_frame               |100000|     0.014978
0x8204168|advance_windowaggregate       |100000|     0.025675
0x81ead8a|ExecClearTuple                |100000|     0.022105
0x8204462|finalize_windowaggregate      |     1|     0.000015
0x8204120|MemoryContextSwitchTo         |     2|     0.000000

spool_tuples() is dominant in eval_windowaggregates(). I think it is
not needed if the query contains only simple aggregate like count(*)
OVER () but currently we copy all the rows from the source table to
tuplestore. Even if it fits in memory, the copy operation costs too
much.

I am thinking about how to avoid unnecessary copy overhead...


Regards,

---
Hitoshi Harada


В списке pgsql-hackers по дате отправления:

Предыдущее
От: decibel
Дата:
Сообщение: Re: [RFC] new digest datatypes, or generic fixed-len hex types?
Следующее
От: pgsql@mohawksoft.com
Дата:
Сообщение: Re: xpath not a good replacement for xpath_string