Calculating statistic via function rather than with query is slowing my query

Поиск
Список
Период
Сортировка
От Anish Kejariwal
Тема Calculating statistic via function rather than with query is slowing my query
Дата
Msg-id CAOpcnr9q2-q-n3TkuQFFgwc2UANFikZoFqhPwfmBJFu6b_MVrw@mail.gmail.com
обсуждение исходный текст
Ответы Re: Calculating statistic via function rather than with query is slowing my query
Список pgsql-performance
Hi everyone,

I'm using postgres 9.0.3, and here's the OS I'm running this on:
Linux 2.6.18-238.12.1.el5xen #1 SMP Tue May 31 14:02:29 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux

I have a fairly straight forward query.  I'm doing a group by on an ID, and then calculating some a statistic on the resulting data.  The problem I'm running into is that when I'm calculating the statistics via a function, it's twice as slow as when I'm calculating the statistics directly in my query.  I want to be able to use a function, since I'll be using this particular calculation in many places.

Any idea of what's going on?  Below, I've included my function, and both queries (I removed the type_ids, and just wrote …ids…

Here's my function (I also tried stable):
CREATE OR REPLACE FUNCTION calc_test(a double precision, b integer, c integer)
RETURNS double precision AS $$
BEGIN
        return a/b/c* 1000000000::double precision;
END;
$$ LANGUAGE plpgsql immutable;


The query that takes 7.6 seconds, when I calculate the statistic from within the query:
explain analyze 
select
   agg.primary_id,
   avg(agg.a / agg.b / agg.c * 1000000000::double precision) foo,
   stddev(agg.a / agg.b / agg.c * 1000000000::double precision) bar
from mytable agg
where agg.type_id in (....ids....)
group by agg.primary_id;

The execution plan:
 HashAggregate  (cost=350380.58..350776.10 rows=9888 width=20) (actual time=7300.414..7331.659 rows=20993 loops=1)
   ->  Bitmap Heap Scan on mytable agg  (cost=28667.90..337509.63 rows=1716127 width=20) (actual time=200.064..2861.600 rows=2309230 loops=1)
         Recheck Cond: (type_id = ANY ('{....ids....}'::integer[]))
         ->  Bitmap Index Scan on mytable_type_id_idx  (cost=0.00..28238.87 rows=1716127 width=0) (actual time=192.725..192.725 rows=2309230 loops=1)
               Index Cond: (type_id = ANY ('{....ids....}'::integer[]))
 Total runtime: 7358.337 ms
(6 rows)




The same query, but now I'm calling the function.  When I call the function it's taking 15.5 seconds.
explain analyze select
   agg.primary_id,
   avg(calc_test(agg.a,agg.b,agg.c)) foo,
   stddev(calc_test(agg.a,agg.b,agg.c)) bar
from mytable agg
where agg.type_id in (....ids....)
group by agg.primary_id;

and, here's the execution plan:

 HashAggregate  (cost=350380.58..355472.90 rows=9888 width=20) (actual time=13660.838..13686.618 rows=20993 loops=1)
   ->  Bitmap Heap Scan on mytable agg  (cost=28667.90..337509.63 rows=1716127 width=20) (actual time=170.385..2881.122 rows=2309230 loops=1)
         Recheck Cond: (type_id = ANY ('{....ids....}'::integer[]))
         ->  Bitmap Index Scan on mytable_type_id_idx  (cost=0.00..28238.87 rows=1716127 width=0) (actual time=162.834..162.834 rows=2309230 loops=1)
               Index Cond: (type_id = ANY ('{....ids....}'::integer[]))
 Total runtime: 13707.560 ms


Thanks!

Anish

В списке pgsql-performance по дате отправления:

Предыдущее
От: "ktm@rice.edu"
Дата:
Сообщение: Re: DBT-5 & Postgres 9.0.3
Следующее
От: Ogden
Дата:
Сообщение: Raid 5 vs Raid 10 Benchmarks Using bonnie++