Обсуждение: Custom function in where clause
Hi, I have searched solution to my problem a few days. On my query, there is big performance problem. It seems to me, that problem is on where-part of sql and it's function. My sql is: select count(*) from table_h where level <= get_level_value(11268,id,area) and (date1 >= '2011-1-1' or date2>='2011-1-1') and (date1 <= '2012-07-09' or date2<='2012-07-09') This takes about 40sek. select count(*) from table_h where (date1 >= '2011-1-1' or date2>='2011-1-1') and (date1 <= '2012-07-09' or date2<='2012-07-09') when ignoring function, it takes <1sek. Function is: CREATE OR REPLACE FUNCTION get_level_value(_user integer, _id, _area) RETURNS integer AS $$ DECLARE found integer; BEGIN SELECT 1 INTO found FROM table_o WHERE userid=_user AND id=_id AND area=_area; IF (found) THEN return 3; ELSE return 1; END IF; END; $$ LANGUAGE plpgsql; On explain, it seems to me that this function is on filter and it will execute on every row. Total resultset contains 1 700000 rows. QUERY PLAN Aggregate (cost=285543.89..285543.90 rows=1 width=0) (actual time=32391.380..32391.380 rows=1 loops=1) -> Bitmap Heap Scan on table_h (cost=11017.63..284987.40 rows=222596 width=0) (actual time=326.946..31857.145 rows=631818loops=1) Recheck Cond: ((date1 >= '2011-01-01'::date) OR (date2 >= '2011-01-01'::date)) Filter: (((date1 <= '2012-07-09'::date) OR (date2 <= '2012-07-09'::date)) AND (level <= get_level_value(11268, id,area))) -> BitmapOr (cost=11017.63..11017.63 rows=669412 width=0) (actual time=321.635..321.635 rows=0 loops=1) -> Bitmap Index Scan on date1 (cost=0.00..10626.30 rows=652457 width=0) (actual time=84.555..84.555 rows=647870loops=1) Index Cond: (date1 >= '2011-01-01'::date) -> Bitmap Index Scan on date2_table_h (cost=0.00..280.03 rows=16955 width=0) (actual time=237.074..237.074rows=15222 loops=1) Index Cond: (date2 >= '2011-01-01'::date) How should I handle this situation and use function? -- kupen -- Wippies-vallankumous on täällä! Varmista paikkasi vallankumouksen eturintamassa ja liity Wippiesiin heti! http://www.wippies.com/
On Tue, Jul 10, 2012 at 6:36 PM, Pena Kupen <kupen@wippies.fi> wrote:
You could not have good performance using function in case where direct JOIN is only way to have reasonable performance.
Stop using function and write join with table_o instead, or put whole query with join inside a function.
Hi,
I have searched solution to my problem a few days. On my query, there is big performance problem.
It seems to me, that problem is on where-part of sql and it's function.
My sql is:
select count(*)
from table_h where level <= get_level_value(11268,id,area) and (date1 >= '2011-1-1' or date2>='2011-1-1') and (date1 <= '2012-07-09' or date2<='2012-07-09')
This takes about 40sek.
select count(*)
from table_h where (date1 >= '2011-1-1' or date2>='2011-1-1') and (date1 <= '2012-07-09' or date2<='2012-07-09')
when ignoring function, it takes <1sek.
Function is:
CREATE OR REPLACE FUNCTION get_level_value(_user integer, _id, _area) RETURNS integer
AS $$
DECLARE found integer;
BEGIN
SELECT 1 INTO found
FROM table_o
WHERE userid=_user AND
id=_id AND
area=_area;
IF (found) THEN
return 3;
ELSE
return 1;
END IF;
END;
$$
LANGUAGE plpgsql;
On explain, it seems to me that this function is on filter and it will execute on every row. Total resultset contains 1 700 000 rows.
QUERY PLAN
Aggregate (cost=285543.89..285543.90 rows=1 width=0) (actual time=32391.380..32391.380 rows=1 loops=1)
-> Bitmap Heap Scan on table_h (cost=11017.63..284987.40 rows=222596 width=0) (actual time=326.946..31857.145 rows=631818 loops=1)
Recheck Cond: ((date1 >= '2011-01-01'::date) OR (date2 >= '2011-01-01'::date))
Filter: (((date1 <= '2012-07-09'::date) OR (date2 <= '2012-07-09'::date)) AND (level <= get_level_value(11268, id, area)))
-> BitmapOr (cost=11017.63..11017.63 rows=669412 width=0) (actual time=321.635..321.635 rows=0 loops=1)
-> Bitmap Index Scan on date1 (cost=0.00..10626.30 rows=652457 width=0) (actual time=84.555..84.555 rows=647870 loops=1)
Index Cond: (date1 >= '2011-01-01'::date)
-> Bitmap Index Scan on date2_table_h (cost=0.00..280.03 rows=16955 width=0) (actual time=237.074..237.074 rows=15222 loops=1)
Index Cond: (date2 >= '2011-01-01'::date)
How should I handle this situation and use function?
You could not have good performance using function in case where direct JOIN is only way to have reasonable performance.
Stop using function and write join with table_o instead, or put whole query with join inside a function.
--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/
Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
МойКруг: http://mboguk.moikrug.ru/
"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."
On 10 July 2012 18:36, Pena Kupen <kupen@wippies.fi> wrote: > Hi, > > I have searched solution to my problem a few days. On my query, there is big > performance problem. > It seems to me, that problem is on where-part of sql and it's function. > > How should I handle this situation and use function? > I would start by rewriting your function in plain SQL rather than PL/pgSQL. As a general rule, don't write a function in PL/pgSQL unless you really need procedurality. This function does not. For example: CREATE OR REPLACE FUNCTION get_level_value(_user integer, _id, _area) RETURNS integer AS $$ -- Return 3 if there are matching records in table_o, otherwise return 1. SELECT CASE WHEN EXISTS ( SELECT id FROM table_o WHERE userid=_user AND id=_id AND area=_area ) THEN 3 ELSE 1 END; $$ LANGUAGE sql STABLE; Cheers, BJ
Hi and thank's guys! First trying this Brendan's recommendation. It seems only a small difference between sql and PL/pgSQL. from 40-->37. Not so good yet. I will try Maxim's little later and you all know. -- kupen Brendan Jurd [direvus@gmail.com] kirjoitti: > On 10 July 2012 18:36, Pena Kupen <kupen@wippies.fi> wrote: > > Hi, > > > > I have searched solution to my problem a few days. On my query, there is big > > performance problem. > > It seems to me, that problem is on where-part of sql and it's function. > > > > > How should I handle this situation and use function? > > > > I would start by rewriting your function in plain SQL rather than > PL/pgSQL. As a general rule, don't write a function in PL/pgSQL > unless you really need procedurality. This function does not. > > For example: > > CREATE OR REPLACE FUNCTION get_level_value(_user integer, _id, _area) > RETURNS integer > AS $$ > -- Return 3 if there are matching records in table_o, otherwise return 1. > SELECT CASE WHEN EXISTS ( > SELECT id > FROM table_o > WHERE userid=_user AND > id=_id AND > area=_area > ) THEN 3 ELSE 1 END; > $$ > LANGUAGE sql STABLE; > > Cheers, > BJ > -- Wippies-vallankumous on täällä! Varmista paikkasi vallankumouksen eturintamassa ja liity Wippiesiin heti! http://www.wippies.com/