Обсуждение: SRF called with optional NULL input runs 7x slower
I've got 2 nearly identical SRFs to retrieve data from the DB(pg 8.2.4)
which goes something like this
================= Function 1 ======================
CREATE OR REPLACE FUNCTION hot(fromdate timestamp without time zone, todate timestamp without time zone, code text[])
RETURNS SETOF trh_hot AS
$BODY$
DECLARE
rec RECORD;
BEGIN
FOR rec IN
SELECT
foo.a,
bar.b
FROM d
INNER JOIN ts
ON ts.id = D.id
inner join trh
ON ts.id = trh.id
AND ts.ttype = trh.ttype
AND ts.run_date = trh.run_date
WHERE d.record_update_date_time BETWEEN fromdate AND todate
AND trh.run_date BETWEEN fromdate AND todate
AND trh.ttype IN (select ttype from lookup_ttype where tsequence = 'hot')
AND d.code = any (code)
LOOP
RETURN NEXT rec;
END LOOP;
RETURN;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
===================================
========== Function 2 ===================
CREATE OR REPLACE FUNCTION hot(fromdate timestamp without time zone, todate timestamp without time zone, code text[],
sntext[])
RETURNS SETOF trh_hot AS
$BODY$
DECLARE
rec RECORD;
BEGIN
FOR rec IN
SELECT
foo.a,
bar.b
FROM d
INNER JOIN ts
ON ts.id = D.id
inner join trh
ON ts.id = trh.id
AND ts.ttype = trh.ttype
AND ts.run_date = trh.run_date
WHERE d.record_update_date_time BETWEEN fromdate AND todate
AND trh.run_date BETWEEN fromdate AND todate
AND trh.ttype IN (select ttype from lookup_ttype where tsequence = 'hot')
--> AND (CASE WHEN code IS NULL THEN true else d.code = any (code) END)
--> AND (CASE WHEN sn IS NULL THEN TRUE else D.id = any(SN) END)
LOOP
RETURN NEXT rec;
END LOOP;
RETURN;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
===================== >8 =============================
The main difference being these 2 lines
AND (CASE WHEN code IS NULL THEN true else d.code = any (code) END)
AND (CASE WHEN sn IS NULL THEN TRUE else D.id = any(SN) END)
which the aim is to check for NULL input and determine whether or not a
filter is necessary for those items.
The query is called by these respectively
Func1: select * from HOT('8/1/2007','9/30/2007','{HUA71}')
Func2: select * from HOT2('8/1/2007','9/30/2007','{HUA71}',NULL)
If I put in all the SN in replacement of the NULL, it's slightly faster
at 37 secs to return the data.
Vmstat 5 on Function1:7 secs 8400 rows
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 268 32296 2908 1096780 1 1 92 75 1083 1021 10 6 82 3
4 1 268 33708 2864 1094856 0 0 5771 21 1134 3518 23 19 51 7
6 0 268 39196 2876 1091396 0 0 2074 27 1230 3477 33 51 0 16
very little on cpu wait.
Vmstat 5 on Function2: 50secs 8400 rows
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
3 0 268 33648 4980 1087552 0 0 1606 103 1216 2943 21 32 0 47
2 1 268 33888 4484 1089576 0 0 2352 14 1209 2811 19 32 0 49
3 1 268 34380 4100 1090564 0 0 1270 13 1187 2890 13 21 0 66
2 1 268 34184 4004 1091408 0 0 1107 98 1206 2949 12 19 0 69
3 2 268 33004 3984 1093312 0 0 1357 34 1203 2931 11 20 0 69
4 0 268 33008 3860 1093556 0 0 1302 8 1193 2897 13 21 0 65
2 1 268 33608 3800 1093140 0 0 1286 6 1195 2954 12 18 0 70
1 1 268 33552 3300 1093776 0 0 1282 109 1213 2896 12 20 0 67
3 1 268 32404 3296 1095368 0 0 1202 1 1188 2908 12 19 0 69
1 1 268 33976 2680 1094476 0 0 1184 0 1180 2905 11 19 0 70
3 1 268 33336 2440 1095924 0 0 1291 8 1196 2944 11 19 0 70
2 1 268 34112 2272 1095412 0 0 1245 118 1221 2932 13 19 0 68
2 0 268 34844 2756 1095364 0 0 698 4 1145 3040 19 49 0 32
Just look at the CPU wait. What Gives? Does checkin on NULL causes a penalty? I
tried running the same query as is on pgadmin3 and it looked to be OK
(the null checking). I do notice that the system (my laptop) was doing
quite a bit of IO.
how can I debug or diagnose where the issues lies? Explain analyse
doesn't do much since this is a Function Scan anyway.
AS of right now, I'm trying to see if I can do Dynamic SQL instead.
Ow Mun Heng wrote: > how can I debug or diagnose where the issues lies? Explain analyse > doesn't do much since this is a Function Scan anyway. Take them out of the function and EXPLAIN ANALYZE them as plain SQL. Note that you should take the parameters out of the literal, so you need to do something like PREPARE foo AS SELECT ... replace code with $1, etc ... EXPLAIN ANALYZE EXECUTE foo(code, ...) Otherwise they would be treated as constants so the queries would be planned differently. (untested, so correct the syntax appropriately) -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J "Before you were born your parents weren't as boring as they are now. They got that way paying your bills, cleaning up your room and listening to you tell them how idealistic you are." -- Charles J. Sykes' advice to teenagers
On Mon, 2007-09-24 at 00:18 -0400, Alvaro Herrera wrote: > Ow Mun Heng wrote: > > > how can I debug or diagnose where the issues lies? Explain analyse > > doesn't do much since this is a Function Scan anyway. > > Take them out of the function and EXPLAIN ANALYZE them as plain SQL. > Note that you should take the parameters out of the literal, so you need > to do something like > > PREPARE foo AS SELECT ... replace code with $1, etc ... > EXPLAIN ANALYZE EXECUTE foo(code, ...) > > Otherwise they would be treated as constants so the queries would be > planned differently. > > (untested, so correct the syntax appropriately) > Okay.. I tried that, but it seems like there's an issue w/ the CASE statements. When I tried the prepare w/ AND (CASE WHEN $3 IS NULL THEN true else d.code = any ($3) END) AND (CASE WHEN $4 IS NULL THEN TRUE else D.id = any($4) END) it ERRORs w/ could not determine data type of parameter $3 If I were to just use AND D.code = ANY($3) then it would work. Any other clues?
Ow Mun Heng wrote: > Okay.. I tried that, but it seems like there's an issue w/ the CASE > statements. > > When I tried the prepare w/ > > AND (CASE WHEN $3 IS NULL THEN true else d.code = any ($3) END) > AND (CASE WHEN $4 IS NULL THEN TRUE else D.id = any($4) END) > > > it ERRORs w/ could not determine data type of parameter $3 Cast it to the correct type. -- Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 18.1", W 73º 13' 56.4" "Cómo ponemos nuestros dedos en la arcilla del otro. Eso es la amistad; jugar al alfarero y ver qué formas se pueden sacar del otro" (C. Halloway en La Feria de las Tinieblas, R. Bradbury)
On Mon, 2007-09-24 at 02:31 -0400, Alvaro Herrera wrote: > Ow Mun Heng wrote: > > > Okay.. I tried that, but it seems like there's an issue w/ the CASE > > statements. > > > > When I tried the prepare w/ > > > > AND (CASE WHEN $3 IS NULL THEN true else d.code = any ($3) END) > > AND (CASE WHEN $4 IS NULL THEN TRUE else D.id = any($4) END) > > > > > > it ERRORs w/ could not determine data type of parameter $3 > > Cast it to the correct type. how do I cast a NULL? Is it Varchar? Anyway.. I tried AND (CASE WHEN CAST($3 as VARCHAR) IS NULL THEN true else d.code = any ($3) END) and I get ERROR: op ANY/ALL (array) requires array on right side >
Ow Mun Heng wrote: > On Mon, 2007-09-24 at 02:31 -0400, Alvaro Herrera wrote: > > > it ERRORs w/ could not determine data type of parameter $3 > > > > Cast it to the correct type. > > how do I cast a NULL? Is it Varchar? I didn't say "pick a random datatype", I said "the correct type", which in this case means the type "code" has in the function. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.