The following bug has been logged online:
Bug reference: 5681
Logged by: Maksym Boguk
Email address: Maxim.Boguk@gmail.com
PostgreSQL version: 8.4.4
Operating system: FreeBSD 7.2
Description: Using set returning function as subrequest can result
losing rows in result set
Details:
There is simplified version of real query:
CREATE OR REPLACE FUNCTION test_entity(integer) RETURNS SETOF integer
LANGUAGE sql STABLE ROWS 1
AS $_$
SELECT id FROM entities WHERE id=$1;
$_$;
Full result query:
SELECT
r.id
,s.id
,s.entity_id
-- ,test_entity(s.entity_id)
FROM resstat_2010_09 r
JOIN services s ON s.id = r.service_id
WHERE r.lbill_id=1200;
id | id | entity_id
-----------+---------+-----------
141798910 | 7004573 | 242023
141798911 | 7004578 | 242023
141798912 | 7004577 | 242023
141798913 | 7004579 | 242023
141798964 | 7004583 | 242023
141799001 | 7004584 |
141802715 | 7004707 |
(7 rows)
Now lets uncomment set returning function:
SELECT
r.id
,s.id
,s.entity_id
,test_entity(s.entity_id)
FROM resstat_2010_09 r
JOIN services s ON s.id = r.service_id
WHERE r.lbill_id=1200 ORDER BY r.id;
id | id | entity_id | test_entity
-----------+---------+-----------+-------------
141798910 | 7004573 | 242023 | 242023
141798911 | 7004578 | 242023 | 242023
141798912 | 7004577 | 242023 | 242023
141798913 | 7004579 | 242023 | 242023
141798964 | 7004583 | 242023 | 242023
(5 rows)
Oops... two rows where function returned zero rows just disappeared.
Explain analyze of the both cases looks completely same:
billing_test=# EXPLAIN ANALYZE SELECT
r.id
,s.id
,s.entity_id
,test_entity(s.entity_id)
FROM resstat_2010_09 r
JOIN services s ON s.id = r.service_id
WHERE r.lbill_id=1200 ORDER BY r.id;
QUERY PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-----
Sort (cost=9.38..9.41 rows=10 width=12) (actual time=0.872..0.897 rows=5
loops=1)
Sort Key: r.id
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=0.00..9.22 rows=10 width=12) (actual
time=0.307..0.810 rows=5 loops=1)
-> Index Scan using resstat_2010_09_lbill_id_key on
resstat_2010_09 r (cost=0.00..1.62 rows=10 width=8) (actual
time=0.036..0.084 rows=7 loops=1)
Index Cond: (lbill_id = 1200)
-> Index Scan using services_pkey on services s (cost=0.00..0.50
rows=1 width=8) (actual time=0.015..0.021 rows=1 loops=7)
Index Cond: (s.id = r.service_id)
Total runtime: 1.045 ms
(9 rows)
PS: I know using set return function in that way is bad style. But I think
behavior is wrong.
PPS: any chance make evaluating of subrequest function visible in explain
output in the future?
Thanks for youre attention.