We have some serious performance problem with SELECTS when add limit, for example, execute time without limit ~250msec (316 rows returned), when add limit 20, execute time 15 – 50secs.
Hi, Mareks.
As this is not actually a bug, it's better to use pgsql-performance or pgsql-general for such questions next time.
We have select with subselect filter:
select a.id, a.jdata
from oss_alarms a
where
a.jdata->>'dn' in
(
select o.jdata->>'ossDn'
from oss_objects o, tvc_entity e
where e.jtype='object'
and o.jdata->>'sid'=e.jdata->>'siteId'
and tvc_unaccent(e.jdata->>'name') like tvc_unaccent('%zaube%')
)
order by (tvc_convert_array_to_date(a.jdata -> 'alarmTime')) desc
limit 20;
As execution plan without limit shows, your join conditions yield 316 rows, explicit sort is fast for this amount
For the plan with the limit, planner prefers to use `idx_oss_alarms_alarm_time` index, to avoid extra sort.
This leads to:
Rows Removed by Join Filter: 10717797
I.e. you're reading 10M rows via index scan and later throw them away, as they do not match your join condition:
Join Filter: ((a.jdata ->> 'dn'::text) = (o.jdata ->> 'ossDn'::text))
It looks like a.jdata->'dn' and a.jdata->'alarmTime' are correlated, although planner doesn't knows that.
PostgreSQL 10 has `CREATE STATISTICS`, but it works on table columns, not expressions.
You can try disabling `idx_oss_alarms_alarm_time` by using expression in `ORDER BY`:
order by (idx_oss_alarms_alarm_timetvc_convert_array_to_date(a.jdata -> 'alarmTime'))+INTERVAL '0' desc
But be warned — this can help in some situations and make things worse in others.
I would recommend to move JOIN and ORDER BY columns out of JSON and make them direct table columns.