Hello,
Our application uses typical queries similar to following (very simplified):
SELECT
part_id,
part_name,
(SELECT
SUM(amount) FROM part_movements M
WHERE P.part_id = M.part_id
) as part_amount
FROM parts P
LIMIT 50
The parts table holds thousands of items. Movement table stores yearly
movement information of those items. We are presenting results to users page
by page, hence the limit case.
User can sort and filter results. When sorting is introduced, query
performance drops significantly:
SELECT
part_id,
part_name,
(SELECT
SUM(amount) FROM part_movements M
WHERE P.part_id = M.part_id
) as part_amount
FROM parts P
ORDER BY part_name
LIMIT 50
Postgres seems to compute all possible rows and then sorts the
results, which nearly renders the paging meaningless. A dummy WHERE
case dramatically improves performance:
SELECT
part_id,
part_name,
(SELECT
SUM(amount) FROM part_movements M
WHERE P.part_id = M.part_id
) as part_amount
FROM parts P
ORDER BY part_name
WHERE part_amount > -10000000
LIMIT 50
Is there a way to improve performance of these queries? Is it possible
to instruct Postgres to first sort the rows then compute the inner
queries? (We have simulated this by using temporary tables and two
stage queries, but this is not practical because most queries are
automatically generated).
Attached is the output of real queries and their corresponding EXPLAIN
ANALYZE outputs.
Regards,
Umit Oztosun