Re: Slow queries consisting inner selects and order bys & hack to speed up

От: Tom Lane
Тема: Re: Slow queries consisting inner selects and order bys & hack to speed up
Дата: ,
Msg-id: 7599.1138043954@sss.pgh.pa.us
(см: обсуждение, исходный текст)
Ответ на: Slow queries consisting inner selects and order bys & hack to speed up  (Ümit Öztosun)
Список: pgsql-performance

=?ISO-8859-1?Q?=DCmit_=D6ztosun?= <> writes:
> Our application uses typical queries similar to following (very simplified):

> SELECT
>  part_id,
>  part_name,
>  (SELECT
>   SUM(amount) FROM part_movements M
>   WHERE P.part_id =3D M.part_id
>  ) as part_amount
> FROM parts P
> ORDER BY part_name
> LIMIT 50

> Postgres seems to compute all possible rows and then sorts the
> results, which nearly renders the paging meaningless.

Yeah.  The general rule is that sorting happens after computing the
SELECT values --- this is more or less required for cases where the
ORDER BY refers to a SELECT-list item.  You'd probably have better
results by writing a sub-select:

SELECT
 part_id,
 part_name,
 (SELECT
  SUM(amount) FROM part_movements M
  WHERE P.part_id = M.part_id
 ) as part_amount
FROM
 (SELECT part_id, part_name FROM parts P
  WHERE whatever ORDER BY whatever LIMIT n) as P;

This will do the part_movements stuff only for rows that make it out of
the sub-select.

Another approach is to make sure the ORDER BY is always on an indexed
column; in cases where the ORDER BY is done by an indexscan instead
of a sort, calculation of the unwanted SELECT-list items does not
happen.  However, this only works as long as LIMIT+OFFSET is fairly
small.

Lastly, are you on a reasonably current Postgres version (performance
complaints about anything older than 8.0 will no longer be accepted
with much grace), and are your statistics up to date?  The ANALYZE
shows rowcount estimates that seem suspiciously far off:

        ->  Seq Scan on scf_stokkart stok  (cost=0.00..142622.54 rows=25 width=503) (actual time=4.726..19324.633
rows=4947loops=1) 
              Filter: (upper((durum)::text) = 'A'::text)

This is important because, if the planner realized that the SELECT-list
items were going to be evaluated 5000 times not 25, it might well choose
a different plan.

            regards, tom lane


В списке pgsql-performance по дате сообщения:

От: "Adnan HOTMAIL"
Дата:
Сообщение: unsubscribe
От: Brad Nicholson
Дата:
Сообщение: Investigating IO Saturation