Re: ORDER BY ... LIMIT.. performance

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: ORDER BY ... LIMIT.. performance
Дата
Msg-id 13210.1039210098@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: ORDER BY ... LIMIT.. performance  ("john cartmell" <john.cartmell@mediaburst.co.uk>)
Список pgsql-performance
"john cartmell" <john.cartmell@mediaburst.co.uk> writes:
> The query:
> SELECT * FROM tblcompany WHERE lower(companyname) like 'a g m%' ORDER BY
> companyname;
> returns 20 rows.
  ^^^^^^^^^^^^^^^

Ahh, light dawns.

> When limit is 19:
>     EXPLAIN ANALYZE SELECT * FROM tblcompany WHERE
> lower(companyname) like 'a g m%' ORDER BY companyname LIMIT 19,0;
>     NOTICE:  QUERY PLAN:

>     Limit  (cost=0.00..4621.68 rows=19 width=223) (actual
> time=561.20..563.11 rows=19 loops=1)
>       ->  Index Scan using idx_tblcompany_companyname on tblcompany
> (cost=0.00..1542006.83 rows=6339 width=223) (actual time=561.19..563.07
> rows=20 loops=1)
>     Total runtime: 563.22 msec

> But when it is 20:
>     EXPLAIN ANALYZE SELECT * FROM tblcompany WHERE
> lower(companyname) like 'a g m%' ORDER BY companyname LIMIT 20,0;
>     NOTICE:  QUERY PLAN:

>     Limit  (cost=0.00..4864.92 rows=20 width=223) (actual
> time=559.58..21895.02 rows=20 loops=1)
>       ->  Index Scan using idx_tblcompany_companyname on tblcompany
> (cost=0.00..1542006.83 rows=6339 width=223) (actual
> time=559.57..21894.97 rows=20 loops=1)
>     Total runtime: 21895.13 msec

The problem here is that in current releases, the Limit plan node tries
to fetch one more row than requested (you can see this in the actual
rowcounts for the first example).  So in your second example, the base
indexscan is actually being run to completion before the Limit gives up.
And since that scan is being used for ordering, not for implementing the
WHERE clause, it visits all the rows.  (When you leave off LIMIT, the
planner chooses a plan that's more amenable to fetching all the data...)

I recently revised the Limit logic so that it doesn't fetch the extra
row.  This takes more code, but you're not the first to complain of
the old behavior.  It'll be in 7.4, or if you're brave you could
probably apply the diff to 7.3.

In the meantime, a more appropriate query would be

SELECT * FROM tblcompany
WHERE lower(companyname) like 'a g m%'
ORDER BY lower(companyname)
LIMIT whatever

so that an index on lower(companyname) could be used both for the WHERE
clause and for the ordering.

            regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее
От: Rod Taylor
Дата:
Сообщение: Re: ALTER TABLE .. < ADD | DROP > OIDS
Следующее
От: Hannu Krosing
Дата:
Сообщение: Re: Speeding up aggregates