Re: Query slows after offset of 100K

Поиск
Список
Период
Сортировка
От Matthew
Тема Re: Query slows after offset of 100K
Дата
Msg-id Pine.LNX.4.64.0802151438540.20402@aragorn.flymine.org
обсуждение исходный текст
Ответ на Query slows after offset of 100K  (Michael Lorenz <mlorenz1@hotmail.com>)
Список pgsql-performance
On Thu, 14 Feb 2008, Michael Lorenz wrote:
> When offsetting up to about 90K records, the EXPLAIN ANALYZE is similar to the following:
> Limit  (cost=15357.06..15387.77 rows=20 width=35) (actual time=19.235..19.276 rows=20 loops=1)
>   ->  Index Scan using account_objectname on "object" o  (cost=0.00..1151102.10 rows=749559 width=35) (actual
time=0.086..14.981rows=10020 loops=1) 
>         Index Cond: (accountid = 354)
>         Filter: ((NOT deleted) OR (deleted IS NULL))
> Total runtime: 19.315 ms

Since this is scanning through 10,000 random rows in 19 milliseconds, I
say all this data is already in the cache. If it wasn't, you'd be looking
at 10,000 random seeks on disk, at about 7ms each, which is 70 seconds.
Try dropping the OS caches (on Linux echo "1" >/proc/sys/vm/drop_caches)
and see if the performance is worse.

> If I move the offset up to 100K records or higher, I get:
> Limit  (cost=145636.26..145636.31 rows=20 width=35) (actual time=13524.327..13524.355 rows=20 loops=1)
>   ->  Sort  (cost=145386.26..147260.16 rows=749559 width=35) (actual time=13409.216..13481.793 rows=100020 loops=1)
>         Sort Key: objectname
>         ->  Seq Scan on "object" o  (cost=0.00..16685.49 rows=749559 width=35) (actual time=0.011..1600.683
rows=749549loops=1) 
>               Filter: (((NOT deleted) OR (deleted IS NULL)) AND (accountid = 354))
> Total runtime: 14452.374 ms

And here, it only takes 1.5 seconds to fetch the entire table from disc
(or it's already in the cache or something), but 14 seconds to sort the
whole lot in memory.

In any case, Postgres is making a good choice - it's just that you have an
unexpected benefit in the first case that the data is in cache. Setting
the effective cache size correctly will help the planner in this case.
Setting work_mem higher will improve the performance of the sort in the
second case.

Of course, what others have said about trying to avoid large offsets is
good advice. You don't actually need a unique index, but it makes it
simpler if you do.

Matthew

--
The early bird gets the worm. If you want something else for breakfast, get
up later.

В списке pgsql-performance по дате отправления:

Предыдущее
От: Greg Smith
Дата:
Сообщение: Re: shared_buffers in 8.3 w/ lots of RAM on dedicated PG machine
Следующее
От: Josh Berkus
Дата:
Сообщение: Re: shared_buffers in 8.3 w/ lots of RAM on dedicated PG machine