Re: Performance hit on large row counts

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Performance hit on large row counts
Дата	26 декабря 2005 г. 19:52:08
Msg-id	15618.1135641128@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Performance hit on large row counts (David Scott <davids@apptechsys.com>)
Список	pgsql-performance

Дерево обсуждения

David Scott <davids@apptechsys.com> writes:
>     I didn't mention I was the only user with transactions open on the
> system during this.  Would cluster eliminate more rows then vacuum full
> if the only open transaction is the one running the vacuum and it is a
> clean transaction?

It wouldn't eliminate more rows, but it could nonetheless produce a
smaller table.  IIRC, VACUUM FULL stops shrinking as soon as it finds
a row that there is no room for in lower-numbered table pages; so a
large row near the end of the table could block squeezing-out of small
amounts of free space in earlier pages of the table.  I doubt this
effect is significant most of the time, but in a table with widely
varying row sizes it might be an issue.

Also, CLUSTER can definitely produce smaller *indexes* than VACUUM FULL.
VACUUM FULL operates at a serious disadvantage when it comes to indexes,
because in order to move a tuple it has to actually make extra index
entries.

>> I'm not at all sure I believe your premise that querying for a different
>> key value excludes cache effects, btw.  On modern hardware it's likely
>> that CLUSTER would leave the *whole* of these tables sitting in kernel
>> disk cache.
>>
>     You are exactly right.  After rebooting the entire box and running
> the query the query time was 15 seconds.  Rebooting the box, running
> cluster on all three tables and then executing the query was 120 ms.  Is
> calling cluster the only way to ensure that these tables get loaded into
> cache?  Running select * appeared to cache some but not all.

Hm, I'd think that SELECT * or SELECT count(*) would cause all of a
table to be cached.  It wouldn't do anything about caching the indexes
though, and that might explain your observations.

            regards, tom lane

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Performance hit on large row counts