Re: Disadvantage to CLUSTER?

Поиск
Список
Период
Сортировка
От Steve Crawford
Тема Re: Disadvantage to CLUSTER?
Дата
Msg-id 4FB2DD74.10001@pinpointresearch.com
обсуждение исходный текст
Ответ на Re: Disadvantage to CLUSTER?  (Robert James <srobertjames@gmail.com>)
Список pgsql-general
On 05/15/2012 02:44 PM, Robert James wrote:
>
> Okay, I understand why we still need VACUUM and why we can't always
> CLUSTER.  But my question remains: assuming I have some down time, do
> I loose anything by CLUSTER.  Your answer is, I believe: Not normally,
> but there is one case where you do.  That's an append-only table,
> where you're generally interested in the most recent data, but you
> cluster on something else.
>
> Does clustering really hurt in that case? Is the planner smart enough
> to realize that the data you want is towards the end only? I would
> think that it doesn't know this, and will, let's say, assume it is
> scattered regardless and perform a full scan.  I guess the question
> is: Does the natural order of data help if there's no explicit means
> for the planner to prove it?
>
The speed benefit of clustering (aside from compacting the files and
reindexing) comes from keeping data that is likely to be used together
stored together. I.e. all the data for this week or all the records for
invoice 12345. If you are really lucky, the data a query needs might be
in one or a few contiguous disk blocks.

But say you clustered your invoice-detail data on item number. Gathering
the detail records for one invoice could require many disk seeks and a
disk-block read per record. In that case clustering would hurt so you
should choose your cluster order intelligently.

BTW, be sure to run ANALYZE after clustering a table.

See http://www.postgresql.org/docs/current/static/sql-cluster.html

Cheers,
Steve


В списке pgsql-general по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: Disadvantage to CLUSTER?
Следующее
От: Evan Martin
Дата:
Сообщение: Slow queries when functions are inlined