Re: Disadvantage to CLUSTER?

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: Disadvantage to CLUSTER?
Дата
Msg-id CAHyXU0w5j02xeoLdyceEQLz3SisCKNBazruW5+GpQBEoSaxOMw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Disadvantage to CLUSTER?  (Robert James <srobertjames@gmail.com>)
Список pgsql-general
On Tue, May 15, 2012 at 4:44 PM, Robert James <srobertjames@gmail.com> wrote:
> On 5/15/12, Steve Crawford <scrawford@pinpointresearch.com> wrote:
>> On 05/15/2012 02:02 PM, Robert James wrote:
>>> Besides the one time spent CLUSTERing, do I loose anything by doing it
>>> for every table?  Does a CLUSTER slow anything down?
>
>> Cluster should have better performance but it depends on the index you
>> choose relative to the queries you typically run. Let's say that you
>> have an accounting system where you most often grab the most recent
>> month worth of data. Clustering that keeps that data together will be
>> beneficial but you could easily have a different index, item-number for
>> instance, that would, if used for clustering, leave the commonly used
>> data scattered throughout the table. If that table was an append-only
>> detail table the most commonly used data would naturally clump together
>> which clustering would then destroy.
>
>
> Okay, I understand why we still need VACUUM and why we can't always
> CLUSTER.  But my question remains: assuming I have some down time, do
> I loose anything by CLUSTER.  Your answer is, I believe: Not normally,
> but there is one case where you do.  That's an append-only table,
> where you're generally interested in the most recent data, but you
> cluster on something else.
>
> Does clustering really hurt in that case? Is the planner smart enough
> to realize that the data you want is towards the end only? I would
> think that it doesn't know this, and will, let's say, assume it is
> scattered regardless and perform a full scan.  I guess the question
> is: Does the natural order of data help if there's no explicit means
> for the planner to prove it?

If you have regular daily downtime then sure, you can cluster and
there is little disadvantage to doing so.  It will pack indexes and do
all kinds of other things that 24x7 shops have to do in a much more
complicated fashion.

the planner makes no assumptions about how the data is ordered in the table.

merlin

В списке pgsql-general по дате отправления:

Предыдущее
От: Scott Briggs
Дата:
Сообщение: Naming conventions
Следующее
От: Steve Crawford
Дата:
Сообщение: Re: Disadvantage to CLUSTER?