Re: CLUSTER and clustered indices

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: CLUSTER and clustered indices
Дата
Msg-id 1132264123.4959.283.camel@localhost.localdomain
обсуждение исходный текст
Ответ на Re: CLUSTER and clustered indices  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: CLUSTER and clustered indices  (Alvaro Herrera <alvherre@commandprompt.com>)
Список pgsql-hackers
On Thu, 2005-11-17 at 10:58 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > When a table has been CLUSTERed on a particular index AND that index
> > values is monotonically increasing, then it would be a bad move to use
> > blocks from the FSM since this would tend to destroy the natural
> > clustering sequence.
> 
> By the time there are any blocks in FSM to take, the original clean
> index page sequence is doubtless history.  The pure-increasing-key
> scenario you are thinking of never will have any FSM entries, so it's
> moot.

Well, just because its a pure-increasing-key doesn't imply rows don't
get deleted. Look at the NewOrder table in DBT-2/TPC-C. Order numbers
increase, but mostly they get delivered in the end. 

In the case I describe, tuples are added always to the rightmost edge of
the index, so it seems worthwhile to always add tuples to the top of the
heap only so that the order is consistent.

On Thu, 2005-11-17 at 12:45 -0300, Alvaro Herrera wrote:

> That's a nice idea, but what's the cost?  You will have to check every
> insert: does the table has indexes?  Is any of them clustered?  Is the
> clustered index attached to a sequence?  It seems quite an expensive
> check to be making.

I'd make the check when the relcache is loaded and store it there as a
boolean. heap_insert already has a boolean on it for use_fsm, so it
would transfer very simply at insert time with almost no overhead.

The use case exists and the technique is low overhead, but the main
question is: Does anybody think this behaviour would be beneficial for
them? (I'm actually in two minds myself, but once the idea has arisen,
it seems sensible to discuss this for everybody's sake).

The trade-off is a table that keeps growing in size, even though you
VACUUM it, with the benefit that the clustering is maintained. 

So how would you maintain it? Looks like you'd still have to use regular
CLUSTER commands, but at least it would stay good in between.

Best Regards, Simon Riggs



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Improving count(*)
Следующее
От: Martijn van Oosterhout
Дата:
Сообщение: Re: Improving count(*)