Re: Simple improvements to freespace allocation

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Simple improvements to freespace allocation
Дата
Msg-id 52CD01B2.1050703@vmware.com
обсуждение исходный текст
Ответ на Simple improvements to freespace allocation  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: Simple improvements to freespace allocation  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Simple improvements to freespace allocation  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Simple improvements to freespace allocation  (Jim Nasby <jim@nasby.net>)
Список pgsql-hackers
On 01/08/2014 08:56 AM, Simon Riggs wrote:
> Current freesapce code gives a new block insert target (NBT) from
> anywhere in table. That isn't very useful with bigger tables and it
> would be useful to be able to specify different algorithms for
> producing NBTs.

I've actually been surprised how little demand there has been for 
alternative algorithms. When I wrote the current FSM implementation, I 
expected people to start coming up with all kinds of wishes, but it 
didn't happen. There has been very few complaints, everyone seems to be 
satisfied with the way it works now. So I'm not convinced there's much 
need for this.

> ALTER TABLE foo WITH (freespace = XXXX);
>
> Three simple and useful models come to mind
>
> * CONCURRENT
> This is the standard/current model. Naming it likes this emphasises
> why we pick NBTs in the way we do.
>
> * PACK
> We want the table to be smaller, so rather than run a VACUUM FULL we
> want to force the table to choose an NBT at start of table, even at
> the expense of concurrency. By avoiding putting new data at the top of
> the table we allow the possibility that VACUUM will shrink table size.
> This is same as current except we always reset the FSM pointer to zero
> and re-seek from there. This takes some time to have an effect, but is
> much less invasive than VACUUM FULL.

We already reset the FSM pointer to zero on vacuum. Would the above 
actually make any difference in practice?

> * RECENT
> For large tables that are append-mostly use case it would be easier to
> prefer NBTs from the last two 1GB segments of a table, allowing them
> to be more easily cached. This is same as current except when we wrap
> we don't go to block 0 we go to first block of penultimate (max - 1)
> segment. For tables <= 2 segments this is no change from existing
> algorithm. For larger tables it would focus updates/inserts into a
> much reduced and yet still large area and allow better cacheing.

Umm, wouldn't that bloat the table with no limit? Putting my 
DBA/developer hat on, I don't understand when I would want to use that 
setting.

> These are small patches.
>
> ...other possibilities, though more complex are...
>
> * IN-MEMORY
> A large table may only have some of its blocks in memory. It would be
> useful to force a NBT to be a block already in shared_buffers IFF a
> table is above a certain size (use same threshold as seq scans, i.e.
> 25% of shared_buffers). That may be difficult to achieve in practice,
> so not sure about this one. Like it? Any ideas?

Yeah, that seems nice, although I have feeling that it's not worth the 
complexity.

There's one policy that I'd like to see: maintaining cluster order. When 
inserting a new tuple, try to place it close to other tuples with 
similar keys, to keep the table clustered.

In practice, CLUSTER CONCURRENTLY might be more useful, though.

> We might also allow a custom NBT policy though allowing user code at
> that point could be dangerous.

Yeah, I don't think there's much need for that. Overall,

- Heikki



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Rajeev rastogi
Дата:
Сообщение: Re: Standalone synchronous master
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: Simple improvements to freespace allocation