ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The problem is that we've traded splitting a page every few hundred
>> inserts for doing a PageIndexMultiDelete, and emitting an extra WAL
>> record, on *every* insert. This is not good.
> I suspect PageIndexMultiDelete() consumes CPU.
That's part of the problem, but only part: the extra WAL record is
expensive too.
> If there are one or two
> dead tuples, PageIndexTupleDelete() is called and memmove(4KB average)
> and adjustment of the linepointer-offsets are performed everytime.
> I think this is a heavy operation. But if the size of most upper index
> entry is same with the dead tuple, we can only move the upper to the hole
> and avoid to modify all tuples. Is this change acceptable?
I'm inclined to think that this is too special-purpose to be a good
solution. It will help pgbench because that test uses only integer
keys, but it won't help for any variable-width datatype. In any case
we'd still have the WAL overhead...
regards, tom lane