Re: Set visibility map bit after HOT prune

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Set visibility map bit after HOT prune
Дата
Msg-id CA+TgmoYQ1FsDrZ6mB95KDGM+9YKa3QRvY9kMyTHe19qMAmbJqA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Set visibility map bit after HOT prune  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Ответы Re: Set visibility map bit after HOT prune
Список pgsql-hackers
On Thu, Dec 20, 2012 at 11:49 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
> I wonder if we should add a flag to heap_page_prune and try to do some
> additional work if its being called from lazy vacuum such as setting
> the VM bit and the tuple freeze. IIRC I had put something like that in
> the early patches, but then ripped of for simplicity. May be its time
> to play with that again.

That seems unlikely to be a good trade-off.  If VACUUM is going to do
extra stuff, it's better to have that in the vacuum-specific code,
rather than in code that is also traversed from other places.
Otherwise the conditional logic might impose a penalty on people who
aren't taking those branches.

>> IMHO, it's probably fairly hopeless to make a pure pgbench workload
>> show a benefit from index-only scans.  A large table under a very
>> heavy, completely random write workload is just about the worst
>> possible case for index-only scans.  Index-only scans are a way of
>> avoiding unnecessary visibility checks when the target data hasn't
>> changed recently, not a magic bullet to escape all heap access.  If
>> the target data has changed, you're going to have to touch the heap.
>
> Not always. Not clearing the VM bit at HOT update is one such idea we
> discussed. Of course, there are open issues with that, but they are
> not unsolvable. The advantage of not touching heap is just too big to
> ignore.

I don't really agree.  Sure, not touching the heap is nice, but mostly
because you avoid pulling pages into shared_buffers that aren't
otherwise needed.  IIRC, an index-only scan isn't faster than an index
scan if all the necessary table and index pages are already cached.
Touching already-resident pages just isn't that expensive.  And of
course, if a page has recently suffered an insert, update, or delete,
it is more likely to be resident.  You can construct access patterns
where this isn't so - e.g. update the page, wait for it to get paged
out, and then SELECT from it with an index-only scan, wait for it to
get paged out again, etc. - but I'm not sure how much of a problem
that is in the real world.

>> And while I agree that we aren't aggressive enough in setting the VM
>> bits right now, I also think it wouldn't be too hard to go too far in
>> the opposite direction: we could easily spend more effort trying to
>> make index-only scans effective than we could ever hope to recoup from
>> the scans themselves.
>
> I agree. I also started having that worry. We are at one extreme right
> now and it might not help to get to the other extreme. Looks like I'm
> coming along the idea of somehow detecting if the scan is happening on
> the result relation of a ModifyTable and avoid setting VM bit in that
> case.

It's unclear to me that that's the right way to slice it.  There are
several different sets of concerns here: (1) avoiding setting the
all-visible bit when it'll be cleared again just after, (2) avoiding
slowing down SELECT with hot-pruning, and (3) avoiding slowing down
repeated SELECTs by NOT having the first one do HOT-pruning.  And
maybe others.  The right thing to do depends on which problems you
think are relatively more important.  That question might not even
have one right answer, but even if it does we don't have consensus on
what it is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Set visibility map bit after HOT prune
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: ThisTimeLineID in checkpointer and bgwriter processes