Re: UNDO and in-place update

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: UNDO and in-place update
Дата
Msg-id CA+TgmoZYs+VNKE1t9yHDjpgaxFM0D3q16b-iH_m9bawF5ok7cw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: UNDO and in-place update  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Список pgsql-hackers
On Fri, Dec 2, 2016 at 5:01 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> Idea of storing just one visibility bit in index tuple is a subject of
> serious doubts for me.
>
> 1. When definitely-all-visible isn't set then we have to recheck during
> scanning heap, right?
> But our current recheck (i.e. rechecking scan quals) is not enough.  Imagine
> you have a range
> index scan (val >= 100 AND val < 200), and val field was updated from val =
> 50 and val = 60
> in some row.  Then you might have this row duplicated in the output.
> Removing index scan and
> sticking to only bitmap index scan doesn't seem to be fair.  Thus, we need
> to introduce another
> kind of recheck that heapTuple.indexKey = indexTuple.indexKey.

Yes.

> 2. Another question is how it could work while index key being intensively
> updated.  There is a risk
> that we would have to traverse same UNDO-chains multiple times.  In worst
> case, we would have
> to spend quadratic time of number of row versions since our snapshot taken.
> We might try to mitigate this
> by caching TID => heap tuple map for our snapshot.  But I don't like this
> approach.
> If we have large enough index scan, then we could either run out of cache or
> consume too much memory
> for that cache.

I agree with the concern, but I don't think that's necessarily the
only mitigation strategy.  The details of what goes into UNDO and what
goes into the TPD aren't really defined yet, and if we structure those
so that you can efficiently find out what happened to a particular TID
with only a bounded number of accesses, then it might not be too bad.
If you imagine having to walk a chain of 1000 UNDO entries once per
TID, and there are 100 TIDs on the page, that sounds pretty bad.  But
maybe we can design the UNDO format in such a way that you never
actually need to walk the entire chain, or only in extremely rare
corner cases.

It strikes me that there are three possibilities.  One is that we can
design the UNDO-based visibility reconstruction so that it is
blazingly fast.  In that case, the only benefit of putting the
visibility information in the index is that index-only scans will be
able to avoid touching the heap in some cases where they presently
cannot.  The second possibility is that despite our best efforts the
UNDO-based visibility reconstruction is doomed to be cripplingly slow.
In that case, even in "good" cases like sequential scans and bitmap
heap scans where we can process the entire page at once instead of
possibly having to make a separate visit per TID, we'll still be
painfully slow.  In that case, we might as well forget this project
altogether.  The third is that we're somewhere in the middle.  If
UNDO-based visibility reconstruction is only a minor annoyance in
sequential scan and bitmap-heap scan cases but, despite our best
efforts, becomes highly painful in index scan cases, then the case for
putting XIDs in the index suddenly becomes a lot more interesting in
my mind.

We may well end up in exactly that place, but I think it's a little
too early to decide yet.  I think we need to write a detailed design
for how UNDO-based visibility reconstruction would actually work, and
maybe even build a prototype to see how that performs, before we can
decide on this.  I kind of hope we don't need to do XIDs-in-the-index;
it sounds like a lot of work, and this project is bound to be
extremely difficult even without that additional complication.
However, if it becomes clear that it's the only way for a system like
this to perform acceptably, then it'll have to be done (unless we give
up on the whole thing).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Patch: Implement failover on libpq connect level.
Следующее
От: Corey Huinker
Дата:
Сообщение: Re: PSQL commands: \quit_if, \quit_unless