Re: UNDO and in-place update

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: UNDO and in-place update
Дата
Msg-id CAPpHfdtiLK55eT9uJu6U=g12q+mNMkegvtGhz0Qdic5H+kuSzA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: UNDO and in-place update  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Tue, Nov 29, 2016 at 8:21 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 28, 2016 at 11:01 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, Nov 27, 2016 at 10:44 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Mon, Nov 28, 2016 at 4:50 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> Well, my original email did contain a discussion of the need for
>>> delete-marking.  I said that you couldn't do in-place updates when
>>> indexed columns were modified unless the index AMs had support for
>>> delete-marking, which is the same point you are making here.
>>
>> Sorry, I had not read that part earlier, but now that I read it, I
>> think there is a slight difference in what I am saying.   I thought
>> along with delete-marking, we might need transaction visibility
>> information in the index as well.
>
> I think we need to avoid putting the visibility information in the
> index because that will make the index much bigger.
>

I agree that there will be an increase in index size, but it shouldn't
be much if we have transaction information (and that too only active
transactions) at page level instead of at tuple level.  I think the
number of concurrent writes on the index will be lesser as compared to
the heap.  There are a couple of benefits of having visibility
information in the index.

a. Heap and index could be independently cleaned up and in most cases
by foreground transactions only.  The work for vacuum will be quite
less as compared to now.  I think this will help us in avoiding the
bloat to a great degree.

b. Improved index-only scans, because now we don't need visibility map
of the heap to check tuple visibility.

c. Some speed improvements for index scans can also be expected
because with this we don't need to perform heap fetch for invisible
index tuples.

+1
I think once we're considering marking deleted index tuples, we should provide an option of visibility-aware indexes.
Probably, it shouldn't be the only option for UNDO-based table engine.  But it definitely should be one of options.

d. This is the way to eventually have index-organized tables.  Once index is visiblity-aware, it becomes possible
to store date there without heap but with snapshots and transactions.  Also, it would be possible to achieve more
unification between heap and index access methods.  Imagine that heap is TID => tuple map and index is index_key => tuple
map.  I think the reason why there is distinguishing between heap (which is hardcoded) access method and index access method is that
heap is visiblity-aware and index is not visiblity aware.  Once they both are visibility-aware, there is no much difference between them:
they are just different kind of maps.  And they could implement the same interface.  Imagine you can select between heap-organized table
and index-organized table just by choosing its primary access method.  If you select heap for primary access method, indexes would
refer TID.  If you select btree on could id as primary access method, indexes would refer id could.  That would be great extendability.
Way better than what we have now.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: pg_recvlogical --endpos
Следующее
От: Haribabu Kommi
Дата:
Сообщение: Re: macaddr 64 bit (EUI-64) datatype support