Re: Index only scan paving the way for "auto" clustered tables?

Поиск
Список
Период
Сортировка
От Kääriäinen Anssi
Тема Re: Index only scan paving the way for "auto" clustered tables?
Дата
Msg-id BC19EF15D84DC143A22D6A8F2590F0A7886413307D@EXMAIL.stakes.fi
обсуждение исходный текст
Ответ на Re: Index only scan paving the way for "auto" clustered tables?  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Index only scan paving the way for "auto" clustered tables?
Список pgsql-hackers
Robert Haas wrote:
"""
And it seems to me that there could easily be format changes that
would make sense for particular cases, but not across the board,
like:

- index-organized tables (heap is a btree, and secondary indexes
reference the PK rather than the TID; this is how MySQL does it, and
Oracle offers it as an option)
- WORM tables (no updates or deletes, and no inserts after creating
transaction commits, allowing a much smaller tuple header)
- non-transactional tables (tuples visible as soon as they're written,
again allowing for smaller tuple header; useful for internal stuff and
perhaps for insert-only log tables)
"""

This is probably a silly idea, but I have been wondering about the
following idea: Instead of having visibility info in the row header,
have a couple of row visibility slots in the page header. These slots
could be shared between rows in the page, so that if you do a bulk
insert/update/delete you would only use one slot. If the slots
overflow, you would use external slots buffer.

When the row is all visible, no slot would be used at all.

The xmin, xmax and cid would be in the slots. ctid would have its
current meaning, except when the external slots would be used,
then ctid would point to the external slot, and it would have the real
row header. I don't know if there would be any other row header
parts which could be shared.

The external slots buffer would then contain xmin, xmax, cid and
the real ctid.

Updates would write the new rows to another page in the heap,
and old rows would stay in place, just as now. So there would not
be any redo log like configuration. Also, the external slots buffer
would be small (18 bytes per row), so it would not get out of
cache too easily.

The performance would suck if you had lots of small updates, or
long running transactions. On the other hand in data warehousing,
where bulk loads are normal, and there are a lot of small rows,
this could actually work.

As said, this is probably a silly idea. But as pluggable heap types
came up, I thought to ask if this could actually work. If this kind of
wondering posts are inappropriate for this list, please tell me so
that I can avoid these in the future.
- Anssi Kääriäinen


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Sabino Mullane
Дата:
Сообщение: Re: Overhead cost of Serializable Snapshot Isolation
Следующее
От: Greg Sabino Mullane
Дата:
Сообщение: Re: Overhead cost of Serializable Snapshot Isolation