Re: MaxOffsetNumber for Table AMs

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: MaxOffsetNumber for Table AMs
Дата	5 мая 2021 г. 20:56:47
Msg-id	CA+TgmoZeVGJ0_SNhJkFG=+OPD4GUKYEMwNJNo4vPtSK4Tn2cJQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: MaxOffsetNumber for Table AMs (Peter Geoghegan <pg@bowt.ie>)
Ответы	Re: MaxOffsetNumber for Table AMs (Peter Geoghegan <pg@bowt.ie>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, May 5, 2021 at 1:15 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > I don't think this is true at all. If you have a clustered index -
> > i.e. the table is physically arranged according to the index ordering
> > - then your secondary indexes all pretty much have to be what we're
> > calling indirect indexes. They can hardly point to a physical
> > identifier if rows are being moved around. I believe InnoDB works this
> > way, and I think Oracle's index-organized tables do too. I suspect
> > there are other examples.
>
> But these systems don't have indirect indexes *on a heap table*! Why
> would they ever do it that way? They already have rowid/TID as a
> stable identifier of logical rows, so having indirect indexes that
> point to a heap table's rows would be strictly worse than the generic
> approach for indexes on a heap table.

One advantage of indirect indexes is that you can potentially avoid a
lot of writes to the index. If a non-HOT update is performed, but the
primary key is not updated, the index does not need to be touched. I
think that's a potentially significant savings, even if bottom-up
index deletion would have prevented the page splits. Similarly, you
can mark a dead line pointer unused without having to scan the
indirect index, because the index isn't pointing to that dead line
pointer anyway.

Hmm, but I guess you have another cleanup problem. What prevents
someone from inserting a new row with the same primary key as a
previously-deleted row but different values in some indirectly-indexed
column? Then the old index entries, if still present, could mistakenly
refer to the new row. I don't know whether Alvaro thought of that
problem when he was working on this previously, or whether he solved
it somehow. Possibly that's a big enough problem that the whole idea
is dead in the water, but it's not obvious to me that this is so.

And, anyway, this whole argument is predicated on the fact that the
only table AM we have right now is heapam. If we had a table AM that
organized the data by primary key value, we'd still want to be able to
have secondary indexes, and they'd have to use the primary key value
as the TID.

> I think that global indexes are well worth having, and should be
> solved some completely different way. The partition key can be an
> additive thing.

I agree that the partition identifier should be an additive thing, but
where would we add it? It seems to me that the obvious answer is to
make it a column of the index tuple. And if we can do that, why can't
we put whatever kind of TID-like stuff people want in the index tuple,
too? Maybe part of the problem here is that I don't actually
understand how posting lists are represented...

-- 
Robert Haas
EDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Peter Geoghegan
Дата: 05 мая 2021 г., 20:48:41
Сообщение: Re: MaxOffsetNumber for Table AMs

Следующее

От: Jeff Davis
Дата: 05 мая 2021 г., 20:56:56
Сообщение: Re: MaxOffsetNumber for Table AMs

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: MaxOffsetNumber for Table AMs

Предыдущее

Следующее