Re: HOT documentation README

Поиск
Список
Период
Сортировка
От Pavan Deolasee
Тема Re: HOT documentation README
Дата
Msg-id 2e78013d0709112349k111dc1den7d49cb30d34599f8@mail.gmail.com
обсуждение исходный текст
Ответ на Re: HOT documentation README  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: HOT documentation README  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-patches


On 9/12/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bruce Momjian <bruce@momjian.us> writes:
> Current README.HOT version:
>       ftp://momjian.us/pub/postgresql/mypatches/README.HOT

I've done some further editorializing and wordsmithing on this,
and marked with XXX some areas that seem like they're still red flags
(or at least inadequately described).


Thanks a lot.
 



CREATE INDEX CONCURRENTLY
-------------------------

In the concurrent case we take a different approach.  We create the
pg_index entry immediately, before we scan the table.  The pg_index entry
is marked as "not ready for inserts".  Then we commit and wait for any
transactions which have the table open to finish.  This ensures that no
*new* HOT updates will change the key value for our new index, because all
transactions will see the existence of the index and will respect its
constraint on which updates can be HOT.

After waiting for transactions which had the table open, we build the
index with a snapshot.  Because we waited for anyone who existed before
the index was created, any tuples seen in the snapshot will have only
valid forward-growing HOT chains.  (They might still have older HOT
updates behind them which are broken.)  As above, we point the index
pointer at the root of the HOT-update chain but we use the key from the
end of the chain.

We mark the index open for inserts (but still not ready for reads) then
we again wait for transactions which have the table open.  Then we take
a second reference snapshot and validate the index.  This searches for
tuples missing from the index --- but it again has to look up the root
of the HOT chains and search for those tuples in the index.

Then we wait until every transaction that could have a snapshot older than
the second reference snapshot is finished.  This ensures that nobody is
alive any longer who could possibly see any of the older tuples in a
broken HOT chain.  (This condition is actually stronger than what is
needed to protect the broken HOT chains --- the problem we are dealing
with is tuples inserted after we took our first reference snapshot.
Those will be part of valid HOT-chains, but might not be in the index.)

XXX the above being the case, why should we bother with all this extra
mechanism?  Seems we could dispense with "not ready for inserts".



The "not ready for inserts" logic is necessary because we have separated
the steps of creating the catalog entry for the new index and actual bottom-up
building of the index into separate transactions. My worry was once the catalog
entry is committed, an index insert won't work with the concurrent index build.

I guess I also need to explain why we need to separate the catalog entry
creation and index building activities into separate transactions. The primary
purpose is to handle existing broken HOT chains and avoid any new
broken chains being created while we are building the index.

Without this additional step, while we are building the index in the first phase
using a snapshot, its possible that a broken HOT chain may get created
forwarding to the tuple that we index. For example:

A table with two columns a and b. Say, we already have an index on 'a'
and creating a new index on 'b'. Say, there is a tuple (0, 1): (1, 11) in
the table where (0, 1) in the line pointer of the tuple.
This is the visible tuple when we took snapshot in the first phase and so
we shall index this tuple with key 11 and TID (0, 1).  But before the first phase
finishes, the tuple is updated to (0, 2): (1, 22). This would be a HOT update
resulting in a broken chain. The root line pointer for the new tuple is still (0, 1).
CIC now waits for the updating transaction,
takes a fresh snapshot and starts the second phase. In the second phase,
it must index the new tuple (1, 22) using the key 22 and TID (0, 1) because
thats the root line pointer for the chain. Now there is already an index entry
in the new index with key 11 and TID (0, 1). Inserting another index entry with
a different key and the same TID would certainly corrupt the index.

We solve this issue by first creating a catalog entry for the new index and
committing the transaction. We then wait
out for any conflicting transactions before taking snapshot for building the
index in the first phase. This arrangement guarantees that the snapshot
used in the first phase would always satisfy tuples at the head of the
existing HOT chains. At the same time, no new broken HOT chains are
created while we are building the index because the new index
(even though not yet ready) is consulted for all subsequent UPDATEs.

Hope this helps.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

В списке pgsql-patches по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: prevent invalidly encoded input
Следующее
От: "Pavan Deolasee"
Дата:
Сообщение: Re: HOT documentation README