Обсуждение: Another HOT thought: why do we need indcreatexid at all?

Поиск
Список
Период
Сортировка

Another HOT thought: why do we need indcreatexid at all?

От
Tom Lane
Дата:
AFAICS, the whole indcreatexid and validForTxn business is a waste of
code.  By the time CREATE INDEX CONCURRENTLY is ready to set indisvalid,
surely any transactions that could see the broken HOT chains are gone.
There might have been some reason for this contraption before we had
plan invalidation, but what use is it now?
        regards, tom lane


Re: Another HOT thought: why do we need indcreatexid at all?

От
Gregory Stark
Дата:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> AFAICS, the whole indcreatexid and validForTxn business is a waste of
> code.  By the time CREATE INDEX CONCURRENTLY is ready to set indisvalid,
> surely any transactions that could see the broken HOT chains are gone.
> There might have been some reason for this contraption before we had
> plan invalidation, but what use is it now?

It sounds like you're missing one of the big problems HOT ran into. When you
create a new index your new index could include columns which were previously
not covered in any index. So there could be pre-existing HOT chains which
would no longer be eligible for HOT treatment. The README called such chains
"broken HOT chains" and has some more information about them.

Nobody who can see any old tuples in such chains can risk using your new index
since the chain will be indexed under the "wrong" key. *New* transactions can
use the index however since they'll only see the head of the chain which is
the key the chain is indexed under. It's the old transactions which can see
the old key values which aren't included in the index.

Or do you see some other reason that plan invalidation can solve this problem?
We looked for and tried a lot of different approaches to solve this problem.
This was the lowest impact solution and the only one that was convincingly
correct (imho).


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: Another HOT thought: why do we need indcreatexid at all?

От
Gregory Stark
Дата:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> AFAICS, the whole indcreatexid and validForTxn business is a waste of
> code.  By the time CREATE INDEX CONCURRENTLY is ready to set indisvalid,
> surely any transactions that could see the broken HOT chains are gone.
> There might have been some reason for this contraption before we had
> plan invalidation, but what use is it now?

Argh, sorry, rereading your message I see there are a few details which I
missed which completely change the meaning of it. Ignore my previous mail :(


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: Another HOT thought: why do we need indcreatexid at all?

От
Gregory Stark
Дата:
"Gregory Stark" <stark@enterprisedb.com> writes:

> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>
>> AFAICS, the whole indcreatexid and validForTxn business is a waste of
>> code.  By the time CREATE INDEX CONCURRENTLY is ready to set indisvalid,
>> surely any transactions that could see the broken HOT chains are gone.
>> There might have been some reason for this contraption before we had
>> plan invalidation, but what use is it now?
>
> Argh, sorry, rereading your message I see there are a few details which I
> missed which completely change the meaning of it. Ignore my previous mail :(

In answer to the real question you were actually asking, I believe you're
correct that CREATE INDEX CONCURRENTLY should never need to set indcreatexid.
Only regular non-concurrent CREATE INDEX needs to protect against that
problem.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: Another HOT thought: why do we need indcreatexid at all?

От
Tom Lane
Дата:
Gregory Stark <stark@enterprisedb.com> writes:
>> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>>> AFAICS, the whole indcreatexid and validForTxn business is a waste of
>>> code.  By the time CREATE INDEX CONCURRENTLY is ready to set indisvalid,
>>> surely any transactions that could see the broken HOT chains are gone.

> In answer to the real question you were actually asking, I believe you're
> correct that CREATE INDEX CONCURRENTLY should never need to set indcreatexid.
> Only regular non-concurrent CREATE INDEX needs to protect against that
> problem.

Argh, I'd momentarily gotten concurrent and nonconcurrent cases backwards.

I would still desperately like to get rid of indcreatexid, though,
because the patch's existing mechanism for clearing it is junk.
There's no guarantee that it will get cleared before it wraps around,
because the clearing is attached to vacuuming of the wrong table.
Maybe you could make it work by special-casing vacuuming of pg_index
itself, but the whole thing's a crock anyway.

[ thinks some more ... ] Hmm, maybe instead of an explicit XID stored in
the pg_index row proper, we could use the xmin of the pg_index row
itself?  That's already got a working mechanism for getting frozen.
        regards, tom lane


Re: Another HOT thought: why do we need indcreatexid at all?

От
"Pavan Deolasee"
Дата:


On 9/14/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I would still desperately like to get rid of indcreatexid, though,
because the patch's existing mechanism for clearing it is junk.
There's no guarantee that it will get cleared before it wraps around,
because the clearing is attached to vacuuming of the wrong table.
Maybe you could make it work by special-casing vacuuming of pg_index
itself, but the whole thing's a crock anyway.


Hmm.. I kind of agree, though I thought the base table must receive
a vacuum for wrap-around purpose (because it contained a
RECENTLY_DEAD tuple) and we should be able to fix indcreatexid
in that context. But if we do anything to get away from it, that will be great.


[ thinks some more ... ] Hmm, maybe instead of an explicit XID stored in
the pg_index row proper, we could use the xmin of the pg_index row
itself?  That's already got a working mechanism for getting frozen.


I think this a great idea. I think we can use the relation->indextuple to
get pg_index row's xmin. But we need to add appropriate relcache
invalidation when we freeze a tuple (at least for pg_index tuples) and
reload this information in relation->indextuple in RelationReloadIndexInfo()
Am I on right track ?

Thanks,
Pavan
 
--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com