Обсуждение: Re: [GENERAL] Using an SMP machine to make multiple indices on the same table

Поиск
Список
Период
Сортировка

Re: [GENERAL] Using an SMP machine to make multiple indices on the same table

От
Tom Lane
Дата:
Martin Weinberg <weinberg@osprey.astro.umass.edu> writes:
> Yes, I understand locking the table, but empirically, two index
> creations will not run simultaneously on the same table.

Hmm, on trying it you are right.  The second index creation blocks here:

#6  0x1718e0 in XactLockTableWait (xid=17334) at lmgr.c:344
#7  0x9e530 in heap_mark4update (relation=0xc1be62f8, tuple=0x7b03b7f0,   buffer=0x7b03b828) at heapam.c:1686
#8  0xcb410 in LockClassinfoForUpdate (relid=387785, rtup=0x7b03b7f0,   buffer=0x7b03b828, confirmCommitted=0 '\000')
atindex.c:1131
 
#9  0xcb534 in IndexesAreActive (relid=387785, confirmCommitted=1 '\001')   at index.c:1176
#10 0xf0f04 in DefineIndex (heapRelationName=0x400aab20 "tenk1",   indexRelationName=0x400aab00 "anotherj",
accessMethodName=0x59f48"btree",   attributeList=0x400aab80, unique=0, primary=0, predicate=0x0,   rangetable=0x0) at
indexcmds.c:133
#11 0x17e118 in ProcessUtility (parsetree=0x400aaba0, dest=Remote)   at utility.c:905

Essentially it's trying to do a SELECT FOR UPDATE on the pg_class tuple
of the relation before it starts building the index.

I have opined before that LockClassinfoForUpdate is a mistake that
shouldn't exist at all, since acquiring the proper lock on the relation
ought to be sufficient.  I see no need for locking the pg_class tuple,
and certainly none for doing so at the beginning of the operation rather
than the end.

Hiroshi, I think you defended it last time; any comments?
        regards, tom lane


Re: [GENERAL] Using an SMP machine to make multiple indices on the same

От
Hiroshi Inoue
Дата:
Tom Lane wrote:
> 
> Martin Weinberg <weinberg@osprey.astro.umass.edu> writes:
> > Yes, I understand locking the table, but empirically, two index
> > creations will not run simultaneously on the same table.
> 
> Hmm, on trying it you are right.  The second index creation blocks here:
> 
> #6  0x1718e0 in XactLockTableWait (xid=17334) at lmgr.c:344
> #7  0x9e530 in heap_mark4update (relation=0xc1be62f8, tuple=0x7b03b7f0,
>     buffer=0x7b03b828) at heapam.c:1686
> #8  0xcb410 in LockClassinfoForUpdate (relid=387785, rtup=0x7b03b7f0,
>     buffer=0x7b03b828, confirmCommitted=0 '\000') at index.c:1131
> #9  0xcb534 in IndexesAreActive (relid=387785, confirmCommitted=1 '\001')
>     at index.c:1176
> #10 0xf0f04 in DefineIndex (heapRelationName=0x400aab20 "tenk1",
>     indexRelationName=0x400aab00 "anotherj", accessMethodName=0x59f48 "btree",
>     attributeList=0x400aab80, unique=0, primary=0, predicate=0x0,
>     rangetable=0x0) at indexcmds.c:133
> #11 0x17e118 in ProcessUtility (parsetree=0x400aaba0, dest=Remote)
>     at utility.c:905
> 
> Essentially it's trying to do a SELECT FOR UPDATE on the pg_class tuple
> of the relation before it starts building the index.
> 
> I have opined before that LockClassinfoForUpdate is a mistake that
> shouldn't exist at all, since acquiring the proper lock on the relation
> ought to be sufficient.

As I've already mentioned many times I never agree with you.

> I see no need for locking the pg_class tuple,
> and certainly none for doing so at the beginning of the operation rather
> than the end.
> 
> Hiroshi, I think you defended it last time; any comments?

Hmm the excluive row level lock by FOR UPDATE is too strong
in this case. OK I would change IndexesAreActive() to not
acquire a lock on the pg_class tuple for user tables because
reindex doesn't need to handle relhasindex for user tables
since 7.1.

regards,
Hiroshi Inoue


Re: [GENERAL] Using an SMP machine to make multiple indices on the same

От
"Hiroshi Inoue"
Дата:
> -----Original Message-----
> From: Hiroshi Inoue
>
> Tom Lane wrote:
> >
> > Martin Weinberg <weinberg@osprey.astro.umass.edu> writes:
> > > Yes, I understand locking the table, but empirically, two index
> > > creations will not run simultaneously on the same table.
> >
> > Hmm, on trying it you are right.  The second index creation blocks here:
> >
> > #6  0x1718e0 in XactLockTableWait (xid=17334) at lmgr.c:344
> > #7  0x9e530 in heap_mark4update (relation=0xc1be62f8, tuple=0x7b03b7f0,
> >     buffer=0x7b03b828) at heapam.c:1686
> > #8  0xcb410 in LockClassinfoForUpdate (relid=387785, rtup=0x7b03b7f0,
> >     buffer=0x7b03b828, confirmCommitted=0 '\000') at index.c:1131
> > #9  0xcb534 in IndexesAreActive (relid=387785,
> confirmCommitted=1 '\001')
> >     at index.c:1176
> > #10 0xf0f04 in DefineIndex (heapRelationName=0x400aab20 "tenk1",
> >     indexRelationName=0x400aab00 "anotherj",
> accessMethodName=0x59f48 "btree",
> >     attributeList=0x400aab80, unique=0, primary=0, predicate=0x0,
> >     rangetable=0x0) at indexcmds.c:133
> > #11 0x17e118 in ProcessUtility (parsetree=0x400aaba0, dest=Remote)
> >     at utility.c:905
> >
> > Essentially it's trying to do a SELECT FOR UPDATE on the pg_class tuple
> > of the relation before it starts building the index.
> >
> > I have opined before that LockClassinfoForUpdate is a mistake that
> > shouldn't exist at all, since acquiring the proper lock on the relation
> > ought to be sufficient.
>
> As I've already mentioned many times I never agree with you.
>
> > I see no need for locking the pg_class tuple,
> > and certainly none for doing so at the beginning of the operation rather
> > than the end.
> >
> > Hiroshi, I think you defended it last time; any comments?
>
> Hmm the excluive row level lock by FOR UPDATE is too strong
> in this case. OK I would change IndexesAreActive() to not
> acquire a lock on the pg_class tuple for user tables because
> reindex doesn't need to handle relhasindex for user tables
> since 7.1.

In the end, I changed DefineIndex() to not call IndexesAreActive().

regards,
Hiroshi Inoue



Re: [GENERAL] Using an SMP machine to make multiple indices on the same

От
Tom Lane
Дата:
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> In the end, I changed DefineIndex() to not call IndexesAreActive().

I saw that.  But is it a good solution?  If someone has deactivated
indexes on a user table (ie turned off relhasindex), then creating a
new index would activate them again, which would probably be bad.

I have realized that this code is wrong anyway, because it doesn't
acquire ShareLock on the relation until far too late; all the setup
processing is done with no lock at all :-(.  LockClassinfoForUpdate
provided a little bit of security against concurrent schema changes,
though not enough.

Also, I'm now a little worried about whether concurrent index creations
will actually work.  Both CREATE INDEX operations will try to update
the pg_class tuple to set relhasindex true.  Since they use
simple_heap_update for that, the second one is likely to fail
because simple_heap_update doesn't handle concurrent updates.

I think what we probably want is
1. Acquire ShareLock at the very start.
2. Check for indexes present but relhasindex = false,   if so complain.
3. Build the index.
4. Update pg_class tuple, being prepared for concurrent   updates (ie, do NOT use simple_heap_update here).

I still don't see any value in LockClassinfoForUpdate, however.
        regards, tom lane


Re: [GENERAL] Using an SMP machine to make multiple indices on

От
Hiroshi Inoue
Дата:
Tom Lane wrote:
> 
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> > In the end, I changed DefineIndex() to not call IndexesAreActive().
> 
> I saw that.  But is it a good solution?  If someone has deactivated
> indexes on a user table (ie turned off relhasindex), then creating a
> new index would activate them again, which would probably be bad.

I apolgize my neglect of reconsidering the activte/deactivate
stuff for indexes. Probably it is no longer needed now(since 7.1).
Reindex under postmaster for user tables has been available 
from the first. I didn't write a documentation about it inten-
tionally in 7.0 though it was my neglect also in 7.1 sorry.
In 7.0 REINDEX set relhasindex to false first to tell all 
backends that the indexes are unavailable because we wasn't 
able to recreate indexes safely in case of abort. Note
that relhasindex was set immediately(out of transactional
control) in 7.0 and acruiring a lock for the pg_class tuple
was very critical.
Since 7.1 we are able to recreate indexes safely under 
postmaster and REINDEX doesn't set relhasindex to false
for user tables. Though REINDEX deactivates the indexes of
system tables the deactivation is done under transactional
control and other backends never see the deactivated 
relhasindex.

> 
> I have realized that this code is wrong anyway, because it doesn't
> acquire ShareLock on the relation until far too late; all the setup
> processing is done with no lock at all :-(.  LockClassinfoForUpdate
> provided a little bit of security against concurrent schema changes,
> though not enough.
> 
> Also, I'm now a little worried about whether concurrent index creations
> will actually work.  Both CREATE INDEX operations will try to update
> the pg_class tuple to set relhasindex true.

Yes but there's a big difference. It's at the end of the creation
not at the beginning. Also note that UpdateStats() updates pg_class
tuple in case of B-trees etc before updating relhasindex. I'm
suspicios if we should update Stats under the transactional control. 
 Since they use
> simple_heap_update for that, the second one is likely to fail
> because simple_heap_update doesn't handle concurrent updates.
> 
> I think what we probably want is
> 
>         1. Acquire ShareLock at the very start.
> 
>         2. Check for indexes present but relhasindex = false,
>            if so complain.
> 
>         3. Build the index.
> 
>         4. Update pg_class tuple, being prepared for concurrent
>            updates (ie, do NOT use simple_heap_update here).
> 
> I still don't see any value in LockClassinfoForUpdate, however.

ISTM to rely on completely the lock for the corresponding
relation is a little misplaced. For example ALTER TABLE OWNER
doesn't acquire any lock on the table but it seems natural to me.
UPDATE pg_class set .. doesn't acquire any lock on the correspoding
relations of the target pg_class tuples but it seems natural to me,

regards,
Hiroshi Inoue


Re: [GENERAL] Using an SMP machine to make multiple indices on the same

От
Tom Lane
Дата:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> Tom Lane wrote:
>> Also, I'm now a little worried about whether concurrent index creations
>> will actually work.  Both CREATE INDEX operations will try to update
>> the pg_class tuple to set relhasindex true.

> Yes but there's a big difference. It's at the end of the creation
> not at the beginning. Also note that UpdateStats() updates pg_class
> tuple in case of B-trees etc before updating relhasindex. I'm
> suspicios if we should update Stats under the transactional control. 

It would probably be good to fix things so that there's only one update
done for both stats and relhasindex, instead of two.  But we *will* get
failures in simple_heap_update if we continue to use that routine.
The window for failure may be relatively short but it's real.  It's not
necessarily short, either; consider multiple CREATE INDEX commands
executed in a transaction block.

>> I still don't see any value in LockClassinfoForUpdate, however.

> ISTM to rely on completely the lock for the corresponding
> relation is a little misplaced.

Surely we *must* be able to rely on the relation lock.  For example:
how does SELECT FOR UPDATE of the relation's pg_class tuple prevent
writers from adding tuples to the relation?  It does not and cannot.
Only getting the appropriate relation lock provides a semantically
correct guarantee that the relation isn't changing underneath us.
Locking the pg_class tuple only locks the tuple itself, it has no wider
scope of meaning.

> For example ALTER TABLE OWNER
> doesn't acquire any lock on the table but it seems natural to me.

Seems like a bug to me.  Consider this scenario:

Backend 1                Backend 2

begin;

lock table1;

select from table1;  -- works
                alter table1 set owner ...

select from table1;  -- fails, no permissions

That should not happen.  It wouldn't happen if ALTER TABLE OWNER
were acquiring an appropriate lock on the relation.

> UPDATE pg_class set .. doesn't acquire any lock on the correspoding
> relations of the target pg_class tuples but it seems natural to me,

While we allow knowledgeable users to poke at the system catalogs
directly, I feel that that is very much a "let the user beware"
facility.  I have no urge to try to guarantee cross-backend
transactional safety for changes executed that way.  But CREATE INDEX,
ALTER TABLE, and so forth should have safe concurrent behavior.
        regards, tom lane


Re: [GENERAL] Using an SMP machine to make multiple indices on

От
Hiroshi Inoue
Дата:
Tom Lane wrote:
> 
> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > Tom Lane wrote:
> >> Also, I'm now a little worried about whether concurrent index creations
> >> will actually work.  Both CREATE INDEX operations will try to update
> >> the pg_class tuple to set relhasindex true.
> 
> > Yes but there's a big difference. It's at the end of the creation
> > not at the beginning. Also note that UpdateStats() updates pg_class
> > tuple in case of B-trees etc before updating relhasindex. I'm
> > suspicios if we should update Stats under the transactional control.
> 
> It would probably be good to fix things so that there's only one update
> done for both stats and relhasindex, instead of two.  

I don't fully agree with you at this point. It's pretty painful
to update relatively irrevalent items at a time in some cases.
UpdateStats() had updated both reltuples and relhasindex before 7.0.
It's me who changed UpdateStats() to not update relhasindex when
I implemented REINDEX command. Reindex has to set relhasindex to 
true after all the indexes of a table were recreated.

> But we *will* get
> failures in simple_heap_update if we continue to use that routine.
> The window for failure may be relatively short but it's real.  It's not
> necessarily short, either; consider multiple CREATE INDEX commands
> executed in a transaction block.
> 
> >> I still don't see any value in LockClassinfoForUpdate, however.
> 
> > ISTM to rely on completely the lock for the corresponding
> > relation is a little misplaced.
> 
> Surely we *must* be able to rely on the relation lock.  For example:
> how does SELECT FOR UPDATE of the relation's pg_class tuple prevent
> writers from adding tuples to the relation? It does not and cannot.

I've never said that the relation lock is unnecessary.
The stuff around relhasindex is(was) an exception that keeps
a (possibly) long term lock for the pg_class tuple apart from 
the relevant relation lock.
What I've mainly intended is to guard our(at least my) code.
If our(my) code acquires an AccessExclusiveLock on a relation
and would update the correspoing pg_class tuple, I'd like to
get the locked tuple not the unlocked one because I couldn't
change unlocked tuples without anxiety. That's almost all.
In most cases the AccessExclusiveLock on the relation would
already block other backends which must be blocked as you
say and so the lock on the pg_class tuple would cause few
additional lock conflicts. Where are disadvantages to get
locked pg_class tuples ?

> Only getting the appropriate relation lock provides a semantically
> correct guarantee that the relation isn't changing underneath us.
> Locking the pg_class tuple only locks the tuple itself, it has no wider
> scope of meaning.
> 
> > For example ALTER TABLE OWNER
> > doesn't acquire any lock on the table but it seems natural to me.
> 
> Seems like a bug to me.  Consider this scenario:
> 
> Backend 1                               Backend 2
> 
> begin;
> 
> lock table1;
> 
> select from table1;  -- works
> 
>                                         alter table1 set owner ...
> 
> select from table1;  -- fails, no permissions
> 
> That should not happen.  It wouldn't happen if ALTER TABLE OWNER
> were acquiring an appropriate lock on the relation.

Hmm ok agreed. One of my intentions is to guard our(my) code
from such careless(?) applications.

> 
> > UPDATE pg_class set .. doesn't acquire any lock on the correspoding
> > relations of the target pg_class tuples but it seems natural to me,
> 
> While we allow knowledgeable users to poke at the system catalogs
> directly, I feel that that is very much a "let the user beware"
> facility.

Me too. Again what I intend is to guard our(my) code from
such knowledgeable users not guarantee them an expected(?)
result.

regards,
Hiroshi Inoue