Обсуждение: Indexes vs. cache flushes

Поиск
Список
Период
Сортировка

Indexes vs. cache flushes

От
Tom Lane
Дата:
I've been working on getting the system to pass regression tests cleanly
when forcing a cache flush at every possible instant.  The main tests
pass now (in 8.1 --- HEAD remains broken pending lookup_rowtype_tupdesc
fix), but contrib is still crashing.  On investigation the problem turns
out to be in index_getprocinfo(), which tries to load up a cached
FmgrInfo for an index support function.  If the support function is not
a built-in C function, then fmgr_info() will need to open pg_proc to
look it up.  If a cache flush occurs in the course of that lookup,
the FmgrInfo we're trying to store into goes away!  Havoc ensues of
course.

After looking at this for a bit, it seems the cleanest fix is for
RelationClearRelation() to treat any open index the same way it
currently handles nailed indexes --- ie, don't do anything except
re-read the pg_class record.  Then we won't try to flush and rebuild the
cached index support info, and the problem doesn't arise.

This would still support REINDEX (which changes pg_class.relfilenode in
order to replace the physical file) and ALTER INDEX SET TABLESPACE.
But you couldn't make any meaningful changes in the definition of an
index, such as changing its column set, operator classes, partial-index
predicate, etc, except by dropping and recreating it.

Now this is true today, and it doesn't seem likely to me that we'd
ever want to relax it (since any such change would probably require
rebuilding the index anyway).  But does anyone see that differently?
        regards, tom lane


Re: Indexes vs. cache flushes

От
Greg Stark
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> This would still support REINDEX (which changes pg_class.relfilenode in
> order to replace the physical file) and ALTER INDEX SET TABLESPACE.
> But you couldn't make any meaningful changes in the definition of an
> index, such as changing its column set, operator classes, partial-index
> predicate, etc, except by dropping and recreating it.
> 
> Now this is true today, and it doesn't seem likely to me that we'd
> ever want to relax it (since any such change would probably require
> rebuilding the index anyway).  But does anyone see that differently?

The only example that comes to mind of something you might want to be able to
twiddle and wouldn't expect to be a slow operation is making a unique index a
non-unique index.

-- 
greg



Re: Indexes vs. cache flushes

От
Tom Lane
Дата:
Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> But you couldn't make any meaningful changes in the definition of an
>> index, such as changing its column set, operator classes, partial-index
>> predicate, etc, except by dropping and recreating it.

> The only example that comes to mind of something you might want to be able to
> twiddle and wouldn't expect to be a slow operation is making a unique index a
> non-unique index.

I think actually that that would still work, so long as you acquired
exclusive lock on the parent table first (which you'd have to do anyway,
because this would constitute a significant change to the table's schema
--- it could invalidate plans for example).  The lock would guarantee
that no one has the index open.  It's only in the case of an opened
index that I propose not flushing the index support info.

The concerns that I find more interesting are changes in the underlying
objects.  We don't have an ALTER OPERATOR CLASS, much less an ALTER
ACCESS METHOD, but it's certainly theoretically possible to change the
definition of a support function used by an index.  There isn't
presently any mechanism to force timely propagation of such a change,
and so you'd be largely on your own --- but realistically, wouldn't such
a change require rebuilding the index anyway?
        regards, tom lane


Re: Indexes vs. cache flushes

От
Martijn van Oosterhout
Дата:
On Thu, Jan 19, 2006 at 02:46:11AM -0500, Tom Lane wrote:
> The concerns that I find more interesting are changes in the underlying
> objects.  We don't have an ALTER OPERATOR CLASS, much less an ALTER
> ACCESS METHOD, but it's certainly theoretically possible to change the
> definition of a support function used by an index.  There isn't
> presently any mechanism to force timely propagation of such a change,
> and so you'd be largely on your own --- but realistically, wouldn't such
> a change require rebuilding the index anyway?

I wondered about the same problem when dealing with the collation
stuff. If you change anything about a collation, you essentially have
to invalidate any indexes, plans or views using it. Like you say,
there's isn't really a way of doing this.

This isn't the first time I've wondered about a flag on the an index
stating "broken, pending rebuild". If one of these details changes, we
really need to stop using the indexes ASAP until they're back into a
consistant state. The only question remaining is when to do the
rebuild: you can either wait for manual intervention or (probably
better) simply do the REINDEXes in the same transaction as the ALTER
OPERATOR CLASS et al.

The locking would be painful, perhaps a better way would be to clone
the opclass, clone the indexes with the new opclass, build them and
then drop the old indexes. Once you've rebuilt the views and indexes,
simply delete the old opclass next database restart since plans don't
survive DB restart, right?. (Isn't this Read-Copy-Update style
locking?).

Have a nice day,

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Indexes vs. cache flushes

От
"Jim C. Nasby"
Дата:
On Thu, Jan 19, 2006 at 02:46:11AM -0500, Tom Lane wrote:
> Greg Stark <gsstark@mit.edu> writes:
> > Tom Lane <tgl@sss.pgh.pa.us> writes:
> >> But you couldn't make any meaningful changes in the definition of an
> >> index, such as changing its column set, operator classes, partial-index
> >> predicate, etc, except by dropping and recreating it.
> 
> > The only example that comes to mind of something you might want to be able to
> > twiddle and wouldn't expect to be a slow operation is making a unique index a
> > non-unique index.
> 
> I think actually that that would still work, so long as you acquired
> exclusive lock on the parent table first (which you'd have to do anyway,
> because this would constitute a significant change to the table's schema
> --- it could invalidate plans for example).  The lock would guarantee
> that no one has the index open.  It's only in the case of an opened
> index that I propose not flushing the index support info.
> 
> The concerns that I find more interesting are changes in the underlying
> objects.  We don't have an ALTER OPERATOR CLASS, much less an ALTER
> ACCESS METHOD, but it's certainly theoretically possible to change the
> definition of a support function used by an index.  There isn't
> presently any mechanism to force timely propagation of such a change,
> and so you'd be largely on your own --- but realistically, wouldn't such
> a change require rebuilding the index anyway?

How would this affect changing the type of a column? Specifically, I'm
thinking of the case of domains, where it would be very handy if the
type of a domain could be changed (possibly with some restrictions). IE:
CREATE DOMAIN name varchar(50); is now too small and you want to up it
to varchar(64). Granted, not supported now, though changing constraints
is currently supported.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: Indexes vs. cache flushes

От
Tom Lane
Дата:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> How would this affect changing the type of a column?

It doesn't, because we drop and rebuild indexes completely during ALTER
COLUMN TYPE.
        regards, tom lane