Обсуждение: Useless code in RelationCacheInitializePhase3

Поиск
Список
Период
Сортировка

Useless code in RelationCacheInitializePhase3

От
Tom Lane
Дата:
While looking at the pending patch to clean up management of
rd_partcheck, I noticed that RelationCacheInitializePhase3 has code that
purports to reload rd_partkey and rd_partdesc, but none for rd_partcheck.
However, that reload code is dead code, as is easily confirmed by
checking the code coverage report, because we have no partitioned system
catalogs.

Moreover, if somebody tried to add such a catalog, I'd bet a good deal
of money that this code would not work.  It seems highly unlikely that
we could run RelationBuildPartitionKey or RelationBuildPartitionDesc
successfully when we haven't even finished bootstrapping the relcache.

I don't think that this foolishness is entirely the fault of the
partitioning work; it's evidently modeled on the adjacent code to reload
rules, triggers, and row security code.  But that code is all equally
dead, equally unlikely to work if somebody tried to invoke it, and
equally likely to be forever unused because there are many other
problems you'd have to surmount to support something like triggers or
row security on system catalogs.

I'm inclined to remove almost everything below the comment
"Fix data that isn't saved in relcache cache file", and replace
it with either assertions or test-and-elogs that say that a
relation appearing in the cache file can't have triggers or rules
or row security or be partitioned.

I am less sure about whether the table-access-method stanza is as silly
as the rest, but I do see that it's unreached in current testing.
So I wonder whether there is any thought that we'd realistically support
catalogs with nondefault AMs, and if there is, does anyone think that
this code would work?

            regards, tom lane



Re: Useless code in RelationCacheInitializePhase3

От
Andres Freund
Дата:
Hi,

On 2019-04-12 14:17:11 -0400, Tom Lane wrote:
> While looking at the pending patch to clean up management of
> rd_partcheck, I noticed that RelationCacheInitializePhase3 has code that
> purports to reload rd_partkey and rd_partdesc, but none for rd_partcheck.
> However, that reload code is dead code, as is easily confirmed by
> checking the code coverage report, because we have no partitioned system
> catalogs.
> 
> Moreover, if somebody tried to add such a catalog, I'd bet a good deal
> of money that this code would not work.  It seems highly unlikely that
> we could run RelationBuildPartitionKey or RelationBuildPartitionDesc
> successfully when we haven't even finished bootstrapping the relcache.

But it sure would be nice if we made it work at some point. Having
e.g. global, permanent + unlogged, and temporary tables attributes in a
separate pg_attribute would be quite an advantage (and much easier than
a separate pg_class).  Obviously even that is *far* from trivial.


> I don't think that this foolishness is entirely the fault of the
> partitioning work; it's evidently modeled on the adjacent code to reload
> rules, triggers, and row security code.  But that code is all equally
> dead, equally unlikely to work if somebody tried to invoke it, and
> equally likely to be forever unused because there are many other
> problems you'd have to surmount to support something like triggers or
> row security on system catalogs.

I don't see us wanting to go to supporting triggers, but I could see us
desiring RLS at some point. To hide rows a user doesn't have access to.


> I am less sure about whether the table-access-method stanza is as silly
> as the rest, but I do see that it's unreached in current testing.
> So I wonder whether there is any thought that we'd realistically support
> catalogs with nondefault AMs, and if there is, does anyone think that
> this code would work?

Right now it definitely won't work, most importantly because there's a
fair bit of catalog related code that triggers direct
heap_insert/update/delete, and expects systable_getnext() to not need
memory to allocate the result in the current context (hence the
!shouldFree assert) and just generally because a lot of places just
straight up assume the catalog is heap.

Most of that would be fairly easy to fix however. A lot of rote work,
but technically not hard. The hardest is probably a bunch of code that
uses xmin for cache validation and such, but that seems solvable.

I don't quite know however how we'd use the ability to technically be
able to have a different AM for catalog tables. One possible thing would
be using different builtin AMs for different catalog tables, that seems
like it'd not be too hard.  But after that it gets harder - e.g. doing
an initdb with a different default AM sounds not impossible, but also
far from easy (we can't do pg_proc lookups before having initialized it,
which is why formrdesc hardcodes GetHeapamTableAmRoutine()).  And having
different AMs per database seems even harder.

I think it probably would work for catalog tables, as it's coded right
now. There's no catalog lookups RelationInitTableAccessMethod() for
tables that return true for IsCatalogTable(). In fact, I think we should
apply something like:

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 64f3c2e8870..7ff64b108c4 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1746,6 +1746,7 @@ RelationInitTableAccessMethod(Relation relation)
         * seem prudent to show that in the catalog. So just overwrite it
         * here.
         */
+       Assert(relation->rd_rel->relam == InvalidOid);
        relation->rd_amhandler = HEAP_TABLE_AM_HANDLER_OID;
    }
    else if (IsCatalogRelation(relation))
@@ -1935,8 +1936,7 @@ formrdesc(const char *relationName, Oid relationReltype,
    /*
     * initialize the table am handler
     */
-   relation->rd_rel->relam = HEAP_TABLE_AM_OID;
-   relation->rd_tableam = GetHeapamTableAmRoutine();
+   RelationInitTableAccessMethod(relation);
 
    /*
     * initialize the rel-has-index flag, using hardwired knowledge

To a) ensure that that is and stays the case b) avoid having the
necessary information in multiple places.  Not sure why we not ended up
doing the thing in the second hunk earlier. Just using
RelationInitTableAccessMethod() seems cleaner to me.

Greetings,

Andres Freund



Re: Useless code in RelationCacheInitializePhase3

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2019-04-12 14:17:11 -0400, Tom Lane wrote:
>> While looking at the pending patch to clean up management of
>> rd_partcheck, I noticed that RelationCacheInitializePhase3 has code that
>> purports to reload rd_partkey and rd_partdesc, but none for rd_partcheck.
>> However, that reload code is dead code, as is easily confirmed by
>> checking the code coverage report, because we have no partitioned system
>> catalogs.
>> 
>> Moreover, if somebody tried to add such a catalog, I'd bet a good deal
>> of money that this code would not work.  It seems highly unlikely that
>> we could run RelationBuildPartitionKey or RelationBuildPartitionDesc
>> successfully when we haven't even finished bootstrapping the relcache.

> But it sure would be nice if we made it work at some point.

Whether it would be nice or not is irrelevant to my point: this code
doesn't work, and it's unlikely that it would ever be part of a working
solution.  I don't think there's any way that it'd be sane to attempt
catalog accesses during RelationCacheInitializePhase3.  If we want any
of these features for system catalogs, I think the route to a real fix
would be to make them load-on-demand data so that they can be fetched
later on.  Or, possibly, the easiest way is to include these data
structures in the dumped cache file.  But what's here is a dead end.
I'd even call it an attractive nuisance, because it encourages people
to add yet more nonfunctional code, rather than pointing them in the
direction of doing something useful.

>> I am less sure about whether the table-access-method stanza is as silly
>> as the rest, but I do see that it's unreached in current testing.
>> So I wonder whether there is any thought that we'd realistically support
>> catalogs with nondefault AMs, and if there is, does anyone think that
>> this code would work?

> Right now it definitely won't work,

Sure, I wasn't expecting that.  The question is the same as above:
is it plausible that this code would appear in this form in a complete
working implementation?  If not, I think we should rip it out rather
than leave the impression that we think it does something useful.

> I think it probably would work for catalog tables, as it's coded right
> now. There's no catalog lookups RelationInitTableAccessMethod() for
> tables that return true for IsCatalogTable(). In fact, I think we should
> apply something like:

Makes sense, and I'd also add some comments pointing out that there had
better not be any catalog lookups when this is called for a system
catalog.

            regards, tom lane



Re: Useless code in RelationCacheInitializePhase3

От
Tom Lane
Дата:
I wrote:
> Whether it would be nice or not is irrelevant to my point: this code
> doesn't work, and it's unlikely that it would ever be part of a working
> solution.  I don't think there's any way that it'd be sane to attempt
> catalog accesses during RelationCacheInitializePhase3.

BTW, to clarify that: obviously, this loop *does* access pg_class, and
pg_class's indexes too.  The issue here is that if any of these other
stanzas ever really executed, we would be doing accesses to a bunch of
other catalogs as well, meaning that their relcache entries would have to
already exist in a state valid enough to permit access.  That would mean
that they'd have to be treated as bootstrap catalogs so that we could
create hardwired entries with formrdesc.  That's not a direction I want
to go in.  Bootstrap catalogs are a huge pain to maintain; we don't want
any more than the absolute minimum of them.

            regards, tom lane



Re: Useless code in RelationCacheInitializePhase3

От
Alvaro Herrera from 2ndQuadrant
Дата:
Hello,

On 2019-Apr-13, Tom Lane wrote:

> Andres Freund <andres@anarazel.de> writes:

> > I think it probably would work for catalog tables, as it's coded right
> > now. There's no catalog lookups RelationInitTableAccessMethod() for
> > tables that return true for IsCatalogTable(). In fact, I think we should
> > apply something like:
> 
> Makes sense, and I'd also add some comments pointing out that there had
> better not be any catalog lookups when this is called for a system
> catalog.

I think this was forgotten ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services