On Wed, Jun 13, 2012 at 7:28 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> This patch is problematic because formally indexes used by syscaches needs to
> be unique, this one is not though because of 0/InvalidOids entries for
> nailed/shared catalog entries. Those values aren't allowed to be queried though.
That's not the only reason it's not unique. Take a look at
GetRelFileNode(). We really only guarantee that <database OID,
tablespace OID, relfilenode, backend-ID>, taken as a four-tuple, is
unique. You could have the same relfilenode in different tablespaces,
or even within the same tablespace with different backend-IDs. The
latter might not matter for you because you're presumably disregarding
temp tables, but the former probably does. It's an uncommon scenario
because we normally set relid = relfilenode, and of course relid is
unique across the database, but the table gets rewritten then you end
up with relid != relfilenode, and I don't think there's anything at
that point that will prevent the new relfilenode from being chosen as
some other relations relfilenode, as long as it's in a different
tablespace.
I think the solution may be to create a specialized cache for this,
rather than relying on the general syscache infrastructure. You might
look at, e.g., attoptcache.c for an example. That would allow you to
build a cache that is aware of things like the relmapper
infrastructure, and the fact that temp tables are ignorable for your
purposes. But I think you will need to include at least the
tablespace OID in the key along with the relfilenode to make it
bullet-proof.
I haven't read through the patch series far enough to know what this
is being used for yet, but my fear is that you're using it to handle
mapping a relflenode extracted from the WAL stream back to a relation
OID. The problem with that is that relfilenode assignments obey
transaction semantics. So, if someone begins a transaction, truncates
a table, inserts a tuple, and commits, the heap_insert record is going
to refer to a relfilenode that, according to the system catalogs,
doesn't exist. This is similar to one of the worries in my other
email, so I won't belabor the point too much more here...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company