Обсуждение: CreateFakeRelcacheEntry versus Hot Standby
I was rather surprised to find this code still present: /* * We set up the lockRelId in case anything tries to lock the dummy * relation. Note that this is fairly bogus since relNodemay be * different from the relation's OID. It shouldn't really matter though, * since we are presumably runningby ourselves and can't have any lock * conflicts ... */rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;rel->rd_lockInfo.lockRelId.relId= rnode.relNode; Seems quite unsafe in HS. regards, tom lane
Tom Lane wrote: > I was rather surprised to find this code still present: > > /* > * We set up the lockRelId in case anything tries to lock the dummy > * relation. Note that this is fairly bogus since relNode may be > * different from the relation's OID. It shouldn't really matter though, > * since we are presumably running by ourselves and can't have any lock > * conflicts ... > */ > rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode; > rel->rd_lockInfo.lockRelId.relId = rnode.relNode; > > Seems quite unsafe in HS. Good catch. I started looking at the callers of CreateFakeRelcacheEntry to see if any of them actually take a lock on the Relation. It seems not, so maybe we should just set those to InvalidOid, and add an Assert(OidIsValid(dbId) && OidIsValid(relId)) to LockAcquire() to ensure that a fake relcache entry is not used to grab locks in the future either. However, I spotted another bug. In ginContinueSplit(), we call CreateFakeRelcacheEntry(), and pass the fake relcache entry to prepareEntryScan() or prepareDataScan(). Those functions store the passed-in Relation ptr in the passed in GinBtreeData struct. After the call, we free the fake relcache entry, but the pointer is still present in the GinBtreeData. The findParents() call later in the function uses it, and will thus access free'd memory. A trivial fix is to delay freeing the fake relcache entry in ginContinueSplit(). But this once again shows that our WAL redo functions are not as well tested as they should be, and the WAL redo cleanup functions in particularly. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > I started looking at the callers of CreateFakeRelcacheEntry to see if > any of them actually take a lock on the Relation. It seems not, so maybe > we should just set those to InvalidOid, and add an > Assert(OidIsValid(dbId) && OidIsValid(relId)) to LockAcquire() to ensure > that a fake relcache entry is not used to grab locks in the future either. +1, I was thinking that would be a reasonable thing to try. (You can't assert dbID != 0, but the relid test should be sufficient.) > However, I spotted another bug. In ginContinueSplit(), we call > CreateFakeRelcacheEntry(), and pass the fake relcache entry to > prepareEntryScan() or prepareDataScan(). Those functions store the > passed-in Relation ptr in the passed in GinBtreeData struct. After the > call, we free the fake relcache entry, but the pointer is still present > in the GinBtreeData. The findParents() call later in the function uses > it, and will thus access free'd memory. > A trivial fix is to delay freeing the fake relcache entry in > ginContinueSplit(). But this once again shows that our WAL redo > functions are not as well tested as they should be, and the WAL redo > cleanup functions in particularly. Well, HS will at least help us shake out the redo functions. Cleanup is a hard case to test ... regards, tom lane