missing RelationCloseSmgr in FreeFakeRelcacheEntry?

Поиск
Список
Период
Сортировка
От Andres Freund
Тема missing RelationCloseSmgr in FreeFakeRelcacheEntry?
Дата
Msg-id 20131029011623.GJ20248@awork2.anarazel.de
обсуждение исходный текст
Ответы Re: missing RelationCloseSmgr in FreeFakeRelcacheEntry?
Re: missing RelationCloseSmgr in FreeFakeRelcacheEntry?
Список pgsql-hackers
Hi,

I've started a valgrind run earlier when trying to run the regression
tests with valgrind --error-exitcode=122 (to cause the regression tests
to fail visibly) but it crashed frequently...

One of them was:

==2184== Invalid write of size 8
==2184==    at 0x76787F: smgrclose (smgr.c:284)
==2184==    by 0x4ED717: xact_redo_commit_internal (xact.c:4676)
==2184==    by 0x4ED7FF: xact_redo_commit (xact.c:4712)
==2184==    by 0x4EDB0D: xact_redo (xact.c:4838)
==2184==    by 0x50355D: StartupXLOG (xlog.c:6809)
==2184==    by 0x70AA1E: StartupProcessMain (startup.c:224)
==2184==    by 0x512B38: AuxiliaryProcessMain (bootstrap.c:429)
==2184==    by 0x709C43: StartChildProcess (postmaster.c:5063)
==2184==    by 0x7086EA: PostmasterStateMachine (postmaster.c:3544)
==2184==    by 0x7072F1: reaper (postmaster.c:2801)
==2184==    by 0x57B325F: ??? (in /lib/x86_64-linux-gnu/libc-2.17.so)
==2184==    by 0x585F822: __select_nocancel (syscall-template.S:81)
==2184==  Address 0x5f63410 is 5,584 bytes inside a recently re-allocated block of size 8,192 alloc'd
==2184==    at 0x402ACEC: malloc (vg_replace_malloc.c:270)
==2184==    by 0x8B3F8E: AllocSetAlloc (aset.c:854)
==2184==    by 0x8B623B: MemoryContextAlloc (mcxt.c:581)
==2184==    by 0x8B5F93: MemoryContextCreate (mcxt.c:522)
==2184==    by 0x8B33C4: AllocSetContextCreate (aset.c:444)
==2184==    by 0x8B55DD: MemoryContextInit (mcxt.c:110)
==2184==    by 0x703B17: PostmasterMain (po

Which seems to indicate a dangling ->rd_smgr pointer, confusing the heck
out of me because I couldn't see how that'd happen.

Looking a bit closer it seems to me that the fake relcache
infrastructure seems to neglect the chance that something used the fake
entry to read something which will have done a RelationOpenSmgr(). Which
in turn will have set rel->rd_smgr->smgr_owner to rel.
When we then pfree() the fake relation in FreeFakeRelcacheEntry we have
a dangling pointer.

It confuses me a bit that this doesn't cause issues during recovery more
frequently... The code seems to have been that way since
a213f1ee6c5a1bbe1f074ca201975e76ad2ed50c which introduced fake relcache
entries. It looks like XLogCloseRelationCache() indirectly has done a
RelationCloseSmgr().

diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 5429d5e..ee7c892 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -433,6 +433,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)voidFreeFakeRelcacheEntry(Relation fakerel){
+   RelationCloseSmgr(fakerel);   pfree(fakerel);}

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: OSX doesn't accept identical source/target for strcpy() anymore
Следующее
От: Tom Lane
Дата:
Сообщение: Re: What hook would you recommend for "one time, post authentication"?