Обсуждение: missing RelationCloseSmgr in FreeFakeRelcacheEntry?

Поиск

Список

Период

Сортировка

missing RelationCloseSmgr in FreeFakeRelcacheEntry?

От

Andres Freund

Дата:

29 октября 2013 г., 01:16:30

Hi,

I've started a valgrind run earlier when trying to run the regression
tests with valgrind --error-exitcode=122 (to cause the regression tests
to fail visibly) but it crashed frequently...

One of them was:

==2184== Invalid write of size 8
==2184==    at 0x76787F: smgrclose (smgr.c:284)
==2184==    by 0x4ED717: xact_redo_commit_internal (xact.c:4676)
==2184==    by 0x4ED7FF: xact_redo_commit (xact.c:4712)
==2184==    by 0x4EDB0D: xact_redo (xact.c:4838)
==2184==    by 0x50355D: StartupXLOG (xlog.c:6809)
==2184==    by 0x70AA1E: StartupProcessMain (startup.c:224)
==2184==    by 0x512B38: AuxiliaryProcessMain (bootstrap.c:429)
==2184==    by 0x709C43: StartChildProcess (postmaster.c:5063)
==2184==    by 0x7086EA: PostmasterStateMachine (postmaster.c:3544)
==2184==    by 0x7072F1: reaper (postmaster.c:2801)
==2184==    by 0x57B325F: ??? (in /lib/x86_64-linux-gnu/libc-2.17.so)
==2184==    by 0x585F822: __select_nocancel (syscall-template.S:81)
==2184==  Address 0x5f63410 is 5,584 bytes inside a recently re-allocated block of size 8,192 alloc'd
==2184==    at 0x402ACEC: malloc (vg_replace_malloc.c:270)
==2184==    by 0x8B3F8E: AllocSetAlloc (aset.c:854)
==2184==    by 0x8B623B: MemoryContextAlloc (mcxt.c:581)
==2184==    by 0x8B5F93: MemoryContextCreate (mcxt.c:522)
==2184==    by 0x8B33C4: AllocSetContextCreate (aset.c:444)
==2184==    by 0x8B55DD: MemoryContextInit (mcxt.c:110)
==2184==    by 0x703B17: PostmasterMain (po

Which seems to indicate a dangling ->rd_smgr pointer, confusing the heck
out of me because I couldn't see how that'd happen.

Looking a bit closer it seems to me that the fake relcache
infrastructure seems to neglect the chance that something used the fake
entry to read something which will have done a RelationOpenSmgr(). Which
in turn will have set rel->rd_smgr->smgr_owner to rel.
When we then pfree() the fake relation in FreeFakeRelcacheEntry we have
a dangling pointer.

It confuses me a bit that this doesn't cause issues during recovery more
frequently... The code seems to have been that way since
a213f1ee6c5a1bbe1f074ca201975e76ad2ed50c which introduced fake relcache
entries. It looks like XLogCloseRelationCache() indirectly has done a
RelationCloseSmgr().

diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 5429d5e..ee7c892 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -433,6 +433,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)voidFreeFakeRelcacheEntry(Relation fakerel){
+   RelationCloseSmgr(fakerel);   pfree(fakerel);}

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: missing RelationCloseSmgr in FreeFakeRelcacheEntry?

От

Andres Freund

Дата:

01 ноября 2013 г., 18:36:39

Hi Heikki, All,

On 2013-10-29 02:16:23 +0100, Andres Freund wrote:
> Looking a bit closer it seems to me that the fake relcache
> infrastructure seems to neglect the chance that something used the fake
> entry to read something which will have done a RelationOpenSmgr(). Which
> in turn will have set rel->rd_smgr->smgr_owner to rel.
> When we then pfree() the fake relation in FreeFakeRelcacheEntry we have
> a dangling pointer.
> 
> It confuses me a bit that this doesn't cause issues during recovery more
> frequently... The code seems to have been that way since
> a213f1ee6c5a1bbe1f074ca201975e76ad2ed50c which introduced fake relcache
> entries. It looks like XLogCloseRelationCache() indirectly has done a
> RelationCloseSmgr().

This looks like it was caused by
a213f1ee6c5a1bbe1f074ca201975e76ad2ed50c which you committed, any chance
you could commit the trivial fix?

> diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
> index 5429d5e..ee7c892 100644
> --- a/src/backend/access/transam/xlogutils.c
> +++ b/src/backend/access/transam/xlogutils.c
> @@ -433,6 +433,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
>  void
>  FreeFakeRelcacheEntry(Relation fakerel)
>  {
> +   RelationCloseSmgr(fakerel);
>     pfree(fakerel);
>  }

Greetings,

Andres Freund

Re: missing RelationCloseSmgr in FreeFakeRelcacheEntry?

От

Heikki Linnakangas

Дата:

04 ноября 2013 г., 07:38:44

On 29.10.2013 03:16, Andres Freund wrote:
> Hi,
>
> I've started a valgrind run earlier when trying to run the regression
> tests with valgrind --error-exitcode=122 (to cause the regression tests
> to fail visibly) but it crashed frequently...
>
> One of them was:
>
> ==2184== Invalid write of size 8
> ==2184==    at 0x76787F: smgrclose (smgr.c:284)
> ==2184==    by 0x4ED717: xact_redo_commit_internal (xact.c:4676)
> ==2184==    by 0x4ED7FF: xact_redo_commit (xact.c:4712)
> ==2184==    by 0x4EDB0D: xact_redo (xact.c:4838)
> ==2184==    by 0x50355D: StartupXLOG (xlog.c:6809)
> ==2184==    by 0x70AA1E: StartupProcessMain (startup.c:224)
> ==2184==    by 0x512B38: AuxiliaryProcessMain (bootstrap.c:429)
> ==2184==    by 0x709C43: StartChildProcess (postmaster.c:5063)
> ==2184==    by 0x7086EA: PostmasterStateMachine (postmaster.c:3544)
> ==2184==    by 0x7072F1: reaper (postmaster.c:2801)
> ==2184==    by 0x57B325F: ??? (in /lib/x86_64-linux-gnu/libc-2.17.so)
> ==2184==    by 0x585F822: __select_nocancel (syscall-template.S:81)
> ==2184==  Address 0x5f63410 is 5,584 bytes inside a recently re-allocated block of size 8,192 alloc'd
> ==2184==    at 0x402ACEC: malloc (vg_replace_malloc.c:270)
> ==2184==    by 0x8B3F8E: AllocSetAlloc (aset.c:854)
> ==2184==    by 0x8B623B: MemoryContextAlloc (mcxt.c:581)
> ==2184==    by 0x8B5F93: MemoryContextCreate (mcxt.c:522)
> ==2184==    by 0x8B33C4: AllocSetContextCreate (aset.c:444)
> ==2184==    by 0x8B55DD: MemoryContextInit (mcxt.c:110)
> ==2184==    by 0x703B17: PostmasterMain (po
>
> Which seems to indicate a dangling ->rd_smgr pointer, confusing the heck
> out of me because I couldn't see how that'd happen.
>
> Looking a bit closer it seems to me that the fake relcache
> infrastructure seems to neglect the chance that something used the fake
> entry to read something which will have done a RelationOpenSmgr(). Which
> in turn will have set rel->rd_smgr->smgr_owner to rel.
> When we then pfree() the fake relation in FreeFakeRelcacheEntry we have
> a dangling pointer.

Yeah, that's clearly a bug.

> diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
> index 5429d5e..ee7c892 100644
> --- a/src/backend/access/transam/xlogutils.c
> +++ b/src/backend/access/transam/xlogutils.c
> @@ -433,6 +433,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
>   void
>   FreeFakeRelcacheEntry(Relation fakerel)
>   {
> +   RelationCloseSmgr(fakerel);
>      pfree(fakerel);
>   }

Hmm, I don't think that's a very good solution. Firstly, it will force
the underlying files to be closed, hurting performance. Fake relcache
entries are used in heapam when clearing visibility map bits, which
might happen frequently enough for that to matter. Secondly, it will
fail if you create two fake relcache entries for the same relfilenode.
Freeing the first will close the smgr entry, and freeing the second will
try to close the same smgr entry again. We never do that in the current
code, but it seems like a possible source of bugs in the future.

How about the attached instead?

- Heikki

Вложения

fake-relcache-fix-2.patch

Re: missing RelationCloseSmgr in FreeFakeRelcacheEntry?

От

Andres Freund

Дата:

04 ноября 2013 г., 09:35:33

On 2013-11-04 09:38:27 +0200, Heikki Linnakangas wrote:
> On 29.10.2013 03:16, Andres Freund wrote:
> >Hi,
> >
> >I've started a valgrind run earlier when trying to run the regression
> >tests with valgrind --error-exitcode=122 (to cause the regression tests
> >to fail visibly) but it crashed frequently...
> >
> >One of them was:
> >...
> >Which seems to indicate a dangling ->rd_smgr pointer, confusing the heck
> >out of me because I couldn't see how that'd happen.
> >
> >Looking a bit closer it seems to me that the fake relcache
> >infrastructure seems to neglect the chance that something used the fake
> >entry to read something which will have done a RelationOpenSmgr(). Which
> >in turn will have set rel->rd_smgr->smgr_owner to rel.
> >When we then pfree() the fake relation in FreeFakeRelcacheEntry we have
> >a dangling pointer.
> 
> Yeah, that's clearly a bug.
> 
> >diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
> >index 5429d5e..ee7c892 100644
> >--- a/src/backend/access/transam/xlogutils.c
> >+++ b/src/backend/access/transam/xlogutils.c
> >@@ -433,6 +433,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
> >  void
> >  FreeFakeRelcacheEntry(Relation fakerel)
> >  {
> >+   RelationCloseSmgr(fakerel);
> >     pfree(fakerel);
> >  }
> 
> Hmm, I don't think that's a very good solution. Firstly, it will force the
> underlying files to be closed, hurting performance. Fake relcache entries
> are used in heapam when clearing visibility map bits, which might happen
> frequently enough for that to matter.

Well, it will only close them when they actually were smgropen()ed which
will not always be the case. Although it probably is in the visibility
map case.

Feels like premature optimization to me.

> Secondly, it will fail if you create
> two fake relcache entries for the same relfilenode. Freeing the first will
> close the smgr entry, and freeing the second will try to close the same smgr
> entry again. We never do that in the current code, but it seems like a
> possible source of bugs in the future.

I think if we go there we'll need refcounted FakeRelcacheEntry's
anyway. Otherwise we'll end up with a relation smgropen()ed multiple
times in the same backend and such which doesn't seem like a promising
course to me since the smgr itself isn't refcounted.

> How about the attached instead?

> diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
> index 5429d5e..f732e71 100644
> --- a/src/backend/access/transam/xlogutils.c
> +++ b/src/backend/access/transam/xlogutils.c
> @@ -422,7 +422,11 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
>      rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
>      rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
>  
> -    rel->rd_smgr = NULL;
> +    /*
> +     * Open the underlying storage, but don't set rd_owner. We want the
> +     * SmgrRelation to persist after the fake relcache entry is freed.
> +     */
> +    rel->rd_smgr = smgropen(rnode, InvalidBackendId);
>  
>      return rel;
>  }

I don't really like that - that will mean we'll leak open smgr handles
for every relation touched via smgr during recovery since they will
never be closed unless the relation is dropped or such. And in some
databases there can be huge amounts of relations.
Since recovery is long lived that doesn't seem like a good idea, besides
the memory usage it will also bloat smgr's internal hash which actually
seems just as likely to hurt performance.

I think intentionally not using the owner mechanism also is dangerous -
what if we have longer lived fake relcache entries and we've just
processed sinval messages? Then we'll have a ->rd_smgr pointing into
nowhere.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: missing RelationCloseSmgr in FreeFakeRelcacheEntry?

От

Heikki Linnakangas

Дата:

04 ноября 2013 г., 12:38:04

On 04.11.2013 11:35, Andres Freund wrote:
> On 2013-11-04 09:38:27 +0200, Heikki Linnakangas wrote:
>> Secondly, it will fail if you create
>> two fake relcache entries for the same relfilenode. Freeing the first will
>> close the smgr entry, and freeing the second will try to close the same smgr
>> entry again. We never do that in the current code, but it seems like a
>> possible source of bugs in the future.
>
> I think if we go there we'll need refcounted FakeRelcacheEntry's
> anyway. Otherwise we'll end up with a relation smgropen()ed multiple
> times in the same backend and such which doesn't seem like a promising
> course to me since the smgr itself isn't refcounted.
>
>> How about the attached instead?
>
>> diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
>> index 5429d5e..f732e71 100644
>> --- a/src/backend/access/transam/xlogutils.c
>> +++ b/src/backend/access/transam/xlogutils.c
>> @@ -422,7 +422,11 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
>>       rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
>>       rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
>>
>> -    rel->rd_smgr = NULL;
>> +    /*
>> +     * Open the underlying storage, but don't set rd_owner. We want the
>> +     * SmgrRelation to persist after the fake relcache entry is freed.
>> +     */
>> +    rel->rd_smgr = smgropen(rnode, InvalidBackendId);
>>
>>       return rel;
>>   }
>
> I don't really like that - that will mean we'll leak open smgr handles
> for every relation touched via smgr during recovery since they will
> never be closed unless the relation is dropped or such. And in some
> databases there can be huge amounts of relations.
> Since recovery is long lived that doesn't seem like a good idea, besides
> the memory usage it will also bloat smgr's internal hash which actually
> seems just as likely to hurt performance.

That's the way SmgrRelations are supposed to be used. Opened on first 
use, and kept open forever. We never smgrclose() during normal operation 
either, unless the relation is dropped or something like that. And see 
how XLogReadBuffer() also calls smgropen() with no corresponding 
smgrclose().

> I think intentionally not using the owner mechanism also is dangerous -
> what if we have longer lived fake relcache entries and we've just
> processed sinval messages? Then we'll have a ->rd_smgr pointing into
> nowhere.

Hmm, the startup process doesn't participate in sinval messaging at all, 
does it?

- Heikki

Re: missing RelationCloseSmgr in FreeFakeRelcacheEntry?

От

Andres Freund

Дата:

04 ноября 2013 г., 12:48:41

On 2013-11-04 14:37:53 +0200, Heikki Linnakangas wrote:
> On 04.11.2013 11:35, Andres Freund wrote:
> >On 2013-11-04 09:38:27 +0200, Heikki Linnakangas wrote:
> >>diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
> >>index 5429d5e..f732e71 100644
> >>--- a/src/backend/access/transam/xlogutils.c
> >>+++ b/src/backend/access/transam/xlogutils.c
> >>@@ -422,7 +422,11 @@ CreateFakeRelcacheEntry(RelFileNode rnode)
> >>      rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
> >>      rel->rd_lockInfo.lockRelId.relId = rnode.relNode;
> >>
> >>-    rel->rd_smgr = NULL;
> >>+    /*
> >>+     * Open the underlying storage, but don't set rd_owner. We want the
> >>+     * SmgrRelation to persist after the fake relcache entry is freed.
> >>+     */
> >>+    rel->rd_smgr = smgropen(rnode, InvalidBackendId);
> >>
> >>      return rel;
> >>  }
> >
> >I don't really like that - that will mean we'll leak open smgr handles
> >for every relation touched via smgr during recovery since they will
> >never be closed unless the relation is dropped or such. And in some
> >databases there can be huge amounts of relations.
> >Since recovery is long lived that doesn't seem like a good idea, besides
> >the memory usage it will also bloat smgr's internal hash which actually
> >seems just as likely to hurt performance.
> 
> That's the way SmgrRelations are supposed to be used. Opened on first use,
> and kept open forever. We never smgrclose() during normal operation either,
> unless the relation is dropped or something like that. And see how
> XLogReadBuffer() also calls smgropen() with no corresponding smgrclose().

Ok, that's quite the fair argument although I'd say normal backends
won't open many relations in comparison to the startup process when
doing replication. But that certainly isn't sufficient argument for
changing logic in the back branches or even HEAD.

> >I think intentionally not using the owner mechanism also is dangerous -
> >what if we have longer lived fake relcache entries and we've just
> >processed sinval messages? Then we'll have a ->rd_smgr pointing into
> >nowhere.

> Hmm, the startup process doesn't participate in sinval messaging at all,
> does it?

Well, not sinval but inval, in hot standby via commit messages.

What about just unowning the smgr entry with
if (rel->rd_smgr != NULL)  smgrsetowner(NULL, rel->rd_smgr)
when closing the fake relcache entry?

That shouldn't require any further changes than changing the comment in
smgrsetowner() that this isn't something expected to frequently happen.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: missing RelationCloseSmgr in FreeFakeRelcacheEntry?

От

Andres Freund

Дата:

04 ноября 2013 г., 13:04:46

On 2013-11-04 13:48:32 +0100, Andres Freund wrote:
> > Hmm, the startup process doesn't participate in sinval messaging at all,
> > does it?
> 
> Well, not sinval but inval, in hot standby via commit messages.

Err, that's bullshit, sorry for that. We send the messages via sinval,
but never (probably at least?) process sinval messages.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: missing RelationCloseSmgr in FreeFakeRelcacheEntry?

От

Andres Freund

Дата:

05 ноября 2013 г., 19:36:39

On 2013-11-04 13:48:32 +0100, Andres Freund wrote:
> What about just unowning the smgr entry with
> if (rel->rd_smgr != NULL)
>    smgrsetowner(NULL, rel->rd_smgr)
> when closing the fake relcache entry?
>
> That shouldn't require any further changes than changing the comment in
> smgrsetowner() that this isn't something expected to frequently happen.

Attached is a patch doing it like that, it required slightmy more
invasive changes than that. With the patch applied we survive replay of
a primary's make check run under valgrind without warnings.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

0001-Un-own-SMgrRelations-in-FreeFakeRelcacheEntry.patch

Re: missing RelationCloseSmgr in FreeFakeRelcacheEntry?

От

Bruce Momjian

Дата:

06 марта 2014 г., 00:19:00

On Tue, Nov  5, 2013 at 08:36:32PM +0100, Andres Freund wrote:
> On 2013-11-04 13:48:32 +0100, Andres Freund wrote:
> > What about just unowning the smgr entry with
> > if (rel->rd_smgr != NULL)
> >    smgrsetowner(NULL, rel->rd_smgr)
> > when closing the fake relcache entry?
> > 
> > That shouldn't require any further changes than changing the comment in
> > smgrsetowner() that this isn't something expected to frequently happen.
> 
> Attached is a patch doing it like that, it required slightmy more
> invasive changes than that. With the patch applied we survive replay of
> a primary's make check run under valgrind without warnings.

Where are we on this patch?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: missing RelationCloseSmgr in FreeFakeRelcacheEntry?

От

Heikki Linnakangas

Дата:

07 марта 2014 г., 11:50:37

On 03/06/2014 02:18 AM, Bruce Momjian wrote:
> On Tue, Nov  5, 2013 at 08:36:32PM +0100, Andres Freund wrote:
>> On 2013-11-04 13:48:32 +0100, Andres Freund wrote:
>>> What about just unowning the smgr entry with
>>> if (rel->rd_smgr != NULL)
>>>     smgrsetowner(NULL, rel->rd_smgr)
>>> when closing the fake relcache entry?
>>>
>>> That shouldn't require any further changes than changing the comment in
>>> smgrsetowner() that this isn't something expected to frequently happen.
>>
>> Attached is a patch doing it like that, it required slightmy more
>> invasive changes than that. With the patch applied we survive replay of
>> a primary's make check run under valgrind without warnings.
>
> Where are we on this patch?

Committed now, with small changes. I made the new smgrclearowner 
function to check that the SmgrRelation object is indeed owned by the 
owner we expect, so that it doesn't unown it if it's actually owned by 
someone else. That shouldn't happen, but it seemed prudent to check.

Thanks Andres. I tried to reproduce the valgrind message you reported, 
but couldn't. How did you do it? Did this commit fix it?

And thanks for the nudge, Bruce.

- Heikki

Re: missing RelationCloseSmgr in FreeFakeRelcacheEntry?

От

Andres Freund

Дата:

04 мая 2014 г., 14:14:34

On 2014-03-07 13:50:27 +0200, Heikki Linnakangas wrote:
> Thanks Andres. I tried to reproduce the valgrind message you reported, but
> couldn't. How did you do it? Did this commit fix it?

I previously could reproduce the issue by either forcing a server into
recovery or using a replica. That seems to be fine now.

Thanks,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: missing RelationCloseSmgr in FreeFakeRelcacheEntry?

Вложения

Вложения