Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY)

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY)
Дата
Msg-id 20140519150804.GD11150@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY)  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY)
Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY)
Список pgsql-hackers
Hi,

On 2014-05-18 14:49:10 -0400, Tom Lane wrote:
> AFAICT, the inner invocation's Relation should always have zero reference
> count by the time we get back to the outer invocation.  Therefore it
> should be possible for RelationCacheInsert to just delete the
> about-to-be-unreachable Relation struct.  I'm experimenting with a patch
> that adds logic like this to RelationCacheInsert:
> 
>     if (found)
>     {
>     Relation oldrel = idhentry->reldesc;
>     idhentry->reldesc = RELATION;
>     if (RelationHasReferenceCountZero(oldrel))
>         RelationDestroyRelation(oldrel, false);
>     else
>         elog(WARNING, "leaking still-referenced duplicate relation");
>     }
> 
> and so far it looks good.

If that happens we'd essentially have a too low reference count on the
entry remaining in the relcache. I'd consider putting an Assert() in
that branch. Perhaps it should also be only allowed for system
relations? If it somehow happens for user defined ones we'd loose
rd_newRelfilenodeSubid and such. Which we treat as reliable these days.

> With both of these things fixed, I'm not seeing any significant memory
> bloat during the first parallel group of the regression tests.  I don't
> think I'll have the patience to let it run much further than that
> (the uuid and enum tests are still running after an hour :-().

Yea, it's really unbearable. A *single* UPDATE took close to 5 minutes
here when testing the decoding stuff.

> BTW, it strikes me that we could probably improve the runtime of the
> CLOBBER tests noticeably if we were to nail AttrDefaultIndexId,
> IndexIndrelidIndexId, and ConstraintRelidIndexId into cache.  I see
> no very good reason not to do that; it should help performance a bit
> in normal cases too.

Sounds like a good plan.

From what I've seen here, the mojority of the time is spent iterating
over various hashes. Primarily relcache.c and catcache.c. Removing
some syscache lookups will reduce the repetitiveness of tha ta bit.

> While I'm at it: I could not help noticing RememberToFreeTupleDescAtEOX,
> which was not there last time I looked at this code.  Isn't that broken
> by design?  It's basically a deliberately induced transaction-lifespan
> memory leak, and AFAICS if it does anything at all, it's supporting
> incorrect calling code.  There should *never* be any situation where it's
> not okay to free a tupledesc with zero refcount.

I am not sure I understand that code. The whole keep_tupdesc/keep_rules
logic isn't new and a bit underdocumented as well :(. Seems to come from
491dd4a9.
If I understand correctly the issue is that callers have accessed the
tupledesc with RelationGetDescr() which doesn't increase the
refcount. If then there's rebuild of that relation's relcache entry
they'll still be looking at the old tupledesc. Which will now have a
reference count of 0. The existing code tries to handle that by not
invalidating the old tupledescs if equal to the rebuilt one. Thereby
leaving the uncounted reference from the outside in a valid state.
The reason that mostly works < 9.4 is that the tupledesc normally won't
change without an AEL lock on the relation which can't happen while
there are still open references.
But I am wondering, is that actually true, even before today? ... No,
doesn't look that way:
CREATE TABLE a(id serial primary key);
BEGIN;
DECLARE foo CURSOR FOR SELECT * FROM a;
ALTER TABLE a RENAME COLUMN id TO id2;

So, renaming a column doesn't enforce CheckTableNotInUse() which means
that it's possible to access a RelationGetDescr() before calling an
external function which might do a ALTER TABLE .. RENAME COLUMN thereby
triggering !keep_tupdesc. And freeing the existing tupledesc.
Now, it will likely work in this case because the DECLARE has probably
properly incremented the refcount, but nonetheless... I am pretty sure I
could construct a crasher with that, even < 9.4.

This is a mess...

I think the code is really only triggering if !keep_tupdesc is false
which should be a fairly minor amount of cases. So it's probably not too
bad performancewise.

>          * If we Rebuilt a relcache entry during a transaction then its
>          * possible we did that because the TupDesc changed as the result of
>          * an ALTER TABLE that ran at less than AccessExclusiveLock. It's
>          * possible someone copied that TupDesc, in which case the copy would
>          * point to free'd memory. So if we rebuild an entry we keep the
> 
> If someone copied the tupdesc, how would freeing the original result
> in the copy being damaged?

The sentence only seems to make sense with s/copy/accessed without
increasing the refcoung, e.g. with RelationGetDescr,/.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Christoph Berg
Дата:
Сообщение: Re: 9.4 beta1 crash on Debian sid/i386
Следующее
От: Tom Lane
Дата:
Сообщение: Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY)