On 2016/05/19 2:48, Tom Lane wrote:
> Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
>> On 2016/05/18 2:22, Tom Lane wrote:
>>> The two ways that we've dealt with this type of hazard are to copy data
>>> out of the relcache before using it; or to give the relcache the
>>> responsibility of not moving a particular portion of data if it did not
>>> change. From memory, the latter applies to the tuple descriptor and
>>> trigger data, but we've done most other things the first way.
>
> After actually looking at the code, we do things that way for the
> tupledesc, the relation's rules if any, and RLS policies --- see
> RelationClearRelation().
I think I confused refcounting method of keeping things around with the
RelationClearRelation()'s method. I now understand that you meant the
latter in your original message.
>> It seems that tuple descriptor is reference-counted; however trigger data
>> is copied. The former seems to have been done on performance grounds (I
>> found 06e10abc).
>
> We do refcount tuple descriptors, but we've been afraid to try to rely
> completely on that; there are too many places that assume a relcache
> entry's tupdesc is safe to reference. It's not that easy to go over to
> a fully refcounted approach, because that creates a new problem of being
> sure that refcounts are decremented when necessary --- that's a pain,
> particularly when a query is abandoned due to an error.
I see.
>> So for a performance-sensitive relcache data structure, refcounting is the
>> way to go (although done quite rarely)?
>
> I'd be suspicious of this because of the cleanup problem. The
> don't-replace-unless-changed approach is the one that's actually battle
> tested.
OK, I will try the RelationClearRelation()'s method of keeping partition
descriptor data around so that no repeated copying is necessary.
Thanks,
Amit