Re: hyrax vs. RelationBuildPartitionDesc

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: hyrax vs. RelationBuildPartitionDesc
Дата
Msg-id CA+HiwqFtb_kvmrSeSHrYD-T7JiFg6Hgo8WA6Kk-Mq=27bMJ7EQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: hyrax vs. RelationBuildPartitionDesc  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Ответы Re: hyrax vs. RelationBuildPartitionDesc  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Mon, Apr 15, 2019 at 5:05 PM Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2019/04/15 2:38, Tom Lane wrote:
> > So the point here is that that reasoning is faulty.  You *cannot* assume,
> > no matter how strong a lock or how many pins you hold, that a relcache
> > entry will not get rebuilt underneath you.  Cache flushes happen
> > regardless.  And unless relcache.c takes special measures to prevent it,
> > a rebuild will result in moving subsidiary data structures and thereby
> > breaking any pointers you may have pointing into those data structures.
> >
> > For certain subsidiary structures such as the relation tupdesc,
> > we do take such special measures: that's what the "keep_xxx" dance in
> > RelationClearRelation is.  However, that's expensive, both in cycles
> > and maintenance effort: it requires having code that can decide equality
> > of the subsidiary data structures, which we might well have no other use
> > for, and which we certainly don't have strong tests for correctness of.
> > It's also very error-prone for callers, because there isn't any good way
> > to cross-check that code using a long-lived pointer to a subsidiary
> > structure is holding a lock that's strong enough to guarantee non-mutation
> > of that structure, or even that relcache.c provides any such guarantee
> > at all.  (If our periodic attempts to reduce lock strength for assorted
> > DDL operations don't scare the pants off you in this connection, you have
> > not thought hard enough about it.)  So I think that even though we've
> > largely gotten away with this approach so far, it's also a half-baked
> > kluge that we should be looking to get rid of, not extend to yet more
> > cases.
>
> Thanks for the explanation.
>
> I understand that simply having a lock and a nonzero refcount on a
> relation doesn't prevent someone else from changing it concurrently.
>
> I get that we want to get rid of the keep_* kludge in the long term, but
> is it wrong to think, for example, that having keep_partdesc today allows
> us today to keep the pointer to rd_partdesc as long as we're holding the
> relation open or refcnt on the whole relation such as with
> PartitionDirectory mechanism?

Ah, we're also trying to fix the memory leak caused by the current
design of PartitionDirectory.  AIUI, the design assumes that the leak
would occur in fairly rare cases, but maybe not so?  If partitions are
frequently attached/detached concurrently (maybe won't be too uncommon
if reduced lock levels encourages users) causing the PartitionDesc of
a given relation changing all the time, then a planning session that's
holding the PartitionDirectory containing that relation would leak as
many PartitionDescs as there were concurrent changes, I guess.

I see that you've proposed to change the PartitionDirectory design to
copy PartitionDesc as way of keeping it around instead holding the
relation open, but having to resort to that would be unfortunate.

Thanks,
Amit



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: jsonpath
Следующее
От: Tom Lane
Дата:
Сообщение: Re: partitioning performance tests after recent patches