Re: [bug?] Missed parallel safety checks, and wrong parallel safety

Поиск
Список
Период
Сортировка
От Zhihong Yu
Тема Re: [bug?] Missed parallel safety checks, and wrong parallel safety
Дата
Msg-id CALNJ-vQFqzbYK8g8WV_wqW4U8VO38oZVfUZjKz8kFEG28YRqPg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [bug?] Missed parallel safety checks, and wrong parallel safety  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [bug?] Missed parallel safety checks, and wrong parallel safety  (Greg Nancarrow <gregn4422@gmail.com>)
RE: [bug?] Missed parallel safety checks, and wrong parallel safety  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Список pgsql-hackers


On Wed, Jun 23, 2021 at 8:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jun 16, 2021 at 6:10 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tuesday, June 15, 2021 10:01 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > Now, maybe it could be done, and I think that's worth a little more thought. For
> > example, perhaps whenever we invalidate a relation, we could also somehow
> > send some new, special kind of invalidation for its parent saying, essentially,
> > "hey, one of your children has changed in a way you might care about." But
> > that's not good enough, because it only goes up one level. The grandparent
> > would still be unaware that a change it potentially cares about has occurred
> > someplace down in the partitioning hierarchy. That seems hard to patch up,
> > again because of the locking rules. The child can know the OID of its parent
> > without locking the parent, but it can't know the OID of its grandparent without
> > locking its parent. Walking up the whole partitioning hierarchy might be an
> > issue for a number of reasons, including possible deadlocks, and possible race
> > conditions where we don't emit all of the right invalidations in the face of
> > concurrent changes. So I don't quite see a way around this part of the problem,
> > but I may well be missing something. In fact I hope I am missing something,
> > because solving this problem would be really nice.
>
> For partition, I think postgres already have the logic about recursively finding
> the parent table[1]. Can we copy that logic to send serval invalid messages that
> invalidate the parent and grandparent... relcache if change a partition's parallel safety ?
> Although, it means we need more lock(on its parents) when the parallel safety
> changed, but it seems it's not a frequent scenario and looks acceptable.
>
> [1] In generate_partition_qual()
>         parentrelid = get_partition_parent(RelationGetRelid(rel), true);
>         parent = relation_open(parentrelid, AccessShareLock);
>         ...
>         /* Add the parent's quals to the list (if any) */
>         if (parent->rd_rel->relispartition)
>                 result = list_concat(generate_partition_qual(parent), my_qual);
>

As shown by me in another email [1], such a coding pattern can lead to
deadlock. It is because in some DDL operations we walk the partition
hierarchy from top to down and if we walk from bottom to upwards, then
that can lead to deadlock. I think this is a dangerous coding pattern
and we shouldn't try to replicate it.

[1] - https://www.postgresql.org/message-id/CAA4eK1LsFpjK5gL%2B0HEvoqB2DJVOi19vGAWbZBEx8fACOi5%2B_A%40mail.gmail.com

--
With Regards,
Amit Kapila.


Hi,
How about walking the partition hierarchy bottom up, recording the parents but not taking the locks.
Once top-most parent is found, take the locks in reverse order (top down) ?

Cheers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Decoding speculative insert with toast leaks memory
Следующее
От: Greg Nancarrow
Дата:
Сообщение: Re: [bug?] Missed parallel safety checks, and wrong parallel safety