Re: Assertion failure with barriers in parallel hash join

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Re: Assertion failure with barriers in parallel hash join
Дата
Msg-id CAAKRu_am6AJEY+Y-tuWYQSK=9kMzknSCkMPiNO_tHyoNZSwGzg@mail.gmail.com
обсуждение исходный текст
Ответ на Assertion failure with barriers in parallel hash join  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: Assertion failure with barriers in parallel hash join
Список pgsql-hackers
On Wed, Mar 17, 2021 at 8:18 AM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Wed, Mar 17, 2021 at 6:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > According to BF animal elver there is something wrong with this
> > commit.  Looking into it.
>
> Assertion failure reproduced here and understood, but unfortunately
> it'll take some more time to fix this.  I've reverted the commit for
> now to unbreak the ~5 machines that are currently showing red in the
> build farm.

So, I think the premise of the patch
v6-0001-Fix-race-condition-in-parallel-hash-join-batch-cl.patch makes
sense: freeing the hashtable memory should happen atomically with
updating the flag that indicates to all others that the memory is not to
be accessed any longer.

As was likely reported by the buildfarm leading to you reverting the
patch, when the inner side is empty and we dump out before advancing the
build barrier past PHJ_BUILD_HASHING_OUTER, we trip the new Asserts
you've added in ExecHashTableDetach().

Assert(!pstate ||
            BarrierPhase(&pstate->build_barrier) >= PHJ_BUILD_RUNNING);

Hmm.

Maybe if the inner side is empty, we can advance the build barrier to
the end?
We help batch 0 along like this in ExecParallelHashJoinSetUpBatches().

But, I'm not sure we can expect the process executing this code to be
attached to the build barrier, can we?

@@ -296,7 +304,19 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
                                 * outer relation.
                                 */
                                if (hashtable->totalTuples == 0 &&
!HJ_FILL_OUTER(node))
+                               {
+                                       if (parallel)
+                                       {
+                                               Barrier    *build_barrier;
+
+                                               build_barrier =
¶llel_state->build_barrier;
+                                               while
(BarrierPhase(build_barrier) < PHJ_BUILD_DONE)
+
BarrierArriveAndWait(build_barrier, 0);
+                                               BarrierDetach(build_barrier);
+                                       }
+
                                        return NULL;
+                               }



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Crash in BRIN minmax-multi indexes
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: New IndexAM API controlling index vacuum strategies