Обсуждение: Parallel bt build crashes when DSM_NONE

Поиск
Список
Период
Сортировка

Parallel bt build crashes when DSM_NONE

От
Kyotaro HORIGUCHI
Дата:
Hello.

I happend to find that server crashes during regtest when
DSM_NONE is enforced. The attached patch fixes that.

The cause is the fact that _bt_spools_heapscan runs
_bt_begin_parallel() even if dynamic_shared_memory_type is
DSM_NONE. It is because plan_create_index_workers() is ignoring
dynamic_shared_memory_type.

We can reproduce this by letting initdb set
dynamic_shared_memory_type=none regardless of actual
availability. (Second attached) and just "make check".

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 740de49..3e8cd14 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -5825,7 +5825,8 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
     double        allvisfrac;
 
     /* Return immediately when parallelism disabled */
-    if (max_parallel_maintenance_workers == 0)
+    if (dynamic_shared_memory_type == DSM_IMPL_NONE ||
+        max_parallel_maintenance_workers == 0)
         return 0;
 
     /* Set up largely-dummy planner state */
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 2efd3b7..876e153 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -871,6 +871,7 @@ choose_dsm_implementation(void)
 #ifdef HAVE_SHM_OPEN
     int            ntries = 10;
 
+    return "none";
     while (ntries > 0)
     {
         uint32        handle;

Re: Parallel bt build crashes when DSM_NONE

От
Michael Paquier
Дата:
On Fri, Feb 09, 2018 at 05:06:35PM +0900, Kyotaro HORIGUCHI wrote:
> I happend to find that server crashes during regtest when
> DSM_NONE is enforced. The attached patch fixes that.
>
> The cause is the fact that _bt_spools_heapscan runs
> _bt_begin_parallel() even if dynamic_shared_memory_type is
> DSM_NONE. It is because plan_create_index_workers() is ignoring
> dynamic_shared_memory_type.

Adding Peter Geoghegan as the author and Robert as the committer in CC,
as that's a mistake from 9da0cc35.

> We can reproduce this by letting initdb set
> dynamic_shared_memory_type=none regardless of actual
> availability. (Second attached) and just "make check".

Or more simply you can just setup an instance with this configuration
and run installcheck.  No need to patch initdb for that.

4 regression tests fail when using dynamic_shared_memory_type=none:
join, aggregates, select_parallel and write_parallel.  test_shm_mq of
course blows up.  Could that justify getting rid of DSM_IMPL_NONE?  As
far as I can see there is an alternative on Windows, and we fallback to
sysv in the worst case.  So I am wondering what's actually the use case
for "none".  And it is good to keep alternate outputs at a minimum,
those tend to rot easily.

Except for those mind errands your patch looks good to me.
--
Michael

Вложения

Re: Parallel bt build crashes when DSM_NONE

От
Peter Geoghegan
Дата:
On Fri, Feb 9, 2018 at 12:06 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> I happend to find that server crashes during regtest when
> DSM_NONE is enforced. The attached patch fixes that.
>
> The cause is the fact that _bt_spools_heapscan runs
> _bt_begin_parallel() even if dynamic_shared_memory_type is
> DSM_NONE. It is because plan_create_index_workers() is ignoring
> dynamic_shared_memory_type.

I think that your patch does the right thing.
plan_create_index_workers() is supposed to take care of parallel
safety, and this is a parallel safety issue.

Thanks
-- 
Peter Geoghegan


Re: Parallel bt build crashes when DSM_NONE

От
Robert Haas
Дата:
On Mon, Feb 12, 2018 at 12:26 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> I think that your patch does the right thing.
> plan_create_index_workers() is supposed to take care of parallel
> safety, and this is a parallel safety issue.

Committed the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Parallel bt build crashes when DSM_NONE

От
Kyotaro HORIGUCHI
Дата:
At Mon, 12 Feb 2018 15:00:54 -0500, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoZot0Pm9JqLs=_jspm78YCjeq=2u+0Oa4kg+gGVvO7Urg@mail.gmail.com>
> On Mon, Feb 12, 2018 at 12:26 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> > I think that your patch does the right thing.
> > plan_create_index_workers() is supposed to take care of parallel
> > safety, and this is a parallel safety issue.
> 
> Committed the patch.

Thank you Robert for Committing, and Peter and Michael for the
comments and supplement.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Parallel bt build crashes when DSM_NONE

От
Kyotaro HORIGUCHI
Дата:
At Mon, 12 Feb 2018 22:05:36 +0900, Michael Paquier <michael@paquier.xyz> wrote in
<20180212130536.GA18625@paquier.xyz>
> On Fri, Feb 09, 2018 at 05:06:35PM +0900, Kyotaro HORIGUCHI wrote:
> 4 regression tests fail when using dynamic_shared_memory_type=none:
> join, aggregates, select_parallel and write_parallel.  test_shm_mq of
> course blows up.  Could that justify getting rid of DSM_IMPL_NONE?  As

A query runs into the fallback route in the case, which even
though the regtest doesn't care about. So it alone doesn't
justify that.

> far as I can see there is an alternative on Windows, and we fallback to
> sysv in the worst case.  So I am wondering what's actually the use case
> for "none".  And it is good to keep alternate outputs at a minimum,
> those tend to rot easily.

Agreed. As Tom mentioned, no bf animal haven't complained with
that since 9.4 and I belive they won't. initdb doesn't create a
database with DSM_IMPL_NONE. Although it can be manually
deactivated, the fact that we haven't have a complain about that
can justify to remove it to some extent. I'll post a patch that
eliminate DSM_IMPL_NONE after this. I'd like it to be shipped on
11.

> Except for those mind errands your patch looks good to me.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Parallel bt build crashes when DSM_NONE

От
Michael Paquier
Дата:
On Wed, Feb 14, 2018 at 10:38:58AM +0900, Kyotaro HORIGUCHI wrote:
> I'll post a patch that eliminate DSM_IMPL_NONE after this. I'd like it
> to be shipped on 11.

Feel free to.  Be just careful that the connection attempts in
test_config_settings() should try to use the DSM implementation defined
by choose_dsm_implementation().
--
Michael

Вложения