Re: postgres_fdw vs. force_parallel_mode on ppc

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: postgres_fdw vs. force_parallel_mode on ppc
Дата
Msg-id CA+TgmoZQh4EJ_SbAUWhSPS8uy=Sm+VaUiAGF+foJDjbuFsVLmA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: postgres_fdw vs. force_parallel_mode on ppc  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Tue, Feb 23, 2016 at 11:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Even if there were, it would not fix this bug, because AFAICS the only
> thing that set_rel_consider_parallel is chartered to do is set the
> per-relation consider_parallel flag.  The failure that is happening in
> that regression test with force_parallel_mode turned on happens because
> standard_planner plasters a Gather node at the top of the plan, causing
> the whole plan including the FDW access to happen inside a parallel
> worker.  The only way to prevent that is to clear the
> wholePlanParallelSafe flag, which as far as I can tell (not that any of
> this is documented worth a damn) isn't something that
> set_rel_consider_parallel is supposed to do.

Hmm.  Well, if you tested it, or looked at the places where
wholePlanParallelSafe is cleared, you would find that it DOES fix the
bug.  create_plan() clears wholePlanParallelSafe if the plan is not
parallel-safe, and the plan won't be parallel-safe unless
consider_parallel was set for the underlying relation.  In case you'd
like to test it for yourself, here's the PoC patch I wrote:

diff --git a/src/backend/optimizer/path/allpaths.c
b/src/backend/optimizer/path/allpaths.c
index bcb668f..8a4179e 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -527,6 +527,11 @@ set_rel_consider_parallel(PlannerInfo *root,
RelOptInfo *rel,                                       return;                               return;
  }
 
+
+                       /* Not for foreign tables. */
+                       if (rte->relkind == RELKIND_FOREIGN_TABLE)
+                               return;
+                       break;
               case RTE_SUBQUERY:

Adding that makes the postgres_fdw case pass.

> It looks to me like there is a good deal of fuzzy thinking here about the
> difference between locally parallelizable and globally parallelizable
> plans, ie Gather at the top vs Gather somewhere else.

If you have a specific complaint, I'm happy to try to improve things,
or you can.  I think however that it is also possible that you haven't
fully understood the code I've spent the last year or so developing
yet, possibly because I haven't documented it well enough, but
possibly also because you haven't spent much time looking on it yet.
I'm glad you are, by the way, because I'm sure there are a bunch of
things here that you can improve over what I was able to do,
especially on the planner side of things, and that would be great.
However, a bit of forbearance would be appreciated.

> I also note with
> dismay that turning force_parallel_mode on seems to pretty much disable
> any testing of local parallelism.

No, I don't think so.  It doesn't push a Gather node on top of a plan
that already contains a Gather, because such a plan isn't
parallel_safe.   Nor does it suppress generation of parallel paths
otherwise.

>> In view of 69d34408e5e7adcef8ef2f4e9c4f2919637e9a06, we shouldn't
>> blindly assume that foreign scans are not parallel-safe, but we can't
>> blindly assume the opposite either.  Maybe we should assume that the
>> foreign scan is parallel-safe only if one or more of the new methods
>> introduced by the aforementioned commit are set, but actually that
>> doesn't seem quite right.  That would tell us whether the scan itself
>> can be parallelized, not whether it's safe to run serially but within
>> a parallel worker.  I think maybe we need a new FDW API that gets
>> called from set_rel_consider_parallel() with the root, rel, and rte as
>> arguments and which can return a Boolean.  If the callback is not set,
>> assume false.
>
> Meh.  As things stand, postgres_fdw would have to aver that it can't ever
> be safely parallelized, which doesn't seem like a very satisfactory answer
> even if there are other FDWs that work differently (and which would those
> be?  None that use a socket-style connection to an external server.)

file_fdw is parallel-safe, and KaiGai posted a patch that makes it
parallel-aware, though that would have needed more work than I'm
willing to put in right now to make it committable.  So in other
words...

> The commit you mention above seems to me to highlight the dangers of
> accepting hook patches with no working use-case to back them up.
> AFAICT it's basically useless for typical FDWs because of this
> multiple-connection problem.

...I didn't ignore this principal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Victor Wagner
Дата:
Сообщение: Re: Convert pltcl from strings to objects
Следующее
От: Ashutosh Bapat
Дата:
Сообщение: Re: Pushing down sorted joins