Re: [HACKERS] postgres_fdw bug in 9.6

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [HACKERS] postgres_fdw bug in 9.6
Дата
Msg-id CA+TgmoYsaVVi0MYrfhCww1=uP4cZtZ+uNwLfimqZN10PxzN3HQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] postgres_fdw bug in 9.6  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [HACKERS] postgres_fdw bug in 9.6
Re: [HACKERS] postgres_fdw bug in 9.6
Список pgsql-hackers
On Fri, Apr 7, 2017 at 2:33 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Apr 3, 2017 at 2:12 AM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp> wrote:
>> On 2017/04/01 1:32, Jeff Janes wrote:
>>> On Thu, Mar 23, 2017 at 5:20 AM, Etsuro Fujita
>>> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:
>>>     Done.  Attached is a new version of the patch.
>>> Is the fix for 9.6.3 going to be just a back port of this, or will it
>>> look different?
>>
>> +1 for backporting; although that requires that GetForeignJoinPaths be
>> updated so that the FDW uses a new function to create an alternative local
>> join path (ie, CreateLocalJoinPath), that would make maintenance of the code
>> easy.
>
> Well, the problem here is that this breaks ABI compatibility.  If we
> applied this to 9.6, and somebody tried to use a previously-compiled
> FDW .so against a new server version, it would fail after the upgrade,
> because the new server wouldn't have GetExistingLocalJoinPath and also
> possibly because of the change to the structure of JoinPathExtraData.
> Maybe there's no better alternative, and maybe nothing outside of
> postgres_fdw is using this stuff anyway, but it seems like a concern.
>
> Also, the CommitFest entry for this seems to be a bit sketchy.  It
> claims that Tom Lane is a co-author of this patch which, AFAICS, is
> not the case.  It is listed under Miscellaneous rather than "Bug
> Fixes", which seems like a surprising decision.  And it uses a subject
> line which is neither very clear nor the same as the (also not
> particularly helpful) subject line of the email thread.

Looking at the code itself, I find the changes to joinpath.c rather alarming.

+    /* Save hashclauses for possible use by the FDW */
+    if (extra->consider_foreignjoin && hashclauses)
+        extra->hashclauses = hashclauses;

A minor consideration is that this is fairly far away from where
hashclauses actually gets populated, so if someone later added an
early "return" statement to this function -- after creating some paths
-- it could subtly break join pushdown.  But I also think there's no
real need for this.  The loop that extracts hash clauses is simple
enough that we could just refactor it into a separate function, or if
necessary duplicate the logic.

+        /* Save first mergejoin data for possible use by the FDW */
+        if (extra->consider_foreignjoin && outerkeys == all_pathkeys)
+        {
+            extra->mergeclauses = cur_mergeclauses;
+            extra->outersortkeys = outerkeys;
+            extra->innersortkeys = innerkeys;
+        }

Similarly here.  select_outer_pathkeys_for_merge(),
find_mergeclauses_for_pathkeys(), and make_inner_pathkeys_for_merge()
are all extern, so there's nothing to keep CreateLocalJoinPath() from
just doing that work itself instead of getting help from joinpath,
which I guess seems better to me.  I think it's just better if we
don't burden joinpath.c with keeping little bits of data around that
CreateLocalJoinPath() can easily get for itself.

There appears to be no regression test covering the case where we get
a Merge Full Join with a non-empty list of mergeclauses.  Hash Full
Join is tested, as is Merge Full Join without merge clauses, but
there's no test for Merge Full Join with mergeclauses, and since that
is a separate code path it seems like it should be tested.

-        /*
-         * If either inner or outer path is a ForeignPath corresponding to a
-         * pushed down join, replace it with the fdw_outerpath, so that we
-         * maintain path for EPQ checks built entirely of local join
-         * strategies.
-         */
-        if (IsA(joinpath->outerjoinpath, ForeignPath))
-        {
-            ForeignPath *foreign_path;
-
-            foreign_path = (ForeignPath *) joinpath->outerjoinpath;
-            if (IS_JOIN_REL(foreign_path->path.parent))
-                joinpath->outerjoinpath = foreign_path->fdw_outerpath;
-        }
-
-        if (IsA(joinpath->innerjoinpath, ForeignPath))
-        {
-            ForeignPath *foreign_path;
-
-            foreign_path = (ForeignPath *) joinpath->innerjoinpath;
-            if (IS_JOIN_REL(foreign_path->path.parent))
-                joinpath->innerjoinpath = foreign_path->fdw_outerpath;
-        }

This logic is removed and not replaced with anything, but I don't see
what keeps this code...

+        Path       *outer_path = outerrel->cheapest_total_path;
+        Path       *inner_path = innerrel->cheapest_total_path;

...from picking a ForeignPath?

There's probably more to think about here, but those are my question
on an initial read-through.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: [HACKERS] Undefined psql variables
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: [HACKERS] Remaining 2017-03 CF entries