Re: [HACKERS] postgres_fdw bug in 9.6

Поиск
Список
Период
Сортировка
От Etsuro Fujita
Тема Re: [HACKERS] postgres_fdw bug in 9.6
Дата
Msg-id bb597d1f-ad78-b47e-3255-cd3303662aa8@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [HACKERS] postgres_fdw bug in 9.6  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Ответы Re: [HACKERS] postgres_fdw bug in 9.6  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Список pgsql-hackers
On 2017/01/09 22:36, Ashutosh Bapat wrote:

I wrote:
>> Done.  Attached is the new version of the patch.

> Why is this so?
>              * If the outer cheapest-total path is parameterized by the inner
>              * rel, we can't generate a nestloop path.

That parameterization means that for each row scanned from the inner 
rel, the nestloop has to pass down the param values of the row into the 
scan of the outer rel; but it's not possible for the nestloop to work 
like that.  You can find the same tests in match_unsorted_outer.

>              * If either cheapest-total path is parameterized by the other
>              * rel, we can't generate a hashjoin/mergejoin path.

The above applies to hashjoins and mergejoins.  In addition, those joins 
cannot pass down the param values of each outer-rel row into the 
inner-rel scan, like nestloop joins, so the inner path mustn't be 
parameterized by the outer rel in those joins.  See the same tests in 
sort_inner_and_outer and hash_inner_and_outer.

While re-reading the patch, I noticed this could be simplified:

!         case JOIN_FULL:
!             /*
!              * If either cheapest-total path is parameterized by the other
!              * rel, we can't generate a hashjoin/mergejoin path.  (There's no
!              * use looking for alternative input paths, since these should
!              * already be the least-parameterized available paths.)
!              */
!             if (PATH_PARAM_BY_REL(outer_path, innerrel) ||
!                 PATH_PARAM_BY_REL(inner_path, outerrel))
!                 return NULL;
!             /* If proposed path is still parameterized, we can't use it. */
!             required_outer = calc_non_nestloop_required_outer(outer_path,
!                                                               inner_path);

This would mean that both the outer and inner paths must be 
unparameterized.  Will simplify if it's ok.

> I don't think we need to provide details of what kind of path the function
> builds.
> +     join plan.  <literal>CreateLocalJoinPath</> builds a nested loop join
> +     path for the specified join relation, except when the join type is
> +     <literal>FULL</>, in which case a merge or hash join path is built.

Ok, will remove.

> I am not able to understand the code or the comment below
> +
> +        /* Save first mergejoin information for possible use by the FDW */
> +        if (outerkeys == all_pathkeys)
> +        {
> +            extra->mergeclauses = cur_mergeclauses;
> +            extra->merge_pathkeys = merge_pathkeys;
> +            extra->merge_outerkeys = outerkeys;
> +            extra->merge_innerkeys = innerkeys;
> +        }

In the case when mergejoin is allowed and we have merge clauses but no 
hash clauses, CreateLocalJoinPath creates a mergejoin from the 
cheapest-total paths for the outer and inner rels, by explicitly sorting 
both the outer and inner rels.  The above is for creating the mergejoin 
in CreateLocalJoinPath; since sort_inner_and_outer also creates 
mergejoins from the cheapest-total paths, the above code saves data 
about one of mergejoins created there so that CreateLocalJoinPath uses 
that data to create the mergejoin.

On second thought, I noticed that this could be simplified a bit as 
well; since the output sort order (merge_pathkeys) of the mergejoin 
implementing a FULL join is NIL (see build_join_pathkeys), we wouldn't 
need to save that info in merge_pathkeys.  Will simplify if it's ok.

> If the inner and/or outer paths are not ordered as required, we will need to
> order them. Code below doesn't seem to handle that case.
>                     /*
>                      * If the paths are already well enough ordered, we can
>                      * skip doing an explicit sort.
>                      */
>                     if (outerkeys &&
>                         pathkeys_contained_in(outerkeys, outer_path->pathkeys))
>                         outerkeys = NIL;
>                     if (innerkeys &&
>                         pathkeys_contained_in(innerkeys, inner_path->pathkeys))
>                         innerkeys = NIL;

I might be missing something, but if the outer/inner paths are already 
well sorted, we wouldn't need an explicit sort, so we can set the 
outerkeys/innerkeys to NIL safely.  (You can find the same optimization 
in try_mergejoin_path.)

Thanks for the review!

Best regards,
Etsuro Fujita





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] DROP FUNCTION of multiple functions
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: [HACKERS] Block level parallel vacuum WIP