Re: [HACKERS] Proposal : Parallel Merge Join

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [HACKERS] Proposal : Parallel Merge Join
Дата
Msg-id CAA4eK1LNeu8s_LygVdPBWqAur_e0q8YHGQucP5pqx2Mg6qoHHg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Proposal : Parallel Merge Join  (Dilip Kumar <dilipbalaut@gmail.com>)
Ответы Re: [HACKERS] Proposal : Parallel Merge Join  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
On Wed, Dec 21, 2016 at 9:23 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> On Wed, Dec 21, 2016 at 8:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Committed the refactoring patch with some mild cosmetic adjustments.
>
> Thanks..
>>
>> As to the second patch:
>>
>> +        if (jointype == JOIN_UNIQUE_INNER)
>> +            jointype = JOIN_INNER;
>>
>> Isn't this dead code?  save_jointype might that value, but jointype won't.
>
> Yes, it is.
>
> I have fixed this, and also observed that comment for
> try_partial_mergejoin_path header was having some problem, fixed that
> too.
>

Review comments:
1.
+ bool is_partial);
+

Seems additional new line is not required.

2.
+ * try_partial_mergejoin_path
+ *  Consider a partial merge join path; if it appears useful,
push it into
+ *  the joinrel's pathlist via add_path().
+ */

I think in above comment, you should say ".. joinrel's
partial_pathlist via add_partial_path()"

3.
+ /*
+ * See comments in try_partial_nestloop_path().
+ */
+ Assert(bms_is_empty
(joinrel->lateral_relids));
+ if (inner_path->param_info != NULL)
+ {
+ Relids
inner_paramrels = inner_path->param_info->ppi_req_outer;
+
+ if (!bms_is_subset
(inner_paramrels, outer_path->parent->relids))
+ return;
+ }

This same code exists in try_partial_nestloop_path() and
try_partial_hashjoin_path() with minor difference of code in
try_partial_hashjoin_path().  Can't we write a separate inline
function for this code and call from all the three places.

4.
+ /*
+ * See comments in try_nestloop_path().
+ */
+ initial_cost_mergejoin(root,
&workspace, jointype, mergeclauses,
+   outer_path,
inner_path,
+   outersortkeys, innersortkeys,
+ extra->sjinfo);

I think in above comments, you want to say try_partial_nestloop_path().

5.
- if (joinrel->consider_parallel && nestjoinOK &&
- save_jointype != JOIN_UNIQUE_OUTER &&
- bms_is_empty(joinrel->lateral_relids))
+ if (!joinrel->consider_parallel ||
+ save_jointype == JOIN_UNIQUE_OUTER ||
+ !bms_is_empty(joinrel->lateral_relids) ||
+ jointype == JOIN_FULL ||
+ jointype == JOIN_RIGHT)
+ return;
+
+ if (nestjoinOK)

Here, we only want to create partial paths when
outerrel->partial_pathlist is not NILL, so I think you can include
that check as well.  Another point to note is that in HEAD, we are not
checking for jointype as JOIN_RIGHT or JOIN_FULL for considering
parallel nestloop paths, whereas with your patch, it will do those
checks, is it a bug of HEAD or is there any other reason for including
those checks for parallel nestloop paths?

6.
+ /* Can't generate mergejoin path if inner rel is parameterized by outer */
+ if (inner_cheapest_total != NULL)
+ {
+ ListCell   *lc1;
+
+ /* generate merge join path for each partial outer path */
+ foreach(lc1, outerrel->partial_pathlist)
+ {
+ Path   *outerpath = (Path *) lfirst(lc1);
+ List   *merge_pathkeys;
+
+ /*
+ * Figure out what useful ordering any paths we create
+ * will have.
+ */
+ merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
+   outerpath->pathkeys);
+
+ generate_mergejoin_paths(root, joinrel, innerrel, outerpath,
+ save_jointype, extra, false,
+ inner_cheapest_total, merge_pathkeys,
+ true);
+ }
+
+ }

Won't it be better if you encapsulate the above chunk of code in
function consider_parallel_merjejoin() similar to
consider_parallel_nestloop()?  I think that way code will look
cleaner.

7.
+ /*
+ * Figure out what useful ordering any paths we create
+ * will have.
+ */
+ merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
+   outerpath->pathkeys);

indentation problem with variable outerpath->pathkeys.

8.
- try_mergejoin_path(root,
-   joinrel,
-   outerpath,
-   inner_cheapest_total,
-   merge_pathkeys,
-   mergeclauses,
-   NIL,
-   innersortkeys,
-   jointype,
-   extra);
+ if (!is_partial)
+ try_mergejoin_path(root,
+ joinrel,
+ outerpath,
+ inner_cheapest_total,
+ merge_pathkeys,
+ mergeclauses,
+ NIL,
+ innersortkeys,
+ jointype,
+ extra);
+
+ /* Generate partial path if inner is parallel safe. */
+ else if (inner_cheapest_total->parallel_safe)
+ try_partial_mergejoin_path(root,
+ joinrel,
+ outerpath,
+ inner_cheapest_total,
+ merge_pathkeys,
+ mergeclauses,
+ NIL,
+ innersortkeys,
+ jointype,
+ extra);

In above code indentation is broken, similarly, there is another code
in a patch where it is broken, try using pgindent.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Abhijit Menon-Sen
Дата:
Сообщение: Re: [HACKERS] [COMMITTERS] pgsql: Update copyright for 2017
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] Potential data loss of 2PC files