Observations in Parallel Append

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Observations in Parallel Append
Дата
Msg-id CAA4eK1+qcbeai3coPpRW=GFCzFeLUsuY4T-AKHqMjxpEGZBPQg@mail.gmail.com
обсуждение исходный текст
Ответы Re: Observations in Parallel Append  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Few observations in Parallel Append commit (ab727167)

1.
+++ b/src/include/nodes/execnodes.h
@@ -21,6 +21,7 @@
 #include "lib/pairingheap.h"
 #include "nodes/params.h"
 #include "nodes/plannodes.h"
+#include "storage/spin.h"
..

There doesn't seem to be any need for including spin.h.  I think some
prior version of the patch might need it.  Patch attached to remove
it.

2.
+ *     choose_next_subplan_for_worker
..
+ *     We start from the first plan and advance through the list;
+ *     when we get back to the end, we loop back to the first
+ *     nonpartial plan.
..
+choose_next_subplan_for_worker(AppendState *node)
{
..
+       if (pstate->pa_next_plan < node->as_nplans - 1)
+       {
+           /* Advance to next plan. */
+           pstate->pa_next_plan++;
+       }
+       else if (append->first_partial_plan < node->as_nplans)
+       {
+           /* Loop back to first partial plan. */
+           pstate->pa_next_plan = append->first_partial_plan;
+       }
..

The code and comment don't seem to match.  The comments indicate that
after reaching the end of the list, we loop back to first nonpartial
plan whereas code indicates that we loop back to first partial plan.
I think one of those needs to be changed unless I am missing something
obvious.

3.
+cost_append(AppendPath *apath)
{
..
+           /*
+            * Apply parallel divisor to non-partial subpaths.  Also add the
+            * cost of partial paths to the total cost, but ignore non-partial
+            * paths for now.
+            */
+           if (i < apath->first_partial_path)
+               apath->path.rows += subpath->rows / parallel_divisor;
+           else
+           {
+               apath->path.rows += subpath->rows;
+               apath->path.total_cost += subpath->total_cost;
+           }
..
}

I think it is better to use clamp_row_est for rows for the case where
we use parallel_divisor so that the value of rows is always sane.
Also, don't we need to use parallel_divisor for partial paths instead
of non-partial paths as those will be actually distributed among
workers?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: Using ProcSignal to get memory context stats from a running backend
Следующее
От: Oliver Ford
Дата:
Сообщение: Re: Add RANGE with values and exclusions clauses to the Window Functions