Re: Parallel Seq Scan

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Parallel Seq Scan
Дата
Msg-id CAA4eK1++YrbjcKU8dph8_ueCQ9GAEXXJsLcVmYmLMh5w+-x3pA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Parallel Seq Scan  (Thom Brown <thom@linux.com>)
Re: Parallel Seq Scan  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Список pgsql-hackers
On Wed, Mar 11, 2015 at 1:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Mar 3, 2015 at 7:47 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > There is still some work left like integrating with
> > access-parallel-safety patch (use parallelmodeok flag to decide
> > whether parallel path can be generated, Enter/Exit parallel mode is still
> > done during execution of funnel node).
> >
> > I think these are minor points which can be fixed once we decide
> > on the other major parts of patch.  Find modified patch attached with
> > this mail.
>
> This is definitely progress.  I do think you need to integrate it with
> the access-parallel-safety patch.

I have tried, but there are couple of failures while applying latest
access-parallel-safety patch, so left it as it is for now.

> Other comments:
>
> - There's not much code left in shmmqam.c.  I think that the remaining
> logic should be integrated directly into nodeFunnel.c, with the two
> bools in worker_result_state becoming part of the FunnelState.  It
> doesn't make sense to have a separate structure for two booleans and
> 20 lines of code.  If you were going to keep this file around, I'd
> complain about its name and its location in the source tree, too, but
> as it is I think we can just get rid of it altogether.
>

Agreed.  Moved the code/logic to nodeFunnel.c

> - Something is deeply wrong with the separation of concerns between
> nodeFunnel.c and nodePartialSeqscan.c.  nodeFunnel.c should work
> correctly with *any arbitrary plan tree* as its left child, and that
> is clearly not the case right now.  shm_getnext() can't just do
> heap_getnext().  Instead, it's got to call ExecProcNode() on its left
> child and let the left child decide what to do about that.

Agreed and made the required changes.

>  The logic
> in InitFunnelRelation() belongs in the parallel seq scan node, not the
> funnel.

I think we should retain initialization of parallelcontext in InitFunnel().
Apart from that, I have moved other stuff to partial seq scan node.

> ExecReScanFunnel() cannot be calling heap_parallel_rescan();
> it needs to *not know* that there is a parallel scan under it.

Agreed.  I think it is better to be do that as part of partial seq scan
node.

>  The
> comment in FunnelRecheck is a copy-and-paste from elsewhere that is
> not applicable to a generic funnel mode.
>

With new changes, this API is not required.

> - The comment in execAmi.c refers to says "Backward scan is not
> suppotted for parallel sequiantel scan".  "Sequential" is mis-spelled
> here, but I think you should just nuke the whole comment.  The funnel
> node is not, in the long run, just for parallel sequential scan, so
> putting that comment above it is not right.  If you want to keep the
> comment, it's got to be more general than that somehow, like "parallel
> nodes do not support backward scans", but I'd just drop it.
>
> - Can we rename create_worker_scan_plannedstmt to
> create_parallel_worker_plannedstmt?
>

Agreed and changed as per suggestion. 

> - I *strongly* suggest that, for the first version of this, we remove
> all of the tts_fromheap stuff.  Let's make no special provision for
> returning a tuple stored in a tuple queue; instead, just copy it and
> store it in the slot as a pfree-able tuple.  That may be slightly less
> efficient, but I think it's totally worth it to avoid the complexity
> of tinkering with the slot mechanism.
>

Sure, removed (tts_fromheap becomes redundant with new changes).

> - InstrAggNode claims that we only need the master's information for
> statistics other than buffer usage and tuple counts, but is that
> really true?  The parallel backends can be working on the parallel
> part of the plan while the master is doing something else, so the
> amount of time the *master* spent in a particular node may not be that
> relevant.  

Yes, but isn't other nodes also work this way, example join node will
display the accumulated stats for buffer usage, but for timing, it will
just use the time for that node (which automatically includes some
part of execution of child nodes, but it is not direct accumulation)?


> We might need to think carefully about what it makes sense
> to display in the EXPLAIN output in parallel cases.
>

Currently the Explain for parallel scan on relation will display the
Funnel node which contains aggregated stat of all workers and the
number of workers and Partial Seq Scan node containing stats for
the scan done by master backend.  Do we want to display something
more?

Current result of Explain statement is as below:
postgres=# explain (analyze,buffers) select c1 from t1 where c1 > 90000;
                                                        QUERY PLAN

--------------------------------------------------------------------------------
-------------------------------------------
 Funnel on t1  (cost=0.00..43750.44 rows=9905 width=4) (actual time=1097.236..15
30.416 rows=10000 loops=1)
   Filter: (c1 > 90000)
   Rows Removed by Filter: 65871
   Number of Workers: 2
   Buffers: shared hit=96 read=99905
   ->  Partial Seq Scan on t1  (cost=0.00..101251.01 rows=9905 width=4) (actual
time=1096.188..1521.810 rows=2342 loops=1)
         Filter: (c1 > 90000)
         Rows Removed by Filter: 24130
         Buffers: shared hit=33 read=26439
 Planning time: 0.143 ms
 Execution time: 1533.438 ms
(11 rows)

> - The header comment on nodeFunnel.h capitalizes the filename incorrectly.
>

Changed.


One additional change (we need to SetLatch() in HandleParallelMessageInterrupt)
is done to handle the hang issue reported on parallel-mode thread.
Without this change it is difficult to verify the patch (will remove this change
once new version of parallel-mode patch containing this change will be posted).


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: logical column ordering
Следующее
От: Andres Freund
Дата:
Сообщение: recovery_target_action doesn't work for anything but shutdown