Re: Parallel Seq Scan

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Parallel Seq Scan
Дата
Msg-id CAA4eK1LUQ2LV51QqC907AW7_Huv5fi=U-Zm9jKmL6c0meTK+2w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Seq Scan  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: Parallel Seq Scan  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
On Mon, Jul 6, 2015 at 3:26 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Fri, 2015-07-03 at 17:35 +0530, Amit Kapila wrote:
>
> > Attached, find the rebased version of patch.
> >
>
> Comments:
>
>
> * The heapam.c changes seem a little ad-hoc. Conceptually, which
> portions should be affected by parallelism? How do we know we didn't
> miss something?

The main reason of changing heapam.c is that we want to scan blocks
parallely by multiple workers and heapam.c seems to be the best
place to make such a change.  As of now, the changes  are mainly
required to identify the next block to scan by each worker.  So
we can focus on that aspect and see if anything is missing.

> * Why is initscan getting the number of blocks from the structure? Is it
> just to avoid an extra syscall, or is there a correctness issue there?

Yes, there is a correctness issue.  All the parallel workers should see
the same scan information during scan as is seen by master backend.
master backend fills this structure and then that is used by all workers
to avoid any problem.

> Is initscan expecting that heap_parallelscan_initialize is always called
> first (if parallel)? Please add a comment explaining above.

okay.

> * What's the difference between scan->rs_nblocks and
> scan->rs_parallel->phs_nblocks?

scan->rs_parallel->phs_nblocks is once initialized in master
backend and then propagated to all other worker backends and
then worker backends use that value to initialize scan->rs_nblocks
(and if master backend itself is involved in scan, then it also
uses it in same way)

> Same for rs_rd->rd_id and phs_relid.

This is also similar to phs_nblocks.  The basic idea is that parallel
heap scan descriptor is formed in master backend containing all the
necessary members that are required for performing the scan in master
as well as worker backends.  Once we initialize the parallel heap scan
descriptor, it is passed to all the worker backends and used by them
to scan the heap.

> * It might be good to separate out some fields which differ between the
> normal heap scan and the parallel heap scan. Perhaps put rs_ctup,
> rs_cblock, and rs_cbuf into a separate structure, which is always NULL
> during a parallel scan. That way we don't accidentally use a
> non-parallel field when doing a parallel scan.

Or the other way to look at it could be separate out fields which are
required for parallel scan which is done currently by forming a
separate structure ParallelHeapScanDescData.

> * Is there a reason that partial scans can't work with syncscan? It
> looks like you're not choosing the starting block in the same way, so it
> always starts at zero and never does syncscan.

The reason why partial scan can't be mixed with sync scan is that in parallel
scan, it performs the scan of heap by synchronizing blocks (each parallel worker
scans a block and then asks for a next block to scan) among parallel workers.
Now if we try to make sync scans work along with it, the synchronization among
parallel workers will go for a toss.  It might not be impossible to make that
work in some way, but not sure if it is important enough for sync scans to work
along with parallel scan.

> If we don't want to mix
> syncscan and partial scan, that's fine, but it should be more explicit.
>

makes sense to me, I think in initscan, we should mark syncscan
as false for parallel scan case.

> I'm trying to understand where tqueue.c fits in. It seems very closely
> tied to the Funnel operator, because any change to the way Funnel works
> would almost certainly require changes in tqueue.c.

tqueue.c is mainly designed to pass tuples between parallel workers
and currently it is used in Funnel operator to gather the tuples generated
by all the parallel workers.  I think we can use it for any other operator
which needs tuple communication among parallel workers.

> But "tqueue" is a
> generic name for the file, so something seems off. Either we should
> explicitly make it the supporting routines for the Funnel operator, or
> we should try to generalize it a little.
>

It has been designed to be generic way of communication for tuples,
but let me know if you have any specific suggestions.

> I still have quite a bit to look at, but this is a start.
>

Thanks for the review.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: pg_archivecleanup, and backup filename to specify as an argument
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: pg_archivecleanup, and backup filename to specify as an argument