Re: CustomScan under the Gather node?

Поиск
Список
Период
Сортировка
От Kouhei Kaigai
Тема Re: CustomScan under the Gather node?
Дата
Msg-id 9A28C8860F777E439AA12E8AEA7694F8011A6841@BPXM15GP.gisp.nec.co.jp
обсуждение исходный текст
Ответ на CustomScan under the Gather node?  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Список pgsql-hackers
> -----Original Message-----
> From: Robert Haas [mailto:robertmhaas@gmail.com]
> Sent: Thursday, February 04, 2016 2:54 AM
> To: Kaigai Kouhei(海外 浩平)
> Cc: pgsql-hackers@postgresql.org
> Subject: ##freemail## Re: [HACKERS] CustomScan under the Gather node?
> 
> On Thu, Jan 28, 2016 at 8:14 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> >>              total         ForeignScan        diff
> >> 0 workers: 17584.319 ms   17555.904 ms      28.415 ms
> >> 1 workers: 18464.476 ms   18110.968 ms     353.508 ms
> >> 2 workers: 19042.755 ms   14580.335 ms    4462.420 ms
> >> 3 workers: 19318.254 ms   12668.912 ms    6649.342 ms
> >> 4 workers: 21732.910 ms   13596.788 ms    8136.122 ms
> >> 5 workers: 23486.846 ms   14533.409 ms    8953.437 ms
> >>
> >> This workstation has 4 CPU cores, so it is natural nworkers=3 records the
> >> peak performance on ForeignScan portion. On the other hands, nworkers>1 also
> >> recorded unignorable time consumption (probably, by Gather node?)
> >   :
> >> Further investigation will need....
> >>
> > It was a bug of my file_fdw patch. ForeignScan node in the master process was
> > also kicked by the Gather node, however, it didn't have coordinate information
> > due to oversight of the initialization at InitializeDSMForeignScan callback.
> > In the result, local ForeignScan node is still executed after the completion
> > of coordinated background worker processes, and returned twice amount of rows.
> >
> > In the revised patch, results seems to me reasonable.
> >              total         ForeignScan      diff
> > 0 workers: 17592.498 ms   17564.457 ms     28.041ms
> > 1 workers: 12152.998 ms   11983.485 ms    169.513 ms
> > 2 workers: 10647.858 ms   10502.100 ms    145.758 ms
> > 3 workers:  9635.445 ms    9509.899 ms    125.546 ms
> > 4 workers: 11175.456 ms   10863.293 ms    312.163 ms
> > 5 workers: 12586.457 ms   12279.323 ms    307.134 ms
> 
> Hmm.  Is the file_fdw part of this just a demo, or do you want to try
> to get that committed?  If so, maybe start a new thread with a more
> appropriate subject line to just talk about that.  I haven't
> scrutinized that part of the patch in any detail, but the general
> infrastructure for FDWs and custom scans to use parallelism seems to
> be in good shape, so I rewrote the documentation and committed that
> part.
>
Thanks, I expect file_fdw part is just for demonstration.
It does not require any special hardware to reproduce this parallel
execution, rather than GpuScan of PG-Strom.

> Do you have any idea why this isn't scaling beyond, uh, 1 worker?
> That seems like a good thing to try to figure out.
>
The hardware I run the above query has 4 CPU cores, so it is not
surprising that 3 workers (+ 1 master) recorded the peak performance.

In addition, enhancement of file_fdw part is a corner-cutting work.

It picks up the next line number to be fetched from the shared memory
segment using pg_atomic_add_fetch_u32(), then it reads the input file
until worker meets the target line. Unrelated line shall be ignored.
Individual worker parses its responsible line only, thus, parallel
execution makes sense in this part. On the other hands, total amount
of CPU cycles for file scan will increase because all the workers
at least have to parse all the lines.

If we would simply split time consumption factor in 0 worker case
as follows: (time to scan file; TSF) + (time to parse lines; TPL)

Total amount of workloads when we distribute file_fdw into N workers is:
 N * (TSF) + (TPL)

Thus, individual worker has to process the following amount of works:
 (TSF) + (TPL)/N

It is a typical formula of Amdahl's law when sequencial part is not
small. The above result says, TSF part is about 7.4s, TPL part is
about 10.1s.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Idle In Transaction Session Timeout, revived
Следующее
От: Tom Lane
Дата:
Сообщение: pg_dump data structures for triggers