Re: [WIP] speeding up GIN build with parallel workers

Поиск
Список
Период
Сортировка
От Constantin S. Pan
Тема Re: [WIP] speeding up GIN build with parallel workers
Дата
Msg-id 20160316031115.5856920c@monster
обсуждение исходный текст
Ответ на Re: [WIP] speeding up GIN build with parallel workers  (David Steele <david@pgmasters.net>)
Ответы Re: [WIP] speeding up GIN build with parallel workers  (Amit Kapila <amit.kapila16@gmail.com>)
Re: [WIP] speeding up GIN build with parallel workers  (Dmitry Ivanov <d.ivanov@postgrespro.ru>)
Список pgsql-hackers
On Mon, 14 Mar 2016 08:42:26 -0400
David Steele <david@pgmasters.net> wrote:

> On 2/18/16 10:10 AM, Constantin S. Pan wrote:
> > On Wed, 17 Feb 2016 23:01:47 +0300
> > Oleg Bartunov <obartunov@gmail.com> wrote:
> >
> >> My feedback is (Mac OS X 10.11.3)
> >>
> >> set gin_parallel_workers=2;
> >> create index message_body_idx on messages using gin(body_tsvector);
> >> LOG:  worker process: parallel worker for PID 5689 (PID 6906) was
> >> terminated by signal 11: Segmentation fault
> >
> > Fixed this, try the new patch. The bug was in incorrect handling
> > of some GIN categories.
>
> Oleg, it looks like Constantin has updated to patch to address the
> issue you were seeing.  Do you have time to retest and review?
>
> Thanks,

Actually, there was some progress since. The patch is
attached.

1. Added another GUC parameter for changing the amount of
shared memory for parallel GIN workers.

2. Changed the way results are merged. It uses shared memory
message queue now.

3. Tested on some real data (GIN index on email message body
tsvectors). Here are the timings for different values of
'gin_shared_mem' and 'gin_parallel_workers' on a 4-CPU
machine. Seems 'gin_shared_mem' has no visible effect.

wnum mem(MB) time(s)
   0      16     247
   1      16     256
   2      16     126
   4      16      89
   0      32     247
   1      32     270
   2      32     123
   4      32      92
   0      64     254
   1      64     272
   2      64     123
   4      64      88
   0     128     250
   1     128     263
   2     128     126
   4     128      85
   0     256     247
   1     256     269
   2     256     130
   4     256      88
   0     512     257
   1     512     275
   2     512     129
   4     512      92
   0    1024     255
   1    1024     273
   2    1024     130
   4    1024      90

On Wed, 17 Feb 2016 12:26:05 -0800
Peter Geoghegan <pg@heroku.com> wrote:

> On Wed, Feb 17, 2016 at 7:55 AM, Constantin S. Pan <kvapen@gmail.com>
> wrote:
> > 4. Hit the 8x speedup limit. Made some analysis of the reasons (see
> > the attached plot or the data file).
>
> Did you actually compare this to the master branch? I wouldn't like to
> assume that the one worker case was equivalent. Obviously that's the
> really interesting baseline.

Compared with the master branch. The case of 0 workers is
indeed equivalent to the master branch.

Regards,
Constantin
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Vik Fearing
Дата:
Сообщение: Re: Idle In Transaction Session Timeout, revived
Следующее
От: Tom Lane
Дата:
Сообщение: Re: plpgsql - DECLARE - cannot to use %TYPE or %ROWTYPE for composite types