Re: Parallel bitmap heap scan

Поиск
Список
Период
Сортировка
От Dilip Kumar
Тема Re: Parallel bitmap heap scan
Дата
Msg-id CAFiTN-uOv70=aJPZk6NL7wbbozYi48g-eygeoM2yefPhc2BkYA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel bitmap heap scan  (Dilip Kumar <dilipbalaut@gmail.com>)
Ответы Re: Parallel bitmap heap scan
Список pgsql-hackers
On Wed, Nov 23, 2016 at 12:31 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

I tried to address these comments in my new version, All comments are
fixed except below

>> + *
>> + *    #2. Bitmap processing (Iterate and process the pages).
>> + *        . In this phase each worker will iterate over page and
>> chunk array and
>> + *        select heap pages one by one. If prefetch is enable then there will
>> + *        be two iterator.
>> + *        . Since multiple worker are iterating over same page and chunk array
>> + *        we need to have a shared iterator, so we grab a spin lock and iterate
>> + *        within a lock.
>>
>> The formatting of this comment is completely haphazard.
>
> I will fix this..
I have changed the formatting, and also moved the algorithm
description inside function body.
I am not sure does this meet your expectation or we should change it further ?

>
>> +    /* reset parallel bitmap scan info, if present */
>> +    if (node->parallel_bitmap)
>> +    {
>> +        ParallelBitmapInfo *pbminfo = node->parallel_bitmap;
>> +
>> +        pbminfo->state = PBM_INITIAL;
>> +        pbminfo->tbmiterator.schunkbit = 0;
>> +        pbminfo->tbmiterator.spageptr = 0;
>> +        pbminfo->tbmiterator.schunkptr = 0;
>> +        pbminfo->prefetch_iterator.schunkbit = 0;
>> +        pbminfo->prefetch_iterator.spageptr = 0;
>> +        pbminfo->prefetch_iterator.schunkptr = 0;
>> +        pbminfo->prefetch_pages = 0;
>> +        pbminfo->prefetch_target = -1;
>> +    }
>>
>> This is obviously not going to work in the face of concurrent
>> activity.  When we did Parallel Seq Scan, we didn't make any changes
>> to the rescan code at all.  I think we are assuming that there's no
>> way to cause a rescan of the driving table of a parallel query; if
>> that's wrong, we'll need some fix, but this isn't it.  I would just
>> leave this out.
>
> In heap_rescan function we have similar change..
>
> if (scan->rs_parallel != NULL)
> {
>   parallel_scan = scan->rs_parallel;
>   SpinLockAcquire(¶llel_scan->phs_mutex);
>   parallel_scan->phs_cblock = parallel_scan->phs_startblock;
>   SpinLockRelease(¶llel_scan->phs_mutex);
> }
>
> And this is not for driving table, it's required for the case when
> gather is as inner node for JOIN.
> consider below example. I know it's a bad plan but we can produce this plan :)
>
> Outer Node                Inner Node
> SeqScan(t1)   NLJ   (Gather -> Parallel SeqScan (t2)).
>
> Reason for doing same is that, during ExecReScanGather we don't
> recreate new DSM instead of that we just reinitialise the DSM
> (ExecParallelReinitialize).

This is not fixed, reason is already explained.

>> +static bool
>> +pbms_is_leader(ParallelBitmapInfo *pbminfo)
>>
>> I think you should see if you can use Thomas Munro's barrier stuff for
>> this instead.
>
> Okay, I will think over it.

IMHO, barrier is used when multiple worker are doing some work
together in phase1, and before moving to next phase all have to
complete phase1, so we put barrier, so that before starting next phase
all cross the barrier.

But here case is different, only one worker need to finish the phase1,
and as soon as it's over all can start phase2. But we don't have
requirement that all worker should cross certain barrier. In this case
even though some worker did not start, other worker can do their work.

>>
>> In general, the amount of change in nodeBitmapHeapScan.c seems larger
>> than I would have expected.  My copy of that file has 655 lines; this
>> patch adds 544 additional lines.  I think/hope that some of that can
>> be simplified away.
>
> I will work on this.

I have removed some function which was actually not required, and code
can be merged in main function. Almost reduced by 100 lines.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: make default TABLESPACE belong to target table.
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: Parallel bitmap heap scan