Re: [HACKERS] Parallel bitmap heap scan

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [HACKERS] Parallel bitmap heap scan
Дата
Msg-id CA+TgmobJpmQwqt+aVi80D6FmtNGxwc1pkJ_G3xuA=GjNuzVWaw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Parallel bitmap heap scan  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
On Mon, Feb 27, 2017 at 10:30 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> On Sun, Feb 26, 2017 at 9:26 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> tbm_free_shared_area because the two iterators share one copy of the
>> underlying iteration arrays, and the TBM code isn't smart enough to
>> avoid freeing them twice.  You're going to have to come up with a
>> better solution to that problem; nodeBitmapHeapScan.c shouldn't know
>> about the way the underlying storage details are managed.  (Maybe you
>> need to reference-count the iterator arrays?)
>
> Yeah, I also think current way doesn't look so clean, currently, these
> arrays are just integers array, may be we can use a first slot of the
> array for reference-count? or convert to the structure which has space
> for reference-count and an integers array. What do you suggest?

Maybe something like typedef struct { int refcnt; SomeType
somename[FLEXIBLE_ARRAY_MEMBER]; } SomeOtherType; would be a good
approach.

>> +    if (node->inited)
>> +        goto start_iterate;
>>
>> My first programming teacher told me not to use goto.  I've
>> occasionally violated that rule, but I need a better reason than you
>> have here.  It looks very easy to avoid.
>
> Yes, this can be avoided, I was just trying to get rid of multi-level
> if nesting and end up with the "goto".

That's what I figured.

>> +            pbms_set_parallel(outerPlanState(node));
>>
>> I think this should be a flag in the plan, and the planner should set
>> it correctly, instead of having it be a flag in the executor that the
>> executor sets.  Also, the flag itself should really be called
>> something that involves the word "shared" rather than "parallel",
>> because the bitmap will not be created in parallel, but it will be
>> shared.
>
> Earlier, I thought that it will be not a good idea to set that flag in
> BitmapIndexScan path because the same path can be used either under
> ParallelBitmapHeapPath or under normal BitmapHeapPath.  But, now after
> putting some thought, I realised that we can do that in create_plan.
> Therefore, I will change this.

Cool.

>> pbms_is_leader() looks horrifically inefficient.  Every worker will
>> reacquire the spinlock for every tuple.  You should only have to enter
>> this spinlock-acquiring loop for the first tuple.  After that, either
>> you became the leader, did the initialization, and set the state to
>> PBM_FINISHED, or you waited until the pre-existing leader did the
>> same.  You should have a backend-local flag that keeps you from
>> touching the spinlock for every tuple.  I wouldn't be surprised if
>> fixing this nets a noticeable performance increase for this patch at
>> high worker counts.
>
> I think there is some confusion, if you notice, below code will avoid
> calling pbms_is_leader for every tuple.
> It will be called only first time. And, after that node->inited will
> be true and it will directly jump to start_iterate for subsequent
> calls.  Am I missing something?
>
>> +    if (node->inited)
>> +        goto start_iterate;

Oh, OK.  I guess I was just confused, then.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] SerializedSnapshotData alignment
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Creation of "Future" commit fest, named 2017-07