Re: Parallel bitmap index scan
От | Bharath Rupireddy |
---|---|
Тема | Re: Parallel bitmap index scan |
Дата | |
Msg-id | CALj2ACUEHw32XzcGO77XwOmhit_Q4Xv+cLmjk2jTjmT2LtTdnQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Parallel bitmap index scan (Dilip Kumar <dilipbalaut@gmail.com>) |
Ответы |
Re: Parallel bitmap index scan
(Dilip Kumar <dilipbalaut@gmail.com>)
|
Список | pgsql-hackers |
On Sun, Jul 26, 2020 at 6:43 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > I would like to propose a patch for enabling the parallelism for the > bitmap index scan path. > > Background: > Currently, we support only a parallel bitmap heap scan path. Therein, > the underlying bitmap index scan is done by a single worker called the > leader. The leader creates a bitmap in shared memory and once the > bitmap is ready it creates a shared iterator and after that, all the > workers process the shared iterator and scan the heap in parallel. > While analyzing the TPCH plan we have observed that some of the > queries are spending significant time in preparing the bitmap. So the > idea of this patch is to use the parallel index scan for preparing the > underlying bitmap in parallel. > > Design: > If underlying index AM supports the parallel path (currently only > BTREE support it), then we will create a parallel bitmap heap scan > path on top of the parallel bitmap index scan path. So the idea of > this patch is that each worker will do the parallel index scan and > generate their part of the bitmap. And, we will create a barrier so > that we can not start preparing the shared iterator until all the > worker is ready with their bitmap. The first worker, which is ready > with the bitmap will keep a copy of its TBM and the page table in the > shared memory. And, all the subsequent workers will merge their TBM > with the shared TBM. Once all the TBM are merged we will get one > common shared TBM and after that stage, the worker can continue. The > remaining part is the same, basically, again one worker will scan the > shared TBM and prepare the shared iterator and once it is ready all > the workers will jointly scan the heap in parallel using shared > iterator. > Though I have not looked at the patch or code for the existing parallel bitmap heap scan, one point keeps bugging in my mind. I may be utterly wrong or my question may be so silly, anyways I would like to ask here: From the above design: each parallel worker creates partial bitmaps for the index data that they looked at. Why should they merge these bitmaps to a single bitmap in shared memory? Why can't each parallel worker do a bitmap heap scan using the partial bitmaps they built during it's bitmap index scan and emit qualified tuples/rows so that the gather node can collect them? There may not be even lock contention as bitmap heap scan takes read locks for the heap pages/tuples. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления:
Следующее
От: Andy FanДата:
Сообщение: Re: Improve planner cost estimations for alternative subplans