Re: Parallel Seq Scan

Поиск
Список
Период
Сортировка
От Haribabu Kommi
Тема Re: Parallel Seq Scan
Дата
Msg-id CAJrrPGd28BLMhD_yQTWdRcap8TW_Nf=yJKEJF+RS3GWRm0cfrQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Sat, Feb 21, 2015 at 12:57 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Feb 18, 2015 at 6:44 PM, Andres Freund <andres@2ndquadrant.com>
> wrote:
>> On 2015-02-18 16:59:26 +0530, Amit Kapila wrote:
>>
>> > There could be some cases where it could be beneficial for worker
>> > to process a sub-tree, but I think there will be more cases where
>> > it will just work on a part of node and send the result back to either
>> > master backend or another worker for further processing.
>>
>> I think many parallelism projects start out that way, and then notice
>> that it doesn't parallelize very efficiently.
>>
>> The most extreme example, but common, is aggregation over large amounts
>> of data - unless you want to ship huge amounts of data between processes
>> eto parallize it you have to do the sequential scan and the
>> pre-aggregate step (that e.g. selects count() and sum() to implement a
>> avg over all the workers) inside one worker.
>>
>
> OTOH if someone wants to parallelize scan (including expensive qual) and
> sort then it will be better to perform scan (or part of scan by one worker)
> and sort by other worker.

There exists a performance problem if we perform SCAN in one worker
and SORT operation in another worker,
because there is a need of twice tuple transfer between worker to
worker/backend. This is a costly operation.
It is better to combine SCAN and SORT operation into a one worker job.
This can be targeted once the parallel scan
code is stable.

Regards,
Hari Babu
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kevin Grittner
Дата:
Сообщение: Re: Idea: GSoC - Query Rewrite with Materialized Views
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Enforce creation of destination folders for source files in pg_regress (Was: pg_regress writes into source tree)