Re: Parallel Seq Scan

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Parallel Seq Scan
Дата
Msg-id CAA4eK1KsT4UakMHHP5nAuzq0ETmfV7mgjDj=LThfSHnv6vXt9Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Seq Scan  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: Parallel Seq Scan  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Список pgsql-hackers
On Wed, Feb 18, 2015 at 6:44 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2015-02-18 16:59:26 +0530, Amit Kapila wrote:
>
> > There could be some cases where it could be beneficial for worker
> > to process a sub-tree, but I think there will be more cases where
> > it will just work on a part of node and send the result back to either
> > master backend or another worker for further processing.
>
> I think many parallelism projects start out that way, and then notice
> that it doesn't parallelize very efficiently.
>
> The most extreme example, but common, is aggregation over large amounts
> of data - unless you want to ship huge amounts of data between processes
> eto parallize it you have to do the sequential scan and the
> pre-aggregate step (that e.g. selects count() and sum() to implement a
> avg over all the workers) inside one worker.
>

OTOH if someone wants to parallelize scan (including expensive qual) and
sort then it will be better to perform scan (or part of scan by one worker)
and sort by other worker. 


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kevin Grittner
Дата:
Сообщение: Re: Allow "snapshot too old" error, to prevent bloat
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: pg_upgrade and rsync