Re: Parallel tuplesort (for parallel B-Tree index creation)

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Parallel tuplesort (for parallel B-Tree index creation)
Дата
Msg-id b4615f37-70e7-58e4-3e68-6122b02f15c9@iki.fi
обсуждение исходный текст
Ответ на Parallel tuplesort (for parallel B-Tree index creation)  (Peter Geoghegan <pg@heroku.com>)
Ответы Re: Parallel tuplesort (for parallel B-Tree index creation)  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 08/02/2016 01:18 AM, Peter Geoghegan wrote:
> No merging in parallel
> ----------------------
>
> Currently, merging worker *output* runs may only occur in the leader
> process. In other words, we always keep n worker processes busy with
> scanning-and-sorting (and maybe some merging), but then all processes
> but the leader process grind to a halt (note that the leader process
> can participate as a scan-and-sort tuplesort worker, just as it will
> everywhere else, which is why I specified "parallel_workers = 7" but
> talked about 8 workers).
>
> One leader process is kept busy with merging these n output runs on
> the fly, so things will bottleneck on that, which you saw in the
> example above. As already described, workers will sometimes merge in
> parallel, but only their own runs -- never another worker's runs. I
> did attempt to address the leader merge bottleneck by implementing
> cross-worker run merging in workers. I got as far as implementing a
> very rough version of this, but initial results were disappointing,
> and so that was not pursued further than the experimentation stage.
>
> Parallel merging is a possible future improvement that could be added
> to what I've come up with, but I don't think that it will move the
> needle in a really noticeable way.

It'd be good if you could overlap the final merges in the workers with 
the merge in the leader. ISTM it would be quite straightforward to 
replace the final tape of each worker with a shared memory queue, so 
that the leader could start merging and returning tuples as soon as it 
gets the first tuple from each worker. Instead of having to wait for all 
the workers to complete first.

- Heikki




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission denied”
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Parallel tuplesort (for parallel B-Tree index creation)