Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Дата
Msg-id CAH2-Wzn=+MDmhOVHQ=M3TYxbCtPQsWStt0GT273XhADkiNoOvg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)  (Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)  (Michael Paquier <michael.paquier@gmail.com>)
Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Mon, Jan 30, 2017 at 8:46 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Wed, Jan 4, 2017 at 12:53 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> Attached is V7 of the patch.
>
> I am doing some testing.  First, some superficial things from first pass:
>
> [Various minor cosmetic issues]

Oops.

> Just an observation:  if you ask for a large number of workers, but
> only one can be launched, it will be constrained to a small fraction
> of maintenance_work_mem, but use only one worker.  That's probably OK,
> and I don't see how to do anything about it unless you are prepared to
> make workers wait for an initial message from the leader to inform
> them how many were launched.

Actually, the leader-owned worker Tuplesort state will have the
appropriate amount, so you'd still need to have 2 participants (1
worker + leader-as-worker). And, sorting is much less sensitive to
having a bit less memory than hashing (at least when there isn't
dozens of runs to merge in the end, or multiple passes). So, I agree
that this isn't worth worrying about for a DDL statement.

> Should this 64KB minimum be mentioned in the documentation?

You mean user-visible documentation, and not just tuplesort.h? I don't
think that that's necessary. That's a ludicrously low amount of memory
for a worker to be limited to anyway. It will never come up with
remotely sensible use of the feature.

> +   if (!btspool->isunique)
> +   {
> +       shm_toc_estimate_keys(&pcxt->estimator, 2);
> +   }
>
> Project style: people always tell me to drop the curlies in cases like
> that.  There are a few more examples in the patch.

I only do this when there is an "else" that must have curly braces,
too. There are plenty of examples of this from existing code, so I
think it's fine.

> + /* Wait for workers */
> + ConditionVariableSleep(&shared->workersFinishedCv,
> +   WAIT_EVENT_PARALLEL_FINISH);
>
> I don't think we should reuse WAIT_EVENT_PARALLEL_FINISH in
> tuplesort_leader_wait and worker_wait.  That belongs to
> WaitForParallelWorkersToFinish, so someone who see that in
> pg_stat_activity won't know which it is.

Noted.

> IIUC worker_wait() is only being used to keep the worker around so its
> files aren't deleted.  Once buffile cleanup is changed to be
> ref-counted (in an on_dsm_detach hook?) then workers might as well
> exit sooner, freeing up a worker slot... do I have that right?

Yes. Or at least I think it's very likely that that will end up happening.

> Incidentally, barrier.c could probably be used for this
> synchronisation instead of these functions.  I think
> _bt_begin_parallel would call BarrierInit(&shared->barrier,
> scantuplesortstates) and then after LaunchParallelWorkers() it'd call
> a new interface BarrierDetachN(&shared->barrier, scantuplesortstates -
> pcxt->nworkers_launched) to forget about workers that failed to
> launch.  Then you could use BarrierWait where the leader waits for the
> workers to finish, and BarrierDetach where the workers are finished
> and want to exit.

I thought about doing that, actually, but I don't like creating
dependencies on some other uncommited patch, which is a moving target
(barrier stuff isn't committed yet). It makes life difficult for
reviewers. I put off adopting condition variables until they were
committed for the same reason -- it's was easy to do without them for
a time. I'll probably get around to it before too long, but feel no
urgency about it. Barriers will only allow me to make a modest net
removal of code, AFAIK.

Thanks
-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Vitaly Burovoy
Дата:
Сообщение: Re: [HACKERS] macaddr 64 bit (EUI-64) datatype support
Следующее
От: Michael Paquier
Дата:
Сообщение: [HACKERS] Refactoring of replication commands using printsimple