Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
От | Peter Geoghegan |
---|---|
Тема | Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) |
Дата | |
Msg-id | CAH2-WznUWoxD71cGjvA1Oo_9xheG4A77HKVKPFd-Q8UOES30UQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) |
Список | pgsql-hackers |
On Thu, Jan 25, 2018 at 10:00 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: >> At this point, my preferred solution is for someone to go implement >> Amit's WaitForParallelWorkersToAttach() idea [1] (Amit himself seems >> like the logical person for the job). >> > > I can implement it and share a prototype patch with you which you can > use to test parallel sort stuff. That would be great. Thank you. > I would like to highlight the > difference which you will see with WaitForParallelWorkersToAttach as > compare to WaitForParallelWorkersToFinish() is that the former will > give you how many of nworkers_launched workers are actually launched > whereas latter gives an error if any of the expected workers is not > launched. I feel former is good and your proposed way of calling it > after the leader is done with its work has alleviated the minor > disadvantage of this API which is that we need for workers to startup. I'm not sure that it makes much difference, though, since in the end WaitForParallelWorkersToFinish() is called anyway, much like nodeGather.c. Have I missed something? I had imagined that WaitForParallelWorkersToAttach() would give me an error in the style of WaitForParallelWorkersToFinish(), without actually waiting for the parallel workers to finish. > However, now I see that you and Thomas are trying to find a different > way to overcome this problem differently, so not sure if I should go > ahead or not. I have seen that you told you wanted to look at > Thomas's proposed stuff carefully tomorrow, so I will wait for you > guys to decide which way is appropriate. I suspect that the overhead of Thomas' experimental approach is going to causes problems in certain cases. Cases that are hard to foresee. That patch makes HandleParallelMessages() set ParallelMessagePending artificially, pending confirmation of having launched all workers. It was an interesting experiment, but I think that your WaitForParallelWorkersToAttach() idea has a better chance of working out. >> Once that's committed, I can >> post a new version of the patch that uses that new infrastructure -- >> I'll add a call to the new function, without changing anything else. >> Failing that, we could actually just use >> WaitForParallelWorkersToFinish(). I still don't want to use a barrier, >> mostly because it complicates parallel_leader_participation=off, >> something that Amit is in agreement with [2][3]. >> > > I think if we want we can use barrier API's to solve this problem, but > I kind of have a feeling that it doesn't seem to be the most > appropriate API, especially because existing API like > WaitForParallelWorkersToFinish() can serve the need in a similar way. I can't see a way in which using a barrier can have less complexity. I think it will have quite a bit more, and I suspect that you share this feeling. > Just to conclude, following are proposed ways to solve this problem: > > 1. Implement a new API WaitForParallelWorkersToAttach and use that to > solve this problem. Peter G. and Amit thinks, this is a good way to > solve this problem. > 2. Use existing API WaitForParallelWorkersToFinish to solve this > problem. Peter G. feels that if API mentioned in #1 is not available, > we can use this to solve the problem and I agree with that position. > Thomas is not against it. > 3. Use Thomas's new way to detect such failures. It is not clear to > me at this stage if any one of us have accepted it to be the way to > proceed, but Thomas and Peter G. want to investigate it further. > 4. Use of Barrier API to solve this problem. Robert appears to be > strongly in favor of this approach. That's a good summary. The next revision of the patch will make the leader-participates-as-worker spool/Tuplelsortstate start and finish sorting before the main leader spool/Tuplelsortstate is even started. I did this with the intention of making it very clear that my approach does not assume a number of participants up-front -- that is actually something we only need a final answer on at the point that the leader merges, which is logically the last possible moment. Hopefully this will reassure Robert. It is quite a small change, but leads to a slightly cleaner organization within nbtsort.c, since _bt_begin_parallel() is the only point that has to deal with leader participation. Another minor advantage is that this makes the trace_sort overheads/duration for each of the two tuplesort within the leader non-overlapping (when the leader participates as a worker). -- Peter Geoghegan
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Amit KapilaДата:
Сообщение: Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key
Следующее
От: Peter MoserДата:
Сообщение: Re: [HACKERS] [PROPOSAL] Temporal query processing with range types