Re: parallel mode and parallel contexts
От | Robert Haas |
---|---|
Тема | Re: parallel mode and parallel contexts |
Дата | |
Msg-id | CA+TgmoYmp_=XcJEhvJZt9P8drBgW-pDpjHxBhZA79+M4o-CZQA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: parallel mode and parallel contexts (Simon Riggs <simon@2ndQuadrant.com>) |
Ответы |
Re: parallel mode and parallel contexts
Re: parallel mode and parallel contexts |
Список | pgsql-hackers |
On Mon, Jan 5, 2015 at 6:21 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > PARTIAL src/backend/access/transam/parallel.c > wait_patiently mentioned in comment but doesn’t exist Woops. Fixed. Originally, DestroyParallelContext() had a Boolean argument wait_patiently, specifying whether it should exit at once or do the equivalent of WaitForParallelWorkersToFinish() first. But that turned out not to work well - for example, in the included parallel_count example, it's very important to (1) first wait for all the workers to exit, (2) then read the final count, and (3) only after that destroy the dynamic shared memory segment. So WaitForParallelWorkersToFinish() got separated out into a different function, but I failed to update the comments. > Why not make nworkers into a uint? We don't have or use a type called uint. I could make it uint32 or uint64 or uint16, but I don't see much advantage in that. I could also make it unsigned, but we make very little use of the unsigned type generally, so I picked int as most consistent with our general practice. If there's a consensus that something else is much better than int, I will change it throughout, but I think we have bigger fish to fry. > Trampoline? Really? I think we should define what we mean by that, > somewhere, rather than just use the term as if it was self-evident. Comments improved. > Entrypints? Already noted by Andres; fixed in the attached version. > These comments don’t have any explanation or justification OK, I rewrote them. Hopefully it's better now. > This comment is copied-and-pasted too many times for safety and elegance It's not the same comment each time it appears; it appears in the exact form you quoted just twice. It does get a little long-winded, I guess, but I think that parallelism is a sufficiently-unfamiliar concept to us as a group that it makes sense to have a detailed comment in each place explaining exactly why that particular callsite requires a check, so that's what I've tried to do. > * Parallel stuff at start sounded OK > * Transaction stuff strikes me as overcomplicated and error prone. > Given there is no explanation or justification for most of it, I’d be > inclined to question why its there Gosh, I was pretty pleased with how simple the transaction integration turned out to be. Most of what's there right now is either (a) code to copy state from the master to the parallel workers or (b) code to throw errors if the workers try to things that aren't safe. I suspect there are a few things missing, but I don't see anything there that looks unnecessary. > * If we have a trampoline for DSM, why don’t we use a trampoline for > snapshots, then you wouldn’t need to do all this serialize stuff The trampoline is just to let extensions use this infrastructure if they want to; there's no way to avoid the serialization-and-restore stuff unless we switch to creating the child processes using fork(), but that would: 1. Not work on Windows. 2. Require the postmaster to deal with processes that are not its immediate children. 3. Possibly introduce other bugs. Since I've spent a year and a half trying to get this method to work and am now, I think, almost there, I'm not particularly sanguine about totally changing the approach. > This is very impressive, but it looks like we’re trying to lift too > much weight on the first lift. If we want to go anywhere near this, we > need to have very clear documentation about how and why its like that. > I’m actually very sorry to say that because the review started very > well, much much better than most. When I posted the group locking patch, it got criticized because it didn't actually do anything useful by itself; similarly, the pg_background patch was criticized for not being a large enough step toward parallelism. So, this time, I posted something more comprehensive. I don't think it's quite complete yet. I expect a committable version of this patch to be maybe another 500-1000 lines over what I have here right now -- I think it needs to do something about heavyweight locking, and I expect that there are some unsafe things that aren't quite prohibited yet. But the current patch is only 2300 lines, which is not astonishingly large for a feature of this magnitude; if anything, I'd say it's surprisingly small, due to a year and a half of effort laying the necessary groundwork via long series of preliminary commits. I'm not unwilling to divide this up some more if we can agree on a way to do that that makes sense, but I think we're nearing the point where we need to take the plunge and say, look, this is version one of parallelism. Thunk. In addition to the changes mentioned above, the attached version prohibits a few more things (as suggested by Andres) and passes the dsm_segment to the user-supplied entrypoint (as requested off-list by Andres, because otherwise you can't set up additional shm_mq structures). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления:
Следующее
От: Alvaro HerreraДата:
Сообщение: Re: Re: Patch to add functionality to specify ORDER BY in CREATE FUNCTION for SRFs