Re: Race to build pg_isolation_regress in "make -j check-world"
От | Noah Misch |
---|---|
Тема | Re: Race to build pg_isolation_regress in "make -j check-world" |
Дата | |
Msg-id | 20181224221601.GA3227827@rfd.leadboat.com обсуждение исходный текст |
Ответ на | Re: Race to build pg_isolation_regress in "make -j check-world" (Noah Misch <noah@leadboat.com>) |
Список | pgsql-hackers |
On Mon, Jan 01, 2018 at 05:39:56PM -0800, Noah Misch wrote: > On Sat, Dec 16, 2017 at 08:06:05PM -0800, Noah Misch wrote: > > On Mon, Nov 06, 2017 at 12:07:52AM -0800, Noah Misch wrote: > > > I now see similar trouble from multiple > > "make" processes running "make -C contrib/test_decoding install" concurrently. > > This is a risk for any directory named in an EXTRA_INSTALL variable of more > > than one makefile. Under the right circumstances, this would affect > > contrib/hstore and others in addition to contrib/test_decoding. That brings > > me back to the locking idea: > > > > > The problem of multiple "make" processes in a directory (especially src/port) > > > shows up elsewhere. In a cleaned tree, "make -j -C src/bin" or "make -j > > > installcheck-world" will do it. For more-prominent use cases, src/Makefile > > > prevents this with ".NOTPARALLEL:" and building first the directories that are > > > frequent submake targets. Perhaps we could fix the general problem with > > > directory locking; targets that call "$(MAKE) -C FOO" would first sleep until > > > FOO's lock is available. That could be tricky to make robust. > Performance has been the principal snare. I value "make -j" being fast when > there's little to rebuild, but that shell script approach slowed an empty > build by 340% (GNU/Linux w/ SSD) to 2300% (Cygwin). In a build having nothing > to do, merely adding a no-op wrapper around "make -C" (does nothing but > execvp() the real GNU make) slowed the build by over 10%[1]. To get what I > considered decent performance took several design changes: > 3. Lock only directories known to be entered more than once per top-level > target. Preliminarily, this reduced the 147 lock acquisitions to 25. > > I regret that (3) entails ongoing maintenance of a whitelist of such > directories; adding a "make -C" call can expand the list. However, I can > automate verification of that whitelist. As I developed this more, I found it dissatisfying. To verify the whitelist, one empties the whitelist, rebuilds each makefile target, and reassembles the whitelist from the list of directories visited more than once within a target. That's too expensive to run all the time (e.g. with every "make check"). I could make a buildfarm animal turn red when the whitelist is out of date, but I wouldn't enjoy discovering or fixing whitelist drift that way. Hence, I'm back to preferring fixes targeting the known problems: - Fix conflicting writes in src/test/regress: https://postgr.es/m/flat/20181224034411.GA3224776%40rfd.leadboat.com - Fix conflicting EXTRA_INSTALL, detailed above. In the MAKELEVEL-0 part of the temp-install Makefile target, serially process all applicable EXTRA_INSTALL. See attached patch. On my usual development system, this adds <2s to "make check-world" and <0.1s to "make check" or "make -C contrib/earthdistance check". To corroborate the sufficiency of those two patches, I ran 110 iterations of "make -j40" and "make -j40 check-world" without encountering a failure attributable to parallel make. I also ran those targets at excess parallelism (-j120) and audited the list of commands appearing more than once: make clean && make -j120 2>&1 | sort | uniq -c | sort -rn make -j120 check-world 2>&1 | sort | uniq -c | sort -rn Unlike the locking method, this doesn't fix clean-tree "make -j -C src/bin" or clean-tree "make -j installcheck-world". Thanks, nm
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Alexander KorotkovДата:
Сообщение: Re: GIN predicate locking slows down valgrind isolationtests tremendously