Re: src/test/recovery regression failure on bionic
От | Tom Lane |
---|---|
Тема | Re: src/test/recovery regression failure on bionic |
Дата | |
Msg-id | 1462.1578522666@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: src/test/recovery regression failure on bionic (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: src/test/recovery regression failure on bionic
|
Список | pgsql-hackers |
I wrote: > This would happen if anything is causing the postmaster to have > a few more open files than the test added by commit > d207038053837ae9365df2776371632387f6f655 is allowing for. It's > a test bug and nothing more. > Why sidewinder is not showing this in HEAD too is an interesting > question, but it isn't. However, it could be that on another > platform (ie bionic) the problem does manifest in HEAD. I set up a NetBSD 7 installation locally, and while I have not directly reproduced the failure, I believe I understand all the components of it now. (1) d20703805's test will clearly fall over if there are more than six FDs open in the postmaster when set_max_safe_fds is called, because it sets max_files_per_process = 26 while set_max_safe_fds requires at least 20 usable FDs to be available. (2) The postmaster's stdin/stdout/stderr will surely eat up three of those. (3) In HEAD, that's actually all the FDs there are normally, but in the back branches there is one more (under the conditions of this test), because in the back branches we open the postmaster's listen sockets before we run set_max_safe_fds. (9a86f03b4 changed this.) (4) NetBSD 7.0's cron leaves three extra open FDs in processes that it spawns. I have not looked into why, but I have experimentally observed this. For example, lsof on a "sleep" launched from cron shows COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME sleep 7824 tgl cwd VDIR 0,0 512 795201 /home/tgl sleep 7824 tgl txt VREG 0,0 10431 1613152 /bin/sleep sleep 7824 tgl txt VREG 0,0 1616564 22726 /lib/libc.so.12.193.1 sleep 7824 tgl txt VREG 0,0 55295 22747 /lib/libgcc_s.so.1.0 sleep 7824 tgl txt VREG 0,0 187183 22762 /lib/libm.so.0.11 sleep 7824 tgl txt VREG 0,0 92195 1499524 /libexec/ld.elf_so sleep 7824 tgl 0r PIPE 0xfffffe803131eb58 16384 sleep 7824 tgl 1w PIPE 0xfffffe8007ec4a30 0 ->0xfffffe800cc0d2c0 sleep 7824 tgl 2w PIPE 0xfffffe8007ec4a30 0 ->0xfffffe800cc0d2c0 sleep 7824 tgl 7u unknown file system type: 0 sleep 7824 tgl 8u unknown file system type: 0 sleep 7824 tgl 9w PIPE 0xfffffe80036c4dc0 0 while of course "sleep" launched by hand has only 0/1/2 open. We may conclude that when the regression tests are launched from cron, as would be typical for a buildfarm animal, HEAD has exactly zero FDs to spare in this test, while the back branches are one FD underwater and fail. This matches the observed results from sidewinder. It's not clear whether any of this info applies to Christoph's trouble with bionic. If the extra FDs are an old cron bug, it could be that bionic shares that bug --- but to explain failure on HEAD, you'd have to posit four excess FDs not three. I'm not convinced that what Christoph is seeing matches this anyway; he hasn't showed the telltale "insufficient file descriptors" message, at least. Still, maybe launched-by-cron vs launched-by-hand is a relevant point there. regards, tom lane
В списке pgsql-hackers по дате отправления:
Следующее
От: Robert HaasДата:
Сообщение: Re: Removing pg_pltemplate and creating "trustable" extensions