SIGUSR1 pingpong between master na autovacum launcher causes crash
От | Zdenek Kotala |
---|---|
Тема | SIGUSR1 pingpong between master na autovacum launcher causes crash |
Дата | |
Msg-id | 1250860954.1239.114.camel@localhost обсуждение исходный текст |
Ответы |
Re: SIGUSR1 pingpong between master na autovacum
launcher causes crash
(Alvaro Herrera <alvherre@commandprompt.com>)
|
Список | pgsql-hackers |
I found following core file of PG 8.4.0 on my system (Solaris Nevada b119): fe8ae42d _dowrite (85bf6e8, 3a, 8035e3c, 80350e8) + 8dfe8ae743 _ndoprnt (85bf6e8, 8035ec8, 8035e3c, 0) + 2bafe8b322d vsnprintf(85bfaf0, 3ff, 85bf6e8, 8035ec8, 0, 0) + 65082194ea appendStringInfoVA (8035e9c, 85bf6e8, 8035ec8) + 4a083ca5d3errmsg (849c340, 0) + 1030829272d StartAutoVacWorker (fe97f000, 32, 85b82b0, 8035ef4, 82a1496, c) + 3d082a1901StartAutovacuumWorker (c, 8035f08, fe8ed28f, 10, 0, 8035fbc) + 71082a1496 sigusr1_handler (10, 0, 8035fbc) + 186fe8ed28f__sighndlr (10, 0, 8035fbc, 82a1310) + ffe8e031f call_user_handler (10) + 2affe8e054f sigacthandler (10, 0, 8035fbc)+ df--- called from signal handler with signal 16 (SIGUSR1) ---fe8f37f6 __systemcall (3, fec32b88, 0, fe8e0b46) +6fe8e0c71 thr_sigsetmask (3, 85abd50, 0, fe8e0d18) + 139fe8e0d3f sigprocmask (3, 85abd50, 0) + 31082a14a4 sigusr1_handler(10, 0, 8036340) + 194fe8ed28f __sighndlr (10, 0, 8036340, 82a1310) + ffe8e031f call_user_handler (10) + 2affe8e054fsigacthandler (10, 0, 8036340) + df ... 80x same sighandler stack --- called from signal handler with signal 16 (SIGUSR1) ---fe8f37f6 __systemcall (3, fec32b88, 0, fe8e0b46) + 6fe8e0c71 thr_sigsetmask(3, 85abd50, 0, fe8e0d18) + 139fe8e0d3f sigprocmask (3, 85abd50, 0) + 31082a14a4 sigusr1_handler (10, 0, 80478fc)+ 194fe8ed28f __sighndlr (10, 0, 80478fc, 82a1310) + ffe8e031f call_user_handler (10) + 2affe8e054f sigacthandler(10, 0, 80478fc) + df--- called from signal handler with signal 16 (SIGUSR1) ---fe8f1867 __pollsys (8047b50,2, 8047c04, 0) + 7fe89ce61 pselect (6, 8047c44, 0, 0, 8047c04, 0) + 199fe89d236 select (6, 8047c44, 0, 0, 8047c38,0) + 780829dc20 ServerLoop (feffb804, bd26003b, 41b21fcb, 85c1de0, 1, 0) + c00829d5d0 PostmasterMain (3, 85b72c8)+ dd008227abf main (3, 85b72c8, 8047df0, 8047d9c) + 22f080b893d _start (3, 8047e80, 8047ea5, 8047ea8, 0, 8047ec2)+ 7d The problem what I see here is that StartAutovacuumWorker() fails and send SIGUSR1 to the postmaster, but it send it too quickly and signal handler is still active. When signal mask is unblocked in sigusr1_handler() than signal handler is run again... The reason why StartAutovacuumWorker() is interesting. Log says: LOG: could not fork autovacuum worker process: Not enough space It is strange and I don't understand it. May be too many nested signal handlers call could cause it. Strange also is that 100ms is not enough to protect this situation, but I think that sleep could interrupted by signal. My suggestion is to set for example gotUSR1=true in sigusr1_handler() and in the server loop check if we got a USR1 signal. It avoids any problems with signal handler which is not currently POSIX compliant anyway. any other ideas? Zdenek
В списке pgsql-hackers по дате отправления: