Re: anole: assorted stability problems
От | Andres Freund |
---|---|
Тема | Re: anole: assorted stability problems |
Дата | |
Msg-id | 20150526030551.GU32396@alap3.anarazel.de обсуждение исходный текст |
Ответ на | anole: assorted stability problems (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Список | pgsql-hackers |
On 2015-05-20 16:21:57 -0300, Alvaro Herrera wrote: > In HEAD only. Previous branches seem mostly clean, so there's something > going wrong. Spinlocks going wrong perhaps? > > http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=anole&dt=2015-05-20%2016%3A30%3A26&stg=check > ! PANIC: stuck spinlock (c00000000d6f4140) detected at lwlock.c:816 > ! server closed the connection unexpectedly > ! This probably means the server terminated abnormally > ! before or while processing the request. > ! connection to server was lost > > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2015-05-09%2020%3A30%3A29 > ! PANIC: semop(id=0) failed: Result too large > ! server closed the connection unexpectedly > ! This probably means the server terminated abnormally > ! before or while processing the request. > ! connection to server was lost > > http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=anole&dt=2015-05-05%2018%3A39%3A38&stg=check > ! FATAL: semop(id=0) failed: File too large > ! server closed the connection unexpectedly > ! This probably means the server terminated abnormally > ! before or while processing the request. > ! connection to server was lost > > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2015-05-03%2012%3A30%3A18 > ! PANIC: semop(id=-1073741824) failed: Invalid argument > ! server closed the connection unexpectedly > ! This probably means the server terminated abnormally > ! before or while processing the request. > ! connection to server was lost > > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2015-04-29%2004%3A30%3A25 > ! PANIC: stuck spinlock (c00000000d335360) detected at lwlock.c:767 > ! server closed the connection unexpectedly > ! This probably means the server terminated abnormally > ! before or while processing the request. > ! connection to server was lost And now: ! FATAL: semop(id=-2013265921) failed: Invalid argument ! CONTEXT: SQL statement "CREATE TEMP TABLE brin_result (cid tid)" ! PL/pgSQL function inline_code_block line 20 at SQL statement ! server closed the connection unexpectedly ! This probably means the server terminated abnormally ! before or while processing the request. ! connection to server was lost Uhm: void s_init_lock_sema(volatile slock_t *lock) {static int counter = 0; *lock = (++counter) % NUM_SPINLOCK_SEMAPHORES; } One problem here might be that counter is signed. Once s_init_lock_sema has been called often enough for counter to wrap around strange things will happen. But - I don't see why this codepatch would even be hit once on this platform? It's only built !HAVE_SPINLOCKS which isn't the case on anole. So this appears to be an independent bug (9.4+). One that has lead me to find an atomics bug (9.5+, stupid forgotten codepath for atomics on spinlocks on semaphores) - which again should be independent, because it's again is only relevant when spinlocks aren't used... I'll fix both. But that leaves this problem.
В списке pgsql-hackers по дате отправления: