isolation test deadlocking on buildfarm member coypu

Поиск
Список
Период
Сортировка
От Rémi Zara
Тема isolation test deadlocking on buildfarm member coypu
Дата
Msg-id 4082D6E2-8965-4D90-A3AC-E1008DFEF23F@mac.com
обсуждение исходный текст
Ответы Re: isolation test deadlocking on buildfarm member coypu
Список pgsql-hackers
Hi,

Isolation tests seem to deadlock on buildfarm member coypu (NetBSD/powerpc 5.1).

Here are the process for one deadlock (several days) :

pgbuildfarm  2405  0.0  1.2 26948  7916 ?     Is   Wed02AM    0:00.25 postgres: pgbuildfarm isolationtest [local]
INSERTwaiting  
pgbuildfarm  6559  0.0  0.4 10288  2684 ?     I    Wed02AM    0:00.05 ./pg_isolation_regress
--psqldir=/home/pgbuildfarm/workdir/HEAD/inst/bin--inputdir=. --schedule=./isolation_schedule (pg_isolation_reg) 
pgbuildfarm  8296  0.0  1.8 25924 11880 ?     Ss   Wed02AM    0:59.79 postgres: writer process
pgbuildfarm 10630  0.0  0.4 16212  2916 ?     Ss   Wed02AM    1:43.16 postgres: stats collector process
pgbuildfarm 11814  0.0  0.7  9084  4408 ?     I    Wed01AM    0:00.39 /usr/pkg/bin/perl ./run_branches.pl --run-one
pgbuildfarm 14785  0.0  0.2  3344  1276 ?     I    Wed02AM    0:00.03 gmake NO_LOCALE=1 installcheck
pgbuildfarm 15353  0.0  0.8 26948  4960 ?     Ss   Wed02AM    1:10.55 postgres: autovacuum launcher process
pgbuildfarm 16619  0.0  1.8 15992 11512 ?     I    Wed01AM    0:04.90 /usr/pkg/bin/perl ./run_build.pl --config
build-farm.confHEAD  
pgbuildfarm 17532  0.0  0.5 25924  3052 ?     Ss   Wed02AM    0:34.49 postgres: wal writer process
pgbuildfarm 18073  0.0  1.0 25924  6844 ?     I    Wed02AM    0:19.83 /home/pgbuildfarm/workdir/HEAD/inst/bin/postgres
-Ddata-C  
pgbuildfarm 21310  0.0  1.2 26948  7948 ?     Is   Wed02AM    0:00.06 postgres: pgbuildfarm isolationtest [local] idle
intransaction  
pgbuildfarm 24137  0.0  0.2  3456  1064 ?     Is   Wed01AM    0:00.01 /bin/sh -c cd /home/pgbuildfarm/client-code &&
./run_branches.pl--run-one  
pgbuildfarm 26932  0.0  0.2  3456  1072 ?     I    Wed02AM    0:00.01 sh -c cd pgsql.16619/src/test/isolation && gmake
NO_LOCALE=1installcheck 2>&1  
pgbuildfarm 28430  0.0  1.3 26948  8816 ?     Ss   Wed02AM  408:43.91 postgres: pgbuildfarm isolationtest [local] idle
pgbuildfarm 28809  0.0  0.4 10476  2844 ?     R    Wed02AM   27:48.24 ./isolationtester dbname=isolationtest


I killed one of the postgres process, and it lead to a failure of the buildfarm perl script (probably due to the 2GB+
logfile), and persistance of the postgres cluster (which I had to clean manually) 
I've kept the source and install directories, if useful for investigation.


Here are the processes for a second lock (several hours):

pgbuildfarm  1423  0.0  0.4 16212  2888 ?     Ss    2:11PM    0:04.07 postgres: stats collector process
pgbuildfarm  1604  0.0  1.2 26948  7724 ?     Is    2:13PM    0:00.05 postgres: pgbuildfarm isolationtest [local] idle
intransaction  
pgbuildfarm  2380  0.0  0.4 10288  2624 ?     I     2:12PM    0:00.05 ./pg_isolation_regress
--psqldir=/home/pgbuildfarm/workdir/HEAD/inst/bin--inputdir=. --schedule=./isolation_schedule (pg_isolation_reg) 
pgbuildfarm  3156  0.0  0.8 26948  4920 ?     Ss    2:11PM    0:02.71 postgres: autovacuum launcher process
pgbuildfarm  3253  0.0  0.5 25924  3044 ?     Ss    2:11PM    0:01.65 postgres: wal writer process
pgbuildfarm  9889  0.0  1.8 25924 11856 ?     Ss    2:11PM    0:04.87 postgres: writer process
pgbuildfarm 11359  0.0  1.4 26948  9032 ?     Ss    2:13PM   15:16.63 postgres: pgbuildfarm isolationtest [local] idle
pgbuildfarm 13738  0.0  1.2 26948  7860 ?     Ss    2:13PM    0:00.06 postgres: pgbuildfarm isolationtest [local]
UPDATEwaiting  
pgbuildfarm 14031  0.0  1.0 25924  6800 ?     S     2:11PM    0:01.82 /home/pgbuildfarm/workdir/HEAD/inst/bin/postgres
-Ddata-C  
pgbuildfarm 16555  0.0  0.4 10476  2764 ?     S     2:13PM    1:01.77 ./isolationtester dbname=isolationtest
pgbuildfarm 18038  0.0  0.2  3456  1008 ?     I     2:11PM    0:00.01 sh -c cd pgsql.20707/src/test/isolation && gmake
NO_LOCALE=1installcheck 2>&1  
pgbuildfarm 19057  0.0  0.2  3456  1000 ?     Is    1:30PM    0:00.01 /bin/sh -c cd /home/pgbuildfarm/client-code &&
./run_branches.pl--run-one  
pgbuildfarm 20707  0.0  1.8 15992 11460 ?     I     1:30PM    0:04.92 /usr/pkg/bin/perl ./run_build.pl --config
build-farm.confHEAD  
pgbuildfarm 21749  0.0  0.2  3344  1248 ?     I     2:11PM    0:00.04 gmake NO_LOCALE=1 installcheck
pgbuildfarm 23054  0.0  0.7  9084  4352 ?     I     1:30PM    0:00.38 /usr/pkg/bin/perl ./run_branches.pl --run-one
pgbuildfarm  2125  0.0  0.0  2652     0 ttyp0 R+    5:09PM    0:00.00 grep pgbuildf (bash)
pgbuildfarm  9930  0.0  0.3  2652  2072 ttyp0 S     5:03PM    0:00.09 bash
pgbuildfarm 29194  0.0  0.1  3552   872 ttyp0 O+    5:09PM    0:00.01 ps -aux
pgbuildfarm@cube:workdir$date
Sat Jul 16 17:10:13 CEST 2011

The log fills with:

[4e21807d.2c5f:1060074] LOG:  execute isolationtester_waiting: SELECT 1 WHERE pg_stat_get_backend_waiting($1)
[4e21807d.2c5f:1060075] DETAIL:  parameters: $1 = '5'
[4e21807d.2c5f:1060076] LOG:  execute isolationtester_waiting: SELECT 1 WHERE pg_stat_get_backend_waiting($1)
[4e21807d.2c5f:1060077] DETAIL:  parameters: $1 = '5'

I killed process 1604 (postgres: pgbuildfarm isolationtest [local] idle in transaction). Again, the buildfarm
perlscriptkicked into action. See http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=coypu&dt=2011-07-16%2011%3A30%3A02
forthe logs 

Regards,

Rémi Zara


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "ktm@rice.edu"
Дата:
Сообщение: Re: Mysterious server crashes
Следующее
От: Florian Pflug
Дата:
Сообщение: Re: Commitfest Status: Sudden Death Overtime