Re: Streaming replication - unable to stop the standby
От | Stefan Kaltenbrunner |
---|---|
Тема | Re: Streaming replication - unable to stop the standby |
Дата | |
Msg-id | 4BDF1458.1040807@kaltenbrunner.cc обсуждение исходный текст |
Ответ на | Re: Streaming replication - unable to stop the standby (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Streaming replication - unable to stop the standby
(Robert Haas <robertmhaas@gmail.com>)
|
Список | pgsql-hackers |
Tom Lane wrote: > Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes: >> I'm currently testing SR/HS in 9.0beta1 and I noticed that it seems >> quite easy to end up in a situation where you have a standby that seems >> to be stuck in: > >> $ psql -p 5433 >> psql: FATAL: the database system is shutting down > >> but not not actually shuting down ever. I ran into that a few times now >> (mostly because I'm trying to chase a recovery issue I hit during >> earlier testing) by simply having the master iterate between a pgbench >> run and "idle" while simple doing pg_ctl restart in a loop on the standby. >> I do vaguely recall some discussions of that but I thought the issue git >> settled somehow? > > Hm, I haven't pushed this hard but "pg_ctl stop" seems to stop the > standby for me. Which subprocesses of the slave postmaster are still > around? Could you attach to them with gdb and get stack traces? it is not always failing to shut down - it only fails sometimes - I have not exactly pinpointed yet what it is causing this but the standby is in a weird state now: * the master is currently idle * the standby has no connections at all logs from the standby: FATAL: the database system is shutting down FATAL: the database system is shutting down FATAL: replication terminated by primary server LOG: restored log file "000000010000001900000054" from archive cp: cannot stat `/mnt/space/wal-archive/000000010000001900000055': No such file or directory LOG: record with zero length at 19/55000078 cp: cannot stat `/mnt/space/wal-archive/000000010000001900000055': No such file or directory FATAL: could not connect to the primary server: could not connect to server: Connection refused Is the server running on host "localhost" and accepting TCP/IP connections on port 5432?couldnot connect to server: Connection refused Is the server running on host "localhost" and accepting TCP/IPconnections on port 5432? cp: cannot stat `/mnt/space/wal-archive/000000010000001900000055': No such file or directory cp: cannot stat `/mnt/space/wal-archive/000000010000001900000055': No such file or directory LOG: streaming replication successfully connected to primary FATAL: the database system is shutting down the first two "FATAL: the database system is shutting down" are from me trying to connect using psql after i noticed that pg_ctl failed to shutdown the slave. The next thing I tried was restarting the master - which lead to the following logs and the standby noticing that and reconnecting but you cannot actually connect... process tree for the standby is: 29523 pts/2 S 0:00 /home/postgres9/pginst/bin/postgres -D /mnt/space/pgdata_standby 29524 ? Ss 0:06 \_ postgres: startup process waiting for 000000010000001900000055 29529 ? Ss 0:00 \_ postgres: writer process 29835 ? Ss 0:00 \_ postgres: wal receiver process streaming 19/55000078 Stefan
В списке pgsql-hackers по дате отправления: