Re: URGENT issue: pg-xlog growing on master!

Поиск
Список
Период
Сортировка
От Niels Kristian Schjødt
Тема Re: URGENT issue: pg-xlog growing on master!
Дата
Msg-id 4DB42015-B9C6-4019-9A4D-68E356C8C947@autouncle.com
обсуждение исходный текст
Ответ на Re: URGENT issue: pg-xlog growing on master!  (bricklen <bricklen@gmail.com>)
Ответы Re: URGENT issue: pg-xlog growing on master!  (bricklen <bricklen@gmail.com>)
Re: URGENT issue: pg-xlog growing on master!  (Jeff Janes <jeff.janes@gmail.com>)
Re: URGENT issue: pg-xlog growing on master!  (Matheus de Oliveira <matioli.matheus@gmail.com>)
Список pgsql-performance

Den 10/06/2013 kl. 16.36 skrev bricklen <bricklen@gmail.com>:

On Mon, Jun 10, 2013 at 4:29 AM, Niels Kristian Schjødt <nielskristian@autouncle.com> wrote:

2013-06-10 11:21:45 GMT FATAL:  could not connect to the primary server: could not connect to server: No route to host
                Is the server running on host "192.168.0.4" and accepting
                TCP/IP connections on port 5432?

Did anything get changed on the standby or master around the time this message started occurring?
On the master, what do the following show?
show port;
show listen_addresses;

The master's IP is still 192.168.0.4?

Have you tried connecting to the master using something like:
psql -h 192.168.0.4 -p 5432 -U postgres -d postgres
 
Does that throw a useful error or warning?


It turned out that the switch port that the server was connected to was faulty, and hence no successful connection between master and slave was established. This resolved in pg_xlog building up very fast, because our system performs a lot of changes on the data we store. 

I ended up running pg_archivecleanup on the master to get some space freed urgently. Then I got the switch changed with a new one. Now I'm trying to the streaming replication setup from scratch again, but with no luck.

I can't seem to figure out which steps I need to do, to get the standby server wiped and get it started as a streaming replication again from scratch. I tried to follow the steps, from step 6, in here http://wiki.postgresql.org/wiki/Streaming_Replication but the process seems to fail when I reach the point where I try to do a psql -c "SELECT pg_stop_backup()". It just says:

NOTICE:  pg_stop_backup cleanup done, waiting for required WAL segments to be archived
WARNING:  pg_stop_backup still waiting for all required WAL segments to be archived (60 seconds elapsed)
HINT:  Check that your archive_command is executing properly.  pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.
WARNING:  pg_stop_backup still waiting for all required WAL segments to be archived (120 seconds elapsed)
HINT:  Check that your archive_command is executing properly.  pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.
WARNING:  pg_stop_backup still waiting for all required WAL segments to be archived (240 seconds elapsed)
HINT:  Check that your archive_command is executing properly.  pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.
WARNING:  pg_stop_backup still waiting for all required WAL segments to be archived (480 seconds elapsed)
HINT:  Check that your archive_command is executing properly.  pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.
WARNING:  pg_stop_backup still waiting for all required WAL segments to be archived (960 seconds elapsed)
HINT:  Check that your archive_command is executing properly.  pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.
WARNING:  pg_stop_backup still waiting for all required WAL segments to be archived (1920 seconds elapsed)
HINT:  Check that your archive_command is executing properly.  pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.

When looking at ps aux on the master, I see the following:

postgres 30930  0.0  0.0  98412  1632 ?        Ss   15:59   0:02 postgres: archiver process   failed on 0000000200000E1B000000A9

The file mentioned is the one that it was about to archive, when the standby server failed. Somehow it must still be trying to "catch up" from that file which of cause isn't there any more, since I had to remove those in order to get more space on the HDD. Instead of trying to catch up from the last succeeded file, I want it to start over from scratch with the replication - I just don't know how.



В списке pgsql-performance по дате отправления:

Предыдущее
От: bricklen
Дата:
Сообщение: Re: URGENT issue: pg-xlog growing on master!
Следующее
От: bricklen
Дата:
Сообщение: Re: URGENT issue: pg-xlog growing on master!