Re: str replication failed, restart fixed it

Поиск
Список
Период
Сортировка
От Willy-Bas Loos
Тема Re: str replication failed, restart fixed it
Дата
Msg-id CAHnozTg4PAR-JLjwp++CNPDc2sM=VWhQfvD-=K_0XqpE2C1JOA@mail.gmail.com
обсуждение исходный текст
Ответ на str replication failed, restart fixed it  (Willy-Bas Loos <willybas@gmail.com>)
Список pgsql-general
This is very probably an OpenVZ issue, it can be solved by bringing down the shared_buffers a lot.
The restart works because the server is in fact down. I think pg_lsclusters showed online because of a stale runfile.

I was hoping that the memory allocation improvements in postgres 9.3 would solve these issues, but this post makes me think that they won't:
http://www.postgresql.org/message-id/CAHyXU0xa5EgvjeH=4vp-eZDJdS5kMQuiDivvTRLjY-uZ62Y44w@mail.gmail.com

Does anyone know solutions?

Cheers,

WBL


On Wed, Feb 26, 2014 at 10:53 AM, Willy-Bas Loos <willybas@gmail.com> wrote:
Hi,

I had a problem today and i fixed it by restarting postgres.
That doesn't seem to make sense to me, what could have been going on?

This is the log:
2014-02-26 04:30:45 CET db: ip: us: FATAL:  could not send data to WAL stream: SSL error: sslv3 alert unexpected message
       
cp: cannot stat `/data/postgresql/9.1/main/wal_archive/000000010000006400000062': No such file or directory
2014-02-26 04:30:45 CET db: ip: us: LOG:  unexpected pageaddr 64/3FBC6000 in log file 100, segment 98, offset 12345344
cp: cannot stat `/data/postgresql/9.1/main/wal_archive/000000010000006400000062': No such file or directory
2014-02-26 04:30:45 CET db: ip: us: LOG:  streaming replication successfully connected to primary
2014-02-26 04:32:09 CET db: ip: us: LOG:  startup process (PID 5385) was terminated by signal 7: Bus error
2014-02-26 04:32:09 CET db: ip: us: LOG:  terminating any other active server processes

The cluster was "online" according to pg_lsclusters, but it was not possible to connect to it:
psql: could not connect to server: No such file or directory
    Is the server running locally and accepting
    connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

uptime tells me this:
postgres@myserver:~$ uptime
 10:47:27 up 89 days, 42 min,  1 user,  load average: 0.00, 0.00, 0.00

This is postgresql 9.1 on Ubuntu 12.04 on OpenVZ

The weirdest thing is that restarting the postgres cluster fixed it.
Does this make any sense to you?

Cheers,

WBL
--
"Quality comes from focus and clarity of purpose" -- Mark Shuttleworth



--
"Quality comes from focus and clarity of purpose" -- Mark Shuttleworth

В списке pgsql-general по дате отправления:

Предыдущее
От: "Tomas Vondra"
Дата:
Сообщение: Re: cannot delete corrupted rows after DB corruption: tuple concurrently updated
Следующее
От: Leonardo M. Ramé
Дата:
Сообщение: Determine Client Encoding