Re: str replication failed, restart fixed it

Поиск

Список

Период

Сортировка

От	Willy-Bas Loos
Тема	Re: str replication failed, restart fixed it
Дата	26 февраля 2014 г. 15:08:01
Msg-id	CAHnozTg4PAR-JLjwp++CNPDc2sM=VWhQfvD-=K_0XqpE2C1JOA@mail.gmail.com обсуждение исходный текст
Ответ на	str replication failed, restart fixed it (Willy-Bas Loos <willybas@gmail.com>)
Список	pgsql-general

Дерево обсуждения

This is very probably an OpenVZ issue, it can be solved by bringing down the shared_buffers a lot.

The restart works because the server is in fact down. I think pg_lsclusters showed online because of a stale runfile.

I was hoping that the memory allocation improvements in postgres 9.3 would solve these issues, but this post makes me think that they won't:
http://www.postgresql.org/message-id/CAHyXU0xa5EgvjeH=4vp-eZDJdS5kMQuiDivvTRLjY-uZ62Y44w@mail.gmail.com

Does anyone know solutions?

Cheers,

WBL

On Wed, Feb 26, 2014 at 10:53 AM, Willy-Bas Loos <willybas@gmail.com> wrote:

Hi,

I had a problem today and i fixed it by restarting postgres.
That doesn't seem to make sense to me, what could have been going on?

This is the log:
2014-02-26 04:30:45 CET db: ip: us: FATAL: could not send data to WAL stream: SSL error: sslv3 alert unexpected message

cp: cannot stat `/data/postgresql/9.1/main/wal_archive/000000010000006400000062': No such file or directory
2014-02-26 04:30:45 CET db: ip: us: LOG: unexpected pageaddr 64/3FBC6000 in log file 100, segment 98, offset 12345344
cp: cannot stat `/data/postgresql/9.1/main/wal_archive/000000010000006400000062': No such file or directory
2014-02-26 04:30:45 CET db: ip: us: LOG: streaming replication successfully connected to primary
2014-02-26 04:32:09 CET db: ip: us: LOG: startup process (PID 5385) was terminated by signal 7: Bus error
2014-02-26 04:32:09 CET db: ip: us: LOG: terminating any other active server processes

The cluster was "online" according to pg_lsclusters, but it was not possible to connect to it:
psql: could not connect to server: No such file or directory
    Is the server running locally and accepting
    connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

uptime tells me this:
postgres@myserver:~$ uptime
10:47:27 up 89 days, 42 min, 1 user, load average: 0.00, 0.00, 0.00

This is postgresql 9.1 on Ubuntu 12.04 on OpenVZ

The weirdest thing is that restarting the postgres cluster fixed it.
Does this make any sense to you?

Cheers,

WBL
--
"Quality comes from focus and clarity of purpose" -- Mark Shuttleworth

--
"Quality comes from focus and clarity of purpose" -- Mark Shuttleworth

В списке pgsql-general по дате отправления:

Предыдущее

От: "Tomas Vondra"
Дата: 26 февраля 2014 г., 14:00:04
Сообщение: Re: cannot delete corrupted rows after DB corruption: tuple concurrently updated

Следующее

От: Leonardo M. Ramé
Дата: 26 февраля 2014 г., 16:16:11
Сообщение: Determine Client Encoding

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: str replication failed, restart fixed it

Предыдущее

Следующее