Re: Re: "missing chunk number XX for toast value YY in pg_toast ..." after pg_basebackup.

Поиск

Список

Период

Сортировка

От
Тема	Re: Re: "missing chunk number XX for toast value YY in pg_toast ..." after pg_basebackup.
Дата	9 марта 2016 г. 18:42:01
Msg-id	1457548913211.38205.2418@webmail7 обсуждение
Ответ на	Re: Re: "missing chunk number XX for toast value YY in pg_toast ..." after pg_basebackup. (Adrian Klaver <adrian.klaver@aklaver.com>)
Ответы	Re: Re: "missing chunk number XX for toast value YY in pg_toast ..." after pg_basebackup.
Список	pgsql-general

Дерево обсуждения

Hi Adrian,

thank you very much for your patience. I apologise for the missing information.

On 9 March 2016 16:13:00 +01:00, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 03/09/2016 04:56 AM, fredrik@huitfeldt.com wrote:
Hi Adrian,

thank you very much for your response.

I ran the "VACUUM ANALYZE" command on the master node.

Regarding log messages.

Here is the contents of the log (excluding connections/disconnections):

Assuming the below is from the replica database.

the "LOG: recovery was paused" message was indeed from the replica.

2016-02-22 02:30:08 GMT 24616 LOG: recovery has paused

So what happened to cause the above?

we automatically pause recovery on the replica before running pg_dump. This is in order to make certain that we get a consistent dump of the database.

I am not seeing anything below that indicates the recovery started again.

the reason why we do not see a matching "resume" is that the pg_dump failed and our error handling was insufficient.

2016-02-22 02:30:08 GMT 24616 HINT: Execute pg_xlog_replay_resume() to
continue.
2016-02-22 02:37:19 GMT 23859 DBNAME ERROR: missing chunk number 0 for
toast value 2747579 in pg_toast_22066
2016-02-22 02:37:19 GMT 23859 DBNAME STATEMENT: COPY public.room_shape
(room_uuid, data) TO stdout;
2016-02-22 02:37:41 GMT 2648 DBNAME LOG: could not receive data from
client: Connection reset by peer
2016-02-22 02:37:41 GMT 2648 DBNAME LOG: unexpected EOF on client
connection

What does the log from the master show?

It doesnt seem to show much. It does have these repeated messages, however:

2016-02-22 02:12:18 GMT 30908 LOG: using stale statistics instead of current ones because stats collector is not responding

2016-02-22 02:13:01 GMT 30908 LOG: using stale statistics instead of current ones because stats collector is not responding

2016-02-22 02:13:52 GMT 30908 LOG: using stale statistics instead of current ones because stats collector is not responding

There are lots of these mesages within the timeframe. There seems to be a couple of them every 2-4 hours.

Best regards,
Fredrik Huitfeldt

On 7 March 2016 16:35:29 +01:00, Adrian Klaver
<adrian.klaver@aklaver.com> wrote:
On 03/06/2016 10:18 PM, fredrik@huitfeldt.com
<mailto:fredrik@huitfeldt.com> wrote:

HI All,

i would really appreciate any help I can get on this issue.

basically, a pg_basebackup + streaming attach, led to a database
that we
could not read from afterwards.

From original post:

http://www.postgresql.org/message-id/1456919678340.31300.116900@webmail2

"The issue remained until we ran a full vacuum analyze on the cluster."

Which cluster was that, the master or the slave?

"I have logfiles from the incident, but I cannot see anything out of
the ordinary (despite having a fair amount of experience investigating
postgresql logs)."

Can we see the section before and after ERROR?

Beset regards,
Fredrik

PS please advise if this is better posted on another list.

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

Best regards,

Fredrik

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Re: "missing chunk number XX for toast value YY in pg_toast ..." after pg_basebackup.