Re: BUG #13822: Slave terminated - WAL contains references to invalid page

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: BUG #13822: Slave terminated - WAL contains references to invalid page
Дата
Msg-id CAB7nPqQyhuJjQerCBxiS1bOg46OvE-EV9Om2bTyKrfaUhFHHVg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #13822: Slave terminated - WAL contains references to invalid page  (<Marek.Petr@tieto.com>)
Ответы Re: BUG #13822: Slave terminated - WAL contains references to invalid page  (<Marek.Petr@tieto.com>)
Список pgsql-bugs
On Tue, Dec 22, 2015 at 9:05 PM,  <Marek.Petr@tieto.com> wrote:
> 2015-12-22 00:25:11 CET @  WARNING:  page 71566 of relation base/16422/23253 is uninitialized
> 2015-12-22 00:25:11 CET @  CONTEXT:  xlog redo visible: rel 1663/16422/23253; blk 71566
> 2015-12-22 00:25:11 CET @  PANIC:  WAL contains references to invalid pages
> 2015-12-22 00:25:11 CET @  CONTEXT:  xlog redo visible: rel 1663/16422/23253; blk 71566
> 2015-12-22 00:25:12 CET @  LOG:  startup process (PID 24434) was terminated by signal 6: Aborted
> 2015-12-22 00:25:12 CET @  LOG:  terminating any other active server processes

Looking more closely at that, this is the code path of the redo
routine for XLOG_HEAP2_VISIBLE. I have been looking at the area of the
code around visibilitymap_set to try to see if there could be a race
condition with another backend extending the relation and causing the
page to be uninitialized but have not found anything yet. 9.4 has been
out for some time, and this is the first report of this kind for this
redo routine. Still, you have been able to reproduce the problem
twice, so this has the smell of a bug... Others, opinions?

Did you rebuild a new slave and let the master running, and perhaps
some data corruption is coming from it? What's the state of the same
pages on the master? Are they zero'ed?

Also, are you using any parameter with a value different than the
default. I don't know fsync, full_page_writes...

> select relname from pg_class where relfilenode in ('17230','23253');
>     relname
> ----------------
>  pg_toast_17225
>  pg_toast_23246
> (2 rows)
>
> First toast's relation has 34GB, second 2452 MB.
> Is it possible to get more info from some deeper logging for the case it will occur again?

I am not sure to understand what you are looking for here. You could
make the logs more verbose but this would bloat your log partition...
--
Michael

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #13832: Syntax errors are extremely unfriendly
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: BUG #13770: Extending recovery_min_apply_delay on Standby causes it to be unavailable for a while