Обсуждение: pg_ctl promote causes error "could not read block" (PG 9.5.0 and 9.5.4)

Поиск
Список
Период
Сортировка

pg_ctl promote causes error "could not read block" (PG 9.5.0 and 9.5.4)

От
raj 1988
Дата:
Hi there,

we are running into this weird issue of table getting on READ ONLY mode with below error:

ERROR:  could not read block 54 in file "base/<databaseoid>/215619": read only 0 of 8192 bytes

We are facing this whenever we promote a streaming standby using pg_ctl promote command, and this is happing on PG 9.5.0 and 9.5.4 and OEL 6.9

Are we hitting some bug? tried to look around but not able to confirm if we are hitting a bug or not.  For us this is happening consistently on different servers whenever we do pg_ctl promote and then it block WRITE on that table.

As of now we get rid of the error either by doing vacuum full or CTAS, but i am afraid what we will do in case this happens to our few TB tables.


Thanks a lot in advance

-raj

Re: pg_ctl promote causes error "could not read block" (PG 9.5.0 and9.5.4)

От
Michael Paquier
Дата:
On Wed, Mar 28, 2018 at 09:36:11AM -0700, raj 1988 wrote:
> Are we hitting some bug? tried to look around but not able to confirm if we
> are hitting a bug or not.  For us this is happening consistently on
> different servers whenever we do pg_ctl promote and then it block WRITE on
> that table.

This has the strong smell of the FSM bug fixed in 9.5.5:
https://www.postgresql.org/docs/devel/static/release-9-5-5.html

So, in order to get things right:
1) Update to the latest version of Postgres 9.5.
2) Make sure that your cluster gets in a clean state.  There are
instructions here:
https://wiki.postgresql.org/wiki/Free_Space_Map_Problems

> As of now we get rid of the error either by doing vacuum full or CTAS, but
> i am afraid what we will do in case this happens to our few TB tables.

This rebuilds the file-space map, which is why it goes away.  You really
want to do the work I am mentioning above to get back to a clean state.
--
Michael

Вложения