standby recovery fails (tablespace related) (tentative patch and discussion)

Поиск
Список
Период
Сортировка
От Paul Guo
Тема standby recovery fails (tablespace related) (tentative patch and discussion)
Дата
Msg-id CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
обсуждение исходный текст
Ответы Re: standby recovery fails (tablespace related) (tentative patch and discussion)  (Asim R P <apraveen@pivotal.io>)
Список pgsql-hackers
Hello postgres hackers,

Recently my colleagues and I encountered an issue: a standby can not recover after an unclean shutdown and it's related to tablespace.
The issue is that the standby re-replay some xlog that needs tablespace directories (e.g. create a database with tablespace),
but the tablespace directories has already been removed in the previous replay. 

In details, the standby normally finishes replaying for the below operations, but due to unclean shutdown, the redo lsn
is not updated in pg_control and is still kept a value before the 'create db with tabspace' xlog, however since the tablespace
directories were removed so it reports error when repay the database create wal.

create db with tablespace
drop database
drop tablespace.

Here is the log on the standby.
2019-04-17 14:52:14.926 CST [23029] LOG:  starting PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4), 64-bit
2019-04-17 14:52:14.927 CST [23029] LOG:  listening on IPv4 address "192.168.35.130", port 5432
2019-04-17 14:52:14.929 CST [23029] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2019-04-17 14:52:14.943 CST [23030] LOG:  database system was interrupted while in recovery at log time 2019-04-17 14:48:27 CST
2019-04-17 14:52:14.943 CST [23030] HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2019-04-17 14:52:14.949 CST [23030] LOG:  entering standby mode                
2019-04-17 14:52:14.950 CST [23030] LOG:  redo starts at 0/30105B8              
2019-04-17 14:52:14.951 CST [23030] FATAL:  could not create directory "pg_tblspc/65546/PG_12_201904072/65547": No such file or directory
2019-04-17 14:52:14.951 CST [23030] CONTEXT:  WAL redo at 0/3011650 for Database/CREATE: copy dir 1663/1 to 65546/65547
2019-04-17 14:52:14.951 CST [23029] LOG:  startup process (PID 23030) exited with exit code 1
2019-04-17 14:52:14.951 CST [23029] LOG:  terminating any other active server processes
2019-04-17 14:52:14.953 CST [23029] LOG:  database system is shut down          

Steps to reprodce:

1. setup a master and standby.
2. On both side, run: mkdir /tmp/some_isolation2_pg_basebackup_tablespace

3. Run SQLs:
drop tablespace if exists some_isolation2_pg_basebackup_tablespace; 
create tablespace some_isolation2_pg_basebackup_tablespace location '/tmp/some_isolation2_pg_basebackup_tablespace';

3. Clean shutdown and restart both postgres instances.

4. Run the following SQLs:

drop database if exists some_database_with_tablespace; 
create database some_database_with_tablespace tablespace some_isolation2_pg_basebackup_tablespace; 
drop database some_database_with_tablespace;
drop tablespace some_isolation2_pg_basebackup_tablespace; 
\! pkill -9 postgres; ssh host70 pkill -9 postgres

Note immediate shutdown via pg_ctl should also be able to reproduce and the above steps probably does not 100% reproduce.

I created an initial patch for this issue (see the attachment). The idea is re-creating those directories recursively. The above issue exists in dbase_redo(),
but TablespaceCreateDbspace (for relation file create redo) is probably buggy also so I modified that function also. Even there is no bug
in that function, it seems that using simple pg_mkdir_p() is cleaner. Note reading TablespaceCreateDbspace(), I found it seems that this issue
has already be thought though insufficient but frankly this solution (directory recreation) seems to be not perfect given actually this should
have been the responsibility of tablespace creation (also tablespace creation does more like symlink creation, etc). Also, I'm not sure whether
we need to use invalid page mechanism (see xlogutils.c).

Another solution is that, actually, we create a checkpoint when createdb/movedb/dropdb/droptablespace, maybe we should enforce to create
restartpoint on standby for such special kind of checkpoint wal - that means we need to set a flag in checkpoing wal and let checkpoint redo
code to create restartpoint if that flag is set. This solution seems to be safer.

Thanks,
Paul

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jiří Fejfar
Дата:
Сообщение: Re: extensions are hitting the ceiling
Следующее
От: "Zhang, Jie"
Дата:
Сообщение: [patch] pg_test_timing does not prompt illegal option