Re: pg_rewind failure by file deletion in source server

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: pg_rewind failure by file deletion in source server
Дата
Msg-id 55894EB4.4080109@iki.fi
обсуждение исходный текст
Ответ на Re: pg_rewind failure by file deletion in source server  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: pg_rewind failure by file deletion in source server
Re: pg_rewind failure by file deletion in source server
Список pgsql-hackers
On 06/23/2015 07:51 AM, Michael Paquier wrote:
> So... Attached are a set of patches dedicated at fixing this issue:

Thanks for working on this!

> - 0001, add if_not_exists to pg_tablespace_location, returning NULL if
> path does not exist
> - 0002, same with pg_stat_file, returning NULL if file does not exist
> - 0003, same with pg_read_*file. I added them to all the existing
> functions for consistency.
> - 0004, pg_ls_dir extended with if_not_exists and include_dot_dirs
> (thanks Robert for the naming!)
> - 0005, as things get complex, a set of regression tests aimed to
> covering those things. pg_tablespace_location is platform-dependent,
> so there are no tests for it.
> - 0006, the fix for pg_rewind, using what has been implemented before.

With thes patches, pg_read_file() will return NULL for any failure to 
open the file, which makes pg_rewind to assume that the file doesn't 
exist in the source server, and will remove the file from the 
destination. That's dangerous, those functions should check specifically 
for ENOENT.

There's still a small race condition with tablespaces. If you run CREATE 
TABLESPACE in the source server while pg_rewind is running, it's 
possible that the recursive query that pg_rewind uses sees the symlink 
in pg_tblspc/ directory, but its snapshot doesn't see the row in 
pg_tablespace yet. It will think that the symlink is a regular file, try 
to read it, and fail (if we checked for ENOENT).

Actually, I think we need try to deal with symlinks a bit harder. 
Currently, pg_rewind assumes that anything in pg_tblspace that has a 
matching row in pg_tablespace is a symlink, and nothing else is. I think 
symlinks to directories. I just noticed that pg_rewind fails miserable 
if pg_xlog is a symlink, because of that:

----
The servers diverged at WAL position 0/3023F08 on timeline 1.
Rewinding from last common checkpoint at 0/2000060 on timeline 1

"data-master//pg_xlog" is not a directory
Failure, exiting
----

I think we need to add a column to pg_stat_file output, to indicate 
symbolic links, and add a pg_readlink() function. That still leaves a 
race condition if the type of a file changes, i.e. a file is deleted and 
a directory with the same name is created in its place, but that seems 
acceptable. I don't think PostgreSQL ever does such a thing, so that 
could only happen if you mess with the data directory manually while the 
server is running.

I just realized another problem: We recently learned the hard way that 
some people have files in the data directory that are not writeable by 
the 'postgres' user 
(http://www.postgresql.org/message-id/20150523172627.GA24277@msg.df7cb.de). 
pg_rewind will try to overwrite all files it doesn't recognize as 
relation files, so it's going to fail on those. A straightforward fix 
would be to first open the destination file in read-only mode, and 
compare its contents, and only open the file in write mode if it has 
changed. It would still fail when the files really differ, but I think 
that's acceptable.

I note that pg_rewind doesn't need to distinguish between an empty and a 
non-existent directory, so it's quite silly for it to pass 
include_dot_dirs=true, and then filter out "." and ".." from the result 
set. The documentation should mention the main reason for including "." 
and "..": to distinguish between an empty and non-existent directory.

- Heikki




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Sandro Santilli
Дата:
Сообщение: Memory context at PG_init call ?
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: A couple of newlines missing in pg_rewind log entries