Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Дата
Msg-id CA+hUKGJqWyqH08WDe=XvW_Aka3Vf7379vC1MfGLeGfizfKR2MA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
On Fri, Feb 17, 2023 at 4:21 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> While contemplating what else a mandatory file lock might break, I
> remembered that basebackup.c also reads the control file.  Hrmph.  Not
> addressed yet; I guess it might need to acquire/release around
> sendFile(sink, XLOG_CONTROL_FILE, ...)?

If we go this way, I suppose, in theory at least, someone with
external pg_backup_start()-based tools might also want to hold the
lock while copying pg_control.  Otherwise they might fail to open it
on Windows (where that patch uses a mandatory lock) or copy garbage on
Linux (as they can today, I assume), with non-zero probability -- at
least when copying files from a hot standby.  Or backup tools might
want to get the file contents through some entirely different
mechanism that does the right interlocking (whatever that might be,
maybe inside the server).  Perhaps this is not so much the localised
systems programming curiosity I thought it was, and has implications
that'd need to be part of the documented low-level backup steps.  It
makes me like the idea a bit less.  It'd be good to hear from backup
gurus what they think about that.

One cute hack I thought of to make the file lock effectively advisory
on Windows is to lock a byte range *past the end* of the file (the
documentation says you can do that).  That shouldn't stop programs
that want to read the file without locking and don't know/care about
our scheme (that is, pre-existing backup tools that haven't considered
this problem and remain oblivious or accept the (low) risk of torn
reads), but will block other participants in our scheme.

If we went back to the keep-rereading-until-it-stops-changing model,
then an external backup tool would need to be prepared to do that too,
in theory at least.  Maybe some already do something like that?

Or maybe the problem is/was too theoretical before...



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Possible false valgrind error reports
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: Fix the description of GUC "max_locks_per_transaction" and "max_pred_locks_per_transaction" in guc_table.c