Re: race condition when writing pg_control

Поиск
Список
Период
Сортировка
От Bossart, Nathan
Тема Re: race condition when writing pg_control
Дата
Msg-id D162D812-D724-4B79-8D6C-CB1EFC9A796E@amazon.com
обсуждение исходный текст
Ответ на Re: race condition when writing pg_control  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: race condition when writing pg_control  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
On 5/22/20, 10:40 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
> On Sat, May 23, 2020 at 01:00:17AM +0900, Fujii Masao wrote:
>> Per my quick check, XLogReportParameters() seems to have the similar issue,
>> i.e., it updates the control file without taking ControlFileLock.
>> Maybe we should fix this at the same time?
>
> Yeah.  It also checks the control file values, implying that we should
> have LW_SHARED taken at least at the beginning, but this lock cannot
> be upgraded we need LW_EXCLUSIVE the whole time.  I am wondering if we
> should check with an assert if ControlFileLock is taken when going
> through UpdateControlFile().  We have one code path at the beginning
> of redo where we don't need a lock close to the backup_label file
> checks, but we could just pass down a boolean flag to the routine to
> handle that case.  Another good thing in having an assert is that any
> new caller of UpdateControlFile() would need to think about the need
> of a lock.

While an assertion in UpdateControlFile() would not have helped us
catch the problem I initially reported, it does seem worthwhile to add
it.  I have attached a patch that adds this assertion and also
attempts to fix XLogReportParameters().  Since there is only one place
where we feel it is safe to call UpdateControlFile() without a lock, I
just changed it to take the lock.  I don't think this adds any sort of
significant contention risk, and IMO it is a bit cleaner than the
boolean flag.

For the XLogReportParameters() fix, I simply added an exclusive lock
acquisition for the portion that updates the values in shared memory
and calls UpdateControlFile().  IIUC the first part of this function
that accesses several ControlFile values should be safe, as none of
them can be updated after server start.

Nathan


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: hash join error improvement (old)
Следующее
От: Robert Haas
Дата:
Сообщение: Re: some grammar refactoring