Re: Possible missing segments in archiving on standby

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Possible missing segments in archiving on standby
Дата
Msg-id 20210908.164033.807290883158662930.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: Possible missing segments in archiving on standby  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Ответы Re: Possible missing segments in archiving on standby  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Список pgsql-hackers
At Wed, 8 Sep 2021 16:01:22 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> 
> 
> On 2021/09/08 10:45, Kyotaro Horiguchi wrote:
> > Anyway there's no guarantee on the archive ordering. As discussed in
> > the nearby thread [1], newer segment is often archived earlier. I
> > agree that that happens mainly on busy servers, though. The archiver
> > is designed to handle such "gaps" and/or out-of-order segment
> > notifications.  We could impose a strict ordering on archiving but I
> > think we would take total performance than such strictness.
> 
> Yes, there are other cases causing newer WAL file to be archived
> eariler.
> The issue can happen if XLogArchiveNotify() fails to create .ready
> file,
> for example. Fixing only the case that we're discussing here is not
> enough.
> If *general* fix is discussed at the thread you told, it's better to
> do nothing here for the issue and to just make the startup process
> call
> XLogArchiveCheckDone() if it finds the WAL file including XLOG_SWITCH
> record.

No. The discussion taken there is not about permanently missing .ready
files, but about .ready files created out-of-order.  So I don't think
the outcome from the thread does *fix* this issue.

> > At least currently, recovery fetches segments by a single process and
> > every file is marked immediately after being filled-up, so all files
> > other than the latest one in pg_wal including history files should
> > have been marked for sure unless file system gets into a trouble.
> 
> You can reproduce that situation easily by starting the server with
> archive_mode=off, generating WAL files, sometimes running
> pg_switch_wal(),
> causing the server to crash, and then restarting the server with
> archive_mode=on. At the beginning of recovery, all the WAL files in
> pg_wal
> don't have their archive notification files at all. Then, with the
> patch,
> only WAL files including XLOG_SWITCH are notified for WAL archiving
> during recovery. The other WAL files will be notified at the
> subsequent
> checkpoint.

I don't think we want such extent of perfectness at all for the case
where some archive-related parameters are changed after a
crash. Anyway we need to take a backup after that and at least all
segments required for the backup will be properly archived.  The
segments up to the XLOG_SWITCH segment are harmless garbage, or a bit
of food for disk.

> > I'm not sure I like that XLogWalRcvClose hides the segment-switch
> > condition.  If we do that check in the function, I'd like to name the
> > function XLogWalRcvCloseIfSwitched or something indicates the
> > condition.  I'd like to invert the condition to reduce indentation,
> > too.
> 
> We can move the condition-check out of the function like the attached
> patch.

Thanks!

> > Why don't we call it just after writing data, as my first proposal
> > did? There's no difference in functionality between doing that and the
> > patch.  If we do so, recvFile>=0 is always true and that condition can
> > be removed but that would be optional.  Anyway, by doing that, no
> > longer need to call the function twice or we can even live without the
> > new function.
> 
> I think that it's better and more robust to confirm that the
> currently-opened
> WAL file is valid target one to write WAL *before* actually writing
> any data
> into it.

Sounds convincing.  Ok, I agree to that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От:
Дата:
Сообщение: RE: [PATCH] New default role allowing to change per-role/database settings
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: Gather performance analysis