Re: Stronger safeguard for archive recovery not to miss data

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Stronger safeguard for archive recovery not to miss data
Дата
Msg-id 20210405.161327.326176026982929551.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: Stronger safeguard for archive recovery not to miss data  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Ответы Re: Stronger safeguard for archive recovery not to miss data
Список pgsql-hackers
At Mon, 5 Apr 2021 12:34:53 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> 
> 
> On 2021/04/04 11:58, osumi.takamichi@fujitsu.com wrote:
> >> IMO it's better to comment why this server restart is necessary.
> >> As far as I understand correctly, this is necessary to ensure the WAL
> >> file
> >> containing the record about the change of wal_level (to minimal) is
> >> archived,
> >> so that the subsequent archive recovery will be able to replay it.
> > OK, added some comments. Further, I felt the way I wrote this part was
> > not good at all and self-evident
> > and developers who read this test would feel uneasy about that point.
> > So, a little bit fixed that test so that we can get clearer conviction
> > for wal archive.
> 
> LGTM. Thanks for updating the patch!
> 
> Attached is the updated version of the patch. I applied the following
> changes.

+                 errhint("Use a backup taken after setting wal_level to higher than minimal "
+                         "or recover to the point in time before wal_level was changed to minimal even though it may
causedata loss.")));
 

Looking the HINT message, I thought that it's hard to find where up to
I should recover.  The caller knows the LSN of last PARAMETER_CHANGE
record so we can show that as the recovery-end LSN.  On the other
hand, I'm not sure it's worth considering tough, we can reach here
when starting archive recovery on a server with wal_level=minimal.  We
can pass 0 as LSN to notify that case.

If we do that, we can emit more clear message like the following.

WAL-while-minimal case
 FATAL:  WAL generated with wal_level=minimal at 0/40000A0, data may be missing
 HINT:  Use a backup later than, or recover up to before that LSN.

Mis-setting case
 FATAL:  archive recovery is not available on server with wal_level=minimal
 HINT:  Start this server with setting wal_level=replica or higher


The value "replica" is not double-quoted there (since before this
patch), but double-quoted in another error message about hot standby
just below.  Maybe we should let them in the same style.

> Could you review this version? Barring any objection, I'm thinking to
> commit this.
> 
> +sub check_wal_level
> +{
> +    my ($target_wal_level, $explanation) = @_;
> +
> +    is( $node->safe_psql(
> +            'postgres', q{show wal_level}),
> +        $target_wal_level,
> +        $explanation);
> 
> Do we really need this test? This test doesn't seem to be directly
> related
> to the test purpose. It seems to be testing the behavior of the
> PostgresNode
> functions. So I removed this from the patch.

+1.

> +# This protection should apply to recovery mode
> +my $another_node = get_new_node('another_node');
> 
> The same test is performed twice with different settings. So isn't it
> better
> to merge the code for both tests into one common function, to simplify
> the code? I did that.

Sounds reasonable for that size.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Any objections to implementing LogicalDecodeMessageCB for pgoutput?
Следующее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Replication slot stats misgivings