Обсуждение: Add checkpoint/redo LSNs to recovery errors.
Hackers, This patch adds checkpoint/redo LSNs to recovery error messages where they may be useful for debugging. When backup_label is present the LSNs are already output in a log message, but it still seems like a good idea to repeat them. When backup_label is not present, the checkpoint LSN is not logged unless backup recovery is in progress or the checkpoint is found. In the case where a backup is restored but the backup_label is missing, the checkpoint LSN is not logged so it is useful for debugging to have it in the error message. Regards, -David
Вложения
On Thu, Feb 29, 2024 at 10:53:15AM +1300, David Steele wrote:
> This patch adds checkpoint/redo LSNs to recovery error messages where they
> may be useful for debugging.
Thanks for following up on that!
> When backup_label is not present, the checkpoint LSN is not logged unless
> backup recovery is in progress or the checkpoint is found. In the case where
> a backup is restored but the backup_label is missing, the checkpoint LSN is
> not logged so it is useful for debugging to have it in the error message.
ereport(PANIC,
- (errmsg("could not locate a valid checkpoint record")));
+ (errmsg("could not locate a valid checkpoint record at %X/%X",
+ LSN_FORMAT_ARGS(CheckPointLoc))));
}
I've seen this one in the field occasionally, so that's really a
welcome addition IMO.
I've scanned a bit xlogrecovery.c, and I have spotted a couple of that
could gain more information.
ereport(PANIC,
(errmsg("invalid redo record in shutdown checkpoint")));
[...]
ereport(PANIC,
(errmsg("invalid redo in checkpoint record")));
These two could mention CheckPointLoc and checkPoint.redo.
ereport(PANIC,
(errmsg("invalid next transaction ID")));
Perhaps some XID information could be added here?
ereport(FATAL,
(errmsg("WAL ends before consistent recovery point")));
[...]
ereport(FATAL,
(errmsg("WAL ends before end of online backup"),
These two are in xlog.c, and don't mention backupStartPoint for the
first one. Perhaps there's a point in adding some information about
EndOfLog and LocalMinRecoveryPoint as well?
--
Michael
Вложения
On 2/29/24 16:42, Michael Paquier wrote:
> On Thu, Feb 29, 2024 at 10:53:15AM +1300, David Steele wrote:
>> This patch adds checkpoint/redo LSNs to recovery error messages where they
>> may be useful for debugging.
>
> I've scanned a bit xlogrecovery.c, and I have spotted a couple of that
> could gain more information.
>
> ereport(PANIC,
> (errmsg("invalid redo record in shutdown checkpoint")));
> [...]
> ereport(PANIC,
> (errmsg("invalid redo in checkpoint record")));
> These two could mention CheckPointLoc and checkPoint.redo.
>
> ereport(PANIC,
> (errmsg("invalid next transaction ID")));
> Perhaps some XID information could be added here?
>
> ereport(FATAL,
> (errmsg("WAL ends before consistent recovery point")));
> [...]
> ereport(FATAL,
> (errmsg("WAL ends before end of online backup"),
>
> These two are in xlog.c, and don't mention backupStartPoint for the
> first one. Perhaps there's a point in adding some information about
> EndOfLog and LocalMinRecoveryPoint as well?
For now I'd like to just focus on these three messages that are related
to a missing backup_label or a misconfiguration of restore_command when
backup_label is present.
No doubt there are many other recovery log messages that could be
improved, but I'd rather not bog down the patch.
Regards,
-David
On Sun, Mar 10, 2024 at 04:58:19PM +1300, David Steele wrote: > No doubt there are many other recovery log messages that could be improved, > but I'd rather not bog down the patch. Fair argument, and these additions are useful when taken independently. I've applied your suggestions for now. -- Michael