Обсуждение: Assertion failure in pg_copy_logical_replication_slot()
Hi,
If restart_lsn of logical replication slot gets behind more than
max_slot_wal_keep_size from the current LSN, the logical replication
slot would be invalidated and its restart_lsn is reset to an invalid LSN.
If this logical replication slot with an invalid restart_lsn is specified
as the source slot in pg_copy_logical_replication_slot(), the function
causes the following assertion failure.
TRAP: FailedAssertion("!logical_slot", File: "slotfuncs.c", Line: 727)
This assertion failure is caused by
/* Copying non-reserved slot doesn't make sense */
if (XLogRecPtrIsInvalid(src_restart_lsn))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot copy a replication slot that doesn't reserve WAL")));
I *guess* this assertion check was added because restart_lsn should
not be invalid before. But in v13, it can be invalid thanks to max_slot_wal_keep_size.
I think that this assertion check seems useless and should be removed in v13.
Patch attached. Thought?
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Вложения
At Tue, 23 Jun 2020 00:17:47 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
> Hi,
>
> If restart_lsn of logical replication slot gets behind more than
> max_slot_wal_keep_size from the current LSN, the logical replication
> slot would be invalidated and its restart_lsn is reset to an invalid
> LSN.
> If this logical replication slot with an invalid restart_lsn is
> specified
> as the source slot in pg_copy_logical_replication_slot(), the function
> causes the following assertion failure.
Good catch!
> TRAP: FailedAssertion("!logical_slot", File: "slotfuncs.c", Line: 727)
>
> This assertion failure is caused by
>
> /* Copying non-reserved slot doesn't make sense */
> if (XLogRecPtrIsInvalid(src_restart_lsn))
> ereport(ERROR,
> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> errmsg("cannot copy a replication slot that doesn't reserve
> WAL")));
>
> I *guess* this assertion check was added because restart_lsn should
> not be invalid before. But in v13, it can be invalid thanks to
> max_slot_wal_keep_size.
> I think that this assertion check seems useless and should be removed
> in v13.
> Patch attached. Thought?
Your diagnosis looks correct to me. The assertion failure means that
copy_replication_slot was not exercised at least for a non-reserving
logical slots. Greping "pg_copy_logical_replication_slot" on src/test
showed nothing so I doubt we are exercising the function.
Don't we need some?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On 2020/06/23 18:42, Kyotaro Horiguchi wrote:
> At Tue, 23 Jun 2020 00:17:47 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>> Hi,
>>
>> If restart_lsn of logical replication slot gets behind more than
>> max_slot_wal_keep_size from the current LSN, the logical replication
>> slot would be invalidated and its restart_lsn is reset to an invalid
>> LSN.
>> If this logical replication slot with an invalid restart_lsn is
>> specified
>> as the source slot in pg_copy_logical_replication_slot(), the function
>> causes the following assertion failure.
>
> Good catch!
>
>> TRAP: FailedAssertion("!logical_slot", File: "slotfuncs.c", Line: 727)
>>
>> This assertion failure is caused by
>>
>> /* Copying non-reserved slot doesn't make sense */
>> if (XLogRecPtrIsInvalid(src_restart_lsn))
>> ereport(ERROR,
>> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>> errmsg("cannot copy a replication slot that doesn't reserve
>> WAL")));
>>
>> I *guess* this assertion check was added because restart_lsn should
>> not be invalid before. But in v13, it can be invalid thanks to
>> max_slot_wal_keep_size.
>> I think that this assertion check seems useless and should be removed
>> in v13.
>> Patch attached. Thought?
>
> Your diagnosis looks correct to me.
Thanks for the check! I will commit the patch later.
> The assertion failure means that
> copy_replication_slot was not exercised at least for a non-reserving
> logical slots. Greping "pg_copy_logical_replication_slot" on src/test
> showed nothing so I doubt we are exercising the function.
>
> Don't we need some?
Yes, increasing the test coverage sounds helpful!
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
On 2020-Jun-23, Fujii Masao wrote:
> If restart_lsn of logical replication slot gets behind more than
> max_slot_wal_keep_size from the current LSN, the logical replication
> slot would be invalidated and its restart_lsn is reset to an invalid LSN.
> If this logical replication slot with an invalid restart_lsn is specified
> as the source slot in pg_copy_logical_replication_slot(), the function
> causes the following assertion failure.
>
> TRAP: FailedAssertion("!logical_slot", File: "slotfuncs.c", Line: 727)
Oops.
> This assertion failure is caused by
>
> /* Copying non-reserved slot doesn't make sense */
> if (XLogRecPtrIsInvalid(src_restart_lsn))
> ereport(ERROR,
> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> errmsg("cannot copy a replication slot that doesn't reserve WAL")));
Heh, you pasted the code after your patch rather than the original.
I think the errcode is a bit bogus considering the new case.
IMO ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE is more appropriate.
One could argue that the error message could also be different for the
case of a logical slot (or even a physical slot that has the upcoming
"invalidated_at" LSN set), maybe "cannot copy a replication slot that
has been invalidated" but maybe that's a pointless distinction.
I don't object to the patch as presented.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020/06/24 9:38, Alvaro Herrera wrote:
> On 2020-Jun-23, Fujii Masao wrote:
>
>> If restart_lsn of logical replication slot gets behind more than
>> max_slot_wal_keep_size from the current LSN, the logical replication
>> slot would be invalidated and its restart_lsn is reset to an invalid LSN.
>> If this logical replication slot with an invalid restart_lsn is specified
>> as the source slot in pg_copy_logical_replication_slot(), the function
>> causes the following assertion failure.
>>
>> TRAP: FailedAssertion("!logical_slot", File: "slotfuncs.c", Line: 727)
>
> Oops.
>
>> This assertion failure is caused by
>>
>> /* Copying non-reserved slot doesn't make sense */
>> if (XLogRecPtrIsInvalid(src_restart_lsn))
>> ereport(ERROR,
>> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>> errmsg("cannot copy a replication slot that doesn't reserve WAL")));
>
> Heh, you pasted the code after your patch rather than the original.
oh.... sorry.
> I think the errcode is a bit bogus considering the new case.
> IMO ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE is more appropriate.
Agreed. So I updated the patch so this errcode is used instead.
Patch attached.
> One could argue that the error message could also be different for the
> case of a logical slot (or even a physical slot that has the upcoming
> "invalidated_at" LSN set), maybe "cannot copy a replication slot that
> has been invalidated" but maybe that's a pointless distinction.
> I don't object to the patch as presented.
I have no strong opinion about this, but for now I kept the message as it is.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Вложения
On 2020-Jun-24, Fujii Masao wrote: > > I think the errcode is a bit bogus considering the new case. > > IMO ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE is more appropriate. > > Agreed. So I updated the patch so this errcode is used instead. > Patch attached. LGTM. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020/06/24 23:58, Alvaro Herrera wrote: > On 2020-Jun-24, Fujii Masao wrote: > >>> I think the errcode is a bit bogus considering the new case. >>> IMO ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE is more appropriate. >> >> Agreed. So I updated the patch so this errcode is used instead. >> Patch attached. > > LGTM. Thanks! Pushed. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION