Обсуждение: Forbid the use of invalidated physical slots in streaming replication.

Поиск
Список
Период
Сортировка

Forbid the use of invalidated physical slots in streaming replication.

От
"Zhijie Hou (Fujitsu)"
Дата:
Hi,

When testing streaming replication with a physical slot. I found an unexpected
behavior that the walsender could use an invalidated physical slot for
streaming.

This occurs when the primary slot is invalidated due to reaching the
max_slot_wal_keep_size before initializing the streaming replication
(e.g. before creating the standby). Attach a draft script(test_invalidated_slot.sh)
which can reproduce this.

Once the slot is invalidated, it can no longer protect the WALs and
Rows, as these invalidated slots are not considered in functions like
ReplicationSlotsComputeRequiredXmin().

Besides, the walsender could advance the restart_lsn of an invalidated slot,
then user won't be able to know that if the slot is actually validated or not,
because the 'conflicting' of view pg_replication_slot could be set back to
null.

So, I think it's a bug and one idea to fix is to check the validity of the physical slot in
StartReplication() after acquiring the slot like what the attachment does,
what do you think ?

Best Regards,
Hou Zhijie


Вложения

Re: Forbid the use of invalidated physical slots in streaming replication.

От
Ashutosh Bapat
Дата:
On Wed, Dec 6, 2023 at 6:25 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
> Hi,
>
> When testing streaming replication with a physical slot. I found an unexpected
> behavior that the walsender could use an invalidated physical slot for
> streaming.
>
> This occurs when the primary slot is invalidated due to reaching the
> max_slot_wal_keep_size before initializing the streaming replication
> (e.g. before creating the standby). Attach a draft script(test_invalidated_slot.sh)
> which can reproduce this.

Interesting. Thanks for the script. It reproduces the problem for me easily.

>
> Once the slot is invalidated, it can no longer protect the WALs and
> Rows, as these invalidated slots are not considered in functions like
> ReplicationSlotsComputeRequiredXmin().
>
> Besides, the walsender could advance the restart_lsn of an invalidated slot,
> then user won't be able to know that if the slot is actually validated or not,
> because the 'conflicting' of view pg_replication_slot could be set back to
> null.

In this case, since the basebackup was taken after the slot was
invalidated, it does not require the WAL that was removed. But it
seems that once the streaming starts, the slot sprints to life again
and gets validated again. Here's pg_replication_slot output after the
standby starts
#select * from pg_replication_slots ;
  slot_name  |    plugin     | slot_type | datoid | database |
temporary | active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phase | conflic
ting

-------------+---------------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------+--------
-----
 logicalslot | test_decoding | logical   |      5 | postgres | f
  | f      |            |      |          739 |             |
0/1513B08           | lost       |               | f         | t
 physical    |               | physical  |        |          | f
  | t      |     341925 |  752 |              | 0/404CB78   |
           | unreserved |      16462984 | f         |
(2 rows)

which looks quite similar to the output when slot was valid after creation
  slot_name  |    plugin     | slot_type | datoid | database |
temporary | active | active_pid | xmin | catalog_xmin | restart_lsn |
confirmed_flush_lsn | wal_status | safe_wal_size | two_phase | conflic
ting

-------------+---------------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------+--------
-----
 logicalslot | test_decoding | logical   |      5 | postgres | f
  | f      |            |      |          739 | 0/1513AD0   |
0/1513B08           | unreserved |      -1591888 | f         | f
 physical    |               | physical  |        |          | f
  | f      |            |      |              | 0/14F0DF0   |
           | unreserved |      -1591888 | f         |
(2 rows)

>
> So, I think it's a bug and one idea to fix is to check the validity of the physical slot in
> StartReplication() after acquiring the slot like what the attachment does,
> what do you think ?

I am not sure whether that's necessarily a bug. Of course, we don't
expect invalid slots to be used but in this case I don't see any harm.
The invalid slot has been revived and has all the properties set just
like a valid slot. So it must be considered in
ReplicationSlotsComputeRequiredXmin() as well. I haven't verified it
myself though. In case the WAL is really lost and is requested by the
standby it will throw an error "requested WAL segment [0-9A-F]+ has
already been removed". So no harm there as well.

I haven't been able to locate the code which makes the slot valid
though. So I can't say whether the behaviour is intended or not.
Looking at StartReplication() comment
/*
* We don't need to verify the slot's restart_lsn here; instead we
* rely on the caller requesting the starting point to use. If the
* WAL segment doesn't exist, we'll fail later.
*/
it looks like the behaviour is not completely unexpected.

--
Best Wishes,
Ashutosh Bapat



RE: Forbid the use of invalidated physical slots in streaming replication.

От
"Zhijie Hou (Fujitsu)"
Дата:
On Thursday, December 7, 2023 7:43 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:

Hi,

> 
> On Wed, Dec 6, 2023 at 6:25 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> wrote:
> >
> > When testing streaming replication with a physical slot. I found an
> > unexpected behavior that the walsender could use an invalidated
> > physical slot for streaming.
> >
> > This occurs when the primary slot is invalidated due to reaching the
> > max_slot_wal_keep_size before initializing the streaming replication
> > (e.g. before creating the standby). Attach a draft
> > script(test_invalidated_slot.sh) which can reproduce this.
> 
> Interesting. Thanks for the script. It reproduces the problem for me easily.

Thanks for testing and replying!

> 
> >
> > Once the slot is invalidated, it can no longer protect the WALs and
> > Rows, as these invalidated slots are not considered in functions like
> > ReplicationSlotsComputeRequiredXmin().
> >
> > Besides, the walsender could advance the restart_lsn of an invalidated
> > slot, then user won't be able to know that if the slot is actually
> > validated or not, because the 'conflicting' of view
> > pg_replication_slot could be set back to null.
> 
> In this case, since the basebackup was taken after the slot was invalidated, it
> does not require the WAL that was removed. But it seems that once the
> streaming starts, the slot sprints to life again and gets validated again. Here's
> pg_replication_slot output after the standby starts.

Actually, It doesn't bring the invalidated slot back to life completely.
The slot's view data looks valid while the 'invalidated' flag of this slot is still
RS_INVAL_WAL_REMOVED (user are not aware of it.)


> 
> >
> > So, I think it's a bug and one idea to fix is to check the validity of
> > the physical slot in
> > StartReplication() after acquiring the slot like what the attachment
> > does, what do you think ?
> 
> I am not sure whether that's necessarily a bug. Of course, we don't expect
> invalid slots to be used but in this case I don't see any harm.
> The invalid slot has been revived and has all the properties set just like a valid
> slot. So it must be considered in
> ReplicationSlotsComputeRequiredXmin() as well. I haven't verified it myself
> though. In case the WAL is really lost and is requested by the standby it will
> throw an error "requested WAL segment [0-9A-F]+ has already been
> removed". So no harm there as well.

Since the 'invalidated' field of the slot is still (RS_INVAL_WAL_REMOVED), even
if the walsender advances the restart_lsn, the slot will not be considered in the
ReplicationSlotsComputeRequiredXmin(), so the WALs and Rows are not safe
and that's why I think it's a bug.

After looking closer, it seems this behavior started from 15f8203 which introduced the
ReplicationSlotInvalidationCause 'invalidated', after that we check the invalidated enum
in Xmin/Lsn computation function.

If we want to go back to previous behavior, we need to revert/adjust the check
for invalidated in ReplicationSlotsComputeRequiredXmin(), but since the logical
decoding(walsender) disallow using invalidated slot, so I feel it's consistent
to do similar check for physical one. Besides, pg_replication_slot_advance()
also disallow passing invalidated slot to it as well. What do you think ?

Best Regards,
Hou zj 

Re: Forbid the use of invalidated physical slots in streaming replication.

От
Ashutosh Bapat
Дата:
> > > pg_replication_slot could be set back to null.
> >
> > In this case, since the basebackup was taken after the slot was invalidated, it
> > does not require the WAL that was removed. But it seems that once the
> > streaming starts, the slot sprints to life again and gets validated again. Here's
> > pg_replication_slot output after the standby starts.
>
> Actually, It doesn't bring the invalidated slot back to life completely.
> The slot's view data looks valid while the 'invalidated' flag of this slot is still
> RS_INVAL_WAL_REMOVED (user are not aware of it.)
>

I was mislead by the code in pg_get_replication_slots(). I did not
read it till the following

--- code ----
case WALAVAIL_REMOVED:

/*
* If we read the restart_lsn long enough ago, maybe that file
* has been removed by now. However, the walsender could have
* moved forward enough that it jumped to another file after
* we looked. If checkpointer signalled the process to
* termination, then it's definitely lost; but if a process is
* still alive, then "unreserved" seems more appropriate.
*
* If we do change it, save the state for safe_wal_size below.
*/
--- code ---

I see now how an invalid slot's wal status can be reported as
unreserved. So I think it's a bug.

>
> >
> > >
> > > So, I think it's a bug and one idea to fix is to check the validity of
> > > the physical slot in
> > > StartReplication() after acquiring the slot like what the attachment
> > > does, what do you think ?
> >
> > I am not sure whether that's necessarily a bug. Of course, we don't expect
> > invalid slots to be used but in this case I don't see any harm.
> > The invalid slot has been revived and has all the properties set just like a valid
> > slot. So it must be considered in
> > ReplicationSlotsComputeRequiredXmin() as well. I haven't verified it myself
> > though. In case the WAL is really lost and is requested by the standby it will
> > throw an error "requested WAL segment [0-9A-F]+ has already been
> > removed". So no harm there as well.
>
> Since the 'invalidated' field of the slot is still (RS_INVAL_WAL_REMOVED), even
> if the walsender advances the restart_lsn, the slot will not be considered in the
> ReplicationSlotsComputeRequiredXmin(), so the WALs and Rows are not safe
> and that's why I think it's a bug.
>
> After looking closer, it seems this behavior started from 15f8203 which introduced the
> ReplicationSlotInvalidationCause 'invalidated', after that we check the invalidated enum
> in Xmin/Lsn computation function.
>
> If we want to go back to previous behavior, we need to revert/adjust the check
> for invalidated in ReplicationSlotsComputeRequiredXmin(), but since the logical
> decoding(walsender) disallow using invalidated slot, so I feel it's consistent
> to do similar check for physical one. Besides, pg_replication_slot_advance()
> also disallow passing invalidated slot to it as well. What do you think ?

What happens if you run your script on a build prior to 15f8203?
Purely from reading the code, it looks like the physical slot would
sprint back to life since its restart_lsn would be updated. But I am
not able to see what happens to invalidated_at. It probably remains a
valid LSN and the slot would still not be considred in xmin
calculation. It will be good to be compatible to pre-15f8203
behaviour.

I think making logical and physical slot behaviour consistent would be
better but if the inconsistent behaviour is out there for some
releases, changing that now will break backward compatibility.

-- 
Best Wishes,
Ashutosh Bapat



Re: Forbid the use of invalidated physical slots in streaming replication.

От
vignesh C
Дата:
On Fri, 8 Dec 2023 at 19:15, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> > > > pg_replication_slot could be set back to null.
> > >
> > > In this case, since the basebackup was taken after the slot was invalidated, it
> > > does not require the WAL that was removed. But it seems that once the
> > > streaming starts, the slot sprints to life again and gets validated again. Here's
> > > pg_replication_slot output after the standby starts.
> >
> > Actually, It doesn't bring the invalidated slot back to life completely.
> > The slot's view data looks valid while the 'invalidated' flag of this slot is still
> > RS_INVAL_WAL_REMOVED (user are not aware of it.)
> >
>
> I was mislead by the code in pg_get_replication_slots(). I did not
> read it till the following
>
> --- code ----
> case WALAVAIL_REMOVED:
>
> /*
> * If we read the restart_lsn long enough ago, maybe that file
> * has been removed by now. However, the walsender could have
> * moved forward enough that it jumped to another file after
> * we looked. If checkpointer signalled the process to
> * termination, then it's definitely lost; but if a process is
> * still alive, then "unreserved" seems more appropriate.
> *
> * If we do change it, save the state for safe_wal_size below.
> */
> --- code ---
>
> I see now how an invalid slot's wal status can be reported as
> unreserved. So I think it's a bug.
>
> >
> > >
> > > >
> > > > So, I think it's a bug and one idea to fix is to check the validity of
> > > > the physical slot in
> > > > StartReplication() after acquiring the slot like what the attachment
> > > > does, what do you think ?
> > >
> > > I am not sure whether that's necessarily a bug. Of course, we don't expect
> > > invalid slots to be used but in this case I don't see any harm.
> > > The invalid slot has been revived and has all the properties set just like a valid
> > > slot. So it must be considered in
> > > ReplicationSlotsComputeRequiredXmin() as well. I haven't verified it myself
> > > though. In case the WAL is really lost and is requested by the standby it will
> > > throw an error "requested WAL segment [0-9A-F]+ has already been
> > > removed". So no harm there as well.
> >
> > Since the 'invalidated' field of the slot is still (RS_INVAL_WAL_REMOVED), even
> > if the walsender advances the restart_lsn, the slot will not be considered in the
> > ReplicationSlotsComputeRequiredXmin(), so the WALs and Rows are not safe
> > and that's why I think it's a bug.
> >
> > After looking closer, it seems this behavior started from 15f8203 which introduced the
> > ReplicationSlotInvalidationCause 'invalidated', after that we check the invalidated enum
> > in Xmin/Lsn computation function.
> >
> > If we want to go back to previous behavior, we need to revert/adjust the check
> > for invalidated in ReplicationSlotsComputeRequiredXmin(), but since the logical
> > decoding(walsender) disallow using invalidated slot, so I feel it's consistent
> > to do similar check for physical one. Besides, pg_replication_slot_advance()
> > also disallow passing invalidated slot to it as well. What do you think ?
>
> What happens if you run your script on a build prior to 15f8203?
> Purely from reading the code, it looks like the physical slot would
> sprint back to life since its restart_lsn would be updated. But I am
> not able to see what happens to invalidated_at. It probably remains a
> valid LSN and the slot would still not be considred in xmin
> calculation. It will be good to be compatible to pre-15f8203
> behaviour.

I have changed the patch status to "Waiting on Author", as some of the
queries requested by Ashutosh are not yet addressed.

Regards,
Vignesh



Re: Forbid the use of invalidated physical slots in streaming replication.

От
Robert Haas
Дата:
On Thu, Dec 7, 2023 at 8:00 AM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
> After looking closer, it seems this behavior started from 15f8203 which introduced the
> ReplicationSlotInvalidationCause 'invalidated', after that we check the invalidated enum
> in Xmin/Lsn computation function.

Adding Andres in Cc, as that was his commit.

It's not entirely clear to me how this feature was intended to
interact with physical replication slots. I found seemingly-relevant
documentation in two places:

https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-MAX-SLOT-WAL-KEEP-SIZE
https://www.postgresql.org/docs/current/view-pg-replication-slots.html

In the latter, it says "unreserved means that the slot no longer
retains the required WAL files and some of them are to be removed at
the next checkpoint. This state can return to reserved or extended."
But if a slot becomes invalid in such a way that it cannot return to a
valid state later, then this isn't accurate.

I have a feeling that the actual behavior here has evolved and the
documentation hasn't kept up. And I wonder whether we need a more
comprehensive explanation of the intended behavior in this section:

https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Forbid the use of invalidated physical slots in streaming replication.

От
vignesh C
Дата:
On Mon, 8 Jan 2024 at 10:25, vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 8 Dec 2023 at 19:15, Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > > > > pg_replication_slot could be set back to null.
> > > >
> > > > In this case, since the basebackup was taken after the slot was invalidated, it
> > > > does not require the WAL that was removed. But it seems that once the
> > > > streaming starts, the slot sprints to life again and gets validated again. Here's
> > > > pg_replication_slot output after the standby starts.
> > >
> > > Actually, It doesn't bring the invalidated slot back to life completely.
> > > The slot's view data looks valid while the 'invalidated' flag of this slot is still
> > > RS_INVAL_WAL_REMOVED (user are not aware of it.)
> > >
> >
> > I was mislead by the code in pg_get_replication_slots(). I did not
> > read it till the following
> >
> > --- code ----
> > case WALAVAIL_REMOVED:
> >
> > /*
> > * If we read the restart_lsn long enough ago, maybe that file
> > * has been removed by now. However, the walsender could have
> > * moved forward enough that it jumped to another file after
> > * we looked. If checkpointer signalled the process to
> > * termination, then it's definitely lost; but if a process is
> > * still alive, then "unreserved" seems more appropriate.
> > *
> > * If we do change it, save the state for safe_wal_size below.
> > */
> > --- code ---
> >
> > I see now how an invalid slot's wal status can be reported as
> > unreserved. So I think it's a bug.
> >
> > >
> > > >
> > > > >
> > > > > So, I think it's a bug and one idea to fix is to check the validity of
> > > > > the physical slot in
> > > > > StartReplication() after acquiring the slot like what the attachment
> > > > > does, what do you think ?
> > > >
> > > > I am not sure whether that's necessarily a bug. Of course, we don't expect
> > > > invalid slots to be used but in this case I don't see any harm.
> > > > The invalid slot has been revived and has all the properties set just like a valid
> > > > slot. So it must be considered in
> > > > ReplicationSlotsComputeRequiredXmin() as well. I haven't verified it myself
> > > > though. In case the WAL is really lost and is requested by the standby it will
> > > > throw an error "requested WAL segment [0-9A-F]+ has already been
> > > > removed". So no harm there as well.
> > >
> > > Since the 'invalidated' field of the slot is still (RS_INVAL_WAL_REMOVED), even
> > > if the walsender advances the restart_lsn, the slot will not be considered in the
> > > ReplicationSlotsComputeRequiredXmin(), so the WALs and Rows are not safe
> > > and that's why I think it's a bug.
> > >
> > > After looking closer, it seems this behavior started from 15f8203 which introduced the
> > > ReplicationSlotInvalidationCause 'invalidated', after that we check the invalidated enum
> > > in Xmin/Lsn computation function.
> > >
> > > If we want to go back to previous behavior, we need to revert/adjust the check
> > > for invalidated in ReplicationSlotsComputeRequiredXmin(), but since the logical
> > > decoding(walsender) disallow using invalidated slot, so I feel it's consistent
> > > to do similar check for physical one. Besides, pg_replication_slot_advance()
> > > also disallow passing invalidated slot to it as well. What do you think ?
> >
> > What happens if you run your script on a build prior to 15f8203?
> > Purely from reading the code, it looks like the physical slot would
> > sprint back to life since its restart_lsn would be updated. But I am
> > not able to see what happens to invalidated_at. It probably remains a
> > valid LSN and the slot would still not be considred in xmin
> > calculation. It will be good to be compatible to pre-15f8203
> > behaviour.
>
> I have changed the patch status to "Waiting on Author", as some of the
> queries requested by Ashutosh are not yet addressed.

The patch which you submitted has been awaiting your attention for
quite some time now.  As such, we have moved it to "Returned with
Feedback" and removed it from the reviewing queue. Depending on
timing, this may be reversible.  Kindly address the feedback you have
received, and resubmit the patch to the next CommitFest.

Regards,
Vignesh