Обсуждение: Re: pgsql: walreceiver uses a temporary replication slot by default

Поиск
Список
Период
Сортировка

Re: pgsql: walreceiver uses a temporary replication slot by default

От
Michael Paquier
Дата:
Hi Peter,
(Adding Andres and Sergei in CC.)

On Tue, Jan 14, 2020 at 01:57:34PM +0000, Peter Eisentraut wrote:
> walreceiver uses a temporary replication slot by default
>
> If no permanent replication slot is configured using
> primary_slot_name, the walreceiver now creates and uses a temporary
> replication slot.  A new setting wal_receiver_create_temp_slot can be
> used to disable this behavior, for example, if the remote instance is
> out of replication slots.

A recent message from Seigei Kornilov has attracted my attention to
this commit:
https://www.postgresql.org/message-id/370331579618998@vla3-6a5326aeb4ee.qloud-c.yandex.net

In the thread about switching primary_conninfo to be reloadable, we
have argued at great lengths that we should never have the WAL
receiver fetch by itself the GUC parameters used for the connection
with its primary.  Here is the main area of the discussion:
https://www.postgresql.org/message-id/20190217192720.qphwrraj66rht5lj@alap3.anarazel.de

The previous thread was long enough so it can easily be missed.
However, it seems to me that we may need to revisit a couple of things
for this commit?  In short, the following things:
- wal_receiver_create_temp_slot should be made PGC_POSTMASTER,
similarly to primary_slot_name and primary_conninfo.
- WalReceiverMain() should not load the parameter from the GUC context
by itself.
- RequestXLogStreaming(), called by the startup process, should be in
charge of defining if a temp slot should be used or not.
--
Michael

Вложения

Re: pgsql: walreceiver uses a temporary replication slot by default

От
Sergei Kornilov
Дата:
Hello

> In short, the following things:
> - wal_receiver_create_temp_slot should be made PGC_POSTMASTER,
> similarly to primary_slot_name and primary_conninfo.
> - WalReceiverMain() should not load the parameter from the GUC context
> by itself.
> - RequestXLogStreaming(), called by the startup process, should be in
> charge of defining if a temp slot should be used or not.

I would like to cross-post here a patch with such changes that I posted in "allow online change primary_conninfo"
thread.
This thread is more appropriate for discussion about wal_receiver_create_temp_slot.

PS: I posted this patch in both threads mostly to make cfbot happy.

regards, Sergei
Вложения

Re: pgsql: walreceiver uses a temporary replication slot by default

От
Peter Eisentraut
Дата:
On 2020-01-22 06:55, Michael Paquier wrote:
> In the thread about switching primary_conninfo to be reloadable, we
> have argued at great lengths that we should never have the WAL
> receiver fetch by itself the GUC parameters used for the connection
> with its primary.  Here is the main area of the discussion:
> https://www.postgresql.org/message-id/20190217192720.qphwrraj66rht5lj@alap3.anarazel.de

The way I understood that discussion was that the issue is having both 
the startup process and the WAL receiver having possibly inconsistent 
knowledge about the current configuration.  That doesn't apply in this 
case, because the setting is only used by the WAL receiver.  Maybe I 
misunderstood.

> The previous thread was long enough so it can easily be missed.
> However, it seems to me that we may need to revisit a couple of things
> for this commit?  In short, the following things:
> - wal_receiver_create_temp_slot should be made PGC_POSTMASTER,
> similarly to primary_slot_name and primary_conninfo.
> - WalReceiverMain() should not load the parameter from the GUC context
> by itself.
> - RequestXLogStreaming(), called by the startup process, should be in
> charge of defining if a temp slot should be used or not.

That would be a reasonable fix if we think the above is really an issue.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgsql: walreceiver uses a temporary replication slot by default

От
Andres Freund
Дата:
Hi,

On 2020-02-10 16:46:04 +0100, Peter Eisentraut wrote:
> On 2020-01-22 06:55, Michael Paquier wrote:
> > In the thread about switching primary_conninfo to be reloadable, we
> > have argued at great lengths that we should never have the WAL
> > receiver fetch by itself the GUC parameters used for the connection
> > with its primary.  Here is the main area of the discussion:
> > https://www.postgresql.org/message-id/20190217192720.qphwrraj66rht5lj@alap3.anarazel.de
> 
> The way I understood that discussion was that the issue is having both the
> startup process and the WAL receiver having possibly inconsistent knowledge
> about the current configuration.  That doesn't apply in this case, because
> the setting is only used by the WAL receiver.  Maybe I misunderstood.

Yes, that was my concern there. I do agree there's much less of an issue
here.

I still architecturally don't find it attractive that the active
configuration between walreceiver and startup process can diverge
though. Imagine if we e.g. added the ability to receive WAL over
multiple connections from one host, or from multiple hosts (e.g. to be
able to get the bulk of the WAL from a cascading node, but also to
provide syncrep acknowledgements directly to the primary), or to allow
for logical replication without needing all WAL locally on a standby
doing decoding.  It seems not great if there's potentially diverging
configuration (hot standby feedback, temporary slots, ... ) between
those walreceivers, just depending on when they started.  Here the model
e.g. paralell workers use, which explicitly ensure that the GUC state is
the same in workers and the leader, is considerably better, imo.

So I think adding more of these parameters affecting walreceivers
without coordination is not going quite in the right direction.

Greetings,

Andres Freund



Re: pgsql: walreceiver uses a temporary replication slot by default

От
Michael Paquier
Дата:
On Mon, Feb 10, 2020 at 01:46:04PM -0800, Andres Freund wrote:
> I still architecturally don't find it attractive that the active
> configuration between walreceiver and startup process can diverge
> though. Imagine if we e.g. added the ability to receive WAL over
> multiple connections from one host, or from multiple hosts (e.g. to be
> able to get the bulk of the WAL from a cascading node, but also to
> provide syncrep acknowledgements directly to the primary), or to allow
> for logical replication without needing all WAL locally on a standby
> doing decoding.  It seems not great if there's potentially diverging
> configuration (hot standby feedback, temporary slots, ... ) between
> those walreceivers, just depending on when they started.  Here the model
> e.g. parallel workers use, which explicitly ensure that the GUC state is
> the same in workers and the leader, is considerably better, imo.

Yes, I still think that we should fix that inconsistency, mark the new
GUC wal_receiver_create_temp_slot as PGC_POSTMASTER, and add a note at
the top of RequestXLogStreaming() and walreceiver.c about the
assumptions we'd prefer rely to for the GUCs starting a WAL receiver.

> So I think adding more of these parameters affecting walreceivers
> without coordination is not going quite in the right direction.

Indeed.  Adding more comments would be one way to prevent the
situation to happen here, I fear that others may forget this stuff in
the future.
--
Michael

Вложения

Re: pgsql: walreceiver uses a temporary replication slot by default

От
Michael Paquier
Дата:
On Wed, Jan 22, 2020 at 06:58:46PM +0300, Sergei Kornilov wrote:
> I would like to cross-post here a patch with such changes that I posted in "allow online change primary_conninfo"
thread.
> This thread is more appropriate for discussion about wal_receiver_create_temp_slot.
>
> PS: I posted this patch in both threads mostly to make cfbot happy.

Thanks for posting this patch, Sergei.  Here is a review to make
things move on.

-    * Create temporary replication slot if no slot name is configured or
-    * the slot from the previous run was temporary, unless
-    * wal_receiver_create_temp_slot is disabled.  We also need to handle
-    * the case where the previous run used a temporary slot but
-    * wal_receiver_create_temp_slot was changed in the meantime.  In that
-    * case, we delete the old slot name in shared memory.  (This would
+    * Create temporary replication slot if requested.  In that
+    * case, we update slot name in shared memory. (This would

The set of comments you are removing from walreceiver.c to decide if a
temporary slot needs to be created or not should be moved to
walreceiverfuncs.c as you move the logic from the WAL receiver startup
phase to the moment the WAL receiver spawn is requested.

I agree with the simplifications in WalReceiverMain() as you have
switched wal_receiver_create_temp_slot to be PGC_POSTMASTER, so
modifications are no longer a state that matter.

It would be more consistent with primary_conn_info and
primary_slot_name if wal_receiver_create_temp_slot is passed down as
an argument of RequestXLogStreaming().

As per the discussion done on this thread, let's also switch the
parameter default to be disabled.  Peter, as the committer of 3297308,
it would be good if you could chime in.
--
Michael

Вложения

Re: pgsql: walreceiver uses a temporary replication slot by default

От
Sergei Kornilov
Дата:
Hello

> Thanks for posting this patch, Sergei. Here is a review to make
> things move on.

Thank you, here is updated patch

> The set of comments you are removing from walreceiver.c to decide if a
> temporary slot needs to be created or not should be moved to
> walreceiverfuncs.c as you move the logic from the WAL receiver startup
> phase to the moment the WAL receiver spawn is requested.

I changed this comments because they describes behavior during change value of wal_receiver_create_temp_slot.
But yes, I need to add some comments to RequestXLogStreaming.

> It would be more consistent with primary_conn_info and
> primary_slot_name if wal_receiver_create_temp_slot is passed down as
> an argument of RequestXLogStreaming().

Yep, I thought about that. Changed.

> As per the discussion done on this thread, let's also switch the
> parameter default to be disabled.

Done (my vote is also for disabling this option by default).

regards, Sergei
Вложения

Re: pgsql: walreceiver uses a temporary replication slot by default

От
Michael Paquier
Дата:
On Mon, Feb 17, 2020 at 04:57:04PM +0300, Sergei Kornilov wrote:
> Thank you, here is updated patch

Thanks

> I changed this comments because they describes behavior during
> change value of wal_receiver_create_temp_slot.  But yes, I need to
> add some comments to RequestXLogStreaming.

I have reworked that part, adding more comments about the use of GUC
parameters when establishing the connection to the primary for a WAL
receiver.  And also I have added an extra comment to walreceiver.c
about the use of GUcs in general, to avoid this stuff again in the
future.  There were some extra nits with the format of
postgresql.conf.sample.

>> As per the discussion done on this thread, let's also switch the
>> parameter default to be disabled.
>
> Done (my vote is also for disabling this option by default).

We visibly tend to move in this direction, at least based on our
discussion.  Let's see where this leads.  For now, I have registered
this patch to next CF (https://commitfest.postgresql.org/27/2456/),
with yourself as author and myself as reviewer, and then let's wait
for mainly Peter E. and others for more input.
--
Michael

Вложения

Re: pgsql: walreceiver uses a temporary replication slot by default

От
Sergei Kornilov
Дата:
Hello

> I have reworked that part, adding more comments about the use of GUC
> parameters when establishing the connection to the primary for a WAL
> receiver. And also I have added an extra comment to walreceiver.c
> about the use of GUcs in general, to avoid this stuff again in the
> future. There were some extra nits with the format of
> postgresql.conf.sample.

Thank you! I just noticed that you removed my proposed change to this condition in RequestXLogStreaming

-    if (slotname != NULL)
+    if (slotname != NULL && slotname[0] != '\0')

We need this change to set is_temp_slot properly. PrimarySlotName GUC can usually be an empty string, so just "slotname
!=NULL" is not enough.
 

I attached patch with this change.

regards, Sergei
Вложения

Re: pgsql: walreceiver uses a temporary replication slot by default

От
Michael Paquier
Дата:
On Tue, Mar 17, 2020 at 11:39:11PM +0300, Sergei Kornilov wrote:
> We need this change to set is_temp_slot properly. PrimarySlotName
> GUC can usually be an empty string, so just "slotname != NULL" is
> not enough.

Yep, or a temporary slot would never be created even if there is no
slot defined, and the priority goes to primary_slot_name if set.

> I attached patch with this change.

Thanks, I have added a new open item for v13 to track this effort:
https://wiki.postgresql.org/wiki/PostgreSQL_13_Open_Items
--
Michael

Вложения