Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
| От | Peter Smith |
|---|---|
| Тема | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
| Дата | |
| Msg-id | CAHut+Puq-i+-DSkN2T_G275zFedhWSHRBtMQRxUkt5rPM35RMA@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart (Masahiko Sawada <sawada.mshk@gmail.com>) |
| Ответы |
Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
|
| Список | pgsql-hackers |
On Mon, Dec 1, 2025 at 4:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Nov 27, 2025 at 12:33 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Hi Sawada-San.
> >
> > Some review comments for v30-0001.
>
> Thank you for the comments!
>
> > ======
> > src/backend/replication/logical/logicalctl.c
> >
> > 6b.
> > Would it be better to have a self-documenting enum {UNKNOWN, ENABLED,
> > DISABLED} instead of some magic int values that need lots of comments
> > explaining how to use them?
>
> No, we should not cast enum values directly to a boolean. We have a
> precedent of such usage like LocalXLogInsertAllowed, and I don't think
> we can reduce the comments even if we use a self-documenting enum
> since the usage is complicated. We could consider #define values 0, 1,
> and -1 but it seems to be overkill to this usage.
>
I never suggested casting enum values directly to booleans. FYI - I've
attached a diff patch to demonstrate what I meant. There are fewer
comments since there is no need to explain special values -1,0,1.
//////////
Anyway, here are some more review comments for v31-0001.
======
src/backend/replication/logical/decode.c
1.
+ /*
+ * a XLOG_PARAMETER_CHANGE record might change wal_level to
+ * 'replica' or 'minimal', but we don't check the logical decoding
+ * availability here because it's updated by a
+ * XLOG_LOGICAL_DECODING_STATUS_CHANGE record.
+ */
Missed a previous review comment. Start this comment with a capital 'A'.
======
src/backend/replication/logical/logicalctl.c
CheckXLogLogicalInfo:
2.
+ XLogLogicalInfoXactCache = (int) XLogLogicalInfo;
+
+ return XLogLogicalInfoXactCache;
+}
This function should be returning the bool -- i.e. the
XLogLogicalInfo, not the int XLogLogicalInfoCache.
======
src/test/recovery/t/050_effective_wal_level.pl
3.
+# Wait for the checkpointer to decrease effective_wal_level to 'replica'.
+sub wait_for_logical_decoding_disabled
+{
+ my ($node) = @_;
+
+ $node->poll_query_until('postgres',
+ qq[select current_setting('effective_wal_level') = 'replica';]);
+}
Should this subroutine also execute a CHECKPOINT here? Otherwise, it
might be a long wait before the checkpoint happens automatically.
Also, it would align better with the comment; otherwise, this
subroutine has nothing really to do with checkpoints at all.
~~~
4.
+# Initialize the primary server with wal_level = 'replica'.
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 1);
+$primary->append_conf('postgresql.conf', "log_min_messages = debug1");
+$primary->start();
Would it be nicer to explicitly set wal_level? Otherwise, perhaps the
comment should mention that this is the default.
~~~
5.
+# Create a physical slot.
+$primary->safe_psql('postgres',
+ qq[select pg_create_physical_replication_slot('test_phy_slot', false, false)]
+);
+
+# Physical slots don't affect effective_wal_level.
+test_wal_level($primary, "replica|replica",
+ "effective_wal_level doesn't change with a new physical slot");
+$primary->safe_psql('postgres',
+ qq[select pg_drop_replication_slot('test_phy_slot')]);
That seems like a single test case, so the code/comments can be combined.
SUGGESTION
Check that creating a physical slot doesn't affect effective_wal_level.
~~~
6.
+# Create a temporary logical slot but exits without releasing it explicitly.
+# This enables logical decoding but skips disabling it and delegates to the
+# checkpointer.
typo: /but exits/ but exit/
~~~
7.
+# Create a new logical slot and check if effective_wal_level must be increased
+# to 'logical'.
typo: /if effective_wal_level must be increased/that
effective_wal_level is increased/
~~~
8.
+# Check if the server cannot start with wal_level='minimal' as long as there is
+# at least one replication slot.
+$primary->adjust_conf('postgresql.conf', 'wal_level', 'minimal');
+$primary->adjust_conf('postgresql.conf', 'max_wal_senders', '0');
+$primary->stop;
+
+command_fails(
+ [
+ 'pg_ctl',
+ '--pgdata' => $primary->data_dir,
+ '--log' => $primary->logfile,
+ 'start',
+ ],
+ "cannot server with wal_level='minimal' as there is in-use logical slot");
+
8a.
Tweak comment.
SUGGESTION
Verify that the server cannot start with wal_level='minimal' when
there is at least one replication slot.
~
8b.
Typo (missing word "start"?) in "cannot server"
Typo /is in-use/is an in-use/
~~~
9.
+# Revert the modified settings.
+$primary->adjust_conf('postgresql.conf', 'wal_level', 'replica');
+$primary->adjust_conf('postgresql.conf', 'max_wal_senders', '10');
+
+# Add other settings to test if we disable logical decoding when
invalidating the last
+# logical slot.
+$primary->append_conf(
+ 'postgresql.conf',
+ qq[
+min_wal_size = 32MB
+max_wal_size = 32MB
+max_slot_wal_keep_size = 16MB
+]);
+$primary->start;
Can those setting changes all be combined so there is only one primary->start?
~~~
10.
+# Check if logical decoding is disabled after invalidating the last
logical slot.
There are quite a few "check if" comments like this one that could be
worded more strongly. I won't make review comments for all of them,
but here is an example.
SUGGESTION:
Verify that logical decoding is disabled after invalidating the last
logical slot.
~~~
11.
+# Recreate the logical slot to enable logical decoding again.
+$primary->safe_psql('postgres',
+ qq[select pg_drop_replication_slot('test_slot')]);
+$primary->safe_psql('postgres',
+ qq[select pg_create_logical_replication_slot('test_slot', 'pgoutput')])
Should this also call 'test_wal_level' subroutine to verify the states?
~~~
12.
+# Initialize standby2 node and start it with wal_level = 'logical'.
+my $standby2 = PostgreSQL::Test::Cluster->new('standby2');
+$standby2->init_from_backup($primary, 'my_backup', has_streaming => 1);
I got confused around here ... Probably I am mistaken, but I thought:
First, there was $primary (a primary) and $standby1 (a standby)
Then, the $standby1 got promoted (so now $standby1 is a primary and
$primary is the standby)
And, now we are making another $standby2 from the $primary.
AFAIK, that $primary is no longer a primary instance, right?
(???)
So, if $primary is not really the primary anymore, then now I am
unsure if later test comments like below are correct:
+# Regardless of their wal_level values, effective_wal_level values on the
+# standby and the cascaded standby depend on the primary's value, 'logical'.
and
+# Drop the primary's last logical slot, decreasing effective_wal_level to
+# 'replica' on all nodes.
etc.
~~~
13.
+# Check if the effective_wal_level remains 'logical' on both nodes
typo: "the both"
======
Kind Regards,
Peter Smith.
Fujitsu Australia
Вложения
В списке pgsql-hackers по дате отправления: