Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
| От | Masahiko Sawada |
|---|---|
| Тема | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
| Дата | |
| Msg-id | CAD21AoACvm7UNa13yx4_=QXWu4RDrn-FbDsN9PownMDApRieSQ@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart (Amit Kapila <amit.kapila16@gmail.com>) |
| Список | pgsql-hackers |
On Wed, Jan 7, 2026 at 9:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Jan 8, 2026 at 5:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Wed, Jan 7, 2026 at 4:56 AM Matthias van de Meent > > <boekewurm+postgres@gmail.com> wrote: > > > > > > Sorry for the belated reply. I noticed this patch got committed, and > > > after reading its commit message (and now, code) I'm concerned that > > > I'm now unable to disable wal_level=logical without removing streaming > > > replication as feature. > > > When I configure wal_level=replica, to me that means to NOT enable > > > wal_level=logical, and that means that I do *not* want the increased > > > overhead in my cluster's table updates that is associated with > > > wal_level=logical (but still want to be able to have streaming > > > replication). > > > > > > I had expected the topical feature to be implemented through changing > > > wal_level to PGC_SIGHUP from PGC_POSTMASTER (and then propagating that > > > through a similar system), which would've required an explicit > > > agreement of the cluster owner to increase the WAL overhead in favour > > > of being able to do logical decoding. However, by making > > > effective_wal_level controlled by CREATE_REPLICATION_SLOT, this guc is > > > suddenly effectively set-able by users with the REPLICATION privilege, > > > which it previously wasn't. And I don't trust my physical subscribers' > > > roles to _not_ also create a logical replication slot. > > > > > > So, sorry I'm late, but I don't agree with the way this decides to > > > change the effective wal level. It elevates REPLICATION users to be > > > able to control wal_level without actually going through the security > > > controls of the system. And no, granting SET ON PARAMETER wal_level > > > for REPLICATION roles isn't a solution IMO - replication roles > > > shouldn't decide which types of replication are allowed in the > > > cluster, only the system owner (and its explicit delegates) should. > > > > > > NB. I'm not opposed to changing wal_level in a running cluster, and I > > > do think that the current xact+checkpoint -based approach to selecting > > > the local effective_wal_level is fine, as well as standby picking up > > > the primary's current setting; it's the trigger condition for the > > > decision to change effective_wal_level that I have problems with. > > > > > > > Thank you for the comments. > > > > I understand the concern that users with the REPLICATION privilege can > > now effectively control wal_level, potentially increasing system-wide > > overhead. While the REPLICATION privilege already implies a high > > degree of trust as we allow it to take a basebackup and create a > > physical slot etc., I agree that this feature might elevate that power > > further, and we may need a mechanism to address this. > > > > If we allow taking the entire physical data via the REPLICATION > privilege, then the user must already be highly privileged. Such a > user is already allowed to read every byte of data in the database via > physical streaming. Now, such a user influencing wal_level to be > changed from 'replica' to 'logical' is of lesser harm. I agree that it > can lead to some non-malicious impact, like disk space (due to > increased WAL volume), and extra CPU consumption due to extra WAL > volume. But I think REPLICATION privilege can already lead to extra > CPU consumption due to wal_sender activity, and even disk space by not > letting the slot advance, which can even crash the system. > > Since these users already have the power to access all data and cause > a Denial of Service (DoS) via disk exhaustion, the ability to > "upgrade" WAL logging from replica to logical can be seen as an > incremental addition to an already highly trusted role. I think we can > update the documentation of the REPLICATION privilege. Yeah, the documentation updates would be necessary anyway. > > > > > To address your concerns, I have come up with the following ideas: > > > > I feel, If an administrator does not want to allow logical decoding, > they can set max_replication_slots to a value that only covers their > known physical replicas. So, they can still control the additional CPU > consumption if they are worried that it can cause harm. The other > possibility is to have a separate GUC for logical slots such as > max_logical_replication_slots. So, still, an administrator can keep > control. max_logical_replication_slots is an interesting idea. It can control how many logical slots can be created within max_replication_slots limit and can be defined as PGC_SIGHUP (i.e., similar to the relationship between autovacuum_max_workers and autovacuum_worker_slots). Another idea would be to require for users to have both REPLICATION privilege and the SET privilege on effective_wal_level in order to toggle logical replication. Probably we can make users who have REPLICATION privilege have the SET privilege by default, and users can REVOKE it if necessary. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: