Обсуждение: Bump default wal_level to logical

Поиск
Список
Период
Сортировка

Bump default wal_level to logical

От
David Fetter
Дата:
Hi,

I'd like to propose $subject, as embodied in the attached patch. This
makes it possible to discover and fulfill a need for logical
replication that can arise at a time when bouncing the server has
become impractical, i.e. when there is already high demand on it.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

Re: Bump default wal_level to logical

От
Amit Kapila
Дата:
On Mon, Jun 8, 2020 at 10:08 AM David Fetter <david@fetter.org> wrote:
>
> Hi,
>
> I'd like to propose $subject, as embodied in the attached patch. This
> makes it possible to discover and fulfill a need for logical
> replication that can arise at a time when bouncing the server has
> become impractical, i.e. when there is already high demand on it.
>

I think we should first do performance testing to see what is the
overhead of making this default.  I think pgbench read-write at
various scale factors would be a good starting point.  Also, we should
see how much additional WAL it generates as compared to current
default.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Bump default wal_level to logical

От
Michael Paquier
Дата:
On Mon, Jun 08, 2020 at 11:59:14AM +0530, Amit Kapila wrote:
> I think we should first do performance testing to see what is the
> overhead of making this default.  I think pgbench read-write at
> various scale factors would be a good starting point.  Also, we should
> see how much additional WAL it generates as compared to current
> default.

+1.  I recall that changing wal_level to logical has been discussed in
the past and performance was the actual take to debate on.
--
Michael

Вложения

Re: Bump default wal_level to logical

От
Magnus Hagander
Дата:


On Mon, Jun 8, 2020 at 8:46 AM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Jun 08, 2020 at 11:59:14AM +0530, Amit Kapila wrote:
> I think we should first do performance testing to see what is the
> overhead of making this default.  I think pgbench read-write at
> various scale factors would be a good starting point.  Also, we should
> see how much additional WAL it generates as compared to current
> default.

+1.  I recall that changing wal_level to logical has been discussed in
the past and performance was the actual take to debate on.

That was at least the discussion (long-going and multi-repeated) before we upped it from minimal to replica. There were some pretty extensive benchmarking done to prove that the difference was very small, and this was weighed against the ability to take basic backups of the system (which arguably is more important than being able to do logical replication).

I agree that we should consider changing it *if* it does not come with a substantial overhead, but that has to be shown.

Of course, what would be even neater would be if it could be changed so you don't have to bounce the server to change the wal_level. That's a bigger change though, but perhaps it is now possible once we have the "global barriers" in 13?

--

Re: Bump default wal_level to logical

От
Tomas Vondra
Дата:
On Mon, Jun 08, 2020 at 11:10:38AM +0200, Magnus Hagander wrote:
>On Mon, Jun 8, 2020 at 8:46 AM Michael Paquier <michael@paquier.xyz> wrote:
>
>> On Mon, Jun 08, 2020 at 11:59:14AM +0530, Amit Kapila wrote:
>> > I think we should first do performance testing to see what is the
>> > overhead of making this default.  I think pgbench read-write at
>> > various scale factors would be a good starting point.  Also, we should
>> > see how much additional WAL it generates as compared to current
>> > default.
>>
>> +1.  I recall that changing wal_level to logical has been discussed in
>> the past and performance was the actual take to debate on.
>>
>
>That was at least the discussion (long-going and multi-repeated) before we
>upped it from minimal to replica. There were some pretty extensive
>benchmarking done to prove that the difference was very small, and this was
>weighed against the ability to take basic backups of the system (which
>arguably is more important than being able to do logical replication).
>
>I agree that we should consider changing it *if* it does not come with a
>substantial overhead, but that has to be shown.
>

I agree performance evaluation is necessary, and I'm willing to spend
some time on it. But I don't think the difference will be much worse
than for the wal_level=replica, at least for common workloads. It's
certainly possible to construct workloads with significant impact, due
to the extra stuff (assignments, cache invalidations and so on).

In general I think the case is somewhat weaker compared to the replica
case, which was required for such basic things like physical backups.


>Of course, what would be even neater would be if it could be changed so
>you don't have to bounce the server to change the wal_level. That's a
>bigger change though, but perhaps it is now possible once we have the
>"global barriers" in 13?
>

Yeah. That would sidestep a lot of the performance concerns, because if
switching from replica to logical is fairly easy / without restart, we
could keep the current default.

Not sure if it's sufficient, though, because switching to logical may
require bumping up number of slots, walsenders, etc. At which point you
actually need a restart. Not to mention that extensions using logical
decoding (like pglogical) need to allocate shared memory etc. But for
the built-in logical replication that is not an issue, ofc.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Bump default wal_level to logical

От
David Fetter
Дата:
On Mon, Jun 08, 2020 at 11:10:38AM +0200, Magnus Hagander wrote:
> On Mon, Jun 8, 2020 at 8:46 AM Michael Paquier <michael@paquier.xyz> wrote:
> 
> > On Mon, Jun 08, 2020 at 11:59:14AM +0530, Amit Kapila wrote:
> > > I think we should first do performance testing to see what is the
> > > overhead of making this default.  I think pgbench read-write at
> > > various scale factors would be a good starting point.  Also, we should
> > > see how much additional WAL it generates as compared to current
> > > default.
> >
> > +1.  I recall that changing wal_level to logical has been discussed in
> > the past and performance was the actual take to debate on.
> >
> 
> That was at least the discussion (long-going and multi-repeated) before we
> upped it from minimal to replica. There were some pretty extensive
> benchmarking done to prove that the difference was very small, and this was
> weighed against the ability to take basic backups of the system (which
> arguably is more important than being able to do logical replication).

I'd argue this a different direction.

Logical replication has been at fundamental to how a lot of systems
operate since Slony came out for the very good reason that it was far
and away the simplest way to accomplish a bunch of design goals. There
are now, and have been for some years, both free and proprietary
systems whose sole purpose is change data capture. PostgreSQL can play
nicely with those systems with this switch flipped on by default.

Looking into the future of PostgreSQL itself, there are things we've
been unable to do thus far that logical replication makes tractable.
These include:

- Zero downtime version changes
- Substantive changes to our on-disk representations between versions
  because upgrading in place places sharp limits on what we could do.

> I agree that we should consider changing it *if* it does not come
> with a substantial overhead, but that has to be shown.

What overhead would be substantial enough to require more work than
changing the default, and under what circumstances?

I ask this because on a heavily loaded system, the kind where
differences could be practical as opposed to merely statistically
significant, statement logging at even the most basic level is a much
bigger burden than the maxed-out WALs are.  Any overhead those WALs
might impose simply disappears in the noise.  The difference is even
more stark in systems subject to audit.

> Of course, what would be even neater would be if it could be changed
> so you don't have to bounce the server to change the wal_level.
> That's a bigger change though, but perhaps it is now possible once
> we have the "global barriers" in 13?

As much as I would love to have this capability, I was hoping to keep
the scope of this contained.  As pointed out down-thread, there's lots
more to doing this dynamically that just turning up the wal_level.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Bump default wal_level to logical

От
Alvaro Herrera
Дата:
On 2020-Jun-08, Tomas Vondra wrote:

> Not sure if it's sufficient, though, because switching to logical may
> require bumping up number of slots, walsenders, etc. At which point you
> actually need a restart. Not to mention that extensions using logical
> decoding (like pglogical) need to allocate shared memory etc. But for
> the built-in logical replication that is not an issue, ofc.

I think it's reasonable to push our default limits for slots,
walsenders, max_bgworkers etc a lot higher than current value (say 10 ->
100).  An unused slot wastes essentially no resources; an unused
walsender is just one PGPROC entry.  If we did that, and also allowed
wal_level to be changed on the fly, we wouldn't need to restart in order
to enable logical replication, so there would be little or no pressure
to change the wal_level default.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Bump default wal_level to logical

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> I think it's reasonable to push our default limits for slots,
> walsenders, max_bgworkers etc a lot higher than current value (say 10 ->
> 100).  An unused slot wastes essentially no resources; an unused
> walsender is just one PGPROC entry.  If we did that, and also allowed
> wal_level to be changed on the fly, we wouldn't need to restart in order
> to enable logical replication, so there would be little or no pressure
> to change the wal_level default.

Unused PGPROC entries will still consume semaphores, which is problematic
on at least some OSes.  It's not really clear to me why the default for
walsenders would need to be O(100) anyway.  The existing default of 10
already ought to be enough to cover approximately 99.999% of use cases.

If we can allow wal_level to be changed on the fly, I agree that would
help reduce the pressure to make the default setting more expensive.
I don't recall why it's PGC_POSTMASTER right now, but I suppose there
was a reason for that ...

            regards, tom lane



Re: Bump default wal_level to logical

От
Robert Haas
Дата:
On Mon, Jun 8, 2020 at 1:16 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> I think it's reasonable to push our default limits for slots,
> walsenders, max_bgworkers etc a lot higher than current value (say 10 ->
> 100).  An unused slot wastes essentially no resources; an unused
> walsender is just one PGPROC entry.  If we did that, and also allowed
> wal_level to be changed on the fly, we wouldn't need to restart in order
> to enable logical replication, so there would be little or no pressure
> to change the wal_level default.

Wouldn't having a whole bunch of extra PGPROC entries have negative
implications for the performance of GetSnapshotData() and other things
that don't scale well at high connection counts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Bump default wal_level to logical

От
Kenneth Marshall
Дата:
On Mon, Jun 08, 2020 at 02:58:03PM -0400, Robert Haas wrote:
> On Mon, Jun 8, 2020 at 1:16 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > I think it's reasonable to push our default limits for slots,
> > walsenders, max_bgworkers etc a lot higher than current value (say 10 ->
> > 100).  An unused slot wastes essentially no resources; an unused
> > walsender is just one PGPROC entry.  If we did that, and also allowed
> > wal_level to be changed on the fly, we wouldn't need to restart in order
> > to enable logical replication, so there would be little or no pressure
> > to change the wal_level default.
> 
> Wouldn't having a whole bunch of extra PGPROC entries have negative
> implications for the performance of GetSnapshotData() and other things
> that don't scale well at high connection counts?
> 

+1

I think just having the defaults raised enough to allow even a couple DB
replication slots would be advantageous and allow it to be used to
address spur of the moment needs for systems that need to stay up. It
does seem wasteful to by default support large numbers of slots and
seems to be contrary to the project stance on initial limits.

Regards,
Ken



Re: Bump default wal_level to logical

От
Robert Haas
Дата:
On Mon, Jun 8, 2020 at 5:11 AM Magnus Hagander <magnus@hagander.net> wrote:
> I agree that we should consider changing it *if* it does not come with a substantial overhead, but that has to be
shown.

I think the big overhead is that you log the old version of each row's
primary key (or whatever the replica identity is) when performing an
UPDATE or DELETE. So if you test it with integer keys probably it's
not bad, and I suspect (though I haven't looked) that we don't do the
extra logging when they key hasn't changed. But if you have wide text
columns as keys and you update them a lot then things might not look
so good. I think in the bad cases for this feature the overhead is
vastly more than going from minimal to replica.

As many people here probably know, I am in general skeptical of this
kind of change. It's based on the premise that reconfiguring the
system is either too hard for users to figure out, or too disruptive
because they'll need a restart. I tend to feel that the first problem
should be solved by making it easier to figure out, and the second one
should be solved by not requiring a restart. I don't think that's easy
engineering, because while I think barriers help, they only address
one relatively small aspect of what's probably a pretty difficult
engineering problem. But the real-life analogue of what's being
proposed here seems to be "the people who are buying this house might
not be able to figure out how to turn the lights on if they need more
light, so let's just turn on all the lights before we hand over the
keys, and that way if they just leave them on forever it'll be cool."
To which any reasonable person would say - "if your electrical
switches are that hard to locate and use, that house has got a serious
design problem."

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Bump default wal_level to logical

От
Peter Geoghegan
Дата:
On Mon, Jun 8, 2020 at 12:09 PM Robert Haas <robertmhaas@gmail.com> wrote:
> I think the big overhead is that you log the old version of each row's
> primary key (or whatever the replica identity is) when performing an
> UPDATE or DELETE. So if you test it with integer keys probably it's
> not bad, and I suspect (though I haven't looked) that we don't do the
> extra logging when they key hasn't changed. But if you have wide text
> columns as keys and you update them a lot then things might not look
> so good. I think in the bad cases for this feature the overhead is
> vastly more than going from minimal to replica.
>
> As many people here probably know, I am in general skeptical of this
> kind of change. It's based on the premise that reconfiguring the
> system is either too hard for users to figure out, or too disruptive
> because they'll need a restart.

I completely agree with your analysis, and your conclusions.

-- 
Peter Geoghegan



Re: Bump default wal_level to logical

От
Alvaro Herrera
Дата:
On 2020-Jun-08, Robert Haas wrote:

> On Mon, Jun 8, 2020 at 1:16 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > I think it's reasonable to push our default limits for slots,
> > walsenders, max_bgworkers etc a lot higher than current value (say 10 ->
> > 100).  An unused slot wastes essentially no resources; an unused
> > walsender is just one PGPROC entry.  If we did that, and also allowed
> > wal_level to be changed on the fly, we wouldn't need to restart in order
> > to enable logical replication, so there would be little or no pressure
> > to change the wal_level default.
> 
> Wouldn't having a whole bunch of extra PGPROC entries have negative
> implications for the performance of GetSnapshotData() and other things
> that don't scale well at high connection counts?

On a quantum-mechanics level, sure, but after Andres's snapshot
scalability patches, will it be measurable?  (Besides, if your workload
is so high that you're measurably affected by the additional unused
PGPROC entries, you can always tune it lower.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Bump default wal_level to logical

От
Peter Geoghegan
Дата:
On Mon, Jun 8, 2020 at 12:28 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> On a quantum-mechanics level, sure, but after Andres's snapshot
> scalability patches, will it be measurable?  (Besides, if your workload
> is so high that you're measurably affected by the additional unused
> PGPROC entries, you can always tune it lower.)

The point that Robert went on to make about the increased WAL volume
from logging old primary key (or replica identity) values was a
stronger argument, IMV.

-- 
Peter Geoghegan



Re: Bump default wal_level to logical

От
Andres Freund
Дата:
Hi,

On 2020-06-08 14:58:03 -0400, Robert Haas wrote:
> On Mon, Jun 8, 2020 at 1:16 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > I think it's reasonable to push our default limits for slots,
> > walsenders, max_bgworkers etc a lot higher than current value (say 10 ->
> > 100).  An unused slot wastes essentially no resources; an unused
> > walsender is just one PGPROC entry.  If we did that, and also allowed
> > wal_level to be changed on the fly, we wouldn't need to restart in order
> > to enable logical replication, so there would be little or no pressure
> > to change the wal_level default.
> 
> Wouldn't having a whole bunch of extra PGPROC entries have negative
> implications for the performance of GetSnapshotData() and other things
> that don't scale well at high connection counts?

Some things yes, but I don't think it should have a significant effect
on GetSnapshotData():

We currently don't touch unused PGPROCs for it (by virtue of
procArray->pgprocnos), and we wouldn't with my proposed / pending
changes (because the relevant arrays contain data for connected backends
at the "front").
You can have some effect if you have temporary spikes to very high
connection counts, and then reduce that again. That can lead to a lot of
unused PGXACT entries being interleaved with used ones, leading to
higher cache miss ratios (data cache as well as tlb). But that cannot
become a problem without those PGPROCs ever being used, because IIRC we
otherwise ensure they're used "densely".

There are a few places where we actually look over all PGPROCs, or size
resources according to that, but I think most of those shouldn't be in
hot paths.

There are also effects like the lock hashtables being sized larger,
which then can reduce the cache hit ratio. I guess we could check
whether it'd make sense to charge less than max_locks_per_transaction
for everything but user processes, but I'm a bit doubtful it's worth it.

Greetings,

Andres Freund



Re: Bump default wal_level to logical

От
Andres Freund
Дата:
Hi,

On 2020-06-08 13:27:50 -0400, Tom Lane wrote:
> If we can allow wal_level to be changed on the fly, I agree that would
> help reduce the pressure to make the default setting more expensive.
> I don't recall why it's PGC_POSTMASTER right now, but I suppose there
> was a reason for that ...

There's reasons, but IIRC they're all solvable with reasonable effort. I
think most of it boils down to only being able to rely on the new
wal_level after a while. For minimal->recovery we basically need a
checkpoint started after the change in configuration, and for
recovery->logical we need to wait until all sessions have a) read the
new config setting b) finished the transaction that used the old
setting.

Greetings,

Andres Freund



Re: Bump default wal_level to logical

От
Peter Eisentraut
Дата:
On 2020-06-08 23:32, Andres Freund wrote:
> On 2020-06-08 13:27:50 -0400, Tom Lane wrote:
>> If we can allow wal_level to be changed on the fly, I agree that would
>> help reduce the pressure to make the default setting more expensive.
>> I don't recall why it's PGC_POSTMASTER right now, but I suppose there
>> was a reason for that ...
> 
> There's reasons, but IIRC they're all solvable with reasonable effort. I
> think most of it boils down to only being able to rely on the new
> wal_level after a while. For minimal->recovery we basically need a
> checkpoint started after the change in configuration, and for
> recovery->logical we need to wait until all sessions have a) read the
> new config setting b) finished the transaction that used the old
> setting.

The best behavior from a user's perspective would be if the WAL level 
automatically switched to logical if logical replication slots are 
present.  You might not even need 'logical' as an actual value of 
wal_level anymore, you just need to keep a flag in shared memory that 
records whether at least one logical slot exists.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Bump default wal_level to logical

От
Kyotaro Horiguchi
Дата:
At Tue, 9 Jun 2020 08:52:24 +0200, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote in 
> On 2020-06-08 23:32, Andres Freund wrote:
> > On 2020-06-08 13:27:50 -0400, Tom Lane wrote:
> >> If we can allow wal_level to be changed on the fly, I agree that would
> >> help reduce the pressure to make the default setting more expensive.
> >> I don't recall why it's PGC_POSTMASTER right now, but I suppose there
> >> was a reason for that ...
> > There's reasons, but IIRC they're all solvable with reasonable
> > effort. I
> > think most of it boils down to only being able to rely on the new
> > wal_level after a while. For minimal->recovery we basically need a
> > checkpoint started after the change in configuration, and for
> > recovery->logical we need to wait until all sessions have a) read the
> > new config setting b) finished the transaction that used the old
> > setting.
> 
> The best behavior from a user's perspective would be if the WAL level
> automatically switched to logical if logical replication slots are
> present.  You might not even need 'logical' as an actual value of
> wal_level anymore, you just need to keep a flag in shared memory that
> records whether at least one logical slot exists.

Currently logical slots cannot be created while wal_level <
logical. Thus a database that has a logical slot must have been once
executed with wal_level >= logical before the creation of the slot.

It seems to me setting wal_level = logical would be better than
creating a dummy logical slot while initdb..

Coulnd't we add an option to speicfy wal_level for initdb? Or an
option to append arbitrary config lines to postgresql.conf?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Bump default wal_level to logical

От
Magnus Hagander
Дата:


On Tue, Jun 9, 2020 at 10:53 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Tue, 9 Jun 2020 08:52:24 +0200, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote in
> On 2020-06-08 23:32, Andres Freund wrote:
> > On 2020-06-08 13:27:50 -0400, Tom Lane wrote:
> >> If we can allow wal_level to be changed on the fly, I agree that would
> >> help reduce the pressure to make the default setting more expensive.
> >> I don't recall why it's PGC_POSTMASTER right now, but I suppose there
> >> was a reason for that ...
> > There's reasons, but IIRC they're all solvable with reasonable
> > effort. I
> > think most of it boils down to only being able to rely on the new
> > wal_level after a while. For minimal->recovery we basically need a
> > checkpoint started after the change in configuration, and for
> > recovery->logical we need to wait until all sessions have a) read the
> > new config setting b) finished the transaction that used the old
> > setting.
>
> The best behavior from a user's perspective would be if the WAL level
> automatically switched to logical if logical replication slots are
> present.  You might not even need 'logical' as an actual value of
> wal_level anymore, you just need to keep a flag in shared memory that
> records whether at least one logical slot exists.

Currently logical slots cannot be created while wal_level <
logical. Thus a database that has a logical slot must have been once
executed with wal_level >= logical before the creation of the slot.

It seems to me setting wal_level = logical would be better than
creating a dummy logical slot while initdb..


I don't think Peter is suggesting a dummy slot. What he's suggesting is allow the creation of a logical slot even when wal_level is not set to logical, and instead automatically raise the wal level to the equivalent of logical when you do. That way, the operation becomes transparent to the user.

 
Coulnd't we add an option to speicfy wal_level for initdb? Or an
option to append arbitrary config lines to postgresql.conf?

That wouldn't solve the problem David raised initially. The whole reason for being able to do this is that you *didn't*k now when you did initdb that you were going to need logical replication at a later stage. So you are already in front of a running cluster with wal_level=replica, and now the cost of turning it to logical includes restarting the db and kicking all sessions out. If you know when you're setting the system up that you're going to need it, then the work of setting it isn't that big.

It might be useful to have a functionality to append arbitrary config lines in initdb, but then that's not all that different from just copying in a pre-made postgresql.auto.conf file into the newly initialized cluster after it's done -- I'm not sure it buys very much.
 
--

Re: Bump default wal_level to logical

От
Amit Kapila
Дата:
On Tue, Jun 9, 2020 at 2:31 PM Magnus Hagander <magnus@hagander.net> wrote:
>
> On Tue, Jun 9, 2020 at 10:53 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
>>
>> At Tue, 9 Jun 2020 08:52:24 +0200, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote in
>> > On 2020-06-08 23:32, Andres Freund wrote:
>> > > On 2020-06-08 13:27:50 -0400, Tom Lane wrote:
>> > >> If we can allow wal_level to be changed on the fly, I agree that would
>> > >> help reduce the pressure to make the default setting more expensive.
>> > >> I don't recall why it's PGC_POSTMASTER right now, but I suppose there
>> > >> was a reason for that ...
>> > > There's reasons, but IIRC they're all solvable with reasonable
>> > > effort. I
>> > > think most of it boils down to only being able to rely on the new
>> > > wal_level after a while. For minimal->recovery we basically need a
>> > > checkpoint started after the change in configuration, and for
>> > > recovery->logical we need to wait until all sessions have a) read the
>> > > new config setting b) finished the transaction that used the old
>> > > setting.
>> >
>> > The best behavior from a user's perspective would be if the WAL level
>> > automatically switched to logical if logical replication slots are
>> > present.  You might not even need 'logical' as an actual value of
>> > wal_level anymore, you just need to keep a flag in shared memory that
>> > records whether at least one logical slot exists.
>>
>> Currently logical slots cannot be created while wal_level <
>> logical. Thus a database that has a logical slot must have been once
>> executed with wal_level >= logical before the creation of the slot.
>>

I think the creation of slot would take a lot more time in that case
as it needs to wait for existing transactions to finish which I feel
could be confusing to users.  Sure, the cost would have to be incurred
the first time but still the user might tempt to cancel such an
operation if he is not aware of the internals.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Bump default wal_level to logical

От
Magnus Hagander
Дата:
On Tue, Jun 9, 2020 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 9, 2020 at 2:31 PM Magnus Hagander <magnus@hagander.net> wrote:
>
> On Tue, Jun 9, 2020 at 10:53 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
>>
>> At Tue, 9 Jun 2020 08:52:24 +0200, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote in
>> > On 2020-06-08 23:32, Andres Freund wrote:
>> > > On 2020-06-08 13:27:50 -0400, Tom Lane wrote:
>> > >> If we can allow wal_level to be changed on the fly, I agree that would
>> > >> help reduce the pressure to make the default setting more expensive.
>> > >> I don't recall why it's PGC_POSTMASTER right now, but I suppose there
>> > >> was a reason for that ...
>> > > There's reasons, but IIRC they're all solvable with reasonable
>> > > effort. I
>> > > think most of it boils down to only being able to rely on the new
>> > > wal_level after a while. For minimal->recovery we basically need a
>> > > checkpoint started after the change in configuration, and for
>> > > recovery->logical we need to wait until all sessions have a) read the
>> > > new config setting b) finished the transaction that used the old
>> > > setting.
>> >
>> > The best behavior from a user's perspective would be if the WAL level
>> > automatically switched to logical if logical replication slots are
>> > present.  You might not even need 'logical' as an actual value of
>> > wal_level anymore, you just need to keep a flag in shared memory that
>> > records whether at least one logical slot exists.
>>
>> Currently logical slots cannot be created while wal_level <
>> logical. Thus a database that has a logical slot must have been once
>> executed with wal_level >= logical before the creation of the slot.
>>

I think the creation of slot would take a lot more time in that case
as it needs to wait for existing transactions to finish which I feel
could be confusing to users.  Sure, the cost would have to be incurred
the first time but still the user might tempt to cancel such an
operation if he is not aware of the internals.

Yeah, I am unsure if this is doable, but I think that's what Peter was trying to explain, because that's what would be most user-friendly. But it may definitely not be worth the complexity, I'm guessing.

Being able to change wal_level on reload instead of restart would be less user friendly than that, but more user friendly than what we have now. 

--

Re: Bump default wal_level to logical

От
Amit Kapila
Дата:
On Tue, Jun 9, 2020 at 4:58 PM Magnus Hagander <magnus@hagander.net> wrote:
>
> On Tue, Jun 9, 2020 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Jun 9, 2020 at 2:31 PM Magnus Hagander <magnus@hagander.net> wrote:
>> >
>> > On Tue, Jun 9, 2020 at 10:53 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
>> >>
>> >> At Tue, 9 Jun 2020 08:52:24 +0200, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote in
>> >> > On 2020-06-08 23:32, Andres Freund wrote:
>> >> > > On 2020-06-08 13:27:50 -0400, Tom Lane wrote:
>> >> > >> If we can allow wal_level to be changed on the fly, I agree that would
>> >> > >> help reduce the pressure to make the default setting more expensive.
>> >> > >> I don't recall why it's PGC_POSTMASTER right now, but I suppose there
>> >> > >> was a reason for that ...
>> >> > > There's reasons, but IIRC they're all solvable with reasonable
>> >> > > effort. I
>> >> > > think most of it boils down to only being able to rely on the new
>> >> > > wal_level after a while. For minimal->recovery we basically need a
>> >> > > checkpoint started after the change in configuration, and for
>> >> > > recovery->logical we need to wait until all sessions have a) read the
>> >> > > new config setting b) finished the transaction that used the old
>> >> > > setting.
>> >> >
>> >> > The best behavior from a user's perspective would be if the WAL level
>> >> > automatically switched to logical if logical replication slots are
>> >> > present.  You might not even need 'logical' as an actual value of
>> >> > wal_level anymore, you just need to keep a flag in shared memory that
>> >> > records whether at least one logical slot exists.
>> >>
>> >> Currently logical slots cannot be created while wal_level <
>> >> logical. Thus a database that has a logical slot must have been once
>> >> executed with wal_level >= logical before the creation of the slot.
>> >>
>>
>> I think the creation of slot would take a lot more time in that case
>> as it needs to wait for existing transactions to finish which I feel
>> could be confusing to users.  Sure, the cost would have to be incurred
>> the first time but still the user might tempt to cancel such an
>> operation if he is not aware of the internals.
>
>
> Yeah, I am unsure if this is doable, but I think that's what Peter was trying to explain, because that's what would
bemost user-friendly. But it may definitely not be worth the complexity, I'm guessing.
 
>

Also, I think we might need to think shall we allow wal_level to be
changed back to replica? If so, how?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Bump default wal_level to logical

От
Amit Kapila
Дата:
On Tue, Jun 9, 2020 at 3:02 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2020-06-08 13:27:50 -0400, Tom Lane wrote:
> > If we can allow wal_level to be changed on the fly, I agree that would
> > help reduce the pressure to make the default setting more expensive.
> > I don't recall why it's PGC_POSTMASTER right now, but I suppose there
> > was a reason for that ...
>
> There's reasons, but IIRC they're all solvable with reasonable effort. I
> think most of it boils down to only being able to rely on the new
> wal_level after a while. For minimal->recovery we basically need a
> checkpoint started after the change in configuration, and for
> recovery->logical we need to wait until all sessions have a) read the
> new config setting b) finished the transaction that used the old
> setting.
>

What if we note down the highest transaction id when we set wal_level
= logical and won't allow a snapshot in logical decoding to reach a
consistent state till we see at least that xid as committed?  I think
this will mean that it won't allow to decode any transaction which is
operated with wal_level < logical and that might serve the purpose.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Bump default wal_level to logical

От
Amit Kapila
Дата:
On Tue, Jun 9, 2020 at 6:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 9, 2020 at 3:02 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
> >
> > On 2020-06-08 13:27:50 -0400, Tom Lane wrote:
> > > If we can allow wal_level to be changed on the fly, I agree that would
> > > help reduce the pressure to make the default setting more expensive.
> > > I don't recall why it's PGC_POSTMASTER right now, but I suppose there
> > > was a reason for that ...
> >
> > There's reasons, but IIRC they're all solvable with reasonable effort. I
> > think most of it boils down to only being able to rely on the new
> > wal_level after a while. For minimal->recovery we basically need a
> > checkpoint started after the change in configuration, and for
> > recovery->logical we need to wait until all sessions have a) read the
> > new config setting b) finished the transaction that used the old
> > setting.
> >
>
> What if we note down the highest transaction id when we set wal_level
> = logical and won't allow a snapshot in logical decoding to reach a
> consistent state till we see at least that xid as committed?  I think
> this will mean that it won't allow to decode any transaction which is
> operated with wal_level < logical and that might serve the purpose.
>

I intend to say that if the above is possible then we don't need to
wait for (b).

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Bump default wal_level to logical

От
Andres Freund
Дата:
Hi,

On 2020-06-09 08:52:24 +0200, Peter Eisentraut wrote:
> On 2020-06-08 23:32, Andres Freund wrote:
> > On 2020-06-08 13:27:50 -0400, Tom Lane wrote:
> > > If we can allow wal_level to be changed on the fly, I agree that would
> > > help reduce the pressure to make the default setting more expensive.
> > > I don't recall why it's PGC_POSTMASTER right now, but I suppose there
> > > was a reason for that ...
> > 
> > There's reasons, but IIRC they're all solvable with reasonable effort. I
> > think most of it boils down to only being able to rely on the new
> > wal_level after a while. For minimal->recovery we basically need a
> > checkpoint started after the change in configuration, and for
> > recovery->logical we need to wait until all sessions have a) read the
> > new config setting b) finished the transaction that used the old
> > setting.
> 
> The best behavior from a user's perspective would be if the WAL level
> automatically switched to logical if logical replication slots are present.
> You might not even need 'logical' as an actual value of wal_level anymore,
> you just need to keep a flag in shared memory that records whether at least
> one logical slot exists.

Yea, it'd be good to have that. But you'd need the same type of
coordination mentioned above, no?

Greetings,

Andres Freund