Re: [Patch] ALTER SYSTEM READ ONLY

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [Patch] ALTER SYSTEM READ ONLY
Дата
Msg-id 20200617180546.yucxtiupvxghxss6@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [Patch] ALTER SYSTEM READ ONLY  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Hi,

On 2020-06-17 12:07:22 -0400, Robert Haas wrote:
> On Wed, Jun 17, 2020 at 10:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I also think that putting such a thing into ALTER SYSTEM has got big
> > logical problems.  Someday we will probably want to have ALTER SYSTEM
> > write WAL so that standby servers can absorb the settings changes.
> > But if writing WAL is disabled, how can you ever turn the thing off again?
> 
> I mean, the syntax that we use for a feature like this is arbitrary. I
> picked this one, so I like it, but it can easily be changed if other
> people want something else. The rest of this argument doesn't seem to
> me to make very much sense. The existing ALTER SYSTEM functionality to
> modify a text configuration file isn't replicated today and I'm not
> sure why we should make it so, considering that replication generally
> only considers things that are guaranteed to be the same on the master
> and the standby, which this is not. But even if we did, that has
> nothing to do with whether some functionality that changes the system
> state without changing a text file ought to also be replicated. This
> is a piece of cluster management functionality and it makes no sense
> to replicate it. And no right-thinking person would ever propose to
> change a feature that renders the system read-only in such a way that
> it was impossible to deactivate it. That would be nuts.

I agree that the concrete syntax here doesn't seem to matter much. If
this worked by actually putting a GUC into the config file, it would
perhaps matter a bit more, but it doesn't afaict.  It seems good to
avoid new top-level statements, and ALTER SYSTEM seems to fit well.


I wonder if there's an argument about wanting to be able to execute this
command over a physical replication connection? I think this feature
fairly obviously is a building block for "gracefully failover to this
standby", and it seems like it'd be nicer if that didn't potentially
require two pg_hba.conf entries for the to-be-promoted primary on the
current/old primary?


> > Lastly, the arguments in favor seem pretty bogus.  HA switchover normally
> > involves just killing the primary server, not expecting that you can
> > leisurely issue some commands to it first.
> 
> Yeah, that's exactly the problem I want to fix. If you kill the master
> server, then you have interrupted service, even for read-only queries.
> That sucks. Also, even if you don't care about interrupting service on
> the master, it's actually sorta hard to guarantee a clean switchover.
> The walsenders are supposed to send all the WAL from the master before
> exiting, but if the connection is broken for some reason, then the
> master is down and the standbys can't stream the rest of the WAL. You
> can start it up again, but then you might generate more WAL. You can
> try to copy the WAL around manually from one pg_wal directory to
> another, but that's not a very nice thing for users to need to do
> manually, and seems buggy and error-prone.

Also (I'm sure you're aware) if you just non-gracefully shut down the
old primary, you're going to have to rewind the old primary to be able
to use it as a standby. And if you non-gracefully stop you're gonna
incur checkpoint overhead, which is *massive* on non-toy
databases. There's a huge practical difference between a minor version
upgrade causing 10s of unavailability and causing 5min-30min.


> And how do you figure out where the WAL ends on the master and make
> sure that the standby replayed it all? If the master is up, it's easy:
> you just use the same queries you use all the time. If the master is
> down, you have to use some different technique that involves manually
> examining files or scrutinizing pg_controldata output. It's actually
> very difficult to get this right.

Yea, it's absurdly hard. I think it's really kind of ridiculous that we
expect others to get this right if we, the developers of this stuff,
can't really get it right because it's so complicated. Which imo makes
this:

> > Commands that involve a whole
> > bunch of subtle interlocking --- and, therefore, aren't going to work if
> > anything has gone wrong already anywhere in the server --- seem like a
> > particularly poor thing to be hanging your HA strategy on.

more of an argument for having this type of stuff builtin.


> It's important not to conflate controlled switchover with failover.
> When there's a failover, you have to accept some risk of data loss or
> service interruption; but a controlled switchover does not need to
> carry the same risks and there are plenty of systems out there where
> it doesn't.

Yup.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Review for GetWALAvailability()
Следующее
От: Tom Lane
Дата:
Сообщение: Re: More tzdb fun: POSIXRULES is being deprecated upstream