Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions
Дата
Msg-id 20190617154133.GF2480@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions  (Ian Barwick <ian.barwick@2ndquadrant.com>)
Ответы Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions  (Magnus Hagander <magnus@hagander.net>)
Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions  (Ian Barwick <ian.barwick@2ndquadrant.com>)
Список pgsql-hackers
Greetings,

* Ian Barwick (ian.barwick@2ndquadrant.com) wrote:
> On 6/15/19 1:08 AM, Stephen Frost wrote:
> > * Ian Barwick (ian.barwick@2ndquadrant.com) wrote:
> >> Consider the following cascading standby setup with PostgreSQL 12:
> >>
> >> - there exists a running primary "A"
> >> - standby "B" is cloned from primary "A" using "pg_basebackup --write-recovery-conf"
> >> - standby "C" is cloned from standby "B" using "pg_basebackup --write-recovery-conf"
> (...)
> >> However, executing "ALTER SYSTEM SET primary_conninfo = 'host=someothernode'" left
> >> standby "C"'s "postgresql.auto.conf" file looking like this:
> >>
> >>     # Do not edit this file manually!
> >>     # It will be overwritten by the ALTER SYSTEM command.
> >>     primary_conninfo = 'host=someothernode'
> >>     primary_conninfo = 'host=node_B'
> >>
> >> which seems somewhat broken, to say the least.
> >
> > Yes, it's completely broken, which I've complained about at least twice
> > on this list to no avail.
> >
> > Thanks for putting together an example case pointing out why it's a
> > serious issue.  The right thing to do here it so create an open item for
> > PG12 around this.
>
> Done.

Thanks.

> >> Attached patch attempts to rectify this situation by having replace_auto_config_value()
> >> deleting any duplicate entries first, before making any changes to the last entry.
> >
> > While this might be a good belt-and-suspenders kind of change to
> > include, I don't think pg_basebackup should be causing us to have
> > multiple entries in the file in the first place..
> (...)
> >> Also attached is a set of TAP tests to check ALTER SYSTEM works as expected (or
> >> at least as seems correct to me).
> >
> > In my view, at least, we should have a similar test for pg_basebackup to
> > make sure that it doesn't create an invalid .auto.conf file.
>
> Indeed... I'd be happy to create tests... but first we need a definition of what
> constitutes a valid .auto.conf file.
>
> If that definition includes requiring that a parameter may occur only once, then
> we need to provide a way for utilities such as pg_basebackup to write the replication
> configuration to a configuration file in such a way that it doesn't somehow
> render that file invalid.

Yes, I think that we do need to require that a parameter only occur once
and pg_basebackup and friends need to be able to manage that.

> In Pg11 and earlier, it was just a case of writing (or overwriting) recovery.conf.

Right.

> In Pg12, the code in pg_basebackup implies the correct thing to do is append to .auto.conf,
> but as demonstrated that can cause problems with duplicate entries.

Code can have bugs. :)  I'd argue that this is such a bug that needs to
be fixed..

> Having pg_basebackup, or any other utility which clones a standby, parse the file
> itself to remove duplicates seems like a recipe for pain and badly duplicated effort
> (unless there's some way of calling the configuration parsing code while the
> server is not running).

I don't really see that there's much hope for it.

> We could declare that the .auto.conf file will be reset to the default state when
> a standby is cloned, but the implicit behaviour so far has been to copy the file
> as-is (as would happen with any other configuration files in the data directory).
>
> We could avoid the need for modifying the .auto.conf file by declaring that a
> configuration with a specific name in the data directory (let's call it
> "recovery.conf" or "replication.conf") can be used by any utilities writing
> replication configuration (though of course in contrast to the old recovery.conf
> it would be included, if exists, as a normal configuration file, though then the
> precedence would need to be defined, etc..). I'm not sure off the top of my head
> whether something like that has already been discussed, though it's presumably a
> bit late in the release cycle to make such changes anyway?

This was discussed a fair bit, including suggestions along exactly those
lines.  There were various arguments for and against, so you might want
to review the threads where that discussion happened to see what the
reasoning was for not having such an independent file.

> >>> This is absolutely the fault of the system for putting in multiple
> >>> entries into the auto.conf, which it wasn't ever written to handle.
> >>
> > * Amit Kapila (amit.kapila16@gmail.com) wrote:
> >> Right.  I think if possible, it should use existing infrastructure to
> >> write to postgresql.auto.conf rather than inventing a new way to
> >> change it.  Apart from this issue, if we support multiple ways to edit
> >> postgresql.auto.conf, we might end up with more problems like this in
> >> the future where one system is not aware of the way file being edited
> >> by another system.
> >
> > I agere that there should have been some effort put into making the way
> > ALTER SYSTEM is modified be consistent between the backend and utilities
> > like pg_basebackup (which would also help third party tools understand
> > how a non-backend application should be modifying the file).
>
> Did you mean to say "the way postgresql.auto.conf is modified"?

Ah, yes, more-or-less.  I think I was going for 'the way ALTER SYSTEM
modifies postgresql.auto.conf'.

> I suggest explicitly documenting postgresql.auto.conf behaviour (and the circumstances
> where it's acceptable to modify it outside of ALTER SYSTEM calls) in the documentation
> (and possibly in the code), so anyone writing utilities which need to
> append to postgresql.auto.conf knows what the situation is.

Yeah, I would think that, ideally, we'd have some code in the common
library that other utilities could leverage and which the backend would
also use.

> - postgresql.auto.conf is maintained by PostgreSQL and can be rewritten at will by the system
>   at any time

I'd further say something along the lines of 'utilities should not
modify a postgresql.auto.conf that's in place under a running PostgreSQL
cluster'.

> - there is no guarantee that items in postgresql.auto.conf will be in a particular order
> - it  must never be manually edited (though it may be viewed)

'must' is perhaps a bit strong...  I would say something like
"shouldn't", as users may *have* to modify it, if PostgreSQL won't
start due to some configuration in it.

> - postgresql.auto.conf may be appended to by utilities which need to write configuration
>   items and which and need a guarantee that the items will be read by the server at startup
>   (but only when the server is down of course)

Well, I wouldn't support saying "append" since that's what got us into
this mess. :)

> - any duplicate items will be removed when ALTER SYSTEM is executed to change or reset
>   an item (a WARNING will be emitted about duplicate items removed)
> - comment lines (apart from the warning at the top of the file) will be silently removed
>   (this is currently the case anyway)

I'd rather say that 'any duplicate items should be removed, and a
WARNING emitted when detected', or something along those lines.  Same
for comment lines...

> I will happily work on those changes in the next few days if agreed.

Great!

Thanks,

Stephen

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: how to run encoding-dependent tests by default
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: how to run encoding-dependent tests by default