Обсуждение: [HACKERS] v10 pg_ctl compatibility

Поиск
Список
Период
Сортировка

[HACKERS] v10 pg_ctl compatibility

От
Jeff Janes
Дата:
Should the release notes have a compatibility entry about pg_ctl restart, being used against a running pre-10 server, no longer being able to detect when startup is complete?

I don't know if cross-version use of pg_ctl restart was ever officially supported, but the current behavior is rather confusing (waiting for a long time, and then reporting failure, even though it started successfully).

Cheers,

Jeff

Re: [HACKERS] v10 pg_ctl compatibility

От
Andres Freund
Дата:
Hi,

On 2017-09-26 11:59:42 -0700, Jeff Janes wrote:
> Should the release notes have a compatibility entry about pg_ctl restart,
> being used against a running pre-10 server, no longer being able to detect
> when startup is complete?
> 
> I don't know if cross-version use of pg_ctl restart was ever officially
> supported, but the current behavior is rather confusing (waiting for a long
> time, and then reporting failure, even though it started successfully).

I'm actually tempted to just make pg_ctl verify the right version of
postgres is being used. Maybe I'm missing something, but what's the
use-case for allowing it, and every couple releases have some breakage?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2017-09-26 11:59:42 -0700, Jeff Janes wrote:
>> I don't know if cross-version use of pg_ctl restart was ever officially
>> supported, but the current behavior is rather confusing (waiting for a long
>> time, and then reporting failure, even though it started successfully).

> I'm actually tempted to just make pg_ctl verify the right version of
> postgres is being used. Maybe I'm missing something, but what's the
> use-case for allowing it, and every couple releases have some breakage?

At a high level the use case is
yum upgrade postgresqlservice postgresql restart

In practice, that wouldn't work across a major-version upgrade anyway,
because of data directory incompatibility.  It should work for minor
versions though, so the version check couldn't be strict.

I'm not really feeling the need to insert a version check though.
It seems like it's more likely to reject cases that would have worked
than to do anything helpful.  The API that pg_ctl relies on is one
that we don't change very often, viz the contents of postmaster.pid.
The fact that we did change it this time is the source of Jeff's
surprise.

Also, what would you check exactly?  Inquiring into what
"postgres --version" returns is not very conclusive about what is
actually running in the data directory pg_ctl has been pointed at.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

От
Andres Freund
Дата:
On 2017-09-26 15:40:39 -0400, Tom Lane wrote:
> I'm not really feeling the need to insert a version check though.

It's only a mild preference here.


> Also, what would you check exactly?  Inquiring into what
> "postgres --version" returns is not very conclusive about what is
> actually running in the data directory pg_ctl has been pointed at.

Reading PG_VERSION ought to do the trick.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

От
Jeff Janes
Дата:
On Tue, Sep 26, 2017 at 12:29 PM, Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2017-09-26 11:59:42 -0700, Jeff Janes wrote:
> Should the release notes have a compatibility entry about pg_ctl restart,
> being used against a running pre-10 server, no longer being able to detect
> when startup is complete?
>
> I don't know if cross-version use of pg_ctl restart was ever officially
> supported, but the current behavior is rather confusing (waiting for a long
> time, and then reporting failure, even though it started successfully).

I'm actually tempted to just make pg_ctl verify the right version of
postgres is being used. Maybe I'm missing something, but what's the
use-case for allowing it, and every couple releases have some breakage?

The use case for me is that I have multiple versions of postgres running on the same machine, and don't want check which pg_ctl is in my path each time I change one of their configurations and need to restart it.  Or at least, I never had to check before, as pg_ctl goes out of its way to restart it with the same bin/postgres that the server is currently running with, rather than restarting with the bin/postgres currently in the path, or whichever bin/postgres is in the same directory as the bin/pg_ctl.  (Which is itself not an unmitigated win, but it is what I'm used to).

Admittedly, this is a much bigger problem with hacking/testing than it is with production use, but still I do have production machines with mixed versions running, and can see myself screwing this up a few times once one of them gets upgraded to v10, even with an incompatibility warning in the release notes.  But at least I had fair warning.

To add insult to injury, when v10 pg_ctl does restart a pre-10 server and it sits there for a long time waiting for it to start up even though it has already started up, if I hit ctrl-C because I assume something is horribly wrong, it then goes ahead and kills the successfully started-and-running server.

If we do want to do a version check and fail out, I think it should check before shutting it down, rather than shutting it down and then refusing to start.

Cheers,

Jeff

Re: [HACKERS] v10 pg_ctl compatibility

От
Tom Lane
Дата:
Jeff Janes <jeff.janes@gmail.com> writes:
> To add insult to injury, when v10 pg_ctl does restart a pre-10 server and
> it sits there for a long time waiting for it to start up even though it has
> already started up, if I hit ctrl-C because I assume something is horribly
> wrong, it then goes ahead and kills the successfully started-and-running
> server.

Really?  The server should have detached itself from your terminal
group long before that.  What platform is this?
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

От
Jeff Janes
Дата:
On Tue, Sep 26, 2017 at 1:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jeff Janes <jeff.janes@gmail.com> writes:
> To add insult to injury, when v10 pg_ctl does restart a pre-10 server and
> it sits there for a long time waiting for it to start up even though it has
> already started up, if I hit ctrl-C because I assume something is horribly
> wrong, it then goes ahead and kills the successfully started-and-running
> server.

Really?  The server should have detached itself from your terminal
group long before that.  What platform is this?


CentOS release 6.9 (Final)

The sever log file (9.6) says:


   64926  2017-09-26 13:56:38.284 PDT LOG:  database system was shut down at 2017-09-26 13:56:37 PDT
   64926  2017-09-26 13:56:38.299 PDT LOG:  MultiXact member wraparound protections are now enabled
   64930  2017-09-26 13:56:38.311 PDT LOG:  autovacuum launcher started
   64924  2017-09-26 13:56:38.313 PDT LOG:  database system is ready to accept connections
...  hit ctrl-C
   64924  2017-09-26 13:56:47.237 PDT LOG:  received fast shutdown request
   64924  2017-09-26 13:56:47.237 PDT LOG:  aborting any active transactions
   64930  2017-09-26 13:56:47.244 PDT LOG:  autovacuum launcher shutting down
   64927  2017-09-26 13:56:47.261 PDT LOG:  shutting down
...


I can try a different platform (ubuntu 16.04, probably) tonight and see if it does the same thing.

Cheers,

Jeff

Re: [HACKERS] v10 pg_ctl compatibility

От
Andres Freund
Дата:
On 2017-09-26 15:15:39 -0700, Jeff Janes wrote:
> On Tue, Sep 26, 2017 at 1:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> > Jeff Janes <jeff.janes@gmail.com> writes:
> > > To add insult to injury, when v10 pg_ctl does restart a pre-10 server and
> > > it sits there for a long time waiting for it to start up even though it
> > has
> > > already started up, if I hit ctrl-C because I assume something is
> > horribly
> > > wrong, it then goes ahead and kills the successfully started-and-running
> > > server.
> >
> > Really?  The server should have detached itself from your terminal
> > group long before that.  What platform is this?
> >
> 
> 
> CentOS release 6.9 (Final)
> 
> The sever log file (9.6) says:
> 
> 
>    64926  2017-09-26 13:56:38.284 PDT LOG:  database system was shut down
> at 2017-09-26 13:56:37 PDT
>    64926  2017-09-26 13:56:38.299 PDT LOG:  MultiXact member wraparound
> protections are now enabled
>    64930  2017-09-26 13:56:38.311 PDT LOG:  autovacuum launcher started
>    64924  2017-09-26 13:56:38.313 PDT LOG:  database system is ready to
> accept connections
> ...  hit ctrl-C
>    64924  2017-09-26 13:56:47.237 PDT LOG:  received fast shutdown request
>    64924  2017-09-26 13:56:47.237 PDT LOG:  aborting any active transactions
>    64930  2017-09-26 13:56:47.244 PDT LOG:  autovacuum launcher shutting
> down
>    64927  2017-09-26 13:56:47.261 PDT LOG:  shutting down
> ...

It's reproducible here.  Postmaster never calls setsid() for itself, nor
does pg_ctl for it, therefore ctrl-c's SIGTERM to pg_ctl also go to its
children, namely postmaster.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

От
Tom Lane
Дата:
Jeff Janes <jeff.janes@gmail.com> writes:
> On Tue, Sep 26, 2017 at 1:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Really?  The server should have detached itself from your terminal
>> group long before that.  What platform is this?

> CentOS release 6.9 (Final)

Hm, same as here.  Are you perhaps not using pg_ctl's -l option?
If not, the postmaster's stderr would remain attached to your tty,
which might be the reason why a terminal ^C affects it.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

От
Andres Freund
Дата:
On 2017-09-26 18:54:17 -0400, Tom Lane wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
> > On Tue, Sep 26, 2017 at 1:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Really?  The server should have detached itself from your terminal
> >> group long before that.  What platform is this?
> 
> > CentOS release 6.9 (Final)
> 
> Hm, same as here.  Are you perhaps not using pg_ctl's -l option?
> If not, the postmaster's stderr would remain attached to your tty,
> which might be the reason why a terminal ^C affects it.

Doesn't make a difference here (Debian sid), I see postgres get
SIGTERMed either way. GDBing in and doing a setsid() in postmaster
before ctrl-c'ing pg_ctl "fixes" it.  Is there a reason we're not doing
so after the fork in pg_ctl?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

От
Jeff Janes
Дата:
On Tue, Sep 26, 2017 at 3:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jeff Janes <jeff.janes@gmail.com> writes:
> On Tue, Sep 26, 2017 at 1:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Really?  The server should have detached itself from your terminal
>> group long before that.  What platform is this?

> CentOS release 6.9 (Final)

Hm, same as here.  Are you perhaps not using pg_ctl's -l option?
If not, the postmaster's stderr would remain attached to your tty,
which might be the reason why a terminal ^C affects it.

I was not using -l.  Instead I set logging_collector=on in postgresql.conf, but I suppose that that is not sufficent?

But I just retried with -l, and it still gets the fast shutdown.

Cheers,

Jeff

Re: [HACKERS] v10 pg_ctl compatibility

От
Tom Lane
Дата:
Jeff Janes <jeff.janes@gmail.com> writes:
> I was not using -l.  Instead I set logging_collector=on in postgresql.conf,
> but I suppose that that is not sufficent?

No, because initial stderr is still connected to whatever.

> But I just retried with -l, and it still gets the fast shutdown.

Hmph.  Doesn't work that way for me on RHEL6, which surely oughta
behave the same as your CentOS.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] v10 pg_ctl compatibility

От
Jeff Janes
Дата:
On Tue, Sep 26, 2017 at 4:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jeff Janes <jeff.janes@gmail.com> writes:
> I was not using -l.  Instead I set logging_collector=on in postgresql.conf,
> but I suppose that that is not sufficent?

No, because initial stderr is still connected to whatever.

> But I just retried with -l, and it still gets the fast shutdown.

Hmph.  Doesn't work that way for me on RHEL6, which surely oughta
behave the same as your CentOS.

                        regards, tom lane

I happened to have a virgin install of CentOS7 here, so I tried it on that.  I installed 9.2 and 10rc1 from repositories (CentOS and PGDG, respectively) rather than from source, and repeated it and still got the fast shutdown when hitting ctrl-C when 10rc1 has restarted the 9.2 server.  

I did the initdb, start, and restart from pg_ctl, not from whatever management tool comes with the packages.  I did it both with the database running as OS user jjanes, and separately running as OS user 'postgres', so it doesn't seem to matter whether uid is high or low.  And with '-l logfile'.

So, I don't know.

Cheers,

Jeff