Обсуждение: BUG #5118: start-status-insert-fatal
The following bug has been logged online: Bug reference: 5118 Logged by: Gerhard Leykam Email address: gel123@sealsystems.de PostgreSQL version: 8.4.0 Operating system: linux Description: start-status-insert-fatal Details: Hi! I am using a start script to set up my PostgreSQL database: it runs initdb, if not done yet, starts the instance with pg_ctl start and checks everything is fine by pg_ctl status. If there is another PostgreSQL database on the same machine listening to the same port, postmaster comes up, pg_ctl status says everthings fine, but postmaster falls down with appropriate message in postgres.log. All SQL commands in my script after status check are running against the other database! How do you think about some sort of postmaster.up file next to postmaster.pid, which is created just as really everything is up and OK? Regards, Gerhard
"Gerhard Leykam" <gel123@sealsystems.de> wrote: > I am using a start script to set up my PostgreSQL database: it runs > initdb, if not done yet, starts the instance with pg_ctl start and > checks everything is fine by pg_ctl status. > > If there is another PostgreSQL database on the same machine > listening to the same port, postmaster comes up, pg_ctl status says > everthings fine, but postmaster falls down with appropriate message > in postgres.log. This is definitely not a PostgreSQL bug. Perhaps the best place to start, before suggesting a new PostgreSQL feature to solve this, would be to post to one of the other lists (admin, maybe?) and describe what you are trying to accomplish with your script, along with the problems you've found with your current version of the script. With a little more information, someone might be able to suggest a solution. (Since you're running on Linux, something involving the lockfile utility might suffice.) -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > "Gerhard Leykam" <gel123@sealsystems.de> wrote: >> I am using a start script to set up my PostgreSQL database: it runs >> initdb, if not done yet, starts the instance with pg_ctl start and >> checks everything is fine by pg_ctl status. >> >> If there is another PostgreSQL database on the same machine >> listening to the same port, postmaster comes up, pg_ctl status says >> everthings fine, but postmaster falls down with appropriate message >> in postgres.log. > This is definitely not a PostgreSQL bug. Well, it's arguably a start-script bug, but I think his point is that it's hard to fix it without any additional support from PG. While mulling that it occurred to me that some additional output from the postmaster would help to solve another thing that's an acknowledged shortcoming of pg_ctl, namely that it can't parse postgresql.conf to find out where the postmaster's communication socket is; cf http://archives.postgresql.org/pgsql-bugs/2009-10/msg00024.php and other older complaints. We could redefine things so that it doesn't need to do that (and also doesn't need to try to intuit the postmaster's port number, which it does do now, but not terribly well). Suppose that after the postmaster is fully up, it writes a file $PGDATA/postmaster.ports, with contents along the lines of 5432 /tmp/.s.PGSQL.5432 ie, IP port number and full socket location (the latter missing on Windows of course). Then pg_ctl only needs to know $PGDATA, and it can get the socket address without any guessing or hard work. (Alternatively we could add this info to postmaster.pid, but a separate file seems like a better idea, if only because the contents of the pid file are subject to change that pg_ctl shouldn't have to deal with.) I am not convinced that this solves the OP's complaint 100%. The failure case that remains is that if the postmaster crashes without deleting this file, then pg_ctl could still be fooled by a competing postmaster that's taken the same port number. I don't know that there's any good way around that one, though. The main thing it does do is remove the pressure to add a lot of code to pg_ctl to try to read postgresql.conf. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Well, it's arguably a start-script bug OK. > While mulling that it occurred to me that some additional output > from the postmaster would help to solve another thing that's an > acknowledged shortcoming of pg_ctl, namely that it can't parse > postgresql.conf to find out where the postmaster's communication > socket is; > cf http://archives.postgresql.org/pgsql-bugs/2009-10/msg00024.php > and other older complaints. > > We could redefine things so that it doesn't need to do that (and > also doesn't need to try to intuit the postmaster's port number, > which it does do now, but not terribly well). Suppose that after > the postmaster is fully up, it writes a file > $PGDATA/postmaster.ports, with contents along the lines of > > 5432 > /tmp/.s.PGSQL.5432 The listen_addresses setting would need to figure in, too. http://archives.postgresql.org/pgsql-hackers/2009-10/msg00022.php Matching that stuff up could start to get a little messy, but it should be doable somehow. This seems likely to overlap the review I was soon going to do of the differences between pg_ctl behavior and what is required for LSB conformance. I'll make sure to test this behavior along with others. One of my current complaints is that pg_ctl doesn't wait until it is actually ready to receive connections before returning an indication of success. I see that I neglected that point in my recently proposed LSB conforming script, but I'm guessing that this fits with other points in the argument that if what I'm doing in the script is demonstrably better than current pg_ctl behavior, we should change pg_ctl to support it rather than scripting around it. (Not that it would be hard to add ten or twenty lines to the script to cover this....) -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Suppose that after the postmaster is fully up, it writes a file >> $PGDATA/postmaster.ports, with contents along the lines of >> >> 5432 >> /tmp/.s.PGSQL.5432 > The listen_addresses setting would need to figure in, too. Yeah, I'm not entirely sure how we'd want to deal with IP addresses, but in principle there could be a line for each postmaster socket not only the Unix-domain socket. > This seems likely to overlap the review I was soon going to do of the > differences between pg_ctl behavior and what is required for LSB > conformance. Agreed, it would be good to do a holistic review of what pg_ctl needs. regards, tom lane
Kevin Grittner wrote: > This seems likely to overlap the review I was soon going to do of the > differences between pg_ctl behavior and what is required for LSB > conformance. I'll make sure to test this behavior along with others. > One of my current complaints is that pg_ctl doesn't wait until it is > actually ready to receive connections before returning an indication > of success. Maybe write the file as postmaster.ports.starting or some such and rename it to its final name when recovery has finished? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote: > I neglected that point in my recently proposed LSB conforming script Hmmm... On review, I see that I assumed that the -w switch on pg_ctl start would cover this. I see that the problem is that this uses psql to connect to the specified port. Besides the problems Tom mentioned with its heuristics to find the right port number for this cluster, there is the OP's point that connections will go to the competing cluster. One thought that occurs to me is that instead of, or in addition to, the new file Tom proposes, the "other cluster" issue could be solved by having a pg_postmaster_pid function in addition to the pg_backend_pid function. This would allow pg_ctl or a script to connect to a port and see if it is the expected postmaster process. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Hmmm... On review, I see that I assumed that the -w switch on pg_ctl > start would cover this. I see that the problem is that this uses psql > to connect to the specified port. Besides the problems Tom mentioned > with its heuristics to find the right port number for this cluster, > there is the OP's point that connections will go to the competing > cluster. One thought that occurs to me is that instead of, or in > addition to, the new file Tom proposes, the "other cluster" issue > could be solved by having a pg_postmaster_pid function in addition to > the pg_backend_pid function. This would allow pg_ctl or a script to > connect to a port and see if it is the expected postmaster process. I would rather see us implement the hypothetical pg_ping protocol and remember to include the postmaster's PID in the response. One of the worst misfeatures of pg_ctl is the need to be able to authenticate itself to the postmaster, and having it rely on being able to actually issue a SQL command would set that breakage in stone. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > I would rather see us implement the hypothetical pg_ping protocol > and remember to include the postmaster's PID in the response. One > of the worst misfeatures of pg_ctl is the need to be able to > authenticate itself to the postmaster, and having it rely on being > able to actually issue a SQL command would set that breakage in > stone. Sounds good to me, other than it stalls pg_ctl revamp until pg_ping is done. I don't remember a clear design of what pg_ping should look like. Does anyone have a clear plan in their head? -Kevin
I wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> ... This would allow pg_ctl or a script to >> connect to a port and see if it is the expected postmaster process. > I would rather see us implement the hypothetical pg_ping protocol > and remember to include the postmaster's PID in the response. Although on second thought, any such test is worth approximately nothing anyway. You can check that the postmaster answering the doorbell reports the same PID that you see in $PGDATA/postmaster.pid, but that still doesn't prove that that postmaster is using that data directory. It could be a random coincidence of PIDs. And in the case of a start script, the probability of random PID match to a stale lockfile is many orders of magnitude higher than you might think; see prior discussions. This could be addressed by having the postmaster report its $PGDATA value in the pg_ping response, but I would be against that on security grounds. We don't let nonprivileged users know where PGDATA is, why would we make the information available without any authentication at all? [ thinks... ] Maybe we could have the postmaster generate a random number at start and include that in both the postmaster.ports file and its pg_ping responses. That would have a substantially lower collision probability than PID, if the number generation process were well designed; and it wouldn't risk exposing anything sensitive in the ping response. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > [ thinks... ] Maybe we could have the postmaster generate a random > number at start and include that in both the postmaster.ports file > and its pg_ping responses. That would have a substantially lower > collision probability than PID, if the number generation process > were well designed; and it wouldn't risk exposing anything sensitive > in the ping response. Unless two postmasters could open the same server socket within a microsecond of one another, a timestamp value captured on opening the server socket seems even better than a random number. Well, I guess if someone subverted the clock it could mislead, but is that really more likely to cause a false match than a random number? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Sounds good to me, other than it stalls pg_ctl revamp until pg_ping is > done. I don't remember a clear design of what pg_ping should look > like. Does anyone have a clear plan in their head? I don't think anyone's written down a full spec, but it seems like a relatively trivial thing to me. * Client connects to the usual place and sends a packet that has a special "protocol number" (similar to the way we handle SSL requests). AFAICS there wouldn't need to be anything else in the packet. * Postmaster responds with a suitable message and closes the connection. The message should at least include the current postmaster CanAcceptConnections status and the PID/magic number we were just discussing. I can't think of anything else offhand --- anyone else? I'm not sure whether we'd want to provide a function within libpq for this, or just code it in pg_ctl. Within libpq the natural thing would be to take a conninfo connection string, but I'm not sure that suits pg_ctl's purposes. regards, tom lane
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> [ thinks... ] Maybe we could have the postmaster generate a random >> number at start and include that in both the postmaster.ports file >> and its pg_ping responses. > Unless two postmasters could open the same server socket within a > microsecond of one another, a timestamp value captured on opening the > server socket seems even better than a random number. Well, that raises the question of whether postmaster uptime could be considered security-sensitive info. I'd still rather use a random number. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'm not sure whether we'd want to provide a function within libpq > for this, or just code it in pg_ctl. I'm inclined to think there would be value to a pg_ping utility to support automated monitoring by unprivileged users on other boxes. That both suggests libpq as the location, and one or two additional pieces of information. An indication of "in archive recovery" versus production or shutdown, for example, might be useful. I'm not sure what else might make sense. > Within libpq the natural thing would be to take a conninfo > connection string, but I'm not sure that suits pg_ctl's purposes. I'm a little lost on that. Would it cause any problems for pg_ctl, or just be more than it would need if it's only implemented there? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm not sure whether we'd want to provide a function within libpq >> for this, or just code it in pg_ctl. > I'm inclined to think there would be value to a pg_ping utility to > support automated monitoring by unprivileged users on other boxes. True. I had first thought that pg_ctl itself could serve that purpose, but it's really designed around the assumption that it has direct access to $PGDATA, so it wouldn't fit well for monitoring from another machine. > That both suggests libpq as the location, and one or two additional > pieces of information. An indication of "in archive recovery" versus > production or shutdown, for example, might be useful. I'm not sure > what else might make sense. IIRC, that's already covered by the CanAcceptConnections state. We need to be pretty conservative about how much information we expose here, anyhow, since it will be handed out to absolutely anybody who can reach the postmaster port. >> Within libpq the natural thing would be to take a conninfo >> connection string, but I'm not sure that suits pg_ctl's purposes. > I'm a little lost on that. Would it cause any problems for pg_ctl, > or just be more than it would need if it's only implemented there? Well, given what we were saying about a postmaster.ports file, pg_ctl would typically be working with an absolute path to the socket file. Which is not what normally goes into a conninfo string. Perhaps that could be addressed by specifying the file contents differently, but I'd be wary of assuming that *all* users of the ports file will be libpq-based --- for instance a Java version of pg_ctl wouldn't be. regards, tom lane
Tom Lane wrote: > This could be addressed by having the postmaster report its $PGDATA > value in the pg_ping response, but I would be against that on security > grounds. We don't let nonprivileged users know where PGDATA is, why > would we make the information available without any authentication at > all? Maybe a hash of it?
Pedro Gimeno <pgsql-003@personal.formauri.es> wrote: > Tom Lane wrote: > >> This could be addressed by having the postmaster report its $PGDATA >> value in the pg_ping response, but I would be against that on >> security grounds. We don't let nonprivileged users know where >> PGDATA is, why would we make the information available without any >> authentication at all? > > Maybe a hash of it? I'm not really clear on why it's a security issue for someone to know the $PGDATA value, but if it is, there are some "typical" locations for which a hash could be generated and matched against the returned hash; so a hash of it would only be safe for those who chose sufficiently "creative" directory paths. On top of that, I'm not sure it's a very useful way to confirm that you've connected to the correct instance. We often get requests to replace the contents of a development or test database with a dump from a production database. More than once, the DBA doing this has forgotten to stop PostgreSQL before deleting the $PGDATA directory and creating it fresh for the restore of the PITR dump. When we attempt to start the new copy, which has the same $PGDATA, owner, and port number as the copy still running in the deleted directory, we have similar issues to those described in the original post. So, personally, I consider the data directory a less reliable test than the pid. (We don't have a lot of OS crash & reboot occurrences.) -Kevin
On Fri, Oct 16, 2009 at 10:33 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Pedro Gimeno <pgsql-003@personal.formauri.es> wrote: >> Tom Lane wrote: >> >>> This could be addressed by having the postmaster report its $PGDATA >>> value in the pg_ping response, but I would be against that on >>> security grounds. =A0We don't let nonprivileged users know where >>> PGDATA is, why would we make the information available without any >>> authentication at all? >> >> Maybe a hash of it? > > I'm not really clear on why it's a security issue for someone to know > the $PGDATA value, but if it is, there are some "typical" locations > for which a hash could be generated and matched against the returned > hash; so a hash of it would only be safe for those who chose > sufficiently "creative" directory paths. > > On top of that, I'm not sure it's a very useful way to confirm that > you've connected to the correct instance. =A0We often get requests to > replace the contents of a development or test database with a dump > from a production database. =A0More than once, the DBA doing this has > forgotten to stop PostgreSQL before deleting the $PGDATA directory and > creating it fresh for the restore of the PITR dump. When we attempt to > start the new copy, which has the same $PGDATA, owner, and port number > as the copy still running in the deleted directory, we have similar > issues to those described in the original post. =A0So, personally, I > consider the data directory a less reliable test than the pid. =A0(We > don't have a lot of OS crash & reboot occurrences.) Well, then Tom's idea of using a random number seems pretty solid no matter how you slice it. Maybe a UUID. ...Robert
Robert Haas <robertmhaas@gmail.com> wrote: > Well, then Tom's idea of using a random number seems pretty solid no > matter how you slice it. Maybe a UUID. A random number is looking like the best option. I'm not sure why I'd want to generate a perfectly good 128 bit random number and then throw away six of the bits to dress it up as a UUID, though. Do the libraries for that do enough to introduce entropy to compensate for the lost bits? Any other benefit I'm missing? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Robert Haas <robertmhaas@gmail.com> wrote: >> Well, then Tom's idea of using a random number seems pretty solid no >> matter how you slice it. Maybe a UUID. > A random number is looking like the best option. I'm not sure why I'd > want to generate a perfectly good 128 bit random number and then throw > away six of the bits to dress it up as a UUID, though. Do the > libraries for that do enough to introduce entropy to compensate for > the lost bits? Any other benefit I'm missing? I was envisioning just using PostmasterRandom() (after initializing the seed from time(NULL) as we do now). I don't think we need a super-wide random number. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > I was envisioning just using PostmasterRandom() (after initializing > the seed from time(NULL) as we do now). I don't think we need a > super-wide random number. Fine with me. Just that and CanAcceptConnections in the response? It seems like pg_ping (client utility and related postmaster support) should be a discrete patch. Improvements to pg_ctl and init scripts would come later, as separate patches? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > It seems like pg_ping (client utility and related postmaster support) > should be a discrete patch. Improvements to pg_ctl and init scripts > would come later, as separate patches? Sounds sane to me. Alternatively, do the postmaster support and make the presumably-minor pg_ctl mods to use it, and then a standalone pg_ping utility could come later. I'm not sure how big the utility would be, but surely bigger than the delta in pg_ctl. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alternatively, do the postmaster support and make the > presumably-minor pg_ctl mods to use it, and then a standalone > pg_ping utility could come later. I'm not sure how big the utility > would be, but surely bigger than the delta in pg_ctl. Bigger than the delta for *just this change* to pg_ctl. I was thinking of addressing all pg_ctl issues at once, but perhaps this one makes sense on its own. If so, your alternative does sound better. -Kevin
On Fri, Oct 16, 2009 at 11:08 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Robert Haas <robertmhaas@gmail.com> wrote: > >> Well, then Tom's idea of using a random number seems pretty solid no >> matter how you slice it. =A0Maybe a UUID. > > A random number is looking like the best option. =A0I'm not sure why I'd > want to generate a perfectly good 128 bit random number and then throw > away six of the bits to dress it up as a UUID, though. =A0Do the > libraries for that do enough to introduce entropy to compensate for > the lost bits? =A0Any other benefit I'm missing? I'm confused. UUIDs throw away 6 bits? Anyway, some smaller random number might be fine, too - not trying to throw a monkey wrench into the process. ...Robert
Robert Haas <robertmhaas@gmail.com> wrote: > UUIDs throw away 6 bits? http://en.wikipedia.org/wiki/Universally_Unique_Identifier#Version_4_.28random.29 -Kevin
On Fri, Oct 16, 2009 at 2:04 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Robert Haas <robertmhaas@gmail.com> wrote: > >> UUIDs throw away 6 bits? > > http://en.wikipedia.org/wiki/Universally_Unique_Identifier#Version_4_.28random.29 How about that. You learn something new every day. ...Robert
Was this ever addressed? --------------------------------------------------------------------------- Tom Lane wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > > Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> I'm not sure whether we'd want to provide a function within libpq > >> for this, or just code it in pg_ctl. > > > I'm inclined to think there would be value to a pg_ping utility to > > support automated monitoring by unprivileged users on other boxes. > > True. I had first thought that pg_ctl itself could serve that purpose, > but it's really designed around the assumption that it has direct access > to $PGDATA, so it wouldn't fit well for monitoring from another machine. > > > That both suggests libpq as the location, and one or two additional > > pieces of information. An indication of "in archive recovery" versus > > production or shutdown, for example, might be useful. I'm not sure > > what else might make sense. > > IIRC, that's already covered by the CanAcceptConnections state. > We need to be pretty conservative about how much information we > expose here, anyhow, since it will be handed out to absolutely > anybody who can reach the postmaster port. > > >> Within libpq the natural thing would be to take a conninfo > >> connection string, but I'm not sure that suits pg_ctl's purposes. > > > I'm a little lost on that. Would it cause any problems for pg_ctl, > > or just be more than it would need if it's only implemented there? > > Well, given what we were saying about a postmaster.ports file, pg_ctl > would typically be working with an absolute path to the socket file. > Which is not what normally goes into a conninfo string. Perhaps that > could be addressed by specifying the file contents differently, but > I'd be wary of assuming that *all* users of the ports file will be > libpq-based --- for instance a Java version of pg_ctl wouldn't be. > > regards, tom lane > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> wrote: > Was this ever addressed? It should probably be on the TODO list. I was going to try to do this along with other items which came out of generating an LSB conforming init script, but have been pulled in different directions for now. When I get the time I've been intending to get back to this, if nobody beats me to it. Do we want one entry with all the miscellaneous pg_ctl issues I've got, or would it be better to keep the separate? -Kevin
Kevin Grittner wrote: > Bruce Momjian <bruce@momjian.us> wrote: > > > Was this ever addressed? > > It should probably be on the TODO list. I was going to try to do > this along with other items which came out of generating an LSB > conforming init script, but have been pulled in different directions > for now. When I get the time I've been intending to get back to > this, if nobody beats me to it. Do we want one entry with all the > miscellaneous pg_ctl issues I've got, or would it be better to keep > the separate? I think you should just edit the TODO wiki and list all the things we agree need fixing: http://wiki.postgresql.org/wiki/Todo -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> wrote: > I think you should just edit the TODO wiki and list all the things > we agree need fixing Will do. It may take me a while to chase down all the issues from the LSB script work. If I recall correctly, some of what I was looking at seems necessary for a well-behaved LSB conforming script, and there was support for the ideas from some quarters, but it fell short of what I'd consider a consensus (for example, escalating the stop request level over time). Do we want to get into a discussion of such issues at this point, add them to the list with a note that discussion is needed before undertaking them, or should I keep them just in my own list? A couple comments which I think were the last on the escalation of stop mode: http://archives.postgresql.org/pgsql-hackers/2009-09/msg01145.php http://archives.postgresql.org/pgsql-hackers/2009-09/msg01158.php Before continuing that discussion, I think we need to decide if *now is the time* to continue that discussion (and other related ones); and I would argue such discussion should wait. These links are just provided to help answer the question of what I should do with such disputed issues when I update the TODO list. I don't want to slip something in without consensus, but I don't want the issue to be lost, or missed by anyone who takes on pg_ctl work. -Kevin
Bruce Momjian <bruce@momjian.us> wrote: > I think you should just edit the TODO wiki and list all the things > we agree need fixing: Done, although with the wealth of opinions and dearth of agreement I referenced much material and said that more discussion was needed before starting development. If anyone figures I missed anything or got overly expansive, have it it. ;-) -Kevin