Обсуждение: Backend working directories and absolute file paths

Поиск
Список
Период
Сортировка

Backend working directories and absolute file paths

От
Tom Lane
Дата:
Ciprian Popovici discovered an entirely new way to break the safety
interlocks that are meant to prevent you from starting a postmaster
in a data directory of the wrong version:
http://archives.postgresql.org/pgsql-general/2005-06/msg01349.php

While one could say this is pilot error, it's still annoying that
the database manages to hose itself so thoroughly.  The problem
as I see it is that we address all data files (including xlog,
pg_control, etc) via absolute path names, and so renaming a
different data directory into place exposes its contents to being
clobbered by the already-running postmaster.

What I am speculating about is:1. At postmaster start (or standalone backend start),   chdir into $PGDATA.2.
Henceforth,address everything under $PGDATA by   relative paths; don't use DataDir in the path at all.
 

This way, if someone moves a data directory with a running postmaster
in it, nothing breaks at all.  It would probably run a bit faster too,
since file open calls would have fewer directories to traverse through.

The only downside I can see to it is that backend and postmaster crashes
would all consistently dump core into $PGDATA (on platforms where cores
dump into the working directory, which is many but not all).  The
current arrangement makes backends dump core into the subdirectory for
the database they are in, which sometimes makes it a bit easier to
identify what's what.  But I can't see that that's a valuable enough
property to override the advantages of using relative paths.

Thoughts?
        regards, tom lane


Re: Backend working directories and absolute file paths

От
David Fetter
Дата:
On Thu, Jun 30, 2005 at 10:55:58AM -0400, Tom Lane wrote:
> Ciprian Popovici discovered an entirely new way to break the safety
> interlocks that are meant to prevent you from starting a postmaster
> in a data directory of the wrong version:
> http://archives.postgresql.org/pgsql-general/2005-06/msg01349.php

> While one could say this is pilot error, it's still annoying that
> the database manages to hose itself so thoroughly.

There will always be a way for a user with enough knowlege to hose a
database completely.  I think it's significant that Mr. Popovici is
the first to manage this one, in the sense that it takes an especially
creative combination of a little knowlege and rushing in where angels
fear to tread to reproduce the problem.  There will never be a
solution to human foolishness, so I say we just tell him and others
like him to restore from backup and move on.

Just my $.02

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!


Re: Backend working directories and absolute file paths

От
Tom Lane
Дата:
David Fetter <david@fetter.org> writes:
> On Thu, Jun 30, 2005 at 10:55:58AM -0400, Tom Lane wrote:
>> Ciprian Popovici discovered an entirely new way to break the safety
>> interlocks that are meant to prevent you from starting a postmaster
>> in a data directory of the wrong version:
>> http://archives.postgresql.org/pgsql-general/2005-06/msg01349.php

>> While one could say this is pilot error, it's still annoying that
>> the database manages to hose itself so thoroughly.

> There will always be a way for a user with enough knowlege to hose a
> database completely.  I think it's significant that Mr. Popovici is
> the first to manage this one, in the sense that it takes an especially
> creative combination of a little knowlege and rushing in where angels
> fear to tread to reproduce the problem.  There will never be a
> solution to human foolishness, so I say we just tell him and others
> like him to restore from backup and move on.

Well, I'm not sure that he's the first to manage it --- he's just the
first to report it in an identifiable way (which is the usual criterion
for assigning credit for discoveries ;-)).

Renaming data directories around is not that uncommon, especially if
you're using a platform that really really wants the active database to
be /var/lib/pgsql/data (if you're running Red Hat's current selinux
policy, you don't have a whole lotta choice about that).  All you have
to do is rename and shut down the postmaster in the wrong order, and
you're hosed.  (The terminating checkpoint will be able to write some
files and not others, depending on what it already had open, so I think
this could be a recipe for corrupting the moved-away database as well as
the moved-in one :-()

Do you have a specific objection to switching over to relative paths,
or are you just saying that this one report doesn't excite you enough
to change it?
        regards, tom lane


Re: Backend working directories and absolute file paths

От
David Fetter
Дата:
On Thu, Jun 30, 2005 at 11:42:59AM -0400, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > On Thu, Jun 30, 2005 at 10:55:58AM -0400, Tom Lane wrote:
> >> Ciprian Popovici discovered an entirely new way to break the safety
> >> interlocks that are meant to prevent you from starting a postmaster
> >> in a data directory of the wrong version:
> >> http://archives.postgresql.org/pgsql-general/2005-06/msg01349.php
> 
> >> While one could say this is pilot error, it's still annoying that
> >> the database manages to hose itself so thoroughly.
> 
> > There will always be a way for a user with enough knowlege to hose a
> > database completely.  I think it's significant that Mr. Popovici is
> > the first to manage this one, in the sense that it takes an especially
> > creative combination of a little knowlege and rushing in where angels
> > fear to tread to reproduce the problem.  There will never be a
> > solution to human foolishness, so I say we just tell him and others
> > like him to restore from backup and move on.
> 
> Well, I'm not sure that he's the first to manage it --- he's just the
> first to report it in an identifiable way (which is the usual criterion
> for assigning credit for discoveries ;-)).

True ;)

> Renaming data directories around is not that uncommon,

With all due respect, I believe that this falls under the category of
prying off cover plates.  When people do this, they're responsible for
knowing what they're about, and taking the consequences if they don't.

In other words, it's pilot error, and that's Not Our Problem.

> especially if you're using a platform that really really wants the
> active database to be /var/lib/pgsql/data (if you're running Red
> Hat's current selinux policy, you don't have a whole lotta choice
> about that).  All you have to do is rename and shut down the
> postmaster in the wrong order, and you're hosed.  (The terminating
> checkpoint will be able to write some files and not others,
> depending on what it already had open, so I think this could be a
> recipe for corrupting the moved-away database as well as the
> moved-in one :-()
> 
> Do you have a specific objection to switching over to relative
> paths, or are you just saying that this one report doesn't excite
> you enough to change it?

The latter, because I believe that this isn't a situation a reasonable
person can stumble into.

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!


Re: Backend working directories and absolute file paths

От
Andrew Dunstan
Дата:

Tom Lane wrote:

>What I am speculating about is:
>    1. At postmaster start (or standalone backend start),
>       chdir into $PGDATA.
>    2. Henceforth, address everything under $PGDATA by
>       relative paths; don't use DataDir in the path at all.
>
>This way, if someone moves a data directory with a running postmaster
>in it, nothing breaks at all.  It would probably run a bit faster too,
>since file open calls would have fewer directories to traverse through.
>  
>

Makes plenty of sense, and is a common way of working.

>The only downside I can see to it is that backend and postmaster crashes
>would all consistently dump core into $PGDATA (on platforms where cores
>dump into the working directory, which is many but not all).  The
>current arrangement makes backends dump core into the subdirectory for
>the database they are in, which sometimes makes it a bit easier to
>identify what's what.  But I can't see that that's a valuable enough
>property to override the advantages of using relative paths.
>
>
>  
>

Maybe I have misunderstood. Could the backends not chdir into the db 
subdir and then do everything relative to that (using .. if necessary)?

How does this all play with tablespaces?

cheers

andrew


Re: Backend working directories and absolute file paths

От
Andrew Dunstan
Дата:

David Fetter wrote:

>On Thu, Jun 30, 2005 at 11:42:59AM -0400, Tom Lane wrote:
>  
>
>>Renaming data directories around is not that uncommon,
>>    
>>
>
>With all due respect, I believe that this falls under the category of
>prying off cover plates.  When people do this, they're responsible for
>knowing what they're about, and taking the consequences if they don't.
>
>In other words, it's pilot error, and that's Not Our Problem.
>
>  
>

We provide many defences against pilot error. So does the Air Force - 
that's part of why you see pilots wearing parachutes.

More to the point, there's not much compelling reason *not* to do this.

cheers

andrew


Re: Backend working directories and absolute file paths

От
David Fetter
Дата:
On Thu, Jun 30, 2005 at 02:31:01PM -0400, Andrew Dunstan wrote:
> 
> 
> David Fetter wrote:
> 
> >On Thu, Jun 30, 2005 at 11:42:59AM -0400, Tom Lane wrote:
> > 
> >
> >>Renaming data directories around is not that uncommon,
> >>   
> >>
> >
> >With all due respect, I believe that this falls under the category
> >of prying off cover plates.  When people do this, they're
> >responsible for knowing what they're about, and taking the
> >consequences if they don't.
> >
> >In other words, it's pilot error, and that's Not Our Problem.
> 
> We provide many defences against pilot error. So does the Air Force
> - that's part of why you see pilots wearing parachutes.
> 
> More to the point, there's not much compelling reason *not* to do
> this.

OK, let's.  I'm hesitant to talk about doing new stuff, as I'm still
not qualified to do any of it, and there are things that think are
higher priority ahead of this.

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!


Re: Backend working directories and absolute file paths

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> Maybe I have misunderstood. Could the backends not chdir into the db 
> subdir and then do everything relative to that (using .. if necessary)?

If we do that then the path to things from the postmaster is different
than it is for the children, which is going to make things quite a bit
more complicated (eg, md.c will have to be aware of whether it is
running in a backend or the bgwriter).  I'm certain we can make it work
if everyplace uses the same relative paths, but I'm less certain about
the reliability of using varying paths.

Also that would break setups where $PGDATA/base or one of its immediate
children is a symlink.  Now the need to set things up that way is
certainly a lot less than it was before we had tablespaces, but I'm
still inclined to avoid depending on .. for addressing stuff.

> How does this all play with tablespaces?

I don't think it matters, since we address those via pg_tblspc anyway.
        regards, tom lane


Re: Backend working directories and absolute file paths

От
Greg Stark
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> This way, if someone moves a data directory with a running postmaster
> in it, nothing breaks at all.  It would probably run a bit faster too,
> since file open calls would have fewer directories to traverse through.

On reasonable platforms the time spent traversing shouldn't be a problem --
however if there are a lot of metadata operations happening at the same time
absolute file paths can cause contention, especially on the root and first few
path elements.

> The only downside I can see to it is that backend and postmaster crashes
> would all consistently dump core into $PGDATA (on platforms where cores
> dump into the working directory, which is many but not all).  The
> current arrangement makes backends dump core into the subdirectory for
> the database they are in, which sometimes makes it a bit easier to
> identify what's what.  But I can't see that that's a valuable enough
> property to override the advantages of using relative paths.

Having dumps occur in per-database directories vs per-cluster directories
isn't really that big a deal.

However it might be nice to have dumps go to a configurable place. Even to a
place to can be set by a session settable GUC. That would make debugging by
non-root users feasible. (You might need a second GUC to enable this feature
for security reasons though).

There's another approach that seems more robust. When initdb is run randomly
generate a unique id. Then whenever creating files include that unique id in
the first block of the file. Whenever you open a file sanity check the first
block. If it doesn't match PANIC immediately. (hm, actually you don't even
need to PANIC, jut shutting the one backend should be enough.)

This would ensure that you don't accidentally restore the wrong files from
your cold backup too. Or anything else anyone might try involving swapping
files around.

-- 
greg



Re: Backend working directories and absolute file paths

От
Tom Lane
Дата:
Greg Stark <gsstark@mit.edu> writes:
> However it might be nice to have dumps go to a configurable place. 

You'd have to talk to your kernel provider about that one; we don't have
any direct control over where or even whether core dumps occur.

> There's another approach that seems more robust. When initdb is run randomly
> generate a unique id. Then whenever creating files include that unique id in
> the first block of the file. Whenever you open a file sanity check the first
> block. If it doesn't match PANIC immediately. (hm, actually you don't even
> need to PANIC, jut shutting the one backend should be enough.)

This adds overhead, rather than removing it as I was hoping to do.
        regards, tom lane


Re: Backend working directories and absolute file paths

От
Greg Stark
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > However it might be nice to have dumps go to a configurable place. 
> 
> You'd have to talk to your kernel provider about that one; we don't have
> any direct control over where or even whether core dumps occur.

Well on most platforms setting the cwd would suffice.

This would also potentially allow you to control profiling output (though I
suspect that gets created at fork time, which would be too late) and other
such things.


For that matter, would depending on the cwd interact well with trusted Pl
languages that can change the cwd? Would they have to guarantee to set it back
when they're done?

> This adds overhead, rather than removing it as I was hoping to do.

That's true. Hm. If the id were short it could go on every page.

-- 
greg



Re: Backend working directories and absolute file paths

От
Tom Lane
Дата:
Greg Stark <gsstark@mit.edu> writes:
> For that matter, would depending on the cwd interact well with trusted Pl
> languages that can change the cwd?

That would definitely be in the category of "don't do that" --- but
there are such a long list of ways to hose your backend in a trusted PL
that adding one more doesn't make me blink.
        regards, tom lane


Re: Backend working directories and absolute file paths

От
Peter Eisentraut
Дата:
Tom Lane wrote:
> You'd have to talk to your kernel provider about that one; we don't
> have any direct control over where or even whether core dumps occur.

Apache used to have (still has?) a way to configure that.  I think they 
must have done the chdir() in the SIGSEGV handler.  Not that I'm 
proposing we do that... ;-)

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/