Обсуждение: Re: [GENERAL] pg_upgrade & tablespaces

Поиск
Список
Период
Сортировка

Re: [GENERAL] pg_upgrade & tablespaces

От
Bruce Momjian
Дата:
On Sat, Jan 11, 2014 at 10:40:20AM -0800, Adrian Klaver wrote:
> >Right.  I know there were multiple issue with this upgrade, jails
> >probably being the biggest, but a new one I had never heard is that _if_
> >you are placing your tablespaces in the PGDATA directory, and you are
> >upgrading from pre-9.2, if you rename the old data directory, you also
> >need to start the old server and update pg_tablespace.spclocation.
> >
> >No one has ever reported that failure, but it would certainly happen.  I
> >wonder if pg_upgrade should be modified to check that
> >pg_tablespace.spclocation point to real directories for pre-9.2 servers.
> >
> 
> I thought I was understanding, now I am not. This starts with your
> post of last night. So in pre-9.2 cases the tablespace location is
> recorded in two places pg_tablespace and the symlinks in pg_tblspc/.

[ I am moving this discussion to hackers to get developer feedback. ]

Right.

> When you upgrade pg_upgrade only looks at the pg_tablespace  entry
> for pre-9.2 instances or does it look at the pg_tblspc symlinks
> also? If it looks at the symlinks would they need to be changed
> also?

pg_upgrade looks in the pg_tablespace in pre-9.2, and uses a function in
9.2+.  The query is:
   snprintf(query, sizeof(query),            "SELECT    %s "            "FROM  pg_catalog.pg_tablespace "
"WHEREspcname != 'pg_default' AND "            "      spcname != 'pg_global'",   /* 9.2 removed the spclocation column
*/           (GET_MAJOR_VERSION(old_cluster.major_version) <= 901) ?
 
--> "spclocation" : "pg_catalog.pg_tablespace_location(oid) AS spclocation");

> As to your check for directories that sounds like a good idea,
> though I have one question. What constitutes a 'real' directory? I
> can see a situation where someone moves an existing instance from
> $PGDATA to $PGDATA.old and the installs a new version in $PGDATA.
> Then before they do the upgrade they create a new tablespace
> directory in $PGDATA. If they did not upgrade the spclocation in the
> old instance and ran the check it would find a directory but there
> would be nothing in it. So would the check look for actual
> tablespace data?

I would probably just look for the directory.  People are not supposed
to be modifying their clusters during the upgrade, though, as stated, if
they move the old cluster, the are required to update pg_tablespace if
they have tablespaces in PGDATA, which is unfortunate.

I think the big question on adding a check is, how many users of 9.4 are
going to be upgrading from pre-9.2 and have tablespaces in PGDATA, and
will be renaming their old PGDATA directory during the upgrade?  We
could add the test to 9.3 too, of course.

Having pg_tablespace and the symlinks get out of sync was the reason
Magnus removed that duplication in 9.2, but I was unaware of how
pg_upgrade really magnifies the problem for tablespaces in PGDATA by
recommending a PGDATA rename.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: [GENERAL] pg_upgrade & tablespaces

От
Adrian Klaver
Дата:
On 01/11/2014 10:55 AM, Bruce Momjian wrote:
> On Sat, Jan 11, 2014 at 10:40:20AM -0800, Adrian Klaver wrote:
>>> Right.  I know there were multiple issue with this upgrade, jails
>>> probably being the biggest, but a new one I had never heard is that _if_
>>> you are placing your tablespaces in the PGDATA directory, and you are
>>> upgrading from pre-9.2, if you rename the old data directory, you also
>>> need to start the old server and update pg_tablespace.spclocation.
>>>
>>> No one has ever reported that failure, but it would certainly happen.  I
>>> wonder if pg_upgrade should be modified to check that
>>> pg_tablespace.spclocation point to real directories for pre-9.2 servers.
>>>
>>
>> I thought I was understanding, now I am not. This starts with your
>> post of last night. So in pre-9.2 cases the tablespace location is
>> recorded in two places pg_tablespace and the symlinks in pg_tblspc/.
>
> [ I am moving this discussion to hackers to get developer feedback. ]
>
> Right.
>
>> When you upgrade pg_upgrade only looks at the pg_tablespace  entry
>> for pre-9.2 instances or does it look at the pg_tblspc symlinks
>> also? If it looks at the symlinks would they need to be changed
>> also?
>
> pg_upgrade looks in the pg_tablespace in pre-9.2, and uses a function in
> 9.2+.  The query is:
>
>      snprintf(query, sizeof(query),
>               "SELECT    %s "
>               "FROM  pg_catalog.pg_tablespace "
>               "WHERE spcname != 'pg_default' AND "
>               "      spcname != 'pg_global'",
>      /* 9.2 removed the spclocation column */
>               (GET_MAJOR_VERSION(old_cluster.major_version) <= 901) ?
> --> "spclocation" : "pg_catalog.pg_tablespace_location(oid) AS spclocation");


I see, though I have another question. If pg_tablespace and the symlinks 
can get out of sync, as you say below, why is pg_tablespace considered 
the authority? Or to put it another way, why not just look at the 
symlinks as in 9.2+?

>
>> As to your check for directories that sounds like a good idea,
>> though I have one question. What constitutes a 'real' directory? I
>> can see a situation where someone moves an existing instance from
>> $PGDATA to $PGDATA.old and the installs a new version in $PGDATA.
>> Then before they do the upgrade they create a new tablespace
>> directory in $PGDATA. If they did not upgrade the spclocation in the
>> old instance and ran the check it would find a directory but there
>> would be nothing in it. So would the check look for actual
>> tablespace data?
>
> I would probably just look for the directory.  People are not supposed
> to be modifying their clusters during the upgrade, though, as stated, if
> they move the old cluster, the are required to update pg_tablespace if
> they have tablespaces in PGDATA, which is unfortunate.
>
> I think the big question on adding a check is, how many users of 9.4 are
> going to be upgrading from pre-9.2 and have tablespaces in PGDATA, and
> will be renaming their old PGDATA directory during the upgrade?  We
> could add the test to 9.3 too, of course.

Well it is not generally accepted that users should even be creating 
tablespaces in $PGDATA, but it is allowed by the program. My inclination 
is to say that it is then the programs'(Postgres) responsibility to deal 
with it. The alternative is to clarify the documentation and make it the 
users responsibility. As to users upgrading from 9.1- to 9.2+, I see 
still a lot of users posting to --general using 9.1- versions. At some 
point they will likely migrate, so I can see a fix being worthwhile.

>
> Having pg_tablespace and the symlinks get out of sync was the reason
> Magnus removed that duplication in 9.2, but I was unaware of how
> pg_upgrade really magnifies the problem for tablespaces in PGDATA by
> recommending a PGDATA rename.
>

Well it was based on the valid assumption that people would create new 
tablespaces outside $PGDATA because that is really is what is meant to 
happen. You know us users we like to test assumptions:)

-- 
Adrian Klaver
adrian.klaver@gmail.com



Re: [GENERAL] pg_upgrade & tablespaces

От
Bruce Momjian
Дата:
On Sat, Jan 11, 2014 at 12:48:51PM -0800, Adrian Klaver wrote:
> >pg_upgrade looks in the pg_tablespace in pre-9.2, and uses a function in
> >9.2+.  The query is:
> >
> >     snprintf(query, sizeof(query),
> >              "SELECT    %s "
> >              "FROM  pg_catalog.pg_tablespace "
> >              "WHERE spcname != 'pg_default' AND "
> >              "      spcname != 'pg_global'",
> >     /* 9.2 removed the spclocation column */
> >              (GET_MAJOR_VERSION(old_cluster.major_version) <= 901) ?
> >--> "spclocation" : "pg_catalog.pg_tablespace_location(oid) AS spclocation");
> 
> 
> I see, though I have another question. If pg_tablespace and the
> symlinks can get out of sync, as you say below, why is pg_tablespace
> considered the authority? Or to put it another way, why not just
> look at the symlinks as in 9.2+?

Uh, good question.  I think I used the system tables because they were
easier to access.  I can't remember if we used the symlinks for some
things and pg_tablespace for other things in pre-9.2.

> >I think the big question on adding a check is, how many users of 9.4 are
> >going to be upgrading from pre-9.2 and have tablespaces in PGDATA, and
> >will be renaming their old PGDATA directory during the upgrade?  We
> >could add the test to 9.3 too, of course.
> 
> Well it is not generally accepted that users should even be creating
> tablespaces in $PGDATA, but it is allowed by the program. My
> inclination is to say that it is then the programs'(Postgres)
> responsibility to deal with it. The alternative is to clarify the
> documentation and make it the users responsibility. As to users
> upgrading from 9.1- to 9.2+, I see still a lot of users posting to
> --general using 9.1- versions. At some point they will likely
> migrate, so I can see a fix being worthwhile.

True.

> >Having pg_tablespace and the symlinks get out of sync was the reason
> >Magnus removed that duplication in 9.2, but I was unaware of how
> >pg_upgrade really magnifies the problem for tablespaces in PGDATA by
> >recommending a PGDATA rename.
> >
> 
> Well it was based on the valid assumption that people would create
> new tablespaces outside $PGDATA because that is really is what is
> meant to happen. You know us users we like to test assumptions:)

Also true.  :-)

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: [GENERAL] pg_upgrade & tablespaces

От
Magnus Hagander
Дата:
On Sun, Jan 12, 2014 at 5:26 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Sat, Jan 11, 2014 at 12:48:51PM -0800, Adrian Klaver wrote:
> >pg_upgrade looks in the pg_tablespace in pre-9.2, and uses a function in
> >9.2+.  The query is:
> >
> >     snprintf(query, sizeof(query),
> >              "SELECT    %s "
> >              "FROM  pg_catalog.pg_tablespace "
> >              "WHERE spcname != 'pg_default' AND "
> >              "      spcname != 'pg_global'",
> >     /* 9.2 removed the spclocation column */
> >              (GET_MAJOR_VERSION(old_cluster.major_version) <= 901) ?
> >--> "spclocation" : "pg_catalog.pg_tablespace_location(oid) AS spclocation");
>
>
> I see, though I have another question. If pg_tablespace and the
> symlinks can get out of sync, as you say below, why is pg_tablespace
> considered the authority? Or to put it another way, why not just
> look at the symlinks as in 9.2+?

Uh, good question.  I think I used the system tables because they were
easier to access.  I can't remember if we used the symlinks for some
things and pg_tablespace for other things in pre-9.2.

If you mean PostgreSQL internally then no, we didn't use pg_tablespace for anything ever. We only used the symlinks. Which is why it was so easy to remove.

If you were using it for something inside pg_upgrade I don't know, but the backend didn't.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [GENERAL] pg_upgrade & tablespaces

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> On Sat, Jan 11, 2014 at 12:48:51PM -0800, Adrian Klaver wrote:
>> I see, though I have another question. If pg_tablespace and the
>> symlinks can get out of sync, as you say below, why is pg_tablespace
>> considered the authority? Or to put it another way, why not just
>> look at the symlinks as in 9.2+?

> Uh, good question.  I think I used the system tables because they were
> easier to access.  I can't remember if we used the symlinks for some
> things and pg_tablespace for other things in pre-9.2.

Well, pre-9.2 pg_dumpall is going to make use of the pg_tablespace
entries, because it has no other choice.  We could conceivably teach
pg_upgrade to look at the symlinks for itself, but we're not going
to do that in pg_dumpall.  Which means that the intermediate dump
script would contain inconsistent location values anyway if the
catalog entries are wrong.  So I don't see any value in changing the
quoted code in pg_upgrade.

It does however seem reasonable for pg_upgrade to note whether any
of the paths are prefixed by old PGDATA and warn about the risks
involved.
        regards, tom lane



Re: [GENERAL] pg_upgrade & tablespaces

От
Bruce Momjian
Дата:
On Sun, Jan 12, 2014 at 12:48:40PM -0500, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Sat, Jan 11, 2014 at 12:48:51PM -0800, Adrian Klaver wrote:
> >> I see, though I have another question. If pg_tablespace and the
> >> symlinks can get out of sync, as you say below, why is pg_tablespace
> >> considered the authority? Or to put it another way, why not just
> >> look at the symlinks as in 9.2+?
> 
> > Uh, good question.  I think I used the system tables because they were
> > easier to access.  I can't remember if we used the symlinks for some
> > things and pg_tablespace for other things in pre-9.2.
> 
> Well, pre-9.2 pg_dumpall is going to make use of the pg_tablespace
> entries, because it has no other choice.  We could conceivably teach
> pg_upgrade to look at the symlinks for itself, but we're not going
> to do that in pg_dumpall.  Which means that the intermediate dump
> script would contain inconsistent location values anyway if the
> catalog entries are wrong.  So I don't see any value in changing the
> quoted code in pg_upgrade.

OK, agreed.

> It does however seem reasonable for pg_upgrade to note whether any
> of the paths are prefixed by old PGDATA and warn about the risks
> involved.

Uh, the problem is that once you rename the old PGDATA, the
pg_tablespace contents no longer point to the current PGDATA.  The
symlinks, if they used absolute paths, wouldn't point to the current
PGDATA either.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: [GENERAL] pg_upgrade & tablespaces

От
Adrian Klaver
Дата:
On 01/12/2014 07:02 PM, Bruce Momjian wrote:
> On Sun, Jan 12, 2014 at 12:48:40PM -0500, Tom Lane wrote:
>> Bruce Momjian <bruce@momjian.us> writes:
>>> On Sat, Jan 11, 2014 at 12:48:51PM -0800, Adrian Klaver wrote:
>>>> I see, though I have another question. If pg_tablespace and the
>>>> symlinks can get out of sync, as you say below, why is pg_tablespace
>>>> considered the authority? Or to put it another way, why not just
>>>> look at the symlinks as in 9.2+?
>>
>>> Uh, good question.  I think I used the system tables because they were
>>> easier to access.  I can't remember if we used the symlinks for some
>>> things and pg_tablespace for other things in pre-9.2.
>>
>> Well, pre-9.2 pg_dumpall is going to make use of the pg_tablespace
>> entries, because it has no other choice.  We could conceivably teach
>> pg_upgrade to look at the symlinks for itself, but we're not going
>> to do that in pg_dumpall.  Which means that the intermediate dump
>> script would contain inconsistent location values anyway if the
>> catalog entries are wrong.  So I don't see any value in changing the
>> quoted code in pg_upgrade.
>
> OK, agreed.
>
>> It does however seem reasonable for pg_upgrade to note whether any
>> of the paths are prefixed by old PGDATA and warn about the risks
>> involved.
>
> Uh, the problem is that once you rename the old PGDATA, the
> pg_tablespace contents no longer point to the current PGDATA.  The
> symlinks, if they used absolute paths, wouldn't point to the current
> PGDATA either.
>

Well the problem is that it actually points to a current PGDATA just the 
wrong one. To use the source installation path and the suggested upgrade 
method from pg_upgrade.

Start.

/usr/local/pgsql/data/tblspc_dir

mv above to

/usr/local/pgsql_old/

install new version of Postgres to

/usr/local/pgsql/data/


In the pgsql_old installation you have symlinks pointing back to the 
current default location. As well pg_tablespace points back to 
/usr/local/pgsql/data/ The issue is that there is not actually anything 
there in the way of a tablespace. So when pg_upgrade runs it tries to 
upgrade from /usr/local/pgsql/data/tblspc_dir to 
/usr/local/pgsql/data/tblspc_dir where the first directory either does 
not exist. or if the user went ahead and created the directory in the 
new installation, is empty. What is really wanted is to upgrade from 
/usr/local/pgsql_old/data/tblspc_dir to 
/usr/local/pgsql/data/tblspc_dir. Right now the only way that happens is 
with user intervention.

-- 
Adrian Klaver
adrian.klaver@gmail.com



Re: [GENERAL] pg_upgrade & tablespaces

От
Bruce Momjian
Дата:
On Sun, Jan 12, 2014 at 07:58:52PM -0800, Adrian Klaver wrote:
> Well the problem is that it actually points to a current PGDATA just
> the wrong one. To use the source installation path and the suggested
> upgrade method from pg_upgrade.
> 
> Start.
> 
> /usr/local/pgsql/data/tblspc_dir
> 
> mv above to
> 
> /usr/local/pgsql_old/
> 
> install new version of Postgres to
> 
> /usr/local/pgsql/data/
> 
> 
> In the pgsql_old installation you have symlinks pointing back to the
> current default location. As well pg_tablespace points back to
> /usr/local/pgsql/data/ The issue is that there is not actually
> anything there in the way of a tablespace. So when pg_upgrade runs
> it tries to upgrade from /usr/local/pgsql/data/tblspc_dir to
> /usr/local/pgsql/data/tblspc_dir where the first directory either
> does not exist. or if the user went ahead and created the directory
> in the new installation, is empty. What is really wanted is to
> upgrade from /usr/local/pgsql_old/data/tblspc_dir to
> /usr/local/pgsql/data/tblspc_dir. Right now the only way that
> happens is with user intervention.

Right, it points to _nothing_ in the _new_ cluster.  Perhaps the
simplest approach would be to check all the pg_tablespace locations to
see if they point at real directories.  If not, we would have to have
the user update pg_tablespace and the symlinks.  :-(  Actually, even in
9.2+, those symlinks are going to point at the same "nothing".  That
would support checking the symlinks in all versions.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: [GENERAL] pg_upgrade & tablespaces

От
Adrian Klaver
Дата:
On 01/12/2014 08:04 PM, Bruce Momjian wrote:
> On Sun, Jan 12, 2014 at 07:58:52PM -0800, Adrian Klaver wrote:
>> Well the problem is that it actually points to a current PGDATA just
>> the wrong one. To use the source installation path and the suggested
>> upgrade method from pg_upgrade.
>>

>> In the pgsql_old installation you have symlinks pointing back to the
>> current default location. As well pg_tablespace points back to
>> /usr/local/pgsql/data/ The issue is that there is not actually
>> anything there in the way of a tablespace. So when pg_upgrade runs
>> it tries to upgrade from /usr/local/pgsql/data/tblspc_dir to
>> /usr/local/pgsql/data/tblspc_dir where the first directory either
>> does not exist. or if the user went ahead and created the directory
>> in the new installation, is empty. What is really wanted is to
>> upgrade from /usr/local/pgsql_old/data/tblspc_dir to
>> /usr/local/pgsql/data/tblspc_dir. Right now the only way that
>> happens is with user intervention.
>
> Right, it points to _nothing_ in the _new_ cluster.  Perhaps the
> simplest approach would be to check all the pg_tablespace locations to
> see if they point at real directories.  If not, we would have to have
> the user update pg_tablespace and the symlinks.  :-(  Actually, even in
> 9.2+, those symlinks are going to point at the same "nothing".  That
> would support checking the symlinks in all versions.
>

Agreed.


-- 
Adrian Klaver
adrian.klaver@gmail.com



Re: [GENERAL] pg_upgrade & tablespaces

От
Bruce Momjian
Дата:
On Sun, Jan 12, 2014 at 11:04:41PM -0500, Bruce Momjian wrote:
> > In the pgsql_old installation you have symlinks pointing back to the
> > current default location. As well pg_tablespace points back to
> > /usr/local/pgsql/data/ The issue is that there is not actually
> > anything there in the way of a tablespace. So when pg_upgrade runs
> > it tries to upgrade from /usr/local/pgsql/data/tblspc_dir to
> > /usr/local/pgsql/data/tblspc_dir where the first directory either
> > does not exist. or if the user went ahead and created the directory
> > in the new installation, is empty. What is really wanted is to
> > upgrade from /usr/local/pgsql_old/data/tblspc_dir to
> > /usr/local/pgsql/data/tblspc_dir. Right now the only way that
> > happens is with user intervention.
>
> Right, it points to _nothing_ in the _new_ cluster.  Perhaps the
> simplest approach would be to check all the pg_tablespace locations to
> see if they point at real directories.  If not, we would have to have
> the user update pg_tablespace and the symlinks.  :-(  Actually, even in
> 9.2+, those symlinks are going to point at the same "nothing".  That
> would support checking the symlinks in all versions.

I have developed the attached patch which checks all tablespaces to make
sure the directories exist.  I plan to backpatch this.

The reason we haven't seen this bug reported more frequently is that a
_database_ defined in a non-existent tablespace directory already throws
an backend error, so this check is only necessary where tables/indexes
(not databases) are defined in non-existant tablespace directories.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Вложения

Re: [GENERAL] pg_upgrade & tablespaces

От
Bruce Momjian
Дата:
On Wed, Apr 16, 2014 at 01:49:20PM -0400, Bruce Momjian wrote:
> On Sun, Jan 12, 2014 at 11:04:41PM -0500, Bruce Momjian wrote:
> > > In the pgsql_old installation you have symlinks pointing back to the
> > > current default location. As well pg_tablespace points back to
> > > /usr/local/pgsql/data/ The issue is that there is not actually
> > > anything there in the way of a tablespace. So when pg_upgrade runs
> > > it tries to upgrade from /usr/local/pgsql/data/tblspc_dir to
> > > /usr/local/pgsql/data/tblspc_dir where the first directory either
> > > does not exist. or if the user went ahead and created the directory
> > > in the new installation, is empty. What is really wanted is to
> > > upgrade from /usr/local/pgsql_old/data/tblspc_dir to
> > > /usr/local/pgsql/data/tblspc_dir. Right now the only way that
> > > happens is with user intervention.
> > 
> > Right, it points to _nothing_ in the _new_ cluster.  Perhaps the
> > simplest approach would be to check all the pg_tablespace locations to
> > see if they point at real directories.  If not, we would have to have
> > the user update pg_tablespace and the symlinks.  :-(  Actually, even in
> > 9.2+, those symlinks are going to point at the same "nothing".  That
> > would support checking the symlinks in all versions.
> 
> I have developed the attached patch which checks all tablespaces to make
> sure the directories exist.  I plan to backpatch this.
> 
> The reason we haven't seen this bug reported more frequently is that a
> _database_ defined in a non-existent tablespace directory already throws
> an backend error, so this check is only necessary where tables/indexes
> (not databases) are defined in non-existant tablespace directories.

Patch applied and backpatched to 9.3.  I beefed up the C comment to
explain how this can happen:
      Check that the tablespace path exists and is a directory.      Effectively, this is checking only for
tables/indexesin      non-existent tablespace directories.  Databases located      in non-existent tablespaces already
throwa backend error.      Non-existent tablespace directories can occur when a data directory      that contains user
tablespacesis moved as part of pg_upgrade      preparation and the symbolic links are not updated.
 

Thanks for the report and debugging.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +