Обсуждение: create tablespace fails silently, or succeeds improperly
as seen below create tablespace does not throw an error or appear to do anything other than register the tablespace. postgres@db01:~> less /opt/pg/data/jnj_indexes/PG_VERSION 8.4 postgres@db01:~> /opt/pg91/bin/psql -p 5433 psql (9.0.1) Type "help" for help. postgres=# select version(); version -------------------------------------------------------------------------------------------------------------------PostgreSQL 9.0.1on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-11), 64-bit (1 row) postgres=# create TABLESPACE jnj_indexes location '/opt/pg/data/jnj_indexes'; CREATE TABLESPACE Dave
Hi Dave, Excerpts from Dave Cramer's message of lun oct 18 12:23:40 -0300 2010: > as seen below create tablespace does not throw an error or appear to > do anything other than register the tablespace. > > postgres@db01:~> less /opt/pg/data/jnj_indexes/PG_VERSION > 8.4 > postgres@db01:~> /opt/pg91/bin/psql -p 5433 > psql (9.0.1) > Type "help" for help. > > postgres=# select version(); > version > ------------------------------------------------------------------------------------------------------------------- > PostgreSQL 9.0.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc > (GCC) 3.4.6 20060404 (Red Hat 3.4.6-11), 64-bit > (1 row) > > postgres=# create TABLESPACE jnj_indexes location '/opt/pg/data/jnj_indexes'; > CREATE TABLESPACE IIRC the reason this works is that the tablespace code now creates a version-specific subdirectory inside the specified directory. This was done to help binary upgrades. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Dave Cramer <pg@fastcrypt.com> writes: > as seen below create tablespace does not throw an error or appear to > do anything other than register the tablespace. I suspect this behavior is partially intentional, because tablespace creation now involves an extra level of subdirectory. However, it's not clear to me why CREATE TABLESPACE is still changing the permissions on the parent directory. Bruce, exactly what is the rationale here? regards, tom lane
On Mon, Oct 18, 2010 at 11:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Dave Cramer <pg@fastcrypt.com> writes: >> as seen below create tablespace does not throw an error or appear to >> do anything other than register the tablespace. > > I suspect this behavior is partially intentional, because tablespace > creation now involves an extra level of subdirectory. However, it's > not clear to me why CREATE TABLESPACE is still changing the permissions > on the parent directory. Bruce, exactly what is the rationale here? OK, it appears there are a few loose ends here then as the documentation http://www.postgresql.org/docs/9.0/interactive/sql-createtablespace.html says the directory needs to be empty. Dave
Dave Cramer wrote: > On Mon, Oct 18, 2010 at 11:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Dave Cramer <pg@fastcrypt.com> writes: > >> as seen below create tablespace does not throw an error or appear to > >> do anything other than register the tablespace. > > > > I suspect this behavior is partially intentional, because tablespace > > creation now involves an extra level of subdirectory. ?However, it's > > not clear to me why CREATE TABLESPACE is still changing the permissions > > on the parent directory. ?Bruce, exactly what is the rationale here? > > OK, it appears there are a few loose ends here then as the > documentation http://www.postgresql.org/docs/9.0/interactive/sql-createtablespace.html > says the directory needs to be empty. Docs updated to say "should", not "must", be empty. Backpatched to 9.0.X. If we need more change, I can do those too. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/doc/src/sgml/ref/create_tablespace.sgml b/doc/src/sgml/ref/create_tablespace.sgml index 24fb79e..beda454 100644 --- a/doc/src/sgml/ref/create_tablespace.sgml +++ b/doc/src/sgml/ref/create_tablespace.sgml @@ -81,7 +81,7 @@ CREATE TABLESPACE <replaceable class="parameter">tablespace_name</replaceable> [ <listitem> <para> The directory that will be used for the tablespace. The directory - must be empty and must be owned by the + should be empty and must be owned by the <productname>PostgreSQL</> system user. The directory must be specified by an absolute path name. </para>
On Mon, Oct 18, 2010 at 2:17 PM, Bruce Momjian <bruce@momjian.us> wrote: > Dave Cramer wrote: >> On Mon, Oct 18, 2010 at 11:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> > Dave Cramer <pg@fastcrypt.com> writes: >> >> as seen below create tablespace does not throw an error or appear to >> >> do anything other than register the tablespace. >> > >> > I suspect this behavior is partially intentional, because tablespace >> > creation now involves an extra level of subdirectory. ?However, it's >> > not clear to me why CREATE TABLESPACE is still changing the permissions >> > on the parent directory. ?Bruce, exactly what is the rationale here? >> >> OK, it appears there are a few loose ends here then as the >> documentation http://www.postgresql.org/docs/9.0/interactive/sql-createtablespace.html >> says the directory needs to be empty. > > Docs updated to say "should", not "must", be empty. Backpatched to > 9.0.X. If we need more change, I can do those too. Perhaps we should fix the behavior rather than the documentation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Alvaro Herrera wrote: > Hi Dave, > > Excerpts from Dave Cramer's message of lun oct 18 12:23:40 -0300 2010: > > as seen below create tablespace does not throw an error or appear to > > do anything other than register the tablespace. > > > > postgres@db01:~> less /opt/pg/data/jnj_indexes/PG_VERSION > > 8.4 > > postgres@db01:~> /opt/pg91/bin/psql -p 5433 > > psql (9.0.1) > > Type "help" for help. > > > > postgres=# select version(); > > version > > ------------------------------------------------------------------------------------------------------------------- > > PostgreSQL 9.0.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc > > (GCC) 3.4.6 20060404 (Red Hat 3.4.6-11), 64-bit > > (1 row) > > > > postgres=# create TABLESPACE jnj_indexes location '/opt/pg/data/jnj_indexes'; > > CREATE TABLESPACE > > IIRC the reason this works is that the tablespace code now creates a > version-specific subdirectory inside the specified directory. This was > done to help binary upgrades. Right, the directory is catalog-version named, which was done to allow for pg_upgrade to work for alpha/beta upgrades (pretty cool). The case above happened because 8.4 still has data in that tablespace. pg_upgrade does supply a script to delete old data files, but it was not used in the case above. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Tom Lane wrote: > Dave Cramer <pg@fastcrypt.com> writes: > > as seen below create tablespace does not throw an error or appear to > > do anything other than register the tablespace. > > I suspect this behavior is partially intentional, because tablespace > creation now involves an extra level of subdirectory. However, it's > not clear to me why CREATE TABLESPACE is still changing the permissions > on the parent directory. Bruce, exactly what is the rationale here? Well, the symbolic link from data/pg_tblspc points to the top directory, not to the catalog-version-named subdirectory. This was done for several reasons, particularly so the directory pointed to by the symlink would be exactly the same as that specified by CREATE TABLESPACE, for code clarity. Tom, is there a particular permission change you were wondering about? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Robert Haas wrote: > On Mon, Oct 18, 2010 at 2:17 PM, Bruce Momjian <bruce@momjian.us> wrote: > > Dave Cramer wrote: > >> On Mon, Oct 18, 2010 at 11:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> > Dave Cramer <pg@fastcrypt.com> writes: > >> >> as seen below create tablespace does not throw an error or appear to > >> >> do anything other than register the tablespace. > >> > > >> > I suspect this behavior is partially intentional, because tablespace > >> > creation now involves an extra level of subdirectory. ?However, it's > >> > not clear to me why CREATE TABLESPACE is still changing the permissions > >> > on the parent directory. ?Bruce, exactly what is the rationale here? > >> > >> OK, it appears there are a few loose ends here then as the > >> documentation http://www.postgresql.org/docs/9.0/interactive/sql-createtablespace.html > >> says the directory needs to be empty. > > > > Docs updated to say "should", not "must", be empty. ?Backpatched to > > 9.0.X. ?If we need more change, I can do those too. > > Perhaps we should fix the behavior rather than the documentation. We can't fix the behavior because we have to allow an old cluster to keep its tablespace files during a pg_upgrade. They are given a script to delete those files later, if they want. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Mon, Oct 18, 2010 at 2:20 PM, Bruce Momjian <bruce@momjian.us> wrote: > Alvaro Herrera wrote: >> Hi Dave, >> >> Excerpts from Dave Cramer's message of lun oct 18 12:23:40 -0300 2010: >> > as seen below create tablespace does not throw an error or appear to >> > do anything other than register the tablespace. >> > >> > postgres@db01:~> less /opt/pg/data/jnj_indexes/PG_VERSION >> > 8.4 >> > postgres@db01:~> /opt/pg91/bin/psql -p 5433 >> > psql (9.0.1) >> > Type "help" for help. >> > >> > postgres=# select version(); >> > version >> > ------------------------------------------------------------------------------------------------------------------- >> > PostgreSQL 9.0.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc >> > (GCC) 3.4.6 20060404 (Red Hat 3.4.6-11), 64-bit >> > (1 row) >> > >> > postgres=# create TABLESPACE jnj_indexes location '/opt/pg/data/jnj_indexes'; >> > CREATE TABLESPACE >> >> IIRC the reason this works is that the tablespace code now creates a >> version-specific subdirectory inside the specified directory. This was >> done to help binary upgrades. > > Right, the directory is catalog-version named, which was done to allow > for pg_upgrade to work for alpha/beta upgrades (pretty cool). The case > above happened because 8.4 still has data in that tablespace. > pg_upgrade does supply a script to delete old data files, but it was not > used in the case above. > right that's because I did not use pg_upgrade. I was manually running create tablespace. Dave
Dave Cramer wrote: > On Mon, Oct 18, 2010 at 2:20 PM, Bruce Momjian <bruce@momjian.us> wrote: > > Alvaro Herrera wrote: > >> Hi Dave, > >> > >> Excerpts from Dave Cramer's message of lun oct 18 12:23:40 -0300 2010: > >> > as seen below create tablespace does not throw an error or appear to > >> > do anything other than register the tablespace. > >> > > >> > postgres@db01:~> less /opt/pg/data/jnj_indexes/PG_VERSION > >> > 8.4 > >> > postgres@db01:~> /opt/pg91/bin/psql -p 5433 > >> > psql (9.0.1) > >> > Type "help" for help. > >> > > >> > postgres=# select version(); > >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? version > >> > ------------------------------------------------------------------------------------------------------------------- > >> > ?PostgreSQL 9.0.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc > >> > (GCC) 3.4.6 20060404 (Red Hat 3.4.6-11), 64-bit > >> > (1 row) > >> > > >> > postgres=# create TABLESPACE jnj_indexes location '/opt/pg/data/jnj_indexes'; > >> > CREATE TABLESPACE > >> > >> IIRC the reason this works is that the tablespace code now creates a > >> version-specific subdirectory inside the specified directory. ?This was > >> done to help binary upgrades. > > > > Right, the directory is catalog-version named, which was done to allow > > for pg_upgrade to work for alpha/beta upgrades (pretty cool). ?The case > > above happened because 8.4 still has data in that tablespace. > > pg_upgrade does supply a script to delete old data files, but it was not > > used in the case above. > > > > right that's because I did not use pg_upgrade. I was manually running > create tablespace. OK, so you were sharing the tablespace with old and new clusters. You are right that in the past that would not have been possible because PG_VERSION would have conflicted, but it is now possible with all new releases because of the catalog-version-named subdirectory. That seems like I a feature, I guess. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Mon, Oct 18, 2010 at 2:39 PM, Bruce Momjian <bruce@momjian.us> wrote: > Dave Cramer wrote: >> On Mon, Oct 18, 2010 at 2:20 PM, Bruce Momjian <bruce@momjian.us> wrote: >> > Alvaro Herrera wrote: >> >> Hi Dave, >> >> >> >> Excerpts from Dave Cramer's message of lun oct 18 12:23:40 -0300 2010: >> >> > as seen below create tablespace does not throw an error or appear to >> >> > do anything other than register the tablespace. >> >> > >> >> > postgres@db01:~> less /opt/pg/data/jnj_indexes/PG_VERSION >> >> > 8.4 >> >> > postgres@db01:~> /opt/pg91/bin/psql -p 5433 >> >> > psql (9.0.1) >> >> > Type "help" for help. >> >> > >> >> > postgres=# select version(); >> >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? version >> >> > ------------------------------------------------------------------------------------------------------------------- >> >> > ?PostgreSQL 9.0.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc >> >> > (GCC) 3.4.6 20060404 (Red Hat 3.4.6-11), 64-bit >> >> > (1 row) >> >> > >> >> > postgres=# create TABLESPACE jnj_indexes location '/opt/pg/data/jnj_indexes'; >> >> > CREATE TABLESPACE >> >> >> >> IIRC the reason this works is that the tablespace code now creates a >> >> version-specific subdirectory inside the specified directory. ?This was >> >> done to help binary upgrades. >> > >> > Right, the directory is catalog-version named, which was done to allow >> > for pg_upgrade to work for alpha/beta upgrades (pretty cool). ?The case >> > above happened because 8.4 still has data in that tablespace. >> > pg_upgrade does supply a script to delete old data files, but it was not >> > used in the case above. >> > >> >> right that's because I did not use pg_upgrade. I was manually running >> create tablespace. > > OK, so you were sharing the tablespace with old and new clusters. You > are right that in the past that would not have been possible because > PG_VERSION would have conflicted, but it is now possible with all new > releases because of the catalog-version-named subdirectory. That seems > like I a feature, I guess. > Sounds unintended. As it turns out I was expecting it to fail and was surprised when it succeeded. Dave
Dave Cramer wrote: > >> >> > postgres=# select version(); > >> >> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? version > >> >> > ------------------------------------------------------------------------------------------------------------------- > >> >> > ?PostgreSQL 9.0.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc > >> >> > (GCC) 3.4.6 20060404 (Red Hat 3.4.6-11), 64-bit > >> >> > (1 row) > >> >> > > >> >> > postgres=# create TABLESPACE jnj_indexes location '/opt/pg/data/jnj_indexes'; > >> >> > CREATE TABLESPACE > >> >> > >> >> IIRC the reason this works is that the tablespace code now creates a > >> >> version-specific subdirectory inside the specified directory. ?This was > >> >> done to help binary upgrades. > >> > > >> > Right, the directory is catalog-version named, which was done to allow > >> > for pg_upgrade to work for alpha/beta upgrades (pretty cool). ?The case > >> > above happened because 8.4 still has data in that tablespace. > >> > pg_upgrade does supply a script to delete old data files, but it was not > >> > used in the case above. > >> > > >> > >> right that's because I did not use pg_upgrade. I was manually running > >> create tablespace. > > > > OK, so you were sharing the tablespace with old and new clusters. ?You > > are right that in the past that would not have been possible because > > PG_VERSION would have conflicted, but it is now possible with all new > > releases because of the catalog-version-named subdirectory. ?That seems > > like I a feature, I guess. > > > > Sounds unintended. As it turns out I was expecting it to fail and was > surprised when it succeeded. Well, it was intended, or rather required by pg_upgrade. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Dave Cramer wrote: > >> >> IIRC the reason this works is that the tablespace code now creates a > >> >> version-specific subdirectory inside the specified directory. ?This was > >> >> done to help binary upgrades. > >> > > >> > Right, the directory is catalog-version named, which was done to allow > >> > for pg_upgrade to work for alpha/beta upgrades (pretty cool). ?The case > >> > above happened because 8.4 still has data in that tablespace. > >> > pg_upgrade does supply a script to delete old data files, but it was not > >> > used in the case above. > >> > > >> > >> right that's because I did not use pg_upgrade. I was manually running > >> create tablespace. > > > > OK, so you were sharing the tablespace with old and new clusters. ?You > > are right that in the past that would not have been possible because > > PG_VERSION would have conflicted, but it is now possible with all new > > releases because of the catalog-version-named subdirectory. ?That seems > > like I a feature, I guess. > > > > Sounds unintended. As it turns out I was expecting it to fail and was > surprised when it succeeded. Well, it was intended, or rather required by pg_upgrade. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes: > Tom Lane wrote: >> I suspect this behavior is partially intentional, because tablespace >> creation now involves an extra level of subdirectory. However, it's >> not clear to me why CREATE TABLESPACE is still changing the permissions >> on the parent directory. Bruce, exactly what is the rationale here? > Tom, is there a particular permission change you were wondering about? In testing it, I noticed that the permissions of the parent directory (the one named in LOCATION) were changed to 0700, which is not where I'd had them set before. I'm not sure that that is still necessary or reasonable. We should make the subdirectory (eg PG_9.1_201010151) mode 0700, but I am dubious that it's still sensible to require ownership on the parent, much less to change its permissions. The argument for locking down the parent seems to be to prevent a bad guy from renaming the subdirectory out of the way and substituting his own --- but if we're trying to prevent that type of attack, then we have to insist on restrictive permissions all the way up the path, not just on the immediate parent. And we do not try to prevent such attacks on the $PGDATA directory itself, so why should we do it on a tablespace? So basically I think this requires some re-thinking that it didn't get. Perhaps we should just be satisfied if we are able to create the subdirectory as owned by postgres, and leave it to the user as to whether the parent directory is a secure place to put the subdirectory. regards, tom lane
Bruce Momjian <bruce@momjian.us> writes: > Robert Haas wrote: >> Perhaps we should fix the behavior rather than the documentation. > We can't fix the behavior because we have to allow an old cluster to > keep its tablespace files during a pg_upgrade. They are given a script > to delete those files later, if they want. The new *layout* of the files may be forced by pg_upgrade considerations, but I don't think that proves much of anything about ownership and permissions settings --- especially not the fact that we are now enforcing settings that were designed for the data directory itself on what is effectively one level up from that. regards, tom lane
On Mon, Oct 18, 2010 at 3:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Bruce Momjian <bruce@momjian.us> writes: >> Robert Haas wrote: >>> Perhaps we should fix the behavior rather than the documentation. > >> We can't fix the behavior because we have to allow an old cluster to >> keep its tablespace files during a pg_upgrade. They are given a script >> to delete those files later, if they want. > > The new *layout* of the files may be forced by pg_upgrade > considerations, but I don't think that proves much of anything about > ownership and permissions settings --- especially not the fact that we > are now enforcing settings that were designed for the data directory > itself on what is effectively one level up from that. I haven't yet been convinced we need or want to relax the rule about the directory being empty. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Tom Lane wrote: > >> I suspect this behavior is partially intentional, because tablespace > >> creation now involves an extra level of subdirectory. However, it's > >> not clear to me why CREATE TABLESPACE is still changing the permissions > >> on the parent directory. Bruce, exactly what is the rationale here? > > > Tom, is there a particular permission change you were wondering about? > > In testing it, I noticed that the permissions of the parent directory > (the one named in LOCATION) were changed to 0700, which is not where > I'd had them set before. I'm not sure that that is still necessary > or reasonable. We should make the subdirectory (eg PG_9.1_201010151) > mode 0700, but I am dubious that it's still sensible to require > ownership on the parent, much less to change its permissions. The > argument for locking down the parent seems to be to prevent a bad guy > from renaming the subdirectory out of the way and substituting his own > --- but if we're trying to prevent that type of attack, then we have to > insist on restrictive permissions all the way up the path, not just on > the immediate parent. And we do not try to prevent such attacks on the > $PGDATA directory itself, so why should we do it on a tablespace? > > So basically I think this requires some re-thinking that it didn't get. > Perhaps we should just be satisfied if we are able to create the > subdirectory as owned by postgres, and leave it to the user as to > whether the parent directory is a secure place to put the subdirectory. Good point. I did not think through the security restrictions of the parent, but because we were symlinking to it, I thought we should lock it down. I see no problem in relaxing the restrictions as you suggest. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Robert Haas wrote: > >> Perhaps we should fix the behavior rather than the documentation. > > > We can't fix the behavior because we have to allow an old cluster to > > keep its tablespace files during a pg_upgrade. They are given a script > > to delete those files later, if they want. > > The new *layout* of the files may be forced by pg_upgrade > considerations, but I don't think that proves much of anything about > ownership and permissions settings --- especially not the fact that we > are now enforcing settings that were designed for the data directory > itself on what is effectively one level up from that. Agreed. I was speaking only about the ability to have something in that top-level directory when you are creating the tablespace. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Robert Haas wrote: > On Mon, Oct 18, 2010 at 3:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Bruce Momjian <bruce@momjian.us> writes: > >> Robert Haas wrote: > >>> Perhaps we should fix the behavior rather than the documentation. > > > >> We can't fix the behavior because we have to allow an old cluster to > >> keep its tablespace files during a pg_upgrade. ?They are given a script > >> to delete those files later, if they want. > > > > The new *layout* of the files may be forced by pg_upgrade > > considerations, but I don't think that proves much of anything about > > ownership and permissions settings --- especially not the fact that we > > are now enforcing settings that were designed for the data directory > > itself on what is effectively one level up from that. > > I haven't yet been convinced we need or want to relax the rule about > the directory being empty. Uh, how would pg_upgrade work then? It would require renaming the top-level tablespace directory, which might require root permissions. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Mon, Oct 18, 2010 at 3:05 PM, Bruce Momjian <bruce@momjian.us> wrote: > Robert Haas wrote: >> On Mon, Oct 18, 2010 at 3:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> > Bruce Momjian <bruce@momjian.us> writes: >> >> Robert Haas wrote: >> >>> Perhaps we should fix the behavior rather than the documentation. >> > >> >> We can't fix the behavior because we have to allow an old cluster to >> >> keep its tablespace files during a pg_upgrade. ?They are given a script >> >> to delete those files later, if they want. >> > >> > The new *layout* of the files may be forced by pg_upgrade >> > considerations, but I don't think that proves much of anything about >> > ownership and permissions settings --- especially not the fact that we >> > are now enforcing settings that were designed for the data directory >> > itself on what is effectively one level up from that. >> >> I haven't yet been convinced we need or want to relax the rule about >> the directory being empty. > > Uh, how would pg_upgrade work then? It would require renaming the > top-level tablespace directory, which might require root permissions. Huh? Whether or not we choose to store our data files in a subdirectory is an independent question from whether or not we verify that the directory is empty before we start scribbling on it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > On Mon, Oct 18, 2010 at 3:05 PM, Bruce Momjian <bruce@momjian.us> wrote: > > Robert Haas wrote: > >> On Mon, Oct 18, 2010 at 3:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> > Bruce Momjian <bruce@momjian.us> writes: > >> >> Robert Haas wrote: > >> >>> Perhaps we should fix the behavior rather than the documentation. > >> > > >> >> We can't fix the behavior because we have to allow an old cluster to > >> >> keep its tablespace files during a pg_upgrade. ?They are given a script > >> >> to delete those files later, if they want. > >> > > >> > The new *layout* of the files may be forced by pg_upgrade > >> > considerations, but I don't think that proves much of anything about > >> > ownership and permissions settings --- especially not the fact that we > >> > are now enforcing settings that were designed for the data directory > >> > itself on what is effectively one level up from that. > >> > >> I haven't yet been convinced we need or want to relax the rule about > >> the directory being empty. > > > > Uh, how would pg_upgrade work then? ?It would require renaming the > > top-level tablespace directory, which might require root permissions. > > Huh? Whether or not we choose to store our data files in a > subdirectory is an independent question from whether or not we verify > that the directory is empty before we start scribbling on it. But the directory will not be empty when pg_upgrade creates the tablespace subdirectory. Backing up, the problem was that we originally stored the data in the symlink directory, and that made it impossible for pg_upgrade to create a parallel tablespace directory for the new version. By using a subdirectory, pg_upgrade knows where to store the tablespace subdirectory and does not interfere with other PG versions stored in the same top-leavel directory. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Oct 18, 2010 at 3:05 PM, Bruce Momjian <bruce@momjian.us> wrote: >> Robert Haas wrote: >>> I haven't yet been convinced we need or want to relax the rule about >>> the directory being empty. >> >> Uh, how would pg_upgrade work then? �It would require renaming the >> top-level tablespace directory, which might require root permissions. > Huh? Whether or not we choose to store our data files in a > subdirectory is an independent question from whether or not we verify > that the directory is empty before we start scribbling on it. No, you're missing the point. If a pre-9.0 DB is told that tablespace T has location '/foo/bar', it'll start creating stuff right in /foo/bar. pg_upgrade will tell 9.0 to create the tablespace with location /foo/bar as well. If 9.0 refuses because that directory contains stuff already, the upgrade will fail. Instead, we make a version-numbered subdirectory and start creating 9.0's stuff there. Given the use of the version-numbered subdirectory, I see no real merit in insisting that the parent directory be empty anyway. It'd be precisely analogous to "initdb -D /foo/bar/data" insisting that /foo/bar be empty, which we have never done and nobody's ever suggested would be a good idea. What *should* be tested, and I hope is, is that if the version-numbered subdirectory already exists then it be empty. Or do we just reject the create whenever the subdirectory exists already? That seems like a bad idea for the same reasons that we allow initdb on an existing data directory: sometimes it's impractical to open up the permissions on the parent directory for exactly the window needed to issue the creation command. I think we should be testing for empty and then putting a PG_VERSION file in it to convert it to nonempty ... which doesn't seem to be happening now. regards, tom lane
On Mon, Oct 18, 2010 at 3:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Oct 18, 2010 at 3:05 PM, Bruce Momjian <bruce@momjian.us> wrote: >>> Robert Haas wrote: >>>> I haven't yet been convinced we need or want to relax the rule about >>>> the directory being empty. >>> >>> Uh, how would pg_upgrade work then? It would require renaming the >>> top-level tablespace directory, which might require root permissions. > >> Huh? Whether or not we choose to store our data files in a >> subdirectory is an independent question from whether or not we verify >> that the directory is empty before we start scribbling on it. > > No, you're missing the point. If a pre-9.0 DB is told that tablespace T > has location '/foo/bar', it'll start creating stuff right in /foo/bar. > pg_upgrade will tell 9.0 to create the tablespace with location /foo/bar > as well. If 9.0 refuses because that directory contains stuff already, > the upgrade will fail. Instead, we make a version-numbered subdirectory > and start creating 9.0's stuff there. > > Given the use of the version-numbered subdirectory, I see no real merit > in insisting that the parent directory be empty anyway. It'd be > precisely analogous to "initdb -D /foo/bar/data" insisting that /foo/bar > be empty, which we have never done and nobody's ever suggested would be > a good idea. There aren't a lot of sane use cases for storing other bits of data inside either $PGDATA or one of your tablespace directories. However... I guess you might have something like an empty lost+found directory if you're creating the tablespace directly on top of a mount point, and perhaps there's a good argument that that shouldn't interfere. Or, I think I've run across NAS devices where every directory on the system contains a subdirectory called .snapshot, or something like that. So maybe insisting on empty isn't right after all. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Oct 18, 2010 at 3:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Given the use of the version-numbered subdirectory, I see no real merit >> in insisting that the parent directory be empty anyway. �It'd be >> precisely analogous to "initdb -D /foo/bar/data" insisting that /foo/bar >> be empty, which we have never done and nobody's ever suggested would be >> a good idea. > There aren't a lot of sane use cases for storing other bits of data > inside either $PGDATA or one of your tablespace directories. > However... I guess you might have something like an empty lost+found > directory if you're creating the tablespace directly on top of a mount > point, and perhaps there's a good argument that that shouldn't > interfere. Or, I think I've run across NAS devices where every > directory on the system contains a subdirectory called .snapshot, or > something like that. So maybe insisting on empty isn't right after > all. Yeah. We have gotten complaints in the past from people who tried to specify a mount point as a tablespace, and it failed because of lost+found or the mount dir being root-owned. We've told them to make a subdirectory, but that always seemed like a workaround. With the new layout there's no longer any strong reason to prevent this case from working. Basically, I'm thinking that given CREATE TABLESPACE LOCATION '/foo/bar' the creation and properties of /foo/bar/PG_9.0_201004261 ought to be handled *exactly* the way that the -D target directory of initdb is. We have more than ten years experience behind the assertion that we're dealing with that case in a good way. We should transfer that behavior over to tablespace directories rather than inventing something that works a shade differently. Barring objections, I'll go make it work that way in HEAD and 9.0. regards, tom lane
Tom Lane wrote: > Yeah. We have gotten complaints in the past from people who tried to > specify a mount point as a tablespace, and it failed because of > lost+found or the mount dir being root-owned. We've told them to make a > subdirectory, but that always seemed like a workaround. With the new > layout there's no longer any strong reason to prevent this case from > working. > > Basically, I'm thinking that given CREATE TABLESPACE LOCATION '/foo/bar' > the creation and properties of /foo/bar/PG_9.0_201004261 ought to be > handled *exactly* the way that the -D target directory of initdb is. > We have more than ten years experience behind the assertion that we're > dealing with that case in a good way. We should transfer that behavior > over to tablespace directories rather than inventing something that > works a shade differently. > > Barring objections, I'll go make it work that way in HEAD and 9.0. Yes please, thanks. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
I'm finally getting around to something that's been on my todo list for a couple of months. I wrote: > Basically, I'm thinking that given CREATE TABLESPACE LOCATION '/foo/bar' > the creation and properties of /foo/bar/PG_9.0_201004261 ought to be > handled *exactly* the way that the -D target directory of initdb is. > We have more than ten years experience behind the assertion that we're > dealing with that case in a good way. We should transfer that behavior > over to tablespace directories rather than inventing something that > works a shade differently. > Barring objections, I'll go make it work that way in HEAD and 9.0. Looking at initdb, there's a couple of hundred lines worth of code involved here. Some of it is not directly sharable with the backend because of the way it manages error cases, but at least the two functions mkdir_p() and check_data_dir() could conceivably be put into src/port/. The former is about 100 lines and the latter about 50. Is sharing them worth doing, or should I just copy-and-paste into commands/tablespace.c? If we're not sharing mkdir_p in toto, I'd be inclined to not bother with duplicating initdb's willingness to create parent directories --- it's not clear to me that that's very sensible for a tablespace creation command anyway. Another question is whether we're really hot enough about this to back-patch the change into 9.0. Given the lack of other complaints since October, maybe we shouldn't take any risk here. Messing around with new modules in src/port/ would be more appetizing if it's HEAD only. Thoughts? regards, tom lane
On Fri, Dec 10, 2010 at 3:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'm finally getting around to something that's been on my todo list for > a couple of months. > > I wrote: >> Basically, I'm thinking that given CREATE TABLESPACE LOCATION '/foo/bar' >> the creation and properties of /foo/bar/PG_9.0_201004261 ought to be >> handled *exactly* the way that the -D target directory of initdb is. >> We have more than ten years experience behind the assertion that we're >> dealing with that case in a good way. We should transfer that behavior >> over to tablespace directories rather than inventing something that >> works a shade differently. > >> Barring objections, I'll go make it work that way in HEAD and 9.0. > > Looking at initdb, there's a couple of hundred lines worth of code > involved here. Some of it is not directly sharable with the backend > because of the way it manages error cases, but at least the two > functions mkdir_p() and check_data_dir() could conceivably be put > into src/port/. The former is about 100 lines and the latter about 50. > Is sharing them worth doing, or should I just copy-and-paste into > commands/tablespace.c? If we're not sharing mkdir_p in toto, I'd be > inclined to not bother with duplicating initdb's willingness to create > parent directories --- it's not clear to me that that's very sensible > for a tablespace creation command anyway. +1 for src/port. > Another question is whether we're really hot enough about this to > back-patch the change into 9.0. Given the lack of other complaints > since October, maybe we shouldn't take any risk here. Messing around > with new modules in src/port/ would be more appetizing if it's HEAD > only. > > Thoughts? At the moment, I'm not feeling hot to back-patch this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > +1 for src/port. > ... > At the moment, I'm not feeling hot to back-patch this. Yeah, that squares with my feelings. Will go do it that way, unless other people object. regards, tom lane
On 12/10/2010 06:01 PM, Tom Lane wrote: > Robert Haas<robertmhaas@gmail.com> writes: >> +1 for src/port. >> ... >> At the moment, I'm not feeling hot to back-patch this. > Yeah, that squares with my feelings. Will go do it that way, > unless other people object. > > I think this is the sensible way to go. cheers andrew
I wrote: >> Basically, I'm thinking that given CREATE TABLESPACE LOCATION '/foo/bar' >> the creation and properties of /foo/bar/PG_9.0_201004261 ought to be >> handled *exactly* the way that the -D target directory of initdb is. One interesting point here is that initdb uses the equivalent of mkdir -p, so it will automatically try to create parent directories of whatever path you specify. Duplicating that behavior in CREATE TABLESPACE causes this diff in the regression tests: -- Will fail with bad path CREATE TABLESPACE badspace LOCATION '/no/such/location'; ! ERROR: directory "/no/such/location" does not exist -- No such tablespace CREATE TABLE bar (i int) TABLESPACE nosuchspace;ERROR: tablespace "nosuchspace" does not exist --- 65,71 ---- -- Will fail with bad path CREATE TABLESPACE badspace LOCATION '/no/such/location'; ! ERROR: could not create directory "/no": Permission denied -- No such tablespace CREATE TABLE bar (i int) TABLESPACE nosuchspace;ERROR: tablespace "nosuchspace" does not exist I'm not sure that this is a bad thing. In particular, it makes WAL replay noticeably more robust since it will do what it can to regenerate the whole path if you deleted parent directories. It will of course still fail, as here, if the server doesn't have write permissions on the last existing dir in the path. Anybody have a problem with adopting this behavior? regards, tom lane
On Fri, Dec 10, 2010 at 10:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >>> Basically, I'm thinking that given CREATE TABLESPACE LOCATION '/foo/bar' >>> the creation and properties of /foo/bar/PG_9.0_201004261 ought to be >>> handled *exactly* the way that the -D target directory of initdb is. > > One interesting point here is that initdb uses the equivalent of mkdir > -p, so it will automatically try to create parent directories of > whatever path you specify. Duplicating that behavior in CREATE > TABLESPACE causes this diff in the regression tests: > > -- Will fail with bad path > CREATE TABLESPACE badspace LOCATION '/no/such/location'; > ! ERROR: directory "/no/such/location" does not exist > -- No such tablespace > CREATE TABLE bar (i int) TABLESPACE nosuchspace; > ERROR: tablespace "nosuchspace" does not exist > --- 65,71 ---- > > -- Will fail with bad path > CREATE TABLESPACE badspace LOCATION '/no/such/location'; > ! ERROR: could not create directory "/no": Permission denied > -- No such tablespace > CREATE TABLE bar (i int) TABLESPACE nosuchspace; > ERROR: tablespace "nosuchspace" does not exist > > I'm not sure that this is a bad thing. In particular, it makes WAL > replay noticeably more robust since it will do what it can to regenerate > the whole path if you deleted parent directories. It will of course > still fail, as here, if the server doesn't have write permissions on the > last existing dir in the path. > > Anybody have a problem with adopting this behavior? Seems a bit surprising. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > On Fri, Dec 10, 2010 at 10:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Anybody have a problem with adopting this behavior? > > Seems a bit surprising. Yeahh.. I'm not really sure about mkdir -p type actions from a SQL command. Not entirely sure why but it doesn't feel 'right' to me. I'd rather have PG complain "that directory doesn't exist". It's not a terribly strong feeling though, more of a "that's kind of suprising", since, as you point out, PG would have to have the necessary permissions to create the directories, and I know that it probably wouldn't on systems that I set up, but I don't really like relying on the FS permissions to realize something like this was happening... At the end of the day, I think you'll get people who will see a permission denied and go "oh, I need to grant PG rights to mkdir in /, sure, why not", even though that might mean creating a directory on a small root FS which will then run out of space quickly... Just my 2c. Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Robert Haas (robertmhaas@gmail.com) wrote: >> On Fri, Dec 10, 2010 at 10:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Anybody have a problem with adopting this behavior? >> Seems a bit surprising. > Yeahh.. I'm not really sure about mkdir -p type actions from a SQL > command. Not entirely sure why but it doesn't feel 'right' to me. I'd > rather have PG complain "that directory doesn't exist". OK. Is there any value in doing mkdir -p in WAL-recovery execution of CREATE TABLESPACE but not regular execution? regards, tom lane
On Sat, Dec 11, 2010 at 10:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Stephen Frost <sfrost@snowman.net> writes: >> * Robert Haas (robertmhaas@gmail.com) wrote: >>> On Fri, Dec 10, 2010 at 10:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> Anybody have a problem with adopting this behavior? > >>> Seems a bit surprising. > >> Yeahh.. I'm not really sure about mkdir -p type actions from a SQL >> command. Not entirely sure why but it doesn't feel 'right' to me. I'd >> rather have PG complain "that directory doesn't exist". > > OK. Is there any value in doing mkdir -p in WAL-recovery execution of > CREATE TABLESPACE but not regular execution? I don't think so. If someone creates a directory that is not fsync'd, and then creates a subdirectory and puts a tablespace on it, and then crashes after this has been WAL-logged but before the directory entries have hit the disk, well, unlucky for them, but that's a vanishingly rare situation. There's no guarantee that we'd set properties on the parent directory that would match the user's expectation anyway, especially if SE-Linux or something is involved. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Dec 11, 2010 at 10:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> OK. �Is there any value in doing mkdir -p in WAL-recovery execution of >> CREATE TABLESPACE but not regular execution? > I don't think so. If someone creates a directory that is not fsync'd, > and then creates a subdirectory and puts a tablespace on it, and then > crashes after this has been WAL-logged but before the directory > entries have hit the disk, well, unlucky for them, but that's a > vanishingly rare situation. I'm not really thinking about crash recovery, but replication slaves. Omitting to create the tablespace location directories on slaves doesn't seem far-fetched at all. Of course, it's likely that the slave server will lack permissions to create in the location directory's parent; but if it can, the outcome shouldn't be too unreasonable. > There's no guarantee that we'd set > properties on the parent directory that would match the user's > expectation anyway, especially if SE-Linux or something is involved. If SELinux doesn't like it, it's going to fail no matter what we do. I'm just suggesting that trying to create the directory path will fail in fewer cases than not trying. regards, tom lane
On Sat, Dec 11, 2010 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sat, Dec 11, 2010 at 10:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> OK. Is there any value in doing mkdir -p in WAL-recovery execution of >>> CREATE TABLESPACE but not regular execution? > >> I don't think so. If someone creates a directory that is not fsync'd, >> and then creates a subdirectory and puts a tablespace on it, and then >> crashes after this has been WAL-logged but before the directory >> entries have hit the disk, well, unlucky for them, but that's a >> vanishingly rare situation. > > I'm not really thinking about crash recovery, but replication slaves. > Omitting to create the tablespace location directories on slaves > doesn't seem far-fetched at all. Of course, it's likely that > the slave server will lack permissions to create in the location > directory's parent; but if it can, the outcome shouldn't be too > unreasonable. Creating the tablespace directory itself would be reasonable, but recursing all the way up seems dubious. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Dec 11, 2010 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm not really thinking about crash recovery, but replication slaves. >> Omitting to create the tablespace location directories on slaves >> doesn't seem far-fetched at all. �Of course, it's likely that >> the slave server will lack permissions to create in the location >> directory's parent; but if it can, the outcome shouldn't be too >> unreasonable. > Creating the tablespace directory itself would be reasonable, but > recursing all the way up seems dubious. Well, it's *very* unlikely that the slave server would have permissions to create in the root directory or close to it. If you grant that it's reasonable to create the location directory itself, why not the parent too, if that's still in a directory that's writable? I agree that the reasonableness of the behavior drops off the more you go up, but so does the probability of having the needed permissions. I don't agree that it's a binary choice where creating exactly one directory is reasonable but exactly two isn't. regards, tom lane
On Sat, Dec 11, 2010 at 1:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sat, Dec 11, 2010 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I'm not really thinking about crash recovery, but replication slaves. >>> Omitting to create the tablespace location directories on slaves >>> doesn't seem far-fetched at all. Of course, it's likely that >>> the slave server will lack permissions to create in the location >>> directory's parent; but if it can, the outcome shouldn't be too >>> unreasonable. > >> Creating the tablespace directory itself would be reasonable, but >> recursing all the way up seems dubious. > > Well, it's *very* unlikely that the slave server would have permissions > to create in the root directory or close to it. If you grant that it's > reasonable to create the location directory itself, why not the parent > too, if that's still in a directory that's writable? I agree that the > reasonableness of the behavior drops off the more you go up, but so does > the probability of having the needed permissions. I don't agree that > it's a binary choice where creating exactly one directory is reasonable > but exactly two isn't. I'm just giving you my opinion. Take it for what it's worth. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On mån, 2010-10-18 at 15:50 -0400, Tom Lane wrote: > Yeah. We have gotten complaints in the past from people who tried to > specify a mount point as a tablespace, and it failed because of > lost+found or the mount dir being root-owned. We've told them to make > a subdirectory, but that always seemed like a workaround. With the > new layout there's no longer any strong reason to prevent this case > from working. > > Basically, I'm thinking that given CREATE TABLESPACE LOCATION > '/foo/bar' > the creation and properties of /foo/bar/PG_9.0_201004261 ought to be > handled *exactly* the way that the -D target directory of initdb is. > We have more than ten years experience behind the assertion that we're > dealing with that case in a good way. We should transfer that > behavior over to tablespace directories rather than inventing > something that works a shade differently. I'm still struggling with the above argument. In one case you are applying a behavior to the argument given to initdb, in the other case you are applying the behavior to a subdirectory of the argument given to CREATE TABLESPACE. I'm not saying the solution is necessarily wrong, but it doesn't seem that this will make things easier or more consistent. An idle thought: How about creating a version-subdirectory also in the PGDATA path. The point about mountpoint annoyance applies here just as well. And it could also make the directory juggling during in-place upgrade more normalized and robust.
Peter Eisentraut <peter_e@gmx.net> writes: > On mån, 2010-10-18 at 15:50 -0400, Tom Lane wrote: >> Basically, I'm thinking that given CREATE TABLESPACE LOCATION '/foo/bar' >> the creation and properties of /foo/bar/PG_9.0_201004261 ought to be >> handled *exactly* the way that the -D target directory of initdb is. >> We have more than ten years experience behind the assertion that we're >> dealing with that case in a good way. We should transfer that >> behavior over to tablespace directories rather than inventing >> something that works a shade differently. > I'm still struggling with the above argument. In one case you are > applying a behavior to the argument given to initdb, in the other case > you are applying the behavior to a subdirectory of the argument given to > CREATE TABLESPACE. I'm not saying the solution is necessarily wrong, > but it doesn't seem that this will make things easier or more > consistent. Well, it is only an argument by analogy, but the proposal does fix the IMO-clear misbehavior complained of way back at the start of this thread. > An idle thought: How about creating a version-subdirectory also in the > PGDATA path. The point about mountpoint annoyance applies here just as > well. And it could also make the directory juggling during in-place > upgrade more normalized and robust. I can't get excited about it. That would break every existing tool that looks into PGDATA, for a fairly marginal simplification during version upgrades. To give just one example of the pain we'd be letting ourselves in for, pg_ctl would now become extremely version-specific. You couldn't even get away with using the wrong copy of pg_ctl during a reinstall after a catversion bump during development. regards, tom lane