Обсуждение: problem/bug in drop tablespace?
While researching a problem reported on the -general list by a user who lost a disk containing his index tablespace, I raninto something, but I'm not sure is a serious bug or just an inconsistency in how \d shows tables. <br /><br />Hereare the steps I took.<br /><br />1. Create a new database 'MYDB' and connect to it.<br />2. Create a new tablespace'MYTBLSP'<br />3. Create a table 'MYTABLE' and populate it.<br />4. Create an index 'MYIND' on that table, withthe index in the new tablespace, MYTBLSP.<br /><br />Now, exit psql and delete the files in the tablespace directorycreated in step 2, simulating the problem the user had.<br /><br />Trying to execute an SQL command on the tableMYTABLE will, as expected, generate an error.<br /><br />Now, drop tablespace MYTBLSP. This will produce an error,but it will delete the tablespace according to \db.<br /><br />Recreate tablespace MYTBLSP.<br /><br />Regenerate theindex on MYTABLE. Queries will work on this table again, as expected.<br /><br />Now, here's the problem I ran into:<br/><br />The index will be rebuilt in tablespace MYTBLSP, but \d on table MYTABLE will not show the index as beingin that tablespace.<br />--<br />Mike Nolan<br /><br /><br /><br />
<br />The last portion of my original post got edited out by mistake.<br /><br />The tests I ran were on version 9.1.3, runningFedora 14, kernel 2.6.35.14-106.fc14-i686.<br /><br />It seems to me that DROP TABLESPACE should check to see if thereare references in the system catalog to the tablespace before dropping it, not just that the tablespace itself is empty. That way it would have thrown an error when I tried to drop the tablespace.<br /><br />A somewhat separate issue iswhat to do when a tablespace is inaccessible, such as due to a disk failure. The thread on -general that prompted my testswas a relatively easy one to suggest how to repair, because the lost tablespace only had indexes in it. But that'snot exactly a -hackers issue, more of a question of better backup protocols.<br /> --<br />Mike Nolan<br />
Michael Nolan wrote: > While researching a problem reported on the -general list by a user who lost a disk containing his > index tablespace, I ran into something, but I'm not sure is a serious bug or just an inconsistency in > how \d shows tables. > > Here are the steps I took. > > 1. Create a new database 'MYDB' and connect to it. > 2. Create a new tablespace 'MYTBLSP' > 3. Create a table 'MYTABLE' and populate it. > 4. Create an index 'MYIND' on that table, with the index in the new tablespace, MYTBLSP. > > Now, exit psql and delete the files in the tablespace directory created in step 2, simulating the > problem the user had. > > Trying to execute an SQL command on the table MYTABLE will, as expected, generate an error. > > Now, drop tablespace MYTBLSP. This will produce an error, but it will delete the tablespace according > to \db. > > Recreate tablespace MYTBLSP. > > Regenerate the index on MYTABLE. Queries will work on this table again, as expected. > > Now, here's the problem I ran into: > > The index will be rebuilt in tablespace MYTBLSP, but \d on table MYTABLE will not show the index as > being in that tablespace. I cannot reproduce this on 9.1.3: test=# CREATE TABLESPACE mytbsp LOCATION '/home/laurenz/x'; CREATE TABLESPACE test=# CREATE TABLE mytable(id integer PRIMARY KEY USING INDEX TABLESPACE mytbsp, val text); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "mytable_pkey" for table "mytable" CREATE TABLE test=# INSERT INTO mytable VALUES (1, 'eins'), (2, 'zwei'); INSERT 0 2 test=# \d mytable Table "laurenz.mytable"Column | Type | Modifiers --------+---------+-----------id | integer | not nullval | text | Indexes: "mytable_pkey" PRIMARY KEY, btree (id), tablespace "mytbsp" $ rm -rf /home/laurenz/x/PG_9.1_201105231/* test=# SELECT * FROM mytable; ERROR: could not open file "pg_tblspc/46752/PG_9.1_201105231/16420/46759": No such file or directory Ok, that's expected. test=# DROP TABLESPACE mytbsp; DROP TABLESPACE No error. test=# CREATE TABLESPACE mytbsp LOCATION '/home/laurenz/x'; CREATE TABLESPACE test=# REINDEX INDEX mytable_pkey; ERROR: could not create directory "pg_tblspc/46752/PG_9.1_201105231/16420": No such file or directory Sure, the tablespace OID has changed. test=# ALTER TABLE mytable DROP CONSTRAINT mytable_pkey; ALTER TABLE test=# ALTER TABLE mytable ADD PRIMARY KEY (id) USING INDEX TABLESPACE mytbsp; NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index "mytable_pkey" for table "mytable" ALTER TABLE test=# \d mytable Table "laurenz.mytable"Column | Type | Modifiers --------+---------+-----------id | integer | not nullval | text | Indexes: "mytable_pkey" PRIMARY KEY, btree (id), tablespace "mytbsp" Looks ok. Yours, Laurenz Albe
On 5/9/12, Albe Laurenz <laurenz.albe@wien.gv.at> wrote: > I cannot reproduce this on 9.1.3: Odd, I've tried it another two times, with similar results to my initial post. Here's what I get starting with the point where I deleted the files in the tablespace: mytest=# select * from mytable; select * from mytable; ERROR: could not open file "pg_tblspc/289477766/PG_9.1_201105231/289477763/289477785": No such file or directory mytest=# \d mytableTable "public.mytable" Column Type Modifiers ------ ------- --------- id integer not null val text Indexes: "mytable_pkey" PRIMARY KEY, btree (id), tablespace "mytblspc" mytest=# drop tablespace mytblspc; drop tablespace mytblspc; WARNING: could not open directory "pg_tblspc/289477766/PG_9.1_201105231": No such file or directory DROP TABLESPACE Time: 16.460 ms mytest=# \d mytableTable "public.mytable" Column Type Modifiers ------ ------- --------- id integer not null val text Indexes: "mytable_pkey" PRIMARY KEY, btree (id) mytest=# create tablespace mytblspc location '/home/postgres/mytb'; create tablespace mytblspc location '/home/postgres/mytb'; CREATE TABLESPACE Time: 42.396 ms mytest=# \d mytableTable "public.mytable" Column Type Modifiers ------ ------- --------- id integer not null val text Indexes: "mytable_pkey" PRIMARY KEY, btree (id) mytest=# reindex table mytable; reindex table mytable; REINDEX Time: 112.981 ms mytest=# \d mytableTable "public.mytable" Column Type Modifiers ------ ------- --------- id integer not null val text Indexes: "mytable_pkey" PRIMARY KEY, btree (id) Here's what's in the mytb directory now: [postgres@romaine PG_9.1_201105231]$ ls -lR : total 4 drwx------. 2 postgres postgres 4096 May 9 13:22 289477763 ./289477763: total 16 -rw-------. 1 postgres postgres 16384 May 9 13:22 289477790 It appears that the index has been rebuilt in the mytblspc tablespace, though \d mytable does not show that. I get the same results whether I rebuild the specific index as you did or reindex the table, as I did. I'm running on 9.1.3 built from the source code, not a distribution. -- Mike Nolan
I see one potential difference between your results and mine. When I rebuild the tablespace, I wind up with the same filename/OID as before, I'm not sure you do. -- Mike Nolan
Michael Nolan wrote: > I see one potential difference between your results and mine. > > When I rebuild the tablespace, I wind up with the same filename/OID as > before, I'm not sure you do. Right. That's strange. Usually OIDs get incremented, so you shouldn't end up with the same OID for the new tablespace. Can you provide a complete testcase? Yours, Laurenz Albe
On 5/11/12, Albe Laurenz <laurenz.albe@wien.gv.at> wrote: > Michael Nolan wrote: >> I see one potential difference between your results and mine. >> >> When I rebuild the tablespace, I wind up with the same filename/OID as >> before, I'm not sure you do. > > Right. That's strange. > Usually OIDs get incremented, so you shouldn't end up with the same > OID for the new tablespace. > > Can you provide a complete testcase? I thought I had, until you were unable to reproduce it. :-) -- Mike Nolan
On 5/11/12, Michael Nolan <htfoot@gmail.com> wrote: > On 5/11/12, Albe Laurenz <laurenz.albe@wien.gv.at> wrote: >> Michael Nolan wrote: >>> I see one potential difference between your results and mine. >>> >>> When I rebuild the tablespace, I wind up with the same filename/OID as >>> before, I'm not sure you do. >> >> Right. That's strange. >> Usually OIDs get incremented, so you shouldn't end up with the same >> OID for the new tablespace. >> >> Can you provide a complete testcase? > > I thought I had, until you were unable to reproduce it. :-) > -- > Mike Nolan > My plan at this point is to wait until beta 1 of 9.2 is out, then see if I can reproduce the problem there. -- Mike Nolan
"Albe Laurenz" <laurenz.albe@wien.gv.at> writes: > Michael Nolan wrote: >> I see one potential difference between your results and mine. >> When I rebuild the tablespace, I wind up with the same filename/OID as >> before, I'm not sure you do. > Right. That's strange. > Usually OIDs get incremented, so you shouldn't end up with the same > OID for the new tablespace. I believe I see what's happening here, and the difference is that Michael deleted the tablespace's directory while Albe only deleted the files in it. The former case causes destroy_tablespace_directories to exit early, without throwing a hard error, and without having removed the symlink for the tablespace OID in $PGDATA/pg_tblspc/. This means that after re-creating a new tablespace in the same directory location, that old symlink works again, even though it no longer corresponds to any OID in pg_tablespace. Thus, Michael doesn't see an error in his REINDEX; physical access to the index still works even though the index's reltablespace is no longer really valid. The reported symptom of \d not showing the tablespace is because the code in psql's describe.c will silently ignore a reltablespace entry that does not match any OID in pg_tablespace. We could prevent this scenario if we changed destroy_tablespace_directories so that for any non-fatal-error situation, it continues to march on and try to destroy the remaining infrastructure, particularly the symlink. I'm not sure that's really a good idea, but it definitely seems to be a bad idea to leave the symlink in place when we're removing the pg_tablespace row. Alternatively we could make more of those cases be errors rather than warnings, so that the pg_tablespace row removal would be rolled back. The comment in destroy_tablespace_directories indicates that what we're trying to support by not throwing an error for missing directory is cleaning up a dead pg_tablespace row, which suggests that removing the symlink too would be reasonable. A larger question is whether we should start making pg_shdepend entries for table/index usage of non-default tablespaces, so that you couldn't DROP a tablespace that the catalogs think still has tables/indexes in it. I'm not sure that that'd be particularly helpful though. It certainly wouldn't do anything to protect against the scenario discussed here of an external agency zapping all the files. regards, tom lane
On Fri, May 11, 2012 at 7:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > A larger question is whether we should start making pg_shdepend entries > for table/index usage of non-default tablespaces, so that you couldn't > DROP a tablespace that the catalogs think still has tables/indexes in > it. I'm astonished we don't do that already. Seems inconsistent with other SQL object types - most obviously, schemas - and a potentially giant foot-gun. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Excerpts from Robert Haas's message of vie may 11 20:28:28 -0400 2012: > On Fri, May 11, 2012 at 7:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > A larger question is whether we should start making pg_shdepend entries > > for table/index usage of non-default tablespaces, so that you couldn't > > DROP a tablespace that the catalogs think still has tables/indexes in > > it. > > I'm astonished we don't do that already. Seems inconsistent with > other SQL object types - most obviously, schemas - and a potentially > giant foot-gun. The original patch did contain tablespace tracking (though I don't remember considering whether they were default or not), but it got ripped out because during the subsequent discussion we considered that it wasn't necessary to keep track of it -- supposedly, whenever you were going to delete a tablespace, the existing files in the directory would be sufficient evidence to stop the deletion. Evidently I failed to consider the case at hand. I don't think there's any particular reason we can't put it back. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Robert Haas's message of vie may 11 20:28:28 -0400 2012: >> I'm astonished we don't do that already. Seems inconsistent with >> other SQL object types - most obviously, schemas - and a potentially >> giant foot-gun. > The original patch did contain tablespace tracking (though I don't > remember considering whether they were default or not), but it got > ripped out because during the subsequent discussion we considered that > it wasn't necessary to keep track of it -- supposedly, whenever you were > going to delete a tablespace, the existing files in the directory would > be sufficient evidence to stop the deletion. Evidently I failed to > consider the case at hand. Well, the question to me is exactly how much good it will do to stop deletion of the pg_tablespace entry, if the underlying files are gone. I'm having a hard time getting excited about expending cycles on that. regards, tom lane
On Fri, May 11, 2012 at 10:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
There could be multiple reasons why the underlying files are not there, such as a filesystem that isn't currently mounted for some reason.
It seems prudent to throw an error on drop tablespace if there are references to that tablespace in the catalog, or perhaps require a 'force' clause to override any errors, but it probably isn't something most DBAs would run into very often.
Thanks for figuring it out, Tom.
--
MIke Nolan
Well, the question to me is exactly how much good it will do to stop
deletion of the pg_tablespace entry, if the underlying files are gone.
I'm having a hard time getting excited about expending cycles on that.
There could be multiple reasons why the underlying files are not there, such as a filesystem that isn't currently mounted for some reason.
It seems prudent to throw an error on drop tablespace if there are references to that tablespace in the catalog, or perhaps require a 'force' clause to override any errors, but it probably isn't something most DBAs would run into very often.
Thanks for figuring it out, Tom.
--
MIke Nolan