Обсуждение: Weird behaviour with ALTER TABLE ... SET TABLESPACE ... statement

Поиск
Список
Период
Сортировка

Weird behaviour with ALTER TABLE ... SET TABLESPACE ... statement

От
Guillaume Lelarge
Дата:
Hi,

I just found a weird behaviour with this statement. Here is a complete
log of my session with a 8.3(.4) server:

guillaume@laptop$ mkdir /home/guillaume/ts1
guillaume@laptop$ createdb db1
guillaume@laptop$ LANG=C psql db1
Welcome to psql 8.3.4, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms      \h for help with SQL commands      \? for help with psql commands      \g
orterminate with semicolon to execute query      \q to quit
 

db1=# create table t1 (id integer);
CREATE TABLE
db1=# create tablespace ts1 location '/home/guillaume/ts1';
CREATE TABLESPACE
db1=# insert into t1 values (1);
INSERT 0 1
db1=# select pg_database.oid as db_folder, relfilenode as table_filename
db1-# from pg_database, pg_class where datname='db1' and relname='t1';db_folder | table_filename
-----------+----------------    74472 |          74475
(1 row)

db1=# \! ls -l /opt/postgresql-8.3/data/base/74472/74475
-rw------- 1 guillaume guillaume 8192 Oct  6 10:59
/opt/postgresql-8.3/data/base/74472/74475

Till now, everything seems good. I have one 8k file for my table t1.

db1=# alter table t1 set tablespace ts1;
ALTER TABLE
/opt/postgresql-8.3/data/base/74472/74475
db1=# \! ls -l /home/guillaume/ts1/74472/74475
-rw------- 1 guillaume guillaume 8192 Oct  6 11:00
/home/guillaume/ts1/74472/74475

My table moved to my own tablespace.

db1=# \! ls -l /opt/postgresql-8.3/data/base/74472/74475
-rw------- 1 guillaume guillaume 0 Oct  6 11:00

This seems weird. I expected to have no file 74475 in the pg_default
tablespace after the ALTER TABLE.

Of course, now, I can't get my table back on the previous tablespace.

db1=# alter table t1 set tablespace pg_default;
ERROR:  could not create relation 1663/74472/74475: File exists
db1=# alter table t1 set tablespace pg_default;
ERROR:  could not create relation 1663/74472/74475: File exists

I finally discovered that a CHECKPOINT resolves my issue.

db1=# checkpoint;
CHECKPOINT

And I can move my table back to the previous tablespace.

db1=# alter table t1 set tablespace pg_default;
ALTER TABLE
db1=# \! ls -l /opt/postgresql-8.3/data/base/74472/74475
-rw------- 1 guillaume guillaume 8192 Oct  6 11:01
/opt/postgresql-8.3/data/base/74472/74475
db1=# \! ls -l /home/guillaume/ts1/74472/74475
-rw------- 1 guillaume guillaume 0 Oct  6 11:01
/home/guillaume/ts1/74472/74475

This behaviour happens on the REL8_3_STABLE and HEAD branches. In
REL8_2_STABLE, it works as I expected it to work :

guillaume@laptop$ mkdir /home/guillaume/ts2

guillaume@laptop$ createdb db2

CREATE DATABASE
guillaume@laptop$ psql db2
Bienvenue dans psql 8.2.10, l'interface interactive de PostgreSQL.

Saisissez:   \copyright pour les termes de distribution   \h pour l'aide-mémoire des commandes SQL   \? pour
l'aide-mémoiredes commandes psql   \g ou point-virgule en fin d'instruction pour exécuter la requête   \q pour quitter
 

db2=# create table t2 (id integer);
CREATE TABLE
db2=# create tablespace ts2 location '/home/guillaume/ts2';
CREATE TABLESPACE
db2=# insert into t2 values (1);
INSERT 0 1
db2=# select pg_database.oid as db_folder, relfilenode as table_filename
from pg_database, pg_class where datname='db2' and relname='t2';db_folder | table_filename
-----------+----------------    47944 |          47945
(1 ligne)

db2=# \! ls -l /opt/postgresql-8.2/data/base/47944/47945
-rw------- 1 guillaume guillaume 8192 2008-10-06 11:12
/opt/postgresql-8.2/data/base/47944/47945
db2=# alter table t2 set tablespace ts2;
ALTER TABLE
db2=# \! ls -l /opt/postgresql-8.2/data/base/47944/47945
ls: cannot access /opt/postgresql-8.2/data/base/47944/47945: No such
file or directory
db2=# \! ls -l /home/guillaume/ts2/47944/47945
-rw------- 1 guillaume guillaume 8192 2008-10-06 11:13
/home/guillaume/ts2/47944/47945
db2=# alter table t2 set tablespace pg_default;
ALTER TABLE
db2=# \! ls -l /home/guillaume/ts2/47944/47945
ls: cannot access /home/guillaume/ts2/47944/47945: No such file or directory
db2=# \! ls -l /opt/postgresql-8.2/data/base/47944/47945
-rw------- 1 guillaume guillaume 8192 2008-10-06 11:13
/opt/postgresql-8.2/data/base/47944/47945

I don't need a checkpoint to be able to move my table back.

It doesn't seem a big issue because checkpoints are issued frequently
but it deserves to get fixed. I looked a bit at the source code. The old
file gets in a queue of to-be-removed files (see smgrscheduleunlink()
function in storage/smgr/smgr.c). But I failed to see where it really
gets deleted. I would welcome any pointer.

Regards.


-- 
Guillaume.http://www.postgresqlfr.orghttp://dalibo.com


Re: Weird behaviour with ALTER TABLE ... SET TABLESPACE ... statement

От
Heikki Linnakangas
Дата:
Guillaume Lelarge wrote:
> db1=# alter table t1 set tablespace ts1;
> ALTER TABLE
> /opt/postgresql-8.3/data/base/74472/74475
> db1=# \! ls -l /home/guillaume/ts1/74472/74475
> -rw------- 1 guillaume guillaume 8192 Oct  6 11:00
> /home/guillaume/ts1/74472/74475
> 
> My table moved to my own tablespace.
> 
> db1=# \! ls -l /opt/postgresql-8.3/data/base/74472/74475
> -rw------- 1 guillaume guillaume 0 Oct  6 11:00
> 
> This seems weird. I expected to have no file 74475 in the pg_default
> tablespace after the ALTER TABLE.
> 
> Of course, now, I can't get my table back on the previous tablespace.
> 
> db1=# alter table t1 set tablespace pg_default;
> ERROR:  could not create relation 1663/74472/74475: File exists
> db1=# alter table t1 set tablespace pg_default;
> ERROR:  could not create relation 1663/74472/74475: File exists
> 
> I finally discovered that a CHECKPOINT resolves my issue.

Hmm. We force a checkpoint in dropdb() for similar reasons.

> It doesn't seem a big issue because checkpoints are issued frequently
> but it deserves to get fixed. I looked a bit at the source code. The old
> file gets in a queue of to-be-removed files (see smgrscheduleunlink()
> function in storage/smgr/smgr.c). But I failed to see where it really
> gets deleted. I would welcome any pointer.

In mdpostckpt().

The trivial fix is to just force a checkpoint in ALTER TABLE SET 
TABLESPACE. Can we do better than that? Perhaps only force a checkpoint 
when we find that the file already exists.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Weird behaviour with ALTER TABLE ... SET TABLESPACE ... statement

От
Tom Lane
Дата:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> The trivial fix is to just force a checkpoint in ALTER TABLE SET 
> TABLESPACE. Can we do better than that? Perhaps only force a checkpoint 
> when we find that the file already exists.

If ALTER TABLE SET TABLESPACE is assuming that it can always use the
same relfilenode number in the new space as in the old, it's just plain
broken.  We need to fix that assumption.
        regards, tom lane


Re: Weird behaviour with ALTER TABLE ... SET TABLESPACE ... statement

От
Guillaume Lelarge
Дата:
Tom Lane a écrit :
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> The trivial fix is to just force a checkpoint in ALTER TABLE SET 
>> TABLESPACE. Can we do better than that? Perhaps only force a checkpoint 
>> when we find that the file already exists.
> 
> If ALTER TABLE SET TABLESPACE is assuming that it can always use the
> same relfilenode number in the new space as in the old, it's just plain
> broken.  We need to fix that assumption.
> 

Do you mean the backend should change the relfilenode number whenever we
use the ALTER TABLE SET TABLESPACE ? I can't think of a way to get twice
the same relfilenode.

Perhaps a better question would be : how the next relfilenode is computed?


-- 
Guillaume.http://www.postgresqlfr.orghttp://dalibo.com


Re: Weird behaviour with ALTER TABLE ... SET TABLESPACE ... statement

От
Heikki Linnakangas
Дата:
Guillaume Lelarge wrote:
> Tom Lane a écrit :
>> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>>> The trivial fix is to just force a checkpoint in ALTER TABLE SET 
>>> TABLESPACE. Can we do better than that? Perhaps only force a checkpoint 
>>> when we find that the file already exists.
>> If ALTER TABLE SET TABLESPACE is assuming that it can always use the
>> same relfilenode number in the new space as in the old, it's just plain
>> broken.  We need to fix that assumption.
> 
> Do you mean the backend should change the relfilenode number whenever we
> use the ALTER TABLE SET TABLESPACE ? I can't think of a way to get twice
> the same relfilenode.
> 
> Perhaps a better question would be : how the next relfilenode is computed?

1. Get the next OID from the counter.
2. Check if a file with that name exists. If it does, goto 1.

Yeah, seems like we need to allocate a new relfilenode in the new 
tablespace.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Weird behaviour with ALTER TABLE ... SET TABLESPACE ... statement

От
Tom Lane
Дата:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Yeah, seems like we need to allocate a new relfilenode in the new 
> tablespace.

I looked into tablecmds.c and verified that ATExecSetTableSpace doesn't
worry about selecting a new relfilenode.  I'm also noticing a number of
permissions-type checks that seem like they'd better be done in
ATPrepSetTableSpace, because we don't go through ATExecSetTableSpace
if the table requires rewriting for other reasons.  All in all this
code seems to need more careful review than it's gotten so far.
You want to do it?
        regards, tom lane


Re: Weird behaviour with ALTER TABLE ... SET TABLESPACE ... statement

От
Heikki Linnakangas
Дата:
Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> Yeah, seems like we need to allocate a new relfilenode in the new 
>> tablespace.
> 
> I looked into tablecmds.c and verified that ATExecSetTableSpace doesn't
> worry about selecting a new relfilenode.  I'm also noticing a number of
> permissions-type checks that seem like they'd better be done in
> ATPrepSetTableSpace, because we don't go through ATExecSetTableSpace
> if the table requires rewriting for other reasons.  All in all this
> code seems to need more careful review than it's gotten so far.
> You want to do it?

I'll give it a shot.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Weird behaviour with ALTER TABLE ... SET TABLESPACE ... statement

От
Heikki Linnakangas
Дата:
Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> Yeah, seems like we need to allocate a new relfilenode in the new 
>> tablespace.
> 
> I looked into tablecmds.c and verified that ATExecSetTableSpace doesn't
> worry about selecting a new relfilenode.  I'm also noticing a number of
> permissions-type checks that seem like they'd better be done in
> ATPrepSetTableSpace, because we don't go through ATExecSetTableSpace
> if the table requires rewriting for other reasons.

The same tests are performed in the rewriting code path in 
ATRewriteTables() and in heap_create_with_catalog().

I fixed the relfilenode allocation in 8.1-HEAD. Doesn't seem worth 
fixing in 8.0, because GetNewRelFileNode() didn't exist before 8.1, so 
we couldn't check for collisions anyway.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Weird behaviour with ALTER TABLE ... SET TABLESPACE ... statement

От
Guillaume Lelarge
Дата:
Heikki Linnakangas a écrit :
> Tom Lane wrote:
>> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>>> Yeah, seems like we need to allocate a new relfilenode in the new
>>> tablespace.
>>
>> I looked into tablecmds.c and verified that ATExecSetTableSpace doesn't
>> worry about selecting a new relfilenode.  I'm also noticing a number of
>> permissions-type checks that seem like they'd better be done in
>> ATPrepSetTableSpace, because we don't go through ATExecSetTableSpace
>> if the table requires rewriting for other reasons.
> 
> The same tests are performed in the rewriting code path in
> ATRewriteTables() and in heap_create_with_catalog().
> 
> I fixed the relfilenode allocation in 8.1-HEAD. Doesn't seem worth
> fixing in 8.0, because GetNewRelFileNode() didn't exist before 8.1, so
> we couldn't check for collisions anyway.
> 

Thanks. It works great!


-- 
Guillaume.http://www.postgresqlfr.orghttp://dalibo.com