Обсуждение: Unlogged tables cannot be truncated twice

Поиск
Список
Период
Сортировка

Unlogged tables cannot be truncated twice

От
Greg Sabino Mullane
Дата:
Wow, this one took a bit to narrow down. Here's the failing case:

# create unlogged table foo (a text);
CREATE TABLE
# begin;
BEGIN
#* truncate table foo;
TRUNCATE TABLE
#* truncate table foo;
ERROR:  could not create file "base/19131/19183_init": File exists

Very reproducible. The column types matter: if the only column=20
is an INT, for example, the problem does not occur.

--=20
Greg Sabino Mullane greg@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8

Re: Unlogged tables cannot be truncated twice

От
Alvaro Herrera
Дата:
Excerpts from Greg Sabino Mullane's message of lun may 30 12:00:43 -0400 2011:
> Wow, this one took a bit to narrow down. Here's the failing case:
>
> # create unlogged table foo (a text);
> CREATE TABLE
> # begin;
> BEGIN
> #* truncate table foo;
> TRUNCATE TABLE
> #* truncate table foo;
> ERROR:  could not create file "base/19131/19183_init": File exists
>
> Very reproducible. The column types matter: if the only column
> is an INT, for example, the problem does not occur.

So 19183 is the toast table OID?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Unlogged tables cannot be truncated twice

От
Andres Freund
Дата:
On Monday, May 30, 2011 11:18:20 PM Alvaro Herrera wrote:
> Excerpts from Greg Sabino Mullane's message of lun may 30 12:00:43 -0400
2011:
> > Wow, this one took a bit to narrow down. Here's the failing case:
> >
> > # create unlogged table foo (a text);
> > CREATE TABLE
> > # begin;
> > BEGIN
> > #* truncate table foo;
> > TRUNCATE TABLE
> > #* truncate table foo;
> > ERROR:  could not create file "base/19131/19183_init": File exists
> >
> > Very reproducible. The column types matter: if the only column
> > is an INT, for example, the problem does not occur.
>
> So 19183 is the toast table OID?
Nope. Its any index.

You can provoke it with:
begin;create unlogged table foo (a int primary key);truncate foo;rollback;
or
begin;create unlogged table foo (a text);truncate foo;rollback;


The problem is this tidbit from tablecmds.c's ExecuteTruncate:

        /*
         * Normally, we need a transaction-safe truncation here.  However, if
         * the table was either created in the current (sub)transaction or has
         * a new relfilenode in the current (sub)transaction, then we can just
         * truncate it in-place, because a rollback would cause the whole
         * table or the current physical file to be thrown away anyway.
         */
        if (rel->rd_createSubid == mySubid ||
            rel->rd_newRelfilenodeSubid == mySubid)
        {
            /* Immediate, non-rollbackable truncation is OK */
            heap_truncate_one_rel(rel);
        }

in combination with index.c's index_build:


    /*
     * If this is an unlogged index, we need to write out an init fork for it.
     */
    if (heapRelation->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED)
    {
        RegProcedure ambuildempty = indexRelation->rd_am->ambuildempty;

        RelationOpenSmgr(indexRelation);
        smgrcreate(indexRelation->rd_smgr, INIT_FORKNUM, false);
        OidFunctionCall1(ambuildempty, PointerGetDatum(indexRelation));
    }

Andres

Re: Unlogged tables cannot be truncated twice

От
Cédric Villemain
Дата:
2011/5/31 Andres Freund <andres@anarazel.de>:
> On Monday, May 30, 2011 11:18:20 PM Alvaro Herrera wrote:
>> Excerpts from Greg Sabino Mullane's message of lun may 30 12:00:43 -0400
> 2011:
>> > Wow, this one took a bit to narrow down. Here's the failing case:
>> >
>> > # create unlogged table foo (a text);
>> > CREATE TABLE
>> > # begin;
>> > BEGIN
>> > #* truncate table foo;
>> > TRUNCATE TABLE
>> > #* truncate table foo;
>> > ERROR:  could not create file "base/19131/19183_init": File exists
>> >
>> > Very reproducible. The column types matter: if the only column
>> > is an INT, for example, the problem does not occur.
>>
>> So 19183 is the toast table OID?
> Nope. Its any index.
>
> You can provoke it with:
> begin;create unlogged table foo (a int primary key);truncate foo;rollback;
> or
> begin;create unlogged table foo (a text);truncate foo;rollback;
>
>
> The problem is this tidbit from tablecmds.c's ExecuteTruncate:
>
>                /*
>                 * Normally, we need a transaction-safe truncation here.  However, if
>                 * the table was either created in the current (sub)transaction or has
>                 * a new relfilenode in the current (sub)transaction, then we can just
>                 * truncate it in-place, because a rollback would cause the whole
>                 * table or the current physical file to be thrown away anyway.
>                 */
>                if (rel->rd_createSubid == mySubid ||
>                        rel->rd_newRelfilenodeSubid == mySubid)
>                {
>                        /* Immediate, non-rollbackable truncation is OK */
>                        heap_truncate_one_rel(rel);
>                }
>
> in combination with index.c's index_build:
>
>
>        /*
>         * If this is an unlogged index, we need to write out an init fork for it.
>         */
>        if (heapRelation->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED)
>        {
>                RegProcedure ambuildempty = indexRelation->rd_am->ambuildempty;
>
>                RelationOpenSmgr(indexRelation);
>                smgrcreate(indexRelation->rd_smgr, INIT_FORKNUM, false);
>                OidFunctionCall1(ambuildempty, PointerGetDatum(indexRelation));
>        }
>

I remove my own explanations as we conclude on the same thing.
Attached is the fix by adding a (!reindex)  in the index.c if().

It looks enough from early testing.

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Вложения

Re: Unlogged tables cannot be truncated twice

От
Andres Freund
Дата:
On Tuesday, May 31, 2011 01:56:05 AM C=E9dric Villemain wrote:
> I remove my own explanations as we conclude on the same thing.
> Attached is the fix by adding a (!reindex)  in the index.c if().
Thats imo wrong because it will break a plain REINDEX?

I think one possible correct fix would be the attached:


Andres

Re: Unlogged tables cannot be truncated twice

От
Andres Freund
Дата:
On Tuesday, May 31, 2011 02:14:00 AM Andres Freund wrote:
> On Tuesday, May 31, 2011 01:56:05 AM C=E9dric Villemain wrote:
> > I remove my own explanations as we conclude on the same thing.
> > Attached is the fix by adding a (!reindex)  in the index.c if().
>=20
> Thats imo wrong because it will break a plain REINDEX?
>=20
> I think one possible correct fix would be the attached:
My version was wrong as well because it  did not observe RelationTruncate's=
=20
nblocks argument. That function is used to "shorten" the relation in vacuum=
.=20
So dropping the init fork there is not a good idea.

So I think it is the simpler version of simply checking the existance of th=
e=20
fork before creating is ok.

Andres

Re: Unlogged tables cannot be truncated twice

От
Andres Freund
Дата:
On Tuesday, May 31, 2011 02:35:58 AM Andres Freund wrote:
> On Tuesday, May 31, 2011 02:14:00 AM Andres Freund wrote:
> > On Tuesday, May 31, 2011 01:56:05 AM C=E9dric Villemain wrote:
> > > I remove my own explanations as we conclude on the same thing.
> > > Attached is the fix by adding a (!reindex)  in the index.c if().
> >=20
> > Thats imo wrong because it will break a plain REINDEX?
>=20
> > I think one possible correct fix would be the attached:
> My version was wrong as well because it  did not observe RelationTruncate=
's
> nblocks argument. That function is used to "shorten" the relation in
> vacuum. So dropping the init fork there is not a good idea.
>=20
> So I think it is the simpler version of simply checking the existance of
> the fork before creating is ok.
Gna. gnargl. Coffe. Bed. ;)

There was an accidental hunk I added while removing some whitespace. That=
=20
would not have been good on a real commit.

Argh.

Re: Unlogged tables cannot be truncated twice

От
Alvaro Herrera
Дата:
Excerpts from Andres Freund's message of lun may 30 20:47:49 -0400 2011:
> On Tuesday, May 31, 2011 02:35:58 AM Andres Freund wrote:
> > On Tuesday, May 31, 2011 02:14:00 AM Andres Freund wrote:
> > > On Tuesday, May 31, 2011 01:56:05 AM Cédric Villemain wrote:
> > > > I remove my own explanations as we conclude on the same thing.
> > > > Attached is the fix by adding a (!reindex)  in the index.c if().
> > >
> > > Thats imo wrong because it will break a plain REINDEX?
> >
> > > I think one possible correct fix would be the attached:
> > My version was wrong as well because it  did not observe RelationTruncate's
> > nblocks argument. That function is used to "shorten" the relation in
> > vacuum. So dropping the init fork there is not a good idea.
> >
> > So I think it is the simpler version of simply checking the existance of
> > the fork before creating is ok.

Hmm, I wonder if what we should be doing here is observe isreindex in
index_build to avoid creating the init fork.  Doing smgr access at that
level seems wrong.

> Gna. gnargl. Coffe. Bed. ;)
>
> There was an accidental hunk I added while removing some whitespace. That
> would not have been good on a real commit.
>
> Argh.

Hah :-)

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Unlogged tables cannot be truncated twice

От
Andres Freund
Дата:
On Tuesday, May 31, 2011 03:27:22 Alvaro Herrera wrote:
> Excerpts from Andres Freund's message of lun may 30 20:47:49 -0400 2011:
> > On Tuesday, May 31, 2011 02:35:58 AM Andres Freund wrote:
> > > On Tuesday, May 31, 2011 02:14:00 AM Andres Freund wrote:
> > > > On Tuesday, May 31, 2011 01:56:05 AM C=C3=A9dric Villemain wrote:
> > > > > I remove my own explanations as we conclude on the same thing.
> > > > > Attached is the fix by adding a (!reindex)  in the index.c if().
> > > >=20
> > > > Thats imo wrong because it will break a plain REINDEX?
> > >=20
> > > > I think one possible correct fix would be the attached:
> > > My version was wrong as well because it  did not observe
> > > RelationTruncate's nblocks argument. That function is used to
> > > "shorten" the relation in vacuum. So dropping the init fork there is
> > > not a good idea.
> > >=20
> > > So I think it is the simpler version of simply checking the existance
> > > of the fork before creating is ok.
>=20
> Hmm, I wonder if what we should be doing here is observe isreindex in
> index_build to avoid creating the init fork.  Doing smgr access at that
> level seems wrong.
isreindex doesn't contain the necessary informormation as its set doing a=
=20
REINDEX even though a new relfilenode is created and thus the fork needs to=
 be=20
created.

It doesn't seem terribly clean do do the !smgrexists(), I aggree with you=
=20
there. On the other hand we are calling smgrcreate() two lines down anyway.=
 I=20
personally don't realy like the placement of that piece of code very much.=
=20
Doing it in index_build seems to be the wrong place. I don't think there=20
really is a good place for it right now.

Andres

Re: Unlogged tables cannot be truncated twice

От
Cédric Villemain
Дата:
2011/5/31 Andres Freund <andres@anarazel.de>:
> On Tuesday, May 31, 2011 03:27:22 Alvaro Herrera wrote:
>> Excerpts from Andres Freund's message of lun may 30 20:47:49 -0400 2011:
>> > On Tuesday, May 31, 2011 02:35:58 AM Andres Freund wrote:
>> > > On Tuesday, May 31, 2011 02:14:00 AM Andres Freund wrote:
>> > > > On Tuesday, May 31, 2011 01:56:05 AM Cédric Villemain wrote:
>> > > > > I remove my own explanations as we conclude on the same thing.
>> > > > > Attached is the fix by adding a (!reindex)  in the index.c if().
>> > > >
>> > > > Thats imo wrong because it will break a plain REINDEX?
>> > >
>> > > > I think one possible correct fix would be the attached:
>> > > My version was wrong as well because it  did not observe
>> > > RelationTruncate's nblocks argument. That function is used to
>> > > "shorten" the relation in vacuum. So dropping the init fork there is
>> > > not a good idea.
>> > >
>> > > So I think it is the simpler version of simply checking the existance
>> > > of the fork before creating is ok.
>>
>> Hmm, I wonder if what we should be doing here is observe isreindex in
>> index_build to avoid creating the init fork.  Doing smgr access at that
>> level seems wrong.
> isreindex doesn't contain the necessary informormation as its set doing a
> REINDEX even though a new relfilenode is created and thus the fork needs to be
> created.
>
> It doesn't seem terribly clean do do the !smgrexists(), I aggree with you
> there. On the other hand we are calling smgrcreate() two lines down anyway. I
> personally don't realy like the placement of that piece of code very much.
> Doing it in index_build seems to be the wrong place. I don't think there
> really is a good place for it right now.

Robert, I wonder if all the logic with INIT_FORK is only to simulate
truncate on server restart ?
I failled to understand why not simply use truncate around the server
start (when it is in recovery).

The current large amount of code changed just to be able to flush
unlogged files on server start looks crazy, if the only point is a
possible  performance impact on server restart, can't we find
something more elegant ?

While here I also wonder why GiST unlogged are not supported ?!

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


Re: Unlogged tables cannot be truncated twice

От
Robert Haas
Дата:
On Tue, May 31, 2011 at 7:04 AM, C=E9dric Villemain
<cedric.villemain.debian@gmail.com> wrote:
> Robert, I wonder if all the logic with INIT_FORK is only to simulate
> truncate on server restart ?

That is correct.

> I failled to understand why not simply use truncate around the server
> start (when it is in recovery).

We need to do the equivalent of a truncate but without relying in any
way on the system catalogs.  The startup process is not bound to a
database, and starting a separate worker process for each database
would be both very complicated and very expensive.

Also note that while truncate is very simple for a table (just zero
out the file) it's not so simple for an index (a zero-length file is
an invalid index, not an empty one).

> While here I also wonder why GiST unlogged are not supported ?!

Because they use LSNs internally to guarantee proper synchronization,
which presumes that WAL records are being omitted.  Temporary GIST
indexes were broken too, for the same reason, but Heikki fixed that
using GetXLogRecPtrForTemp().  We could engineer a similar solution
for unlogged GIST indexes using a shared counter that is saved in
pg_control across clean shutdowns, but I think that's a 9.2 project.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Unlogged tables cannot be truncated twice

От
Robert Haas
Дата:
2011/5/31 Andres Freund <andres@anarazel.de>:
> On Tuesday, May 31, 2011 03:27:22 Alvaro Herrera wrote:
>> Excerpts from Andres Freund's message of lun may 30 20:47:49 -0400 2011:
>> > On Tuesday, May 31, 2011 02:35:58 AM Andres Freund wrote:
>> > > On Tuesday, May 31, 2011 02:14:00 AM Andres Freund wrote:
>> > > > On Tuesday, May 31, 2011 01:56:05 AM C=E9dric Villemain wrote:
>> > > > > I remove my own explanations as we conclude on the same thing.
>> > > > > Attached is the fix by adding a (!reindex) =A0in the index.c if(=
).
>> > > >
>> > > > Thats imo wrong because it will break a plain REINDEX?
>> > >
>> > > > I think one possible correct fix would be the attached:
>> > > My version was wrong as well because it =A0did not observe
>> > > RelationTruncate's nblocks argument. That function is used to
>> > > "shorten" the relation in vacuum. So dropping the init fork there is
>> > > not a good idea.
>> > >
>> > > So I think it is the simpler version of simply checking the existance
>> > > of the fork before creating is ok.
>>
>> Hmm, I wonder if what we should be doing here is observe isreindex in
>> index_build to avoid creating the init fork. =A0Doing smgr access at that
>> level seems wrong.
> isreindex doesn't contain the necessary informormation as its set doing a
> REINDEX even though a new relfilenode is created and thus the fork needs =
to be
> created.
>
> It doesn't seem terribly clean do do the !smgrexists(), I aggree with you
> there. On the other hand we are calling smgrcreate() two lines down anywa=
y. I
> personally don't realy like the placement of that piece of code very much.
> Doing it in index_build seems to be the wrong place. I don't think there
> really is a good place for it right now.

I'm open to suggestions on how to rearrange this, but I think for
right now the approach you proposed upthread (add a smgrexists() test)
is probably the simplest way to fix this.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Unlogged tables cannot be truncated twice

От
Cédric Villemain
Дата:
2011/6/1 Robert Haas <robertmhaas@gmail.com>:
> On Tue, May 31, 2011 at 7:04 AM, Cédric Villemain
> <cedric.villemain.debian@gmail.com> wrote:
>> Robert, I wonder if all the logic with INIT_FORK is only to simulate
>> truncate on server restart ?
>
> That is correct.
>
>> I failled to understand why not simply use truncate around the server
>> start (when it is in recovery).
>
> We need to do the equivalent of a truncate but without relying in any
> way on the system catalogs.  The startup process is not bound to a
> database, and starting a separate worker process for each database
> would be both very complicated and very expensive.
>
> Also note that while truncate is very simple for a table (just zero
> out the file) it's not so simple for an index (a zero-length file is
> an invalid index, not an empty one).
>
>> While here I also wonder why GiST unlogged are not supported ?!
>
> Because they use LSNs internally to guarantee proper synchronization,
> which presumes that WAL records are being omitted.  Temporary GIST
> indexes were broken too, for the same reason, but Heikki fixed that
> using GetXLogRecPtrForTemp().  We could engineer a similar solution
> for unlogged GIST indexes using a shared counter that is saved in
> pg_control across clean shutdowns, but I think that's a 9.2 project.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

All right, things under a new light!
Thank you.


--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


Re: Unlogged tables cannot be truncated twice

От
Robert Haas
Дата:
On Wed, Jun 1, 2011 at 1:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> 2011/5/31 Andres Freund <andres@anarazel.de>:
>> On Tuesday, May 31, 2011 03:27:22 Alvaro Herrera wrote:
>>> Excerpts from Andres Freund's message of lun may 30 20:47:49 -0400 2011:
>>> > On Tuesday, May 31, 2011 02:35:58 AM Andres Freund wrote:
>>> > > On Tuesday, May 31, 2011 02:14:00 AM Andres Freund wrote:
>>> > > > On Tuesday, May 31, 2011 01:56:05 AM C=E9dric Villemain wrote:
>>> > > > > I remove my own explanations as we conclude on the same thing.
>>> > > > > Attached is the fix by adding a (!reindex) =A0in the index.c if=
().
>>> > > >
>>> > > > Thats imo wrong because it will break a plain REINDEX?
>>> > >
>>> > > > I think one possible correct fix would be the attached:
>>> > > My version was wrong as well because it =A0did not observe
>>> > > RelationTruncate's nblocks argument. That function is used to
>>> > > "shorten" the relation in vacuum. So dropping the init fork there is
>>> > > not a good idea.
>>> > >
>>> > > So I think it is the simpler version of simply checking the existan=
ce
>>> > > of the fork before creating is ok.
>>>
>>> Hmm, I wonder if what we should be doing here is observe isreindex in
>>> index_build to avoid creating the init fork. =A0Doing smgr access at th=
at
>>> level seems wrong.
>> isreindex doesn't contain the necessary informormation as its set doing a
>> REINDEX even though a new relfilenode is created and thus the fork needs=
 to be
>> created.
>>
>> It doesn't seem terribly clean do do the !smgrexists(), I aggree with you
>> there. On the other hand we are calling smgrcreate() two lines down anyw=
ay. I
>> personally don't realy like the placement of that piece of code very muc=
h.
>> Doing it in index_build seems to be the wrong place. I don't think there
>> really is a good place for it right now.
>
> I'm open to suggestions on how to rearrange this, but I think for
> right now the approach you proposed upthread (add a smgrexists() test)
> is probably the simplest way to fix this.

Done.  Your patch tested for FSM_FORKNUM instead of INIT_FORKNUM,
which seemed wrong, so I changed it.  I also added comments.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Unlogged tables cannot be truncated twice

От
Andres Freund
Дата:
On Thursday, June 02, 2011 07:31:33 PM Robert Haas wrote:
> On Wed, Jun 1, 2011 at 1:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > 2011/5/31 Andres Freund <andres@anarazel.de>:
> >> On Tuesday, May 31, 2011 03:27:22 Alvaro Herrera wrote:
> >>> Excerpts from Andres Freund's message of lun may 30 20:47:49 -0400 20=
11:
> >>> > On Tuesday, May 31, 2011 02:35:58 AM Andres Freund wrote:
> >>> > > On Tuesday, May 31, 2011 02:14:00 AM Andres Freund wrote:
> >>> > > > On Tuesday, May 31, 2011 01:56:05 AM C=E9dric Villemain wrote:
> >>> > > > > I remove my own explanations as we conclude on the same thing.
> >>> > > > > Attached is the fix by adding a (!reindex)  in the index.c
> >>> > > > > if().
> >>> > > >=20
> >>> > > > Thats imo wrong because it will break a plain REINDEX?
> >>> > >=20
> >>> > > > I think one possible correct fix would be the attached:
> >>> > > My version was wrong as well because it  did not observe
> >>> > > RelationTruncate's nblocks argument. That function is used to
> >>> > > "shorten" the relation in vacuum. So dropping the init fork there
> >>> > > is not a good idea.
> >>> > >=20
> >>> > > So I think it is the simpler version of simply checking the
> >>> > > existance of the fork before creating is ok.
> >>>=20
> >>> Hmm, I wonder if what we should be doing here is observe isreindex in
> >>> index_build to avoid creating the init fork.  Doing smgr access at th=
at
> >>> level seems wrong.
> >>=20
> >> isreindex doesn't contain the necessary informormation as its set doing
> >> a REINDEX even though a new relfilenode is created and thus the fork
> >> needs to be created.
> >>=20
> >> It doesn't seem terribly clean do do the !smgrexists(), I aggree with
> >> you there. On the other hand we are calling smgrcreate() two lines down
> >> anyway. I personally don't realy like the placement of that piece of
> >> code very much. Doing it in index_build seems to be the wrong place. I
> >> don't think there really is a good place for it right now.
> >=20
> > I'm open to suggestions on how to rearrange this, but I think for
> > right now the approach you proposed upthread (add a smgrexists() test)
> > is probably the simplest way to fix this.
>=20
> Done.  Your patch tested for FSM_FORKNUM instead of INIT_FORKNUM,
> which seemed wrong, so I changed it.  I also added comments.
Wow. I don't think I ever made so many stupid mistakes when doing a two lin=
e=20
change. I abviously wasn't really awake that evening. As evidenced excessiv=
ely=20
in that thread ;)

Thanks,

Andres

Re: Unlogged tables cannot be truncated twice

От
Robert Haas
Дата:
On Thu, Jun 2, 2011 at 2:52 PM, Andres Freund <andres@anarazel.de> wrote:
> Wow. I don't think I ever made so many stupid mistakes when doing a two line
> change. I abviously wasn't really awake that evening. As evidenced excessively
> in that thread ;)

I think I once found two bugs in a one-line patch...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company