Обсуждение: More than one pg_database entry for database

Поиск
Список
Период
Сортировка

More than one pg_database entry for database

От
"James Wilford"
Дата:
Hi all,

I'm trying to do a pg_dump of my database but I'm getting this error:

# pg_dump misp > misp.sql
pg_dump: query returned more than one (2) pg_database entry for database
"misp"

I can connect to the database OK, I just can't dump it. So I connected
as postgres to look at the pg_database table:

misp=# SELECT oid,* from pg_database;
   oid    |  datname  | datdba | encoding | datistemplate | datallowconn
| datlastsysoid | datvacuumxid | datfrozenxid | datpath | datconfig |
datacl
----------+-----------+--------+----------+---------------+-------------
-+---------------+--------------+--------------+---------+-----------+--
---------------
  9019080 | webstats  |    101 |        0 | f             | t
|         16974 |          427 |          427 |         |           |
 21676718 | webstats  |      1 |        0 | f             | t
|         16974 |          427 |          427 |         |           |
        1 | template1 |      1 |        0 | t             | t
|         16974 |          427 |          427 |         |           |
{=,postgres=CT}
    16975 | template0 |      1 |        0 | t             | f
|         16974 |          427 |          427 |         |           |
{=,postgres=CT}
 15062595 | misp_jon2 |    101 |        0 | f             | t
|         16974 |          427 |          427 |         |           |
 25823532 | misptest  |      1 |        0 | f             | t
|         16974 |          427 |          427 |         |           |
  3988824 | devices   |      1 |        0 | f             | t
|         16974 |          427 |          427 |         |           |
 30149940 | status    |      1 |        0 | f             | t
|         16974 |          427 |          427 |         |           |
 30153251 | mispcopy  |      1 |        0 | f             | t
|         16974 |          427 |          427 |         |           |
 31238435 | misp      |      1 |        0 | f             | t
|         16974 |    440751129 |   3661976602 |         |           |
  6790290 | misp      |      1 |        0 | f             | t
|         16974 |   2300049162 |   1226307340 |         |           |
  7511185 | misp_jon  |    101 |        0 | f             | t
|         16974 |          427 |          427 |         |           |
(12 rows)


This shows 2 entries for "misp" with different OIDs. Only the first one
(oid 31238435) exists in the data/base directory. So I tried to delete
the other row but it doesn't work:

misp=# DELETE from pg_database where oid = 6790290;
DELETE 0

Several sources I found via Googling suggested this should work. But it
doesn't, so I'm stuck. Things I've tried are:

1. Same thing in standalone mode - no difference
2. VACUUM on misp and pg_database
3. UPDATE on pg_database to change the name of the bogus row - this
doesn't match any rows either just like the DELETE.

Any help would be greatly appreciated.

Thanks,

James

Re: More than one pg_database entry for database

От
Tom Lane
Дата:
"James Wilford" <jwilford@mistral.net> writes:
> This shows 2 entries for "misp" with different OIDs. Only the first one
> (oid 31238435) exists in the data/base directory. So I tried to delete
> the other row but it doesn't work:

> misp=# DELETE from pg_database where oid = 6790290;
> DELETE 0

That's pretty strange.  I wonder if the index on pg_database.oid is
corrupt.  It certainly seems like the one on datname must be corrupt,
else it should have disallowed two such entries.

You could try deleting the bogus row by selecting it by ctid instead
of oid.  Also see if you can REINDEX pg_database (this will only work
in standalone mode I think).

What PG version is this exactly?

            regards, tom lane

Re: More than one pg_database entry for database

От
Tom Lane
Дата:
"James Wilford" <jwilford@mistral.net> writes:
> Its PG version 7.3.

Please tell us it's a relatively recent 7.3.x, at least ...

> I've just tried all your suggestions and
> unfortunately I still can't delete the bogus row. I think I managed to
> reindex the system tables:

Curious.  I'd have thought the reindex would fail because of the
duplicate datname entries.

> And trying ctid didn't work, assuming this syntax is correct:
> backend> delete from pg_database where ctid = '(0,37)';

Even more curious.  I'm starting to think it's some sort of XID
wraparound problem, except if you can see it in SELECT then why doesn't
the DELETE see it?

Could you show all the system columns of pg_database, viz

select xmin, cmin, xmax, cmax, oid, ctid, datname from pg_database;

            regards, tom lane

Re: More than one pg_database entry for database

От
"James Wilford"
Дата:
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: 05 July 2007 15:53
To: James Wilford
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] More than one pg_database entry for database

"James Wilford" <jwilford@mistral.net> writes:
> This shows 2 entries for "misp" with different OIDs. Only the first
> one (oid 31238435) exists in the data/base directory. So I tried to
> delete the other row but it doesn't work:

> misp=# DELETE from pg_database where oid = 6790290; DELETE 0

That's pretty strange.  I wonder if the index on pg_database.oid is
corrupt.  It certainly seems like the one on datname must be corrupt,
else it should have disallowed two such entries.

You could try deleting the bogus row by selecting it by ctid instead of
oid.  Also see if you can REINDEX pg_database (this will only work in
standalone mode I think).

What PG version is this exactly?

            regards, tom lane

Hi Tom,

Its PG version 7.3. I've just tried all your suggestions and
unfortunately I still can't delete the bogus row. I think I managed to
reindex the system tables:

backend> reindex database misp force
NOTICE:  relation 16416 was reindexed
NOTICE:  relation 1261 was reindexed
NOTICE:  relation 1255 was reindexed
NOTICE:  relation 16410 was reindexed
NOTICE:  relation 1247 was reindexed
NOTICE:  relation 1249 was reindexed
NOTICE:  relation 1259 was reindexed
NOTICE:  relation 16388 was reindexed
NOTICE:  relation 16390 was reindexed
NOTICE:  relation 16392 was reindexed
NOTICE:  relation 16394 was reindexed
NOTICE:  relation 16396 was reindexed
NOTICE:  relation 16398 was reindexed
NOTICE:  relation 16400 was reindexed
NOTICE:  relation 16402 was reindexed
NOTICE:  relation 16404 was reindexed
NOTICE:  relation 16406 was reindexed
NOTICE:  relation 16412 was reindexed
NOTICE:  relation 16418 was reindexed
NOTICE:  relation 16594 was reindexed
NOTICE:  relation 1260 was reindexed
NOTICE:  relation 16596 was reindexed
NOTICE:  relation 16598 was reindexed
NOTICE:  relation 16384 was reindexed
NOTICE:  relation 16386 was reindexed
NOTICE:  relation 1262 was reindexed
NOTICE:  relation 16408 was reindexed

However I still can't delete the row, even from standalone mode:


backend> delete from pg_database where oid = 6790290
blank
         1: ctid        (typeid = 27, len = 6, typmod = -1, byval = f)
        ----

And trying ctid didn't work, assuming this syntax is correct:

backend> delete from pg_database where ctid = '(0,37)';
blank
         1: ctid        (typeid = 27, len = 6, typmod = -1, byval = f)
        ----


I might try creating a new database cluster on a test box and copying
the data directory over, would this work? Could I then create a new
database and just rename the data directory to the new OID?

Thanks,

James


Re: More than one pg_database entry for database

От
"James Wilford"
Дата:

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: 05 July 2007 17:27
To: James Wilford
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] More than one pg_database entry for database

"James Wilford" <jwilford@mistral.net> writes:
> Its PG version 7.3.

Please tell us it's a relatively recent 7.3.x, at least ...

> I've just tried all your suggestions and unfortunately I still can't
> delete the bogus row. I think I managed to reindex the system tables:

Curious.  I'd have thought the reindex would fail because of the
duplicate datname entries.

> And trying ctid didn't work, assuming this syntax is correct:
> backend> delete from pg_database where ctid = '(0,37)';

Even more curious.  I'm starting to think it's some sort of XID
wraparound problem, except if you can see it in SELECT then why doesn't
the DELETE see it?

Could you show all the system columns of pg_database, viz

select xmin, cmin, xmax, cmax, oid, ctid, datname from pg_database;

            regards, tom lane

It's a redhat version of 7.3 - rh-postgresql-server-7.3.8-2 is the
package.

Here's the system colums:

misp=# select xmin, cmin, xmax, cmax, oid, ctid, datname from
pg_database;
   xmin   |    cmin    |    xmax    | cmax |   oid    |  ctid  |
datname
----------+------------+------------+------+----------+--------+--------
---
        2 | 2304646034 | 2304646034 |    0 |  9019080 |  (0,3) |
webstats
        2 |          0 |          0 |    0 | 21676718 |  (0,4) |
webstats
        2 |          0 |          0 |    0 |        1 |  (0,6) |
template1
        2 |          0 |          0 |    0 |    16975 |  (0,7) |
template0
        2 | 2425306827 | 2425306827 |    0 | 15062595 | (0,16) |
misp_jon2
        2 |          0 |          0 |    0 | 25823532 | (0,19) |
misptest
        2 |          0 |          0 |    0 |  3988824 | (0,23) | devices
 45097349 |          0 |          0 |    0 | 30149940 | (0,26) | status
 59253569 |          0 |          0 |    0 | 30153251 | (0,27) |
mispcopy
 59254726 |          0 |          0 |    0 | 31238435 | (0,28) | misp
        2 | 2430588419 | 2430588419 |    0 |  6790290 | (0,37) | misp
        2 | 2425306819 | 2425306819 |    0 |  7511185 | (0,38) |
misp_jon
(12 rows)

The "webstats" database also appears twice, although we're more
concerned about "misp" at the moment.

James

Re: More than one pg_database entry for database

От
Tom Lane
Дата:
"James Wilford" <jwilford@mistral.net> writes:
> It's a redhat version of 7.3 - rh-postgresql-server-7.3.8-2 is the
> package.

That was obsoleted by Red Hat well over two years ago; you are overdue
for an update.

>    xmin   |    cmin    |    xmax    | cmax |   oid    |  ctid  | datname
> ----------+------------+------------+------+----------+--------+--------
>  59254726 |          0 |          0 |    0 | 31238435 | (0,28) | misp
>         2 | 2430588419 | 2430588419 |    0 |  6790290 | (0,37) | misp

It definitely looks to me like you have a problem with XID wraparound :-(
The (0,37) row was apparently deleted long ago, but never vacuumed away,
and now its xmax has wrapped around past the 2-billion-transaction event
horizon, causing it to appear visible to MVCC-using queries.  So there's
something broken about your routine vacuuming procedures; you'd better
take a look at that.

As far as getting out of the situation, the only really good answer is a
dump and reload.  I can't think of any simple way of getting rid of the
bogus row, but what you should be able to do to let pg_dump work is to
rename misp to something else.  You can rename it back after getting
through the dump/reload, of course.

            regards, tom lane

Re: More than one pg_database entry for database

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> "James Wilford" <jwilford@mistral.net> writes:
> > It's a redhat version of 7.3 - rh-postgresql-server-7.3.8-2 is the
> > package.
>
> That was obsoleted by Red Hat well over two years ago; you are overdue
> for an update.
>
> >    xmin   |    cmin    |    xmax    | cmax |   oid    |  ctid  | datname
> > ----------+------------+------------+------+----------+--------+--------
> >  59254726 |          0 |          0 |    0 | 31238435 | (0,28) | misp
> >         2 | 2430588419 | 2430588419 |    0 |  6790290 | (0,37) | misp
>
> It definitely looks to me like you have a problem with XID wraparound :-(
> The (0,37) row was apparently deleted long ago, but never vacuumed away,
> and now its xmax has wrapped around past the 2-billion-transaction event
> horizon, causing it to appear visible to MVCC-using queries.  So there's
> something broken about your routine vacuuming procedures; you'd better
> take a look at that.
>
> As far as getting out of the situation, the only really good answer is a
> dump and reload.  I can't think of any simple way of getting rid of the
> bogus row, but what you should be able to do to let pg_dump work is to
> rename misp to something else.  You can rename it back after getting
> through the dump/reload, of course.

Or roll the XID counter back, vacuum the table, and restore the XID to
the original value.  This is done with pg_resetxlog, though I am not
sure if we shipped it in 7.3.

--
Alvaro Herrera                 http://www.amazon.com/gp/registry/DXLWNGRJD34J
"The West won the world not by the superiority of its ideas or values
or religion but rather by its superiority in applying organized violence.
Westerners often forget this fact, non-Westerners never do."
(Samuel P. Huntington)

Re: More than one pg_database entry for database

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> As far as getting out of the situation, the only really good answer is a
>> dump and reload.  I can't think of any simple way of getting rid of the
>> bogus row, but what you should be able to do to let pg_dump work is to
>> rename misp to something else.  You can rename it back after getting
>> through the dump/reload, of course.

> Or roll the XID counter back, vacuum the table, and restore the XID to
> the original value.  This is done with pg_resetxlog, though I am not
> sure if we shipped it in 7.3.

That seems fairly hazardous, in particular there might be undesirable
side-effects on other system catalogs while you are running with the
set-back XID counter.

Also, I see no very good reason to assume that this is the only
wraparound problem present in the DB.  A dump and reload would probably
be useful to help check for other inconsistencies.

            regards, tom lane

Re: More than one pg_database entry for database

От
"James Wilford"
Дата:

-----Original Message-----
From: pgsql-admin-owner@postgresql.org
[mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Tom Lane
Sent: 05 July 2007 19:17
To: Alvaro Herrera
Cc: James Wilford; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] More than one pg_database entry for database

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> As far as getting out of the situation, the only really good answer
>> is a dump and reload.  I can't think of any simple way of getting rid

>> of the bogus row, but what you should be able to do to let pg_dump
>> work is to rename misp to something else.  You can rename it back
>> after getting through the dump/reload, of course.

> Or roll the XID counter back, vacuum the table, and restore the XID to

> the original value.  This is done with pg_resetxlog, though I am not
> sure if we shipped it in 7.3.

That seems fairly hazardous, in particular there might be undesirable
side-effects on other system catalogs while you are running with the
set-back XID counter.

Also, I see no very good reason to assume that this is the only
wraparound problem present in the DB.  A dump and reload would probably
be useful to help check for other inconsistencies.

            regards, tom lane


I'd like to try the dump and reload - the only problem is that I can't
use pg_dump because of the error. And I can't rename the entry in
pg_database because my update statement won't match any rows.

I'm going to set up a test box with the same OS/PG version and try
copying the raw data across from the data directory. It looks like the
only option at the moment. And then I'll definitely upgrade!

Thanks,

James

Re: More than one pg_database entry for database

От
Tom Lane
Дата:
"James Wilford" <jwilford@mistral.net> writes:
> I'd like to try the dump and reload - the only problem is that I can't
> use pg_dump because of the error. And I can't rename the entry in
> pg_database because my update statement won't match any rows.

You can't change the dead entry, but it should work to rename the live
one.

BTW, you have one of the most annoying quoting styles I have ever seen:
it's impossible to tell your words from those of the message you are
replying to, and not trimming the overhead from that message wastes
still more of your readers' time.  Please change your habits, or you
are likely to find that people stop reading your mail.

            regards, tom lane