Обсуждение: database corruption help
Hi, my wal archiving broke and postgresql filled up the local disk with transaction logs, which i foolishly deleted in a moment of madness, after resetting the transaction log a few of my tables are damaged but repairable. However the system tables also seemed to have suffered. My main problem is duplicate rows which violated the primary key and if i do a reindex system i get similar errors, for example:
ERROR: could not create unique index "pg_class_oid_index"
a quick inspection of the pg_class table doesn't show any duplicates, is there anyway i can find out which row(s) are duplicated and remove them without a full db restore?
also doing something like this doesn't return anything which worked for my other tables
select oid from pg_class where oid in(select oid from pg_class group by oid having count(oid)>1 )
Many thanks
--
Got needs? Get Goblin'! - http://www.pricegoblin.co.uk/
"John Lister" <john.lister-ps@kickstone.com> writes: > ERROR: could not create unique index "pg_class_oid_index" > a quick inspection of the pg_class table doesn't show any duplicates, is there anyway i can find out which row(s) are duplicatedand remove them without a full db restore? > also doing something like this doesn't return anything which worked for my other tables > select oid from pg_class where oid in(select oid from pg_class group by oid having count(oid)>1 ) Try doing it with enable_indexscan = off and enable_bitmapscan = off. The system might be relying on the old, bogus index to do the group by oid. regards, tom lane
Cheers tom, that did it - i've removed the duplicates and seeing what else is broken. . > "John Lister" <john.lister-ps@kickstone.com> writes: >> ERROR: could not create unique index "pg_class_oid_index" > >> a quick inspection of the pg_class table doesn't show any duplicates, is >> there anyway i can find out which row(s) are duplicated and remove them >> without a full db restore? > >> also doing something like this doesn't return anything which worked for >> my other tables > >> select oid from pg_class where oid in(select oid from pg_class group by >> oid having count(oid)>1 ) > > Try doing it with enable_indexscan = off and enable_bitmapscan = off. > The system might be relying on the old, bogus index to do the group by > oid. > > regards, tom lane >
John Lister <john.lister@kickstone.com> writes: > Although saying that.. > reindex now works, but doing a vacuum verbose complains that the index > is out of step with the table and i should reindex.. > would i be better shutting the db down, restarting in standalone mode > (and also using the -P option) before reindexing? Yeah, if you have reason not to trust the system indexes then -P is a good idea until they are fixed. Standalone mode per se isn't that important --- you could do this from a regular session with -P specified via PGOPTIONS. regards, tom lane
>Yeah, if you have reason not to trust the system indexes then -P is >a good idea until they are fixed. Standalone mode per se isn't that >important --- you could do this from a regular session with -P specified >via PGOPTIONS. still getting the same problem. > PGOPTIONS="-P" > psql backend >> reindex system backend; >> NOTICE: table "pg_class" was reindexed >> NOTICE: table "pg_statistic" was reindexed >> NOTICE: table "pg_ts_config" was reindexed ... >> vacuum full verbose pg_class >> INFO: vacuuming "pg_catalog.pg_class" >> INFO: "pg_class": found 7 removable, 1839 nonremovable row versions in 18812 pages >> DETAIL: 27 dead row versions cannot be removed yet. >> Nonremovable row versions range from 160 to 229 bytes long. >> There were 929452 unused item pointers. >> Total free space (including removable row versions) is 149628172 bytes. >> 18749 pages are or will become empty, including 0 at the end of the table. >> 0 pages containing 0 free bytes are potential move destinations. >> CPU 0.01s/0.07u sec elapsed 0.17 sec. >> INFO: index "pg_class_oid_index" now contains 1812 row versions in 7 pages >> DETAIL: 0 index row versions were removed. >> 0 index pages have been deleted, 0 are currently reusable. >> CPU 0.00s/0.01u sec elapsed 0.09 sec. >> WARNING: index "pg_class_oid_index" contains 1812 row versions, but table contains 1807 row versions >> HINT: Rebuild the index with REINDEX. >> INFO: index "pg_class_relname_nsp_index" now contains 1812 row versions in 23 pages >> DETAIL: 0 index row versions were removed. >> 0 index pages have been deleted, 0 are currently reusable. >> CPU 0.00s/0.01u sec elapsed 0.09 sec. >> WARNING: index "pg_class_relname_nsp_index" contains 1812 row versions, but table contains 1807 row versions >> HINT: Rebuild the index with REINDEX. Just noticed the 7 dead rows above, are they likely to be causing a problem? BTW this is postgresql 8.3.5 if that makes a difference... I'm guessing it is fairly important to get rid of these warnings.... Thanks
John Lister <john.lister-ps@kickstone.com> writes: > still getting the same problem. >>> PGOPTIONS="-P" >>> psql backend I think you need export PGOPTIONS="-P" to make that work. Whether it's related to your problem isn't clear though. regards, tom lane
Sorry, had exported it - bad cut/pasting... I even tried it in single-user mode passing -P directly but got the same result Also confused/concerned by the 7 dead rows in pg_class. as i've restarted the server all transaction should have finished so i'd like to reclaim these - especially as vacuum reports the table is using 80Mb+ Thanks >> still getting the same problem. >> > > >>>> PGOPTIONS="-P" >>>> psql backend >>>> > > I think you need export PGOPTIONS="-P" to make that work. Whether > it's related to your problem isn't clear though. > > regards, tom lane >
I seem to have more dead rows now.. doing a vacuum full on pg_class gives me INFO: vacuuming "pg_catalog.pg_class"INFO: "pg_class": found 37 removable, 1845 nonremovable row versions in 18905 pages DETAIL: 27 dead row versions cannot be removed yet. Nonremovable row versions range from 160 to 229 bytes long. There were 933834 unused item pointers. Total free space (including removable row versions) is 150368692 bytes. 18839 pages are or will become empty, including 0 at the end of the table. 0 pages containing 0 free bytes are potential move destinations. CPU 0.01s/0.05u sec elapsed 0.17 sec.INFO: index "pg_class_oid_index" now contains 1813 row versions in 7 pages DETAIL: 56 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec.INFO: index "pg_class_relname_nsp_index" now contains 1818 row versions in 24 pages DETAIL: 63 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec.WARNING: index "pg_class_relname_nsp_index" contains 1818 row versions, but table contains 1813 row versions HINT: Rebuild the index with REINDEX.INFO: analyzing "pg_catalog.pg_class"INFO: "pg_class": scanned 3000 of 18905 pages, containing 227 live rows and 6 dead rows; 227 rows in sample, 1430 estimated total rows Total query runtime: 4469 ms. I've restarted the server which should have got rid of any outstanding/long running transactions - which it though was the major cause of lingering dead rows. Any ideas how to fix this as i'd like to reduce this from 150+Mb which can't be good for performance.... Thanks
John Lister <john.lister-ps@kickstone.com> writes: > I seem to have more dead rows now.. > doing a vacuum full on pg_class gives me > INFO: vacuuming "pg_catalog.pg_class"INFO: "pg_class": found 37 > removable, 1845 nonremovable row versions in 18905 pages > DETAIL: 27 dead row versions cannot be removed yet. > Nonremovable row versions range from 160 to 229 bytes long. > There were 933834 unused item pointers. > Total free space (including removable row versions) is 150368692 bytes. > 18839 pages are or will become empty, including 0 at the end of the table. > 0 pages containing 0 free bytes are potential move destinations. Hmm, that last seems to indicate that do_shrinking has been reset. It looks like the cases where that happens are reported with elog(LOG) which means they probably only went to the postmaster log (maybe we should fix things so that vacuum verbose reports those). What have you got in the postmaster log? regards, tom lane
I'm running ubuntu and can see references to only one log file which i'm assuming is the postmaster log.. Anyway the only relevant bits are: GMT LOG: relation "pg_class" TID 15538/4: dead HOT-updated tuple --- cannot shrink relation 2009-02-12 21:06:40 GMT STATEMENT: VACUUM FULL VERBOSE ANALYZE pg_class 2009-02-12 21:06:40 GMT WARNING: index "pg_class_relname_nsp_index" contains 1818 row versions, but table contains 1813 row versions If that helps... Thanks > John Lister <john.lister-ps@kickstone.com> writes: > >> I seem to have more dead rows now.. >> doing a vacuum full on pg_class gives me >> > > >> INFO: vacuuming "pg_catalog.pg_class"INFO: "pg_class": found 37 >> removable, 1845 nonremovable row versions in 18905 pages >> DETAIL: 27 dead row versions cannot be removed yet. >> Nonremovable row versions range from 160 to 229 bytes long. >> There were 933834 unused item pointers. >> Total free space (including removable row versions) is 150368692 bytes. >> 18839 pages are or will become empty, including 0 at the end of the table. >> 0 pages containing 0 free bytes are potential move destinations. >> > > Hmm, that last seems to indicate that do_shrinking has been reset. > It looks like the cases where that happens are reported with elog(LOG) > which means they probably only went to the postmaster log (maybe we > should fix things so that vacuum verbose reports those). What have > you got in the postmaster log? > > regards, tom lane >