Re: ERROR: catalog is missing 9 attribute(s) for relid 10297

Поиск
Список
Период
Сортировка
От O'Shea, Brendan
Тема Re: ERROR: catalog is missing 9 attribute(s) for relid 10297
Дата
Msg-id F6331254DAFC0041BEAE36A51D295F92328BF1@MAVS2.kendall.corp.akamai.com
обсуждение исходный текст
Ответ на Re: ERROR: catalog is missing 9 attribute(s) for relid 10297  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
"Lane, Tom" <tgl@sss.pgh.pa.us> writes:>
> "O'Shea, Brendan" <boshea@akamai.com> writes:
> >> "Lane, Tom" <tgl@sss.pgh.pa.us> writes:
> >> Ugh.  Does it work if you do
> >> export PGOPTIONS="--ignore_system_indexes=1"
>
> > I tried that, but unfortunately pg_dump still fails to run
> and the error
> > message is identical to previous attempts.
>
> Huh.  So it's not index corruption then.  Table corruption is still a
> possibility but it seems likely that you'd be getting other errors
> during the seqscan that looks for the rows.  This suggests
> that the rows
> actually disappeared from pg_attribute, which is a bit hard to credit.
> The only mechanisms I can think of are that VACUUM decided they were
> dead or something physically truncated the table.  The latter would
> probably have zapped a lot of other rows though.
>

The auto vacuum daemon is running pretty frequently on the database
since one of our tables is very heavily updated, about 50-200 per
second, and the rows themselves are large, about 1k in size on average
and about 7000 rows in the table.  We also run a nightly 'vacuum full'
operation on this table since it can sometimes grow fairly large (around
50-100MB or more) if there are any long running transactions that last
20-30 minutes or more.  Not sure if this is related but thought it was
worth a mention.

> Have you checked to see if pg_roles is the only relation with this
> problem?  Try
> select c.relname from pg_class c left join pg_attribute a
> on a.attrelid = c.oid and a.attnum > 0
> group by c.oid,c.relname,c.relnatts having count(*) != c.relnatts;
>

This SQL produces the output:
"pg_roles"
"pg_group"
"pg_rules"
"pg_user"
"pg_shadow"

Querying each table produces the error:
"pg_roles"  => ERROR:  catalog is missing 9 attribute(s) for relid 10297
"pg_group"  => ERROR:  catalog is missing 3 attribute(s) for relid 10303
"pg_rules"  => ERROR:  invalid attribute number 0 for pg_rules
"pg_user"   => ERROR:  catalog is missing 8 attribute(s) for relid 10306
"pg_shadow" => ERROR:  catalog is missing 8 attribute(s) for relid 10300

> > Any suggestions for additional logging we might turn on to help
> > determine the cause of this issue?
>
> Maybe VACUUM VERBOSE on pg_attribute?  Although you'd have to
> get lucky
> enough to catch the time that it zapped the rows, if that's what the
> problem is.
>

So are you saying that on our newly installed system we could
periodically run "VACUUM VERBOSE pg_attribute", append the output to a
log file, and then after the catalog error starts happening again we
could go to this log file to look for potential clues?

Running "VACUUM VERBOSE pg_attribute" now produces:

INFO:  vacuuming "pg_catalog.pg_attribute"
INFO:  scanned index "pg_attribute_relid_attnam_index" to remove 30 row
versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.03 sec.
INFO:  scanned index "pg_attribute_relid_attnum_index" to remove 30 row
versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_attribute": removed 30 row versions in 1 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "pg_attribute_relid_attnam_index" now contains 2670 row
versions in 48 pages
DETAIL:  30 index row versions were removed.
11 index pages have been deleted, 11 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "pg_attribute_relid_attnum_index" now contains 2670 row
versions in 13 pages
DETAIL:  30 index row versions were removed.
2 index pages have been deleted, 2 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_attribute": found 30 removable, 2670 nonremovable row
versions in 62 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 965 unused item pointers.
20 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.04 sec.

Query returned successfully with no result in 500 ms.


Thanks for your help,
Brendan

В списке pgsql-general по дате отправления:

Предыдущее
От: Lew
Дата:
Сообщение: Re: Need efficient way to do comparison with NULL as an option
Следующее
От: ProfKheel
Дата:
Сообщение: Create Index (Hash) on a Large Table Taking Days...