error: could not find pg_class tuple for index 2662
От | daveg |
---|---|
Тема | error: could not find pg_class tuple for index 2662 |
Дата | |
Msg-id | 20110728002853.GA15578@sonic.net обсуждение исходный текст |
Ответы |
Re: error: could not find pg_class tuple for index 2662
|
Список | pgsql-hackers |
My client has been seeing regular instances of the following sort of problem: ...03:06:09.453 exec_simple_query, postgres.c:90003:06:12.042 XX000: could not find pg_class tuple for index 2662 at character1303:06:12.042 RelationReloadIndexInfo, relcache.c:174003:06:12.042 INSERT INTO zzz_k(k) SELECT ...03:06:12.04500000: statement: ABORT03:06:12.045 exec_simple_query, postgres.c:90003:06:12.045 00000: duration: 0.100 ms03:06:12.045exec_simple_query, postgres.c:112803:06:12.046 00000: statement: INSERT INTO temp_807 VALUES(...)03:06:12.046 exec_simple_query, postgres.c:90003:06:12.046 XX000: could not find pg_class tuple for index 2662at character 1303:06:12.046 RelationReloadIndexInfo, relcache.c:174003:06:12.046 INSERT INTO temp_807 VALUES (...)03:06:12.096 08P01: unexpected EOF on client connection03:06:12.096 SocketBackend, postgres.c:34803:06:12.096XX000: could not find pg_class tuple for index 266203:06:12.096 RelationReloadIndexInfo, relcache.c:174003:06:12.12100000: disconnection: session time: 0:06:08.537 user=ZZZ database=ZZZ_0103:06:12.121 log_disconnections,postgres.c:4339 The above happens regularly (but not completely predictably) corresponding with a daily cronjob that checks the catalogs for bloat and does vacuum full and/or reindex as needed. Since some of the applications make very heavy use of temp tables this will usually mean pg_class and pg_index get vacuum full and reindex. Sometimes queries will fail due to being unable to open a tables containing file. On investigation the file will be absent in both the catalogs and the filesystem so I don't know what table it refers to: 20:41:19.063 ERROR: could not open file "pg_tblspc/16401/PG_9.0_201008051/16413/1049145092": No such file or directory20:41:19.063 STATEMENT: insert into r_ar__30 select aid, mid, pid, sum(wdata) as wdata, ... --20:41:19.430 ERROR: could not open file "pg_tblspc/16401/PG_9.0_201008051/16413/1049145092": No such file or directory20:41:19.430 STATEMENT: SELECT nextval('j_id_seq') Finallly, I have seen a several instances of failure to read data by vacuum full itself: 03:05:45.699 00000: statement: vacuum full pg_catalog.pg_index;03:05:45.699 exec_simple_query, postgres.c:90003:05:46.142XX001: could not read block 65 in file "pg_tblspc/16401/PG_9.0_201008051/16416/1049146489": readonly 0 of 8192 bytes03:05:46.142 mdread, md.c:65603:05:46.142 vacuum full pg_catalog.pg_index; This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have identical systems still running 8.4.8 that do not have this issue, so I'm assuming it is related to the vacuum full work done for 9.0. Oddly, we don't see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4, so it may be timing related. This seems possibly related to the issues in: Bizarre buildfarm failure on baiji: can't find pg_class_oid_index http://archives.postgresql.org/pgsql-hackers/2010-02/msg02038.phpBroken HOT chains in system catalogs http://archives.postgresql.org/pgsql-hackers/2011-04/msg00777.php As far as I can tell from the logs I have, once a session sees one of these errors any subsequent query will hit it again until the session exits. However, it does not seem to harm other sessions or leave any persistant damage (crossing fingers and hoping here). I'm ready to do any testing/investigation/instrumented builds etc that may be helpful in resolving this. Regards -dg -- David Gould daveg@sonic.net 510 536 1443 510 282 0869 If simplicity worked, the world would be overrun with insects.
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: Ripping out pg_restore's attempts to parse SQL before sending it