Обсуждение: strange behavior (corruption?) of large production database
We have very strange behavior from an internal production database. There are multiple symptoms, all pointing to a problem with clusterwide tables. For example: postgres@csdfds1:~> psql -U postgres -p 5433 cyspec Welcome to psql 7.4.8, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help on internal slash commands \g or terminate with semicolon to execute query \q to quit cyspec=# select version(); version ------------------------------------------------------------------------------------- PostgreSQL 7.4.8 on i686-pc-linux-gnu,compiled by GCC gcc (GCC) 3.3.3 (SuSE Linux) (1 row) postgres@csdfds1:~> psql -l List of databases Name | Owner | Encoding ----------------+----------+----------- cyspec | postgres | SQL_ASCII temp_mike | postgres | SQL_ASCII temp_mike_new | postgres | SQL_ASCII temp_mike_orig | postgres | SQL_ASCII template0 | postgres | SQL_ASCII template1 | postgres | SQL_ASCII (6 rows) cyspec=# select * from pg_database; datname | datdba | encoding | datistemplate | datallowconn | datlastsysoid | datvacuumxid | datfrozenxid | datpath | datconfig | datacl ---------+--------+----------+---------------+--------------+---------------+--------------+--------------+---------+-----------+-------- (0 rows) cyspec=# \l List of databases Name | Owner | Encoding ------+-------+---------- (0 rows) No databases found. Additionally: cyspec=# select usename, usesysid from pg_shadow; usename | usesysid ----------+---------- postgres | 1 colxl | 102 colro | 101 l400509 | 105 (4 rows) cyspec=# \c - colprod You are now connected as new user "colprod". cyspec=> \c - colxl You are now connected as new user "colxl". cyspec=> \c - colprod You are now connected as new user "colprod". cyspec=> \c - zxcvvb FATAL: user "zxcvvb" does not exist Previous connection kept The "colprod" user has disappeared from pg_shadow (there was one previously, and it was never intentionally dropped), but I can still connect with that user. The current problem was actually initially found because pg_dump complained that the owner of the colprod schema didn't exist. One more thing: cyspec=# show wal_sync_method; wal_sync_method ----------------- fdatasync (1 row) That works, but SHOW ALL and "select * from pg_settings;" return lines and lines of nothing in psql. (I mean literally blank lines, not even "(0 rows)") After issuing \o /tmp/filename the output is all there, and looks normal. The oddness was first noticed about 3 days after a maintenance shutdown. As far as I have been told, during the maintenance window, there may have been OS level package upgrades, and there was a firmware upgrade done on the storage subsystem (NetApp). Any advice at what to look at/do would be appreciated. This database is somewhere around 1.1 TB in size, so dump and reload is not something we're anxious to do. Joe
Joe Conway <mail@joeconway.com> writes: > We have very strange behavior from an internal production database. > There are multiple symptoms, all pointing to a problem with clusterwide > tables. For example: "psql -l" really should produce the same results as doing "\l" in the template1 database. Does it? If so, the next thing to look at is probably whether the pg_class and pg_attribute entries for pg_database look the same in template1 and in cyspec. Similarly you could compare what pg_shadow looks like from different databases, and what the local system tables have as entries for it. regards, tom lane
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: > >>We have very strange behavior from an internal production database. >>There are multiple symptoms, all pointing to a problem with clusterwide >>tables. For example: > > > "psql -l" really should produce the same results as doing "\l" in the > template1 database. Does it? Sorry -- on my last post "psql -l" was pointing to the wrong place. postgres@csdfds1:~> psql -p 5433 -l List of databases Name | Owner | Encoding ------+-------+---------- (0 rows) postgres@csdfds1:~> psql -U postgres -p 5433 template1 Welcome to psql 7.4.8, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help on internal slash commands \g or terminate with semicolon to execute query \q to quit template1=# \l List of databases Name | Owner | Encoding ------+-------+---------- (0 rows) So they agree in template1 and cyspec databases. > If so, the next thing to look at is probably whether the pg_class and pg_attribute entries for pg_database > look the same in template1 and in cyspec. Similarly you could compare > what pg_shadow looks like from different databases, and what the local > system tables have as entries for it. Did that - they look the same. We did a simple cat 1262 | less to view the contents of pg_database and indeed the data looks to be there. Similarly a simple cat of the pg_shadow relfilenode shows the colprod user. Is it possible that we have corrupted shared memory, and a database restart will fix the problem? We didn't want to restart until the forensics were done. Joe
Joe Conway <mail@joeconway.com> writes: > So they agree in template1 and cyspec databases. OK, in that case I'd wonder about whether you've suffered XID wraparound in pg_database and/or pg_shadow. The typical symptom of this is that entries are valid from the system's point of view but not visible to queries, and that seems to be what you have. If so, a restart will NOT fix it. You could try a VACUUM FREEZE on pg_database though. Before doing that, I'd suggest looking at the rows' xmin values (use pg_filedump or grovel through the bits by hand) to confirm the wraparound theory. regards, tom lane
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: > >>So they agree in template1 and cyspec databases. > > OK, in that case I'd wonder about whether you've suffered XID wraparound > in pg_database and/or pg_shadow. The typical symptom of this is that > entries are valid from the system's point of view but not visible to > queries, and that seems to be what you have. If so, a restart will NOT > fix it. You could try a VACUUM FREEZE on pg_database though. Before > doing that, I'd suggest looking at the rows' xmin values (use > pg_filedump or grovel through the bits by hand) to confirm the > wraparound theory. > Talking to the maintainer of this cluster, it sounds like XID wraparound could be the problem. I thought they were running database wide vacuums at some regularity, but apparently they are only vacuuming specific production tables. Since this is a production machine, putting pg_filedump on it may be problematic -- if I grovel through the bits by hand, can you give me a hint about what to look for? Thanks, Joe
Joe Conway <mail@joeconway.com> writes: > Tom Lane wrote: >> You could try a VACUUM FREEZE on pg_database though. > Since this is a production machine, putting pg_filedump on it may be > problematic -- if I grovel through the bits by hand, can you give me a > hint about what to look for? How about you copy off the pg_database file to someplace where it's OK to run pg_filedump? Doing that by hand is mighty tedious. BTW, forget the "FREEZE" part, just VACUUM: http://archives.postgresql.org/pgsql-general/2005-11/msg00097.php regards, tom lane
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: >>Since this is a production machine, putting pg_filedump on it may be >>problematic -- if I grovel through the bits by hand, can you give me a >>hint about what to look for? > > How about you copy off the pg_database file to someplace where it's OK > to run pg_filedump? Doing that by hand is mighty tedious. > > BTW, forget the "FREEZE" part, just VACUUM: > http://archives.postgresql.org/pgsql-general/2005-11/msg00097.php > That was it, apparently. I tarred up the global directory so I can do posthoc analysis, but they were too anxious to wait any longer, so we ran vacuum on pg_database and pg_shadow and now everything appears normal again. I pointed them to: http://www.postgresql.org/docs/7.4/interactive/maintenance.html and advised that they read it carefully. Since this database has many large, but static tables (i.e. new data is loaded each day, but the tables are partitioned into year-month tables), I'm thinking we can run VACUUM FREEZE on the whole database once, and then run VACUUM FREEZE periodically on just the tables that have had rows added since the last time -- is that correct? Thanks again for the quick help. Joe
Joe Conway <mail@joeconway.com> writes: > Since this database has many large, but static tables (i.e. new data is > loaded each day, but the tables are partitioned into year-month tables), > I'm thinking we can run VACUUM FREEZE on the whole database once, and > then run VACUUM FREEZE periodically on just the tables that have had > rows added since the last time -- is that correct? That would work for the user tables, but the lesson to draw from this is not to forget about the system catalogs ... regards, tom lane
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: > >>Since this database has many large, but static tables (i.e. new data is >>loaded each day, but the tables are partitioned into year-month tables), >>I'm thinking we can run VACUUM FREEZE on the whole database once, and >>then run VACUUM FREEZE periodically on just the tables that have had >>rows added since the last time -- is that correct? > > That would work for the user tables, but the lesson to draw from this is > not to forget about the system catalogs ... > Right, I was just thinking about that too. Important safety tip :-) Thanks again! Joe
Joe Conway <mail@joeconway.com> writes: > cyspec=# vacuum freeze pg_catalog.pg_class; > ERROR: failed to re-find parent key in "pg_class_relname_nsp_index" > It seems that we cannot vacuum pg_class, because vacuum itself fails. > Any suggestions on how to bootstrap the fixing of pg_class? REINDEX? regards, tom lane
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: >>cyspec=# vacuum freeze pg_catalog.pg_class; >>ERROR: failed to re-find parent key in "pg_class_relname_nsp_index" > >>It seems that we cannot vacuum pg_class, because vacuum itself fails. >>Any suggestions on how to bootstrap the fixing of pg_class? > > REINDEX? > As usual, sage advice. That worked -- thanks. Joe