Обсуждение: Error: missing chunk number ...
Hi, I've recently observed some strange behavior using postgresql 7.2.1 on a RH-7.3 system. I've got a service for fetching rss feeds from the web. These feeds are stored into a text column. Here's how the table looks like: Column | Type | Modifiers ------------------+--------------------------+----------- channel_id | integer | dataurl | character varying(1000) | ... ... latest_feed | text | feedidx | txtidx | Indexes: hellofeedidx (feedidx uses tsearch and indexes the faulty column, namely latest_feed). This is not on a production system. After running this service for a while (almost a week). I get messages like: # select * from rss_channels; ERROR: missing chunk number 1 for toast value 797979 If I try to delete the culprit row. I get: # delete from rss_channels where channel_id=116; ERROR: simple_heap_delete: tuple already updated by self I'm running a full vacuum daily: su - postgres -c 'vacuumdb --full --analyze --quiet --all' and a simple vacuum once every hour: su - postgres -c 'vacuumdb --analyze --quiet --all' Here's how I configure and run PG: ./configure --enable-multibyte=UNICODE --with-tcl --with-perl --enable-syslog su -l postgres -c 'export LANG=C; /usr/local/pgsql/bin/postmaster -N 200 -B 1000 -o "-S 2000" -S -D /usr/local/pgsql/data ' Note that I cannot do a pg_dump on the rss_channels table. Any idea of what might be the problem? Best wishes, Neophytos
Neophytos Demetriou <k2pts@cytanet.com.cy> writes: > # select * from rss_channels; > ERROR: missing chunk number 1 for toast value 797979 > If I try to delete the culprit row. I get: > # delete from rss_channels where channel_id=116; > ERROR: simple_heap_delete: tuple already updated by self Bizarre. Evidently there's something broken about the TOAST data for your table. I am thinking that the toast table's index might be corrupt, in which case you could probably recover by reindexing that index. But first it would be nice to see if we can figure out exactly what happened --- is this the result of a software bug, or a hardware glitch? Would you be willing to let someone poke around in your database, or perhaps if the DB is not too large, tar it all up to send to someone for analysis? regards, tom lane
> > >But first it would be nice to see if we can figure out exactly >what happened --- is this the result of a software bug, or a hardware >glitch? Would you be willing to let someone poke around in your >database, or perhaps if the DB is not too large, tar it all up to >send to someone for analysis? > Sure. Who should I send the tarball to (~19MB the whole data/ directory)? Btw, vacuumdb does not work either (it used to work though before this problem occured): ERROR: missing chunk number 1 for toast value 797979 vacuumdb: vacuum xxxxx-db failed Best wishes, Neophytos
> > >Bizarre. Evidently there's something broken about the TOAST data for >your table. I am thinking that the toast table's index might be >corrupt, in which case you could probably recover by reindexing that >index. > That's it! Doing the following fixed the problem: ./oid2name -d xxxxx-db -t rss_channels Oid of table rss_channels from database "xxxxx-db": _______________________________ 796690 = rss_channels backend> reindex index pg_toast_796690_idx; I still don't know what caused this but I'll wait a bit and see how it goes -- this was not the first time I had this problem. In the past (couple of weeks), I used to drop and recreate the table to resolve this. Thanks for the help. Best wishes, Neophytos
On Mon, 5 Aug 2002, Neophytos Demetriou wrote: > > > > > >Bizarre. Evidently there's something broken about the TOAST data for > >your table. I am thinking that the toast table's index might be > >corrupt, in which case you could probably recover by reindexing that > >index. > > > > That's it! Doing the following fixed the problem: > > ./oid2name -d xxxxx-db -t rss_channels > Oid of table rss_channels from database "xxxxx-db": > _______________________________ > 796690 = rss_channels > > backend> reindex index pg_toast_796690_idx; > > I still don't know what caused this but I'll wait a bit and see how it > goes -- this was not the first time I had this problem. In the past > (couple of weeks), I used to drop and recreate the table to resolve this. > If you can, run memtest86 on the machine for 24 hours. Probably a hardware memory problem. Gavin
> > >If you can, run memtest86 on the machine for 24 hours. Probably a hardware >memory problem. > After running memtest86 for two hours, it revealed nothing. I'm gonna try it again this evening and let it run until tomorrow morning. Best regards, Neophytos
Neophytos Demetriou <k2pts@cytanet.com.cy> writes: > Sure. Who should I send the tarball to (~19MB the whole data/ directory)? Me. Please shut down the postmaster before you tar up the data/ directory. regards, tom lane
> > >>Sure. Who should I send the tarball to (~19MB the whole data/ directory)? >> > >Me. Please shut down the postmaster before you tar up the data/ >directory. > Ok, sent. Note that it occured again after fixing it this morning. Best wishes, Neophytos
On Mon, 5 Aug 2002, Neophytos Demetriou wrote: > > > > > >If you can, run memtest86 on the machine for 24 hours. Probably a hardware > >memory problem. > > > > After running memtest86 for two hours, it revealed nothing. I'm gonna > try it again this evening and let it run until tomorrow morning. Also check your drive subsystem for bad blocks. man badblocks in linux, not sure what program for other Oses. Postgresql is good, but it can't make up for bad drives or memory. :-)