Re: BUG #5929: ERROR: found toasted toast chunk for toast value 260340218 in pg_toast_260339342
От | Tambet Matiisen |
---|---|
Тема | Re: BUG #5929: ERROR: found toasted toast chunk for toast value 260340218 in pg_toast_260339342 |
Дата | |
Msg-id | 4D810A97.3000602@gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #5929: ERROR: found toasted toast chunk for toast value 260340218 in pg_toast_260339342 ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>) |
Список | pgsql-bugs |
On 16.03.2011 17:09, Kevin Grittner wrote: > Tambet Matiisen<tambet.matiisen@gmail.com> wrote: > >> Pre-live database is restored from live database dump every night. > > How is that done? A single pg_dump of the entire live database > restored using psql? Are both database servers at the same > PostgreSQL version? Yes, I use pg_dump on live server and the result is rdiff-backupped into development server. Whole SQL dump is 12G without compression and the rdiff delta is about 10-20MB every day. Then I drop pre-live database on development server and recreate it using createdb and psql. For a while development server was running 8.4 and live server 8.1. Now both are 8.4, but this shouldn't matter, as I do backup and restore via SQL. > >> So far the errors have been in pre-live database, > > You're running pg_dump against a database you just restored from a > pg_dump image? Hmm, yeah. This sounds rather dumb, but haven't got to that yet. Development server contains some additional databases as well, that do not exist on live server. > >> Usually the next day error was gone. I mostly blamed badly timed >> backup and restore scripts, although this shouldn't result in >> errors. > > No it shouldn't -- if you're following any of the documented backup > and restore techniques. I have a suspicion that you're just doing a > file copy without stopping the live database or properly following > the documented PITR backup and recovery techniques. No, I don't do any advanced backup tricks. Just plain pg_dump and psql. > > This time the error is not in pre-live database and therefore it >> doesn't go away. > > If I understand you, this sounds like corruption in the live > database; nothing on the pre-live database is part of causing this > problem. This would be the case when I do filesystem level copy, but I do not. > >> The server is also running [...] Samba [...] > > I hope you're not trusting Samba too far. For a while we were using > it in backups across our WAN, and it mangled at least one file > almost every day. We had to take to running md5sum against both > ends for each file to ensure we didn't get garbage (until we > converted everything to use TCP communications, which have never > mangled anything for us). As I said, I'm using rdiff-backup to transfer pure SQL files. > >> Both fsync and full_page_writes are on. > > Good. Without those an OS or hardware crash can corrupt your > database. Actually they are commented out, but I suppose this means "on". > >> OK, I don't have UPS for this machine, but power has been stable. >> Current uptime is 32 days, which I bet is from the last kernel >> update. > > OK. A power outage wouldn't be too likely to matter if you have > fsync and full_page_writes on. That's a relief :). > >> Currently I blame either faulty memory or faulty software RAID >> driver. I can easily eliminate the memory cause by running >> memtest86 for few hours > > Is this ECC memory? If not, even a good test doesn't prove that a > RAM problem didn't cause the corruption. It's not ECC memory. > >> Now, off to buy UPS... > > Not a bad idea, but it doesn't sound like lack of that is likely to > have caused the corruption in your live database, based on the > settings you mentioned. (Assuming those settings are in use on the > live server.) Checked live server, it has also fsync=on and full_page_writes=on. But it shouldn't matter, because backup of live server doesn't give any errors. It is possible, that restore of pre-live database using psql lasts so long, that backup of the same database using pg_dump is already kicking in. But again, this shouldn't matter and it doesn't explain why the last error is in another database, that hasn't changed for months. Now I have to find time to run memtest. Tambet
В списке pgsql-bugs по дате отправления:
Предыдущее
От: Euler Taveira de OliveiraДата:
Сообщение: Re: BUG #5934: postgresql.conf: optional equal sign