Обсуждение: URGENT - HELP WITH 7.3.4 Strangeness
I am in need of some help. We are experiencing a lot of very weird strangeness with our Postgresql databases. It appears that we are suddenly suffering from a lot of possible index corruption or other db problems. Here is what is happening: 1. Last Sunday, ran a reindex on all databases across our 3 postgresql servers. This was done using the reindexdb script from the contrib directory. The reindex was prompted due to strangeness on db server 1. 2. Wednesday, our application suddenly starts having data issues. It issues selects to db and does not display the rows. Nothing on the application has changed. This is happening across all 3 database servers. (App runs on Windows and connects via a linux "app" server using linux odbc (from gborg.postgresql.org) to connect to postgres. 3. Pulling queries from db logs and running them via psql returns rows. 4. Wednesday night. Moved all db's from initial problem database server to new db server. Moved another problem db to new server. New db's run fine. 5. Wednesday night. Dumped and reloaded several problem db's on problem servers 2 and 3. 6. Thursday was relatively quiet. 7. Friday, problems reappeared on servers 2 and 3. 8. Friday night. Shutdown, powered off db servers 2 and 3. Ran fsck on all drives. Ran reindexdb across all db's and vacuum analyzed. 9. Saturday morning - life is good. 10. Sunday night, check db's to verify things are good. DB Server 2 is exhibiting problems, db server 3 is fine. 11. Sunday night - reboot db server 2. Everything is fine. 12. Monday (today). - Apparent index corruption on new db server 1. Row in db table but query using index will not pull it. Small database so reindexed it. In synopsis: System runs fine after reboot, but appears to start having index issues. We are able to query the db and see the data if we force a seq scan, but an index scan returns random results. Nothing has changed application or db wise. All server Dell Poweredge 2650's RedHat ES 2.1 Postgres 7.3.4 (with perl) DB's on Dell PowerVaults using raid5 I really need some help solving this. I have never seen anything like this before. Does anyone have any ideas? Thanks, Chris
Chris, I feel your pain. Even though it is odd because all servers seem alike in their problems, it smells like a memory problem. Was anything changed just prior to your problems? Has everything been running fine on this version of OS and postgres before this happened? Another idea is that you are moving corrupted data, and once it gets used some, it is messing things up, so the problems follow the data. Did you see any strangeness when moving the data? Naomi > > In synopsis: > > System runs fine after reboot, but appears to start having index > issues. We are able to query the db and see the data if we force a seq > scan, but an index scan returns random results. Nothing has changed > application or db wise. > > All server Dell Poweredge 2650's > RedHat ES 2.1 > Postgres 7.3.4 (with perl) > DB's on Dell PowerVaults using raid5 > > I really need some help solving this. I have never seen anything like > this before. Does anyone have any ideas? > Thanks, > > Chris > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) -- ------------------------------------------------------------------------- Naomi Walker Chief Information Officer Eldorado Computing, Inc. nwalker@eldocomp.com 602-604-3100 ------------------------------------------------------------------------- Courage is the power to let go the familiar. -Raymond Lindquist ------------------------------------------------------------------------- -- CONFIDENTIALITY NOTICE -- This message is intended for the sole use of the individual and entity to whom it is addressed, and may contain informationthat is privileged, confidential and exempt from disclosure under applicable law. If you are not the intendedaddressee, nor authorized to receive for the intended addressee, you are hereby notified that you may not use, copy,disclose or distribute to anyone the message or any information contained in the message. If you have received thismessage in error, please immediately advise the sender by reply email, and delete the message. Thank you.
Chris , > > Here is what is happening: > > 1. Last Sunday, ran a reindex on all databases across our 3 postgresql > servers. This was done using the reindexdb script from the contrib > directory. The reindex was prompted due to strangeness on db server 1. From going through the full text I understand that db server 1 database is on all the other server. Now if this is true then db server 1 has some kinda Hardware issues. Which leads to database corruption and same data was loaded on other servers. check end posts of this thread http://archives.postgresql.org/pgsql-general/2003-11/msg01511.php -- With Best Regards, Vishal Kashyap. Did you know SaiPACS is one and only PACS Management tool. http://saihertz.com, http://vishalkashyap.tk
Actually, We have 3 distinct db servers, each with separate db's on them. There are between 20 and 170 db's per server. I actually moved a db from db server 3 to the new db server and it worked fine on the new db server. However, when we went to move it back to either db server 2 or 3, the problem reappeared??? I think since it worked fine on the move to the new db server, it eliminates corrupt data. Is there a good way to test for a memory issue w/o bringing the servers down? Of course, the db servers are all production, so we don't have large windows to work with. Thanks, chris Vishal Kashyap @ [SaiHertz] wrote: >Chris , > > > > >>Here is what is happening: >> >>1. Last Sunday, ran a reindex on all databases across our 3 postgresql >>servers. This was done using the reindexdb script from the contrib >>directory. The reindex was prompted due to strangeness on db server 1. >> >> > >From going through the full text I understand that db server 1 >database is on all the other server. >Now if this is true then db server 1 has some kinda Hardware issues. >Which leads to database corruption and same data was loaded on other >servers. >check end posts of this thread >http://archives.postgresql.org/pgsql-general/2003-11/msg01511.php > > > > > >