Re: RESOLVED: Explained by known hardware failures, or keep looking?
| От | Kevin Grittner |
|---|---|
| Тема | Re: RESOLVED: Explained by known hardware failures, or keep looking? |
| Дата | |
| Msg-id | 4678F2C7.EE98.0025.0@wicourts.gov обсуждение |
| Ответ на | Re: Explained by known hardware failures, or keep looking? (Tom Lane <tgl@sss.pgh.pa.us>) |
| Список | pgsql-admin |
Thanks, all. Just an FYI to wrap up the thread. >>> On Mon, Jun 18, 2007 at 3:25 PM, in message <4713.1182198324@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> I'm suspicious that either the controller >> didn't persist dirty pages in the June 14th failure > > That's what it looks like to me --- it's hard to tell if the hardware or > the filesystem is at fault, but one way or another some pages that were > supposedly securely down on disk were wiped to zeroes. You should > probably review whether the hardware is correctly reporting write-complete. The hardware tech found many problems with this box. I may just give it a heavy update load and pull both plugs to see if it comes up clean now. The following was done: Replaced 2 failed drives Controller firmware updated SCSI micro code updated Performed Yast Online updates Connected second power supply Our newer boxes have monitoring software which alerts us before a box gets into this bad a state. -Kevin
В списке pgsql-admin по дате отправления: