"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> Different from other spinlocks,io_in_progress spinlock is a per bufpage
> spinlock and ProcReleaseSpins() doesn't release the spinlock.
> If an error(in md.c in most cases) occured while holding the spinlock
> ,the spinlock would necessarily freeze.
Oooh, good point. Shouldn't this be fixed? If we don't fix it, then
a disk I/O error will translate to an installation-wide shutdown and
restart as soon as some backend tries to touch the locked page (as
indeed was happening to Michael). That seems a tad extreme.
> Michael Simms says
> ERROR: cannot read block 641 of server
> occured before the spinlock stuck abort.
> Probably it is an original cause of the spinlock freeze.
I seem to have missed the message containing that bit of info,
but it certainly suggests that your diagnosis is correct.
> However I don't understand the following status of his machine.
> /dev/sda1 30356106785018642307 43892061535609608 0 100%
Now that we know the root problem was disk driver flakiness, I think
we can write that off as Not Our Fault ;-)
regards, tom lane