Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jun 14, 2010 at 10:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That's a different question altogether ;-). �I assume you're not
>> satisfied by the change Heikki committed a couple hours ago?
>> It will at least try to do something to recover.
> Yeah, I'm not satisfied by that. It's an improvement in the technical
> sense - it replaces an infinite retry that spins at top speed with a
> slower retry that won't flog your CPU quite so badly, but the chances
> that it will actually succeed in correcting the underlying problem
> seem infinitesimal.
I'm not sure about that. walreceiver will refetch from the start of the
current WAL page, so there's at least some chance of getting a good copy
when we didn't have one before.
However, I do agree that it's not helpful to loop forever. If we can
easily make it retry once and then PANIC, I'd be for that --- otherwise
I tend to agree that the best thing is just to PANIC immediately. There
are many many situations where a slave resync will be necessary; a
transmission error on the WAL data is just one more.
regards, tom lane