Tony Wasson wrote:
> On 5/16/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> "Tony Wasson" <ajwasson@gmail.com> writes:
>> > When I saw the same error as you, the stats collector process was
>> > missing.
>>
>> The collector, or the buffer process? The reported message would be
>> emitted by the buffer process, after which it would immediately exit.
>> (The collector would go away too once it noticed EOF on its input.)
>> By and by the postmaster should start a fresh pair of processes.
>
>
> The stats collector was dead and would not respawn. Our options seemed
> limited to restarting postmaster or ignoring the error.
>
> Here was what the process list looked like:
>
> kangaroo:~ twasson$ ps waux | grep post
> pgsql 574 0.0 -0.0 460104 832 p0 S Wed06AM 10:26.98
> /usr/local/pgsql/bin/postmaster -D /Volumes/Vol0/pgsql-data
> pgsql 578 0.0 -5.2 460356 108620 p0 S Wed06AM 27:43.68
> postgres: writer process
> twasson 23844 0.0 -0.0 18172 688 std S+ 10:05AM 0:00.01
> grep post
That is what I recalled, also, though I wasn't meticulous enough to hang
onto the process list.
>> IIRC, the postmaster's spawning is rate-limited to once a minute,
>> so if the new buffer were immediately dying with the same error,
>> that would explain your observation of once-a-minute messages.
>>
>> This all still leaves us no closer to understanding *why* the recv()
>> is failing, though. What it does suggest is that the problem is a
>> hard, repeatable error when it does occur, which makes me loath to
>> put in the quick-fix "retry on EAGAIN" that I previously suggested.
>> If it is a hard error then that will just convert the problem into
>> a busy-loop that'll eat all your CPU cycles ... not much of an
>> improvement ...
>>