Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram

Поиск
Список
Период
Сортировка
От Nikhil Sontakke
Тема Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram
Дата
Msg-id a301bfd90908030720m564e2372j4ef5257be15cabaf@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram  (Magnus Hagander <magnus@hagander.net>)
Ответы Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram  (Magnus Hagander <magnus@hagander.net>)
Список pgsql-bugs
Hi,

>>>>
>>>> And this fact should lend credence to Alvaro's (as well as mine)
>>>> suspicions that it seems to be a Windows kernel issue.
>>>>
>>>> As a consequence, Magnus I was wondering if having a loop similar to
>>>> the WRITE handling of waiting for a fixed timeout in a loop (rather
>>>> than an INFINITE call to WaitForMultipleObjectsEx) inside the
>>>> pgwin32_waitforsinglesocket() function will help for the READ case
>>>> too? I believe Teogor Sigaev had raised a similar concern a while back
>>>> about it:
>>>>
>>>> http://www.nabble.com/-GENERAL--Stats-collector-frozen--td8569977i20.html
>>>
>>> Maybe. I'm unsure if it's enough to just try another
>>> WaitForSingleObjectEx() on it, or if we need to actually issue a
>>> WSARecv() on it as well. Maybe it would be enough to just change the
>>> INIFINTE on line 318 of socket.c to a fixed value. That will loop down
>>> to WSARecv() which should exit with WSAEWOULDBLOCK which will cause us
>>> to do a short sleep and come back. But we'd have to change the limit
>>> of 5 somehow then, since in theory we should wait forever. Maybe that
>>> outer loop should just be a for(;;), what do you think?
>>>
>>
>> Yes, line 318 seems to be a much better location to me. If Windows and
>> it's socket logic behaves properly most of the times :), most of the
>> calls should come out within the first few tries, so changing 5 to an
>> infinite loop shouldn't hurt those normal use cases in theory.
>>
>> OTOH, I was wondering what if we kill the stats collector and on a
>> restart the socket communication resumes properly. Would that
>> conclusively mean that it is a flaw in our code?
>
> No, if we kill the stats collector that will destroy all sockets, and
> when the new one starts all the sockets it operates on are fresh and
> new. So it doesn't show that the flaw is in our code - but it also
> doesn't show that it's in the kernel or runtime libraries.
>

AFAICS in the code, the inherited pgStatSock socket FD remains the
same across the restart of the stats collector process...

Regards,
Nikhils
--
http://www.enterprisedb.com

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram