Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram

Поиск
Список
Период
Сортировка
От Nikhil Sontakke
Тема Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram
Дата
Msg-id a301bfd90908030737p6c1eba91u8bb413647ce18679@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram  (Magnus Hagander <magnus@hagander.net>)
Ответы Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram  (Luke Koops <luke.koops@entrust.com>)
Список pgsql-bugs
Hi,

>>>>>
>>>>> Maybe. I'm unsure if it's enough to just try another
>>>>> WaitForSingleObjectEx() on it, or if we need to actually issue a
>>>>> WSARecv() on it as well. Maybe it would be enough to just change the
>>>>> INIFINTE on line 318 of socket.c to a fixed value. That will loop down
>>>>> to WSARecv() which should exit with WSAEWOULDBLOCK which will cause us
>>>>> to do a short sleep and come back. But we'd have to change the limit
>>>>> of 5 somehow then, since in theory we should wait forever. Maybe that
>>>>> outer loop should just be a for(;;), what do you think?
>>>>>
>>>>
>>>> Yes, line 318 seems to be a much better location to me. If Windows and
>>>> it's socket logic behaves properly most of the times :), most of the
>>>> calls should come out within the first few tries, so changing 5 to an
>>>> infinite loop shouldn't hurt those normal use cases in theory.
>>>>
>>>> OTOH, I was wondering what if we kill the stats collector and on a
>>>> restart the socket communication resumes properly. Would that
>>>> conclusively mean that it is a flaw in our code?
>>>
>>> No, if we kill the stats collector that will destroy all sockets, and
>>> when the new one starts all the sockets it operates on are fresh and
>>> new. So it doesn't show that the flaw is in our code - but it also
>>> doesn't show that it's in the kernel or runtime libraries.
>>>
>>
>> AFAICS in the code, the inherited pgStatSock socket FD remains the
>> same across the restart of the stats collector process...
>
> Partially correct, I think.
>
> Each backend has it's own handle on win32, since we use EXEC_BACKEND
> (this includes the "utility processes" like the stats collector). When
> we start the new one, we are going to use DuplicateHandle() in
> save_backend_variables(). This will therefor get it a new handle, but
> they are both pointing to the same kernel object. I don't know if
> WaitForMultipleObjectsEx() is going to see these as two different
> objects or not, but I think it does.
>

Hmm, got it. Nothing like adding more confusion into the mix :)

Regards,
Nikhils
--
http://www.enterprisedb.com

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: BUG #4958: Stats collector hung on WaitForMultipleObjectsEx while attempting to recv a datagram
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: fix: plpgsql: return query and dropped columns problem