Re: [HACKERS] Hang in pldebugger after git commit : 98a64d0

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [HACKERS] Hang in pldebugger after git commit : 98a64d0
Дата
Msg-id CAA4eK1LPK=ONozMbsvP+B7PGG1JarPprsL2MJ6tGMVH_KpHV2g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Hang in pldebugger after git commit : 98a64d0  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-hackers
On Thu, Dec 15, 2016 at 7:48 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Dec 15, 2016 at 1:18 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Dec 14, 2016 at 5:35 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>>> I have just read the patch, and hardcoding the array position for a
>>>> socket event to 0 assumes that the socket event will be *always* the
>>>> first one in the list. but that cannot be true all the time, any code
>>>> that does not register the socket event as the first one in the list
>>>> will likely break.
>>>
>>> I think your point is very valid and even i had similar thought in my
>>> mind when implementing it but as i mentioned upthread that's how it is
>>> being used in the current code base. Please check a function call to
>>> ModifyWaitEvent() in secure_read().
>>
>> That's kind of irrelevant. A WaitEventSet is a general-purpose
>> primitive, and it needs to work in all situations, current and future,
>> where a reasonable developer would expect it to work.  Sure, pq_init()
>> puts the socket in FeBeWaitSet at position 0.  However, in
>> WaitLatchOrSocket, the socket event, if required, is added after the
>> latch and postmaster-death events.  And new code written using
>> CreateWaitEventSet() / AddWaitEventToSet() in the future could put a
>> socket at any offset in the event array, or could add multiple
>> sockets.  So Michael is right: you can't just stick a hack into
>> WaitEventSetWait that assumes the socket is at offset 0 even if that
>> happens to fix the problem we are facing right at this moment.
>>
>> (I am not sure whether Michael's proposed placement is correct, but
>> I'm almost 100% sure the placement in the submitted patch is not.)
>

Robert, you are right.  As submitted, the placement is incorrect for
the reasons mentioned by you.  The intention was to do it in
secure_read() using FeBeWaitSet where we are sure it is at 0th
position.  Also, I think this is primarily required for windows, so
what we could instead do is introduce a new API ResetWaitEvent() which
will reset socket events passed to it.  This API can have port
specific reset functions something similar to what we have for
ModifyWaitEvent and for now we can have only Win32 specific function
as we are not aware if it is required for any other port.  Then call
this from secure_read before "goto retry;"

> What would seem like the good place is really WaitEventAdjustWin32()
> that gets called in ModifyWaitEvent() by secure_read().
>

Actually, we can't use or do this change in ModifyWaitEvent(), because
it is mainly a Set call, it won't allow you to reset it mainly due to
below check in this API.  Here, what we need is reset.

ModifyWaitEvent()
{
..
if (events == event->events &&
(!(event->events & WL_LATCH_SET) || set->latch == latch))
return;
..
}

If you refer the code prior to 9.6, we always reset the socket at end
of WaitLatchOrSocket for windows and the same cleanup is missing in
new code which is causing this issue.

/* Clean up the event object we created for the socket */
if (sockevent != WSA_INVALID_EVENT)
{
WSAEventSelect(sock, NULL, 0);
WSACloseEvent(sockevent);
}

Upthread Andres mentioned that it isn't okay to reset the events as
was done previously and I believe he is right with the minor exception
that we need it for some of the events like FD_READ.  So that's why I
think it is better to fix the problem for FD_READ.

> And
> WaitEventAdjustWin32() is the place where WSAEventSelect() gets called
> to adjust the flags FD_READ and FD_WRITE depending on the events
> WL_SOCKET_READABLE or WL_SOCKET_WRITEABLE.
>
> Amit, I have a question here. As far as I can read, secure_read() does
> set WL_SOCKET_READABLE as an event to wait for, so FD_READ is
> correctly being set by WSAEventSelect(). What the patch proposed is
> doing is making sure that *only* WL_SOCKET_READABLE is set in the
> event flags. Why is that necessary?
>

Oh no, the intention of the patch is to just reset the
WL_SOCKET_READABLE event flag after WaitEventSetWait() in
secure_read(), so that next time it calls ModifyWaitEvent(), it could
call WSAEventSelect() to re-enable the event.

> I can buy that this addresses the
> problem as this has been visibly tested, but I am unclear about why
> that's the correct thing to do. The patch clearly needs comments to
> clarify its position. Actually, I think that what is surprising in the
> solution proposed is actually that the event flags are reset *after*
> the while loop in WaitEventSetWait().
>

As mentioned above, the current patch doesn't do it correctly.

> I could understand something
> happening inside the loop if WSAEnumNetworkEvents() updates things on
> the fly though...
>

No, nothing wrong in that, it is just that for some of the events
(probably the events that are level triggered) we need to reenable
them by using WSAEventSelect before reuse.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] Performance degradation in Bitmapscan (commit75ae538bc3168bf44475240d4e0487ee2f3bb376)
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Hang in pldebugger after git commit : 98a64d0