Обсуждение: FATAL: semctl(1672698088, 12, SETVAL, 0) failed
I encountered an error when I fast shutdown 8.1.1 on Win2k:
FATAL: semctl(1672698088, 12, SETVAL, 0) failed: A blocking operation
was interrupted by a call to WSACancelBlockingCall.
A similar error on 8.1/win2003 was reported on pgsql-general (sorry, I can't
dig out the
original post from our web archives):
From: Niederland
Date: Tues, Dec 13 2005 9:49 am
2005-12-12 20:30:00 FATAL: semctl(50884184, 15, SETVAL, 0) failed: A
non-blocking socket operation could not be completed immediately.
---
There are two problems here:
(1) Why a socket error?
In port/win32.h, we have
#undef EAGAIN
#undef EINTR
#define EINTR WSAEINTR
#define EAGAIN WSAEWOULDBLOCK
What's the rationale of doing so?
(2) What's happened here?
It may come from PGSemaphoreReset(), and win32 semop() looks like this:
ret = WaitForMultipleObjectsEx(2, wh, FALSE, (sops[0].sem_flg &
IPC_NOWAIT) ? 0 : INFINITE, TRUE);
...
else if (ret == WAIT_OBJECT_0 + 1 || ret == WAIT_IO_COMPLETION)
{
pgwin32_dispatch_queued_signals();
errno = EINTR;
}
else if (ret == WAIT_TIMEOUT)
errno = EAGAIN;
So it seems the EINTR is caused by an incoming signal, the EAGAIN is caused
by a TIMEOUT ... any ideas?
Regards,
Qingqing
Qingqing Zhou wrote:
> I encountered an error when I fast shutdown 8.1.1 on Win2k:
>
> FATAL: semctl(1672698088, 12, SETVAL, 0) failed: A blocking operation
> was interrupted by a call to WSACancelBlockingCall.
>
> A similar error on 8.1/win2003 was reported on pgsql-general (sorry, I can't
> dig out the
> original post from our web archives):
>
> From: Niederland
> Date: Tues, Dec 13 2005 9:49 am
>
> 2005-12-12 20:30:00 FATAL: semctl(50884184, 15, SETVAL, 0) failed: A
> non-blocking socket operation could not be completed immediately.
>
> ---
>
> There are two problems here:
>
> (1) Why a socket error?
> In port/win32.h, we have
>
> #undef EAGAIN
> #undef EINTR
> #define EINTR WSAEINTR
> #define EAGAIN WSAEWOULDBLOCK
>
> What's the rationale of doing so?
We did this so that our code could refer to EINTR/EAGAIN without
port-specific tests.
> (2) What's happened here?
> It may come from PGSemaphoreReset(), and win32 semop() looks like this:
>
> ret = WaitForMultipleObjectsEx(2, wh, FALSE, (sops[0].sem_flg &
> IPC_NOWAIT) ? 0 : INFINITE, TRUE);
> ...
> else if (ret == WAIT_OBJECT_0 + 1 || ret == WAIT_IO_COMPLETION)
> {
> pgwin32_dispatch_queued_signals();
> errno = EINTR;
> }
> else if (ret == WAIT_TIMEOUT)
> errno = EAGAIN;
>
> So it seems the EINTR is caused by an incoming signal, the EAGAIN is caused
> by a TIMEOUT ... any ideas?
I looked at the documentation for the function:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/waitformultipleobjectsex.asp
and it isn't clear what return failure values it has. We certainly
could loop on WSAEINTR. Can you test it?
--
Bruce Momjian http://candle.pha.pa.us
SRA OSS, Inc. http://www.sraoss.com
+ If your life is a hard drive, Christ can be your backup. +
"Bruce Momjian" <pgman@candle.pha.pa.us> wrote
> > In port/win32.h, we have
> >
> > #undef EAGAIN
> > #undef EINTR
> > #define EINTR WSAEINTR
> > #define EAGAIN WSAEWOULDBLOCK
> >
> > What's the rationale of doing so?
>
> We did this so that our code could refer to EINTR/EAGAIN without
> port-specific tests.
>
AFAICS, by doing so, the EINTR/EAGAIN will be translated into
WSAINTR/WSAEWOULDBLOCK through *all* the backend code. That's seems not
appropriate for the code not involving any socket stuff ... I think we need
a fix here.
> > (2) What's happened here?
> > It may come from PGSemaphoreReset(), and win32 semop() looks like this:
> >
> > ret = WaitForMultipleObjectsEx(2, wh, FALSE, (sops[0].sem_flg &
> > IPC_NOWAIT) ? 0 : INFINITE, TRUE);
> > ...
> > else if (ret == WAIT_OBJECT_0 + 1 || ret == WAIT_IO_COMPLETION)
> > {
> > pgwin32_dispatch_queued_signals();
> > errno = EINTR;
> > }
> > else if (ret == WAIT_TIMEOUT)
> > errno = EAGAIN;
> >
> > So it seems the EINTR is caused by an incoming signal, the EAGAIN is
caused
> > by a TIMEOUT ... any ideas?
>
> I looked at the documentation for the function:
>
>
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/waitformultipleobjectsex.asp
>
> and it isn't clear what return failure values it has. We certainly
> could loop on WSAEINTR. Can you test it?
>
Yeah, looking at other code of using semop(), we could plug in a loop in the
win32 semctl():
/* Quickly lock/unlock the semaphore (if we can) */
+ do
+ {
+ errStatus = semop(semId, &sops, 1);
+ } while (errStatus < 0 && errno == EINTR);
if (semop(semId, &sops, 1) < 0)
return -1;
But:
(1) The EINTR problem happens rather rare, so testing it is difficult;
(2) I would rather not doing the above changes before we understand what's
happened here, especially when we have seen a EAGAIN reported here.
Regards,
Qingqing
Qingqing Zhou wrote: > > "Bruce Momjian" <pgman@candle.pha.pa.us> wrote > > > In port/win32.h, we have > > > > > > #undef EAGAIN > > > #undef EINTR > > > #define EINTR WSAEINTR > > > #define EAGAIN WSAEWOULDBLOCK > > > > > > What's the rationale of doing so? > > > > We did this so that our code could refer to EINTR/EAGAIN without > > port-specific tests. > > > > AFAICS, by doing so, the EINTR/EAGAIN will be translated into > WSAINTR/WSAEWOULDBLOCK through *all* the backend code. That's seems not > appropriate for the code not involving any socket stuff ... I think we need > a fix here. Uh, how do we handle it now? I thought we did just that. > http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/waitformultipleobjectsex.asp > > > > and it isn't clear what return failure values it has. We certainly > > could loop on WSAEINTR. Can you test it? > > > > Yeah, looking at other code of using semop(), we could plug in a loop in the > win32 semctl(): > > /* Quickly lock/unlock the semaphore (if we can) */ > + do > + { > + errStatus = semop(semId, &sops, 1); > + } while (errStatus < 0 && errno == EINTR); > > if (semop(semId, &sops, 1) < 0) > return -1; > > But: > (1) The EINTR problem happens rather rare, so testing it is difficult; > (2) I would rather not doing the above changes before we understand what's > happened here, especially when we have seen a EAGAIN reported here. OK, so how do we find the answer? -- Bruce Momjian http://candle.pha.pa.us SRA OSS, Inc. http://www.sraoss.com + If your life is a hard drive, Christ can be your backup. +
On Tue, 28 Feb 2006, Bruce Momjian wrote: > > Uh, how do we handle it now? I thought we did just that. > > OK, so how do we find the answer? > For both problems, I am uncertain (or I've sent a patch already :-(). Call more artillery support here ... Regards, Qingqing
Thread added to TODO.detail for Win32:
o Check WSACancelBlockingCall() for interrupts (win32intr)
---------------------------------------------------------------------------
Qingqing Zhou wrote:
>
>
> On Tue, 28 Feb 2006, Bruce Momjian wrote:
>
> >
> > Uh, how do we handle it now? I thought we did just that.
> >
> > OK, so how do we find the answer?
> >
>
> For both problems, I am uncertain (or I've sent a patch already :-(). Call
> more artillery support here ...
>
> Regards,
> Qingqing
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>
--
Bruce Momjian http://candle.pha.pa.us
SRA OSS, Inc. http://www.sraoss.com
+ If your life is a hard drive, Christ can be your backup. +