Обсуждение: RE: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )

Поиск
Список
Период
Сортировка

RE: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )

От
Horak Daniel
Дата:
> In any case, when one backend quits and another one is 
> started, the new
> one will re-use the semaphore no longer used by the defunct backend.

I have tested my solution a bit more and I have to say that reusing a
semaphore by a new backend works OK. But it is not possible for a newly
created backend to use a semaphore allocated by postmaster (it freezes on
test if the semaphore with given key already exists - done with
semId=semget(semKey, 0, 0) in function IpcSemaphoreCreate() in
storage/ipc/ipc.c ). Why it is, I don't know, but it seems that my solution
uses the ipc library in the right way. There are no longer any error
messages from the ipc library when running the server. And I can't say that
the ipc library is a 100% correct implementation of SysV IPC, it is probably
(sure ;-) )caused by the Windows internals.
        Dan


Re: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )

От
Tom Lane
Дата:
Horak Daniel <horak@mmp.plzen-city.cz> writes:
> I have tested my solution a bit more and I have to say that reusing a
> semaphore by a new backend works OK. But it is not possible for a newly
> created backend to use a semaphore allocated by postmaster (it freezes on
> test if the semaphore with given key already exists - done with
> semId=semget(semKey, 0, 0) in function IpcSemaphoreCreate() in
> storage/ipc/ipc.c ). Why it is, I don't know, but it seems that my solution
> uses the ipc library in the right way.

It seems that you have found a bug in the cygipc library.  I suggest
reporting it to the author of same...
        regards, tom lane


Re: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )

От
Bruce Momjian
Дата:
[Charset iso-8859-1 unsupported, filtering to ASCII...]
> > In any case, when one backend quits and another one is 
> > started, the new
> > one will re-use the semaphore no longer used by the defunct backend.
> 
> I have tested my solution a bit more and I have to say that reusing a
> semaphore by a new backend works OK. But it is not possible for a newly
> created backend to use a semaphore allocated by postmaster (it freezes on
> test if the semaphore with given key already exists - done with
> semId=semget(semKey, 0, 0) in function IpcSemaphoreCreate() in
> storage/ipc/ipc.c ). Why it is, I don't know, but it seems that my solution
> uses the ipc library in the right way. There are no longer any error
> messages from the ipc library when running the server. And I can't say that
> the ipc library is a 100% correct implementation of SysV IPC, it is probably
> (sure ;-) )caused by the Windows internals.

Seems we may have to use the patch, or make some other patch for NT-only
that works around this NT bug.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )

От
Bruce Momjian
Дата:
> Horak Daniel <horak@mmp.plzen-city.cz> writes:
> > I have tested my solution a bit more and I have to say that reusing a
> > semaphore by a new backend works OK. But it is not possible for a newly
> > created backend to use a semaphore allocated by postmaster (it freezes on
> > test if the semaphore with given key already exists - done with
> > semId=semget(semKey, 0, 0) in function IpcSemaphoreCreate() in
> > storage/ipc/ipc.c ). Why it is, I don't know, but it seems that my solution
> > uses the ipc library in the right way.
> 
> It seems that you have found a bug in the cygipc library.  I suggest
> reporting it to the author of same...

Yes, but can we expect all NT sites to get the patch before using
PostgreSQL?  Is there a workaround we can implement?

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )

От
Tom Lane
Дата:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
>> storage/ipc/ipc.c ). Why it is, I don't know, but it seems that my solution
>> uses the ipc library in the right way. There are no longer any error
>> messages from the ipc library when running the server. And I can't say that
>> the ipc library is a 100% correct implementation of SysV IPC, it is probably
>> (sure ;-) )caused by the Windows internals.

> Seems we may have to use the patch, or make some other patch for NT-only
> that works around this NT bug.

I don't have a problem with installing an NT patch (lord knows there
are plenty of #ifdef __CYGWIN32__'s in the code already).  But I have
a problem with *this* patch because I don't believe we understand what
it is doing, and therefore I have no confidence in it.  The extent of
our understanding so far is that one backend can create a semaphore that
can be used by a later backend, but the postmaster cannot create a
semaphore that can be used by a later backend.  I don't really believe
that; I think there is something else going on.  Until we understand
what the something else is, I don't think we have a trustworthy
solution.

The real reason I feel itchy about this is that I know that interprocess
synchronization is a very tricky area, so I'm not confident that the
limited amount of testing Dan can do by himself proves that things are
solid.  As the old saw goes, "testing cannot prove the absence of bugs".
I want to have both clean test results *and* an understanding of what
we are doing before I will feel comfortable.

Looking again at the code, it occurs to me that a backend exiting
normally will probably leave its semaphore set nonzero, which could
(given a buggy IPC library) have something to do with whether another
process can attach to the sema or not.  The postmaster code is *trying*
to create the semas with nonzero starting values, but I see that the
backend code takes the additional step of doing    semun.val = IpcSemaphoreDefaultStartValue;    semctl(semId, semNum,
SETVAL,semun);
 
whereas the postmaster code doesn't.  Maybe the create call isn't
initializing the semaphores the way it's told to?  It'd be worth
trying adding a step like this to the postmaster preallocation.

In any case, I'd really like us to get some feedback from the author of
cygipc about this issue.  I don't mind working around a bug once we
understand exactly what the bug is --- but in this particular area,
I think guessing our way to a workaround isn't good enough.
        regards, tom lane


Re: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )

От
Bruce Momjian
Дата:
Got it.


> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> >> storage/ipc/ipc.c ). Why it is, I don't know, but it seems that my solution
> >> uses the ipc library in the right way. There are no longer any error
> >> messages from the ipc library when running the server. And I can't say that
> >> the ipc library is a 100% correct implementation of SysV IPC, it is probably
> >> (sure ;-) )caused by the Windows internals.
> 
> > Seems we may have to use the patch, or make some other patch for NT-only
> > that works around this NT bug.
> 
> I don't have a problem with installing an NT patch (lord knows there
> are plenty of #ifdef __CYGWIN32__'s in the code already).  But I have
> a problem with *this* patch because I don't believe we understand what
> it is doing, and therefore I have no confidence in it.  The extent of
> our understanding so far is that one backend can create a semaphore that
> can be used by a later backend, but the postmaster cannot create a
> semaphore that can be used by a later backend.  I don't really believe
> that; I think there is something else going on.  Until we understand
> what the something else is, I don't think we have a trustworthy
> solution.
> 
> The real reason I feel itchy about this is that I know that interprocess
> synchronization is a very tricky area, so I'm not confident that the
> limited amount of testing Dan can do by himself proves that things are
> solid.  As the old saw goes, "testing cannot prove the absence of bugs".
> I want to have both clean test results *and* an understanding of what
> we are doing before I will feel comfortable.
> 
> Looking again at the code, it occurs to me that a backend exiting
> normally will probably leave its semaphore set nonzero, which could
> (given a buggy IPC library) have something to do with whether another
> process can attach to the sema or not.  The postmaster code is *trying*
> to create the semas with nonzero starting values, but I see that the
> backend code takes the additional step of doing
>         semun.val = IpcSemaphoreDefaultStartValue;
>         semctl(semId, semNum, SETVAL, semun);
> whereas the postmaster code doesn't.  Maybe the create call isn't
> initializing the semaphores the way it's told to?  It'd be worth
> trying adding a step like this to the postmaster preallocation.
> 
> In any case, I'd really like us to get some feedback from the author of
> cygipc about this issue.  I don't mind working around a bug once we
> understand exactly what the bug is --- but in this particular area,
> I think guessing our way to a workaround isn't good enough.
> 
>             regards, tom lane
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )

От
"Hiroshi Inoue"
Дата:
> -----Original Message-----
> From: owner-pgsql-hackers@postgreSQL.org
> [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Horak Daniel
> Sent: Tuesday, August 17, 1999 9:06 PM
> To: 'Tom Lane'
> Cc: 'pgsql-hackers@postgreSQL.org'
> Subject: RE: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )
>
>
> > In any case, when one backend quits and another one is
> > started, the new
> > one will re-use the semaphore no longer used by the defunct backend.
>
> I have tested my solution a bit more and I have to say that reusing a
> semaphore by a new backend works OK. But it is not possible for a newly
> created backend to use a semaphore allocated by postmaster (it freezes on
> test if the semaphore with given key already exists - done with
> semId=semget(semKey, 0, 0) in function IpcSemaphoreCreate() in
> storage/ipc/ipc.c ). Why it is, I don't know, but it seems that
> my solution
> uses the ipc library in the right way. There are no longer any error
> messages from the ipc library when running the server. And I
> can't say that
> the ipc library is a 100% correct implementation of SysV IPC, it
> is probably
> (sure ;-) )caused by the Windows internals.
>

Yutaka Tanida [yutaka@marin.or.jp] and I have examined IPC
library.

We found that postmaster doesn't call exec() after fork() since v6.4.

The value of static/extern variables which cygipc library holds may
be different from their initial values when postmaster fork()s child
backend processes.

I made the following patch for cygipc library on trial.
This patch was effective for Yutaka's test case.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

*** sem.c.orig    Tue Dec 01 00:16:25 1998
--- sem.c    Tue Aug 17 13:22:06 1999
***************
*** 58,63 ****
--- 58,78 ---- static int          GFirstSem     = 0;        /*PCPC*/ static int          GFdSem        ;
/*PCPC*/

+ static pid_t    GProcessId = 0;
+
+ static void    init_globals(void)
+ {
+     pid_t pid;
+
+     if (pid=getpid(), pid != GProcessId)
+     {
+         GFirstSem = 0;
+         used_sems = used_semids = max_semid = 0;
+         sem_seq = 0;
+         GProcessId = pid;
+     }
+ }
+ /************************************************************************/ /* Demande d'acces a la zone partagee de
gestiondes semaphores        */ /************************************************************************/
 
***************
*** 77,82 ****
--- 92,98 ---- {     int LRet ;

+     init_globals();     if( GFirstSem == 0 )     {     if( IsGSemSemExist() )
*** shm.c.orig    Tue Dec 01 01:04:57 1998
--- shm.c    Tue Aug 17 13:22:27 1999
***************
*** 59,64 ****
--- 59,81 ---- static int          GFirstShm     = 0;        /*PCPC*/ static int          GFdShm        ;
/*PCPC*/

+ /*****************************************/
+ /*    Initialization of static variables   */
+ /*****************************************/
+ static pid_t GProcessId = 0;
+ static void init_globals(void)
+ {
+     pid_t pid;
+
+     if (pid=getpid(), pid != GProcessId)
+     {
+         GFirstShm = 0;
+         shm_rss = shm_swp = max_shmid = 0;
+         shm_seq = 0;
+         GProcessId = pid;
+     }
+ }
+ /************************************************************************/ /* Demande d'acces a la zone partagee de
gestiondes shm        */ /************************************************************************/
 
***************
*** 82,87 ****
--- 99,105 ---- {  int LRet ;

+  init_globals();  if( GFirstShm == 0 )  {   if( IsGSemShmExist() )
*** msg.c.orig    Tue Dec 01 00:16:09 1998
--- msg.c    Tue Aug 17 13:20:04 1999
***************
*** 57,62 ****
--- 57,77 ---- static int          GFirstMsg     = 0;        /*PCPC*/ static int          GFdMsg        ;
/*PCPC*/

+ /*****************************************/
+ /*    Initialization of static variables   */
+ /*****************************************/
+ static pid_t GProcessId = 0;
+ static void init_globals(void)
+ {
+     pid_t pid;
+
+     if (pid=getpid(), pid != GProcessId)
+     {
+         GFirstMsg = 0;
+         msgbytes = msghdrs = msg_seq = used_queues = max_msqid = 0;
+         GProcessId = pid;
+     }
+ } /************************************************************************/ /* Demande d'acces a la zone partagee de
gestiondes semaphores        */ /************************************************************************/
 
***************
*** 79,84 ****
--- 94,100 ---- {  int LRet ;

+  init_globals();  if( GFirstMsg == 0 )  {   if( IsGSemMsgExist() )



Re: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )

От
Bruce Momjian
Дата:
I have added this to the end of the README.NT file.



[Charset iso-8859-1 unsupported, filtering to ASCII...]
> > -----Original Message-----
> > From: owner-pgsql-hackers@postgreSQL.org
> > [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Horak Daniel
> > Sent: Tuesday, August 17, 1999 9:06 PM
> > To: 'Tom Lane'
> > Cc: 'pgsql-hackers@postgreSQL.org'
> > Subject: RE: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )
> >
> >
> > > In any case, when one backend quits and another one is
> > > started, the new
> > > one will re-use the semaphore no longer used by the defunct backend.
> >
> > I have tested my solution a bit more and I have to say that reusing a
> > semaphore by a new backend works OK. But it is not possible for a newly
> > created backend to use a semaphore allocated by postmaster (it freezes on
> > test if the semaphore with given key already exists - done with
> > semId=semget(semKey, 0, 0) in function IpcSemaphoreCreate() in
> > storage/ipc/ipc.c ). Why it is, I don't know, but it seems that
> > my solution
> > uses the ipc library in the right way. There are no longer any error
> > messages from the ipc library when running the server. And I
> > can't say that
> > the ipc library is a 100% correct implementation of SysV IPC, it
> > is probably
> > (sure ;-) )caused by the Windows internals.
> >
> 
> Yutaka Tanida [yutaka@marin.or.jp] and I have examined IPC
> library.
> 
> We found that postmaster doesn't call exec() after fork() since v6.4.
> 
> The value of static/extern variables which cygipc library holds may
> be different from their initial values when postmaster fork()s child
> backend processes.
> 
> I made the following patch for cygipc library on trial.
> This patch was effective for Yutaka's test case.
> 
> Regards.
> 
> Hiroshi Inoue
> Inoue@tpf.co.jp
> 
> *** sem.c.orig    Tue Dec 01 00:16:25 1998
> --- sem.c    Tue Aug 17 13:22:06 1999
> ***************
> *** 58,63 ****
> --- 58,78 ----
>   static int          GFirstSem     = 0;        /*PCPC*/
>   static int          GFdSem        ;        /*PCPC*/
> 
> + static pid_t    GProcessId = 0;
> +
> + static void    init_globals(void)
> + {
> +     pid_t pid;
> +
> +     if (pid=getpid(), pid != GProcessId)
> +     {
> +         GFirstSem = 0;
> +         used_sems = used_semids = max_semid = 0;
> +         sem_seq = 0;
> +         GProcessId = pid;
> +     }
> + }
> +
>   /************************************************************************/
>   /* Demande d'acces a la zone partagee de gestion des semaphores        */
>   /************************************************************************/
> ***************
> *** 77,82 ****
> --- 92,98 ----
>   {
>       int LRet ;
> 
> +     init_globals();
>       if( GFirstSem == 0 )
>       {
>       if( IsGSemSemExist() )
> *** shm.c.orig    Tue Dec 01 01:04:57 1998
> --- shm.c    Tue Aug 17 13:22:27 1999
> ***************
> *** 59,64 ****
> --- 59,81 ----
>   static int          GFirstShm     = 0;        /*PCPC*/
>   static int          GFdShm        ;        /*PCPC*/
> 
> + /*****************************************/
> + /*    Initialization of static variables   */
> + /*****************************************/
> + static pid_t GProcessId = 0;
> + static void init_globals(void)
> + {
> +     pid_t pid;
> +
> +     if (pid=getpid(), pid != GProcessId)
> +     {
> +         GFirstShm = 0;
> +         shm_rss = shm_swp = max_shmid = 0;
> +         shm_seq = 0;
> +         GProcessId = pid;
> +     }
> + }
> +
>   /************************************************************************/
>   /* Demande d'acces a la zone partagee de gestion des shm        */
>   /************************************************************************/
> ***************
> *** 82,87 ****
> --- 99,105 ----
>   {
>    int LRet ;
> 
> +  init_globals();
>    if( GFirstShm == 0 )
>    {
>     if( IsGSemShmExist() )
> *** msg.c.orig    Tue Dec 01 00:16:09 1998
> --- msg.c    Tue Aug 17 13:20:04 1999
> ***************
> *** 57,62 ****
> --- 57,77 ----
>   static int          GFirstMsg     = 0;        /*PCPC*/
>   static int          GFdMsg        ;        /*PCPC*/
> 
> + /*****************************************/
> + /*    Initialization of static variables   */
> + /*****************************************/
> + static pid_t GProcessId = 0;
> + static void init_globals(void)
> + {
> +     pid_t pid;
> +
> +     if (pid=getpid(), pid != GProcessId)
> +     {
> +         GFirstMsg = 0;
> +         msgbytes = msghdrs = msg_seq = used_queues = max_msqid = 0;
> +         GProcessId = pid;
> +     }
> + }
>   /************************************************************************/
>   /* Demande d'acces a la zone partagee de gestion des semaphores        */
>   /************************************************************************/
> ***************
> *** 79,84 ****
> --- 94,100 ----
>   {
>    int LRet ;
> 
> +  init_globals();
>    if( GFirstMsg == 0 )
>    {
>     if( IsGSemMsgExist() )
> 
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026