Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.
Дата
Msg-id 200810271936.m9RJaGW09544@momjian.us
обсуждение исходный текст
Ответы Re: Re: [BUGS] libpq does not manage SSL callbacks properly when other libraries are involved.  (Magnus Hagander <magnus@hagander.net>)
Список pgsql-hackers
Russell Smith wrote:
> Alvaro Herrera wrote:
> > PoolSnoopy wrote:
> >
> >> ***PUSH***
> >>
> >> this bug is really some annoyance if you use automatic build environments.
> >> I'm using phpunit to run tests and as soon as postgres is involved the php
> >> cli environment segfaults at the end. this can be worked around by disabling
> >> ssl but it would be great if the underlying bug got fixed.
> >>
> >
> > This is PHP's bug, isn't it?  Why are you complaining here
> No, this is a problem with the callback/exit functions used by
> PostgreSQL.  We setup callback functions when we use SSL, if somebody
> else uses SSL we can create a problem.
>
> I thought my original report was detailed enough to explain where the
> problem is coming from.  Excerpt from original report;
>
> This is part of a comment from the php bug comment history;
>
> *[12 Nov 2007 2:45pm UTC] sam at zoy dot org*
>
> Hello, I did read the sources and studied them, and I can confirm
> that it is a matter of callback jumping to an invalid address.
>
> libpq's init_ssl_system() installs callbacks by calling
> CRYPTO_set_id_callback() and CRYPTO_set_locking_callback(). This
> function is called each time initialize_SSL() is called (for instance
> through the PHP pg_connect() function) and does not keep a reference
> counter, so libpq's destroy_SSL() has no way to know that it should
> call a destroy_ssl_system() function, and there is no such function
> anyway. So the callbacks are never removed.
>
> But then, upon cleanup, PHP calls zend_shutdown() which properly
> unloads pgsql.so and therefore the unused libpq.
>
> Finally, the zend_shutdown procedure calls zm_shutdown_curl()
> which in turn calls curl_global_cleanup() which leads to an
> ERR_free_strings() call and eventually a CRYPTO_lock() call.
> CRYPTO_lock() checks whether there are any callbacks to call,
> finds one (the one installed by libpg), calls it, and crashes
> because libpq was unloaded and hence the callback is no longer
> in mapped memory.
>
> --
>
> Basically postgresql doesn't cancel the callbacks to itself when the pg
> connection is shut down.  So if the libpq library is unloaded before
> other libraries that use SSL you get a crash as described above.  PHP
> has suggested the fix is to keep a reference counter in libpq so knows
> when to remove the callbacks.
>
> This is a complicated bug, but without real evidence there is no way to
> go to back to PHP and say it's their fault.  Their analysis is
> relatively comprehensive compared to the feedback that's been posted
> here so far.  I'm not sure how best to setup an environment to replicate
> the bug in a way I can debug it.  And even if I get to the point of
> nailing it down, I'll just be back asking questions about how you would
> fix it because I know very little about SSL.
>
> All that said, a quick poke in the source of PostgreSQL says that
> fe-secure.c sets callbacks using CRYPTO_set_xx_callback(...).  These are
> only set in the threaded version it appears.  Which is pretty much
> default in all the installations I encounter.
>
> My google research indicated we need to call
> CRYPTO_set_xx_callback(NULL) when we exit.  but that's not done.  One
> idea for a fix is to add a counter to the initialize_ssl function and
> when destory_ssl is called, decrement the counter.  If it reaches 0 then
> call CRYPT_set_xx_callback(NULL) to remove the callbacks.  This is a
> windows SSL thread that crashes iexplore and testifies to the same
> problem http://www.mail-archive.com/openssl-users@openssl.org/msg53869.html

Sorry for the delay in addressing this bug report.

Your analysis of this problem is right on target.  When the SSL
callbacks were implemented for threaded libpq, there was never any
thought on the effect of unloading libpq while the callbacks were still
registered.

The attached patch unregisters the callback on the close of the last
libpq connection.  Fortunately we require PQfinish() even if the
connection request failed, meaning there should be proper accounting of
the number of open connections with the method used in this patch.

We do leak some memory for every load/unload of libpq, but the leaks
extend beyond the SSL code to the rest of libpq so I didn't attempt to
address that in this patch (and no one has complained about it).

I also could have implemented a function to unload the SSL callbacks.
It would have to have been called before libpq was unloaded, but I
considered it inconvenient and unlikely to be adopted by applications
using libpq in the short-term.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/interfaces/libpq/fe-secure.c
===================================================================
RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-secure.c,v
retrieving revision 1.105
diff -c -c -r1.105 fe-secure.c
*** src/interfaces/libpq/fe-secure.c    16 May 2008 18:30:53 -0000    1.105
--- src/interfaces/libpq/fe-secure.c    27 Oct 2008 19:30:32 -0000
***************
*** 148,153 ****
--- 148,154 ----
  static int    verify_cb(int ok, X509_STORE_CTX *ctx);
  static int    client_cert_cb(SSL *, X509 **, EVP_PKEY **);
  static int    init_ssl_system(PGconn *conn);
+ static void    destroy_ssl_system(void);
  static int    initialize_SSL(PGconn *);
  static void destroy_SSL(void);
  static PostgresPollingStatusType open_client_SSL(PGconn *);
***************
*** 160,165 ****
--- 161,171 ----
  static bool pq_initssllib = true;

  static SSL_CTX *SSL_context = NULL;
+
+ #ifdef ENABLE_THREAD_SAFETY
+ static int ssl_open_connections = 0;
+ #endif
+
  #endif

  /*
***************
*** 836,860 ****
      if (pthread_mutex_lock(&init_mutex))
          return -1;

!     if (pq_initssllib && pq_lockarray == NULL)
      {
!         int            i;
!
!         CRYPTO_set_id_callback(pq_threadidcallback);
!
!         pq_lockarray = malloc(sizeof(pthread_mutex_t) * CRYPTO_num_locks());
!         if (!pq_lockarray)
          {
!             pthread_mutex_unlock(&init_mutex);
!             return -1;
          }
!         for (i = 0; i < CRYPTO_num_locks(); i++)
          {
!             if (pthread_mutex_init(&pq_lockarray[i], NULL))
!                 return -1;
          }
-
-         CRYPTO_set_locking_callback(pq_lockingcallback);
      }
  #endif
      if (!SSL_context)
--- 842,876 ----
      if (pthread_mutex_lock(&init_mutex))
          return -1;

!     if (pq_initssllib)
      {
!         if (pq_lockarray == NULL)
          {
!             int i;
!
!             pq_lockarray = malloc(sizeof(pthread_mutex_t) * CRYPTO_num_locks());
!             if (!pq_lockarray)
!             {
!                 pthread_mutex_unlock(&init_mutex);
!                 return -1;
!             }
!             for (i = 0; i < CRYPTO_num_locks(); i++)
!             {
!                 if (pthread_mutex_init(&pq_lockarray[i], NULL))
!                 {
!                     free(pq_lockarray);
!                     pq_lockarray = NULL;
!                     pthread_mutex_unlock(&init_mutex);
!                     return -1;
!                 }
!             }
          }
!
!         if (ssl_open_connections++ == 0)
          {
!             CRYPTO_set_id_callback(pq_threadidcallback);
!             CRYPTO_set_locking_callback(pq_lockingcallback);
          }
      }
  #endif
      if (!SSL_context)
***************
*** 889,894 ****
--- 905,970 ----
  }

  /*
+  *    This function is needed because if the libpq library is unloaded
+  *    from the application, the callback functions will no longer exist when
+  *    SSL used by other parts of the system.  For this reason,
+  *    we unregister the SSL callback functions when the last libpq
+  *    connection is closed.
+  */
+ static void
+ destroy_ssl_system(void)
+ {
+ #ifdef ENABLE_THREAD_SAFETY
+ #ifndef WIN32
+     static pthread_mutex_t init_mutex = PTHREAD_MUTEX_INITIALIZER;
+ #else
+     static pthread_mutex_t init_mutex = NULL;
+     static long mutex_initlock = 0;
+
+     if (init_mutex == NULL)
+     {
+         while (InterlockedExchange(&mutex_initlock, 1) == 1)
+              /* loop, another thread own the lock */ ;
+         if (init_mutex == NULL)
+         {
+             if (pthread_mutex_init(&init_mutex, NULL))
+                 return -1;
+         }
+         InterlockedExchange(&mutex_initlock, 0);
+     }
+ #endif
+     if (pthread_mutex_lock(&init_mutex))
+         return;
+
+     if (pq_initssllib)
+     {
+         /*
+          *    We never free pq_lockarray, which means we leak memory on
+          *    repeated loading/unloading of this library.
+          */
+
+         if (ssl_open_connections > 0)
+             --ssl_open_connections;
+
+         if (ssl_open_connections == 0)
+         {
+             /*
+              *    We need to unregister the SSL callbacks on last connection
+              *    close because the libpq shared library might be unloaded,
+              *    and once it is, callbacks must be removed to prevent them
+              *    from being called by other SSL code.
+              */
+             CRYPTO_set_locking_callback(NULL);
+             CRYPTO_set_id_callback(NULL);
+         }
+     }
+
+     pthread_mutex_unlock(&init_mutex);
+ #endif
+     return;
+ }
+
+ /*
   *    Initialize global SSL context.
   */
  static int
***************
*** 958,963 ****
--- 1034,1040 ----
  static void
  destroy_SSL(void)
  {
+     destroy_ssl_system();
      if (SSL_context)
      {
          SSL_CTX_free(SSL_context);

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Hot Standby: Code Snapshot (v2b)
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Full socket send buffer prevents cancel, timeout