On 2013-09-13 14:33:25 -0400, Stephen Frost wrote:
> * Stephen Frost (sfrost@snowman.net) wrote:
> > * Andres Freund (andres@2ndquadrant.com) wrote:
> > > Hm. close_SSL() first does pqsecure_destroy() which will unset the
> > > callbacks, and the count and then goes on to do X509_free() and
> > > ENGINE_finish(), ENGINE_free() if either is used.
> > >
> > > It's not implausible that one of those actually needs locking. I doubt
> > > engines play a role here, but, without having looked at the testcase,
> > > X509_free() might be a possibility.
> >
> > Unfortunately, while I can still easily get the deadlock to happen when
> > the hooks are reset, the hooks don't appear to ever get called when
> > ssl_open_connections is set to zero. You have a good point about the
> > additional SSL calls after the hooks are unloaded though, I wonder if
> > holding the ssl_config_mutex lock over all of close_SSL might be more
> > sensible..
>
> I went ahead and moved the locks to be around all of close_SSL() and
> haven't been able to reproduce the deadlock, so perhaps those calls are
> the issue and what's happening is that another thread is dropping or
> adding the hooks in a common place while the X509_free, etc, are trying
> to figure out if they should be calling the locking functions or not,
> but there's a race because there's no higher-level locking happening
> around those.
>
> Attached is a patch to move those and which doesn't deadlock for me.
It seems slightly cleaner to just move the pqsecure_destroy(); to the
end of that function, based on a boolean. But if you think otherwise, I
won't protest...
Greetings,
Andres Freund
-- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services