Обсуждение: Patch to allow C extension modules to initialize/finish

Поиск
Список
Период
Сортировка

Patch to allow C extension modules to initialize/finish

От
"Ralf S. Engelschall"
Дата:
PostgreSQL provides a way to load C extension modules with its internal
FMGR. Unfortunately there is no portable way for an extension module to
initialize (directly after the pg_dlopen() of the DSO) and to finish
(directly before the pg_dlclose() of the DSO). This way it is mostly
impossible to write a more complex extension module in a portable way.

The only to me known workarounds are either to call an own
initialization function at the start of _EVERY_ exported function
manually (works, but is ugly and especially doesn't work for the
finishing function!) or to leverage some platform specific hacks like
the implicitly called _init and _fini functions (is what the ODBC
extension module currently does, but is horribly platform specific and
not portable).

Hence I propose the patch below (applies to PostgreSQL 8.1.4) which
mimics the dlopen(3) and dlclose(3) behaviour of some Unix platforms
and resolves and calls _PG_init and _PG_fini functions of an extension
module right after/before the pg_dlopen/pg_dlclose calls in the FMGR.
This is both a fully portable solution and fully backward compatible to
existing and forthcoming extension modules (except they really would
have _PG_init and _PG_fini functions already defined).
                                      Ralf S. Engelschall                                      rse@engelschall.com
                               www.engelschall.com
 

Index: src/backend/utils/fmgr/dfmgr.c
--- src/backend/utils/fmgr/dfmgr.c.orig    2005-10-15 04:49:32 +0200
+++ src/backend/utils/fmgr/dfmgr.c    2006-08-02 20:48:48 +0200
@@ -60,6 +60,10 @@static char *expand_dynamic_library_name(const char *name);static char
*substitute_libpath_macro(constchar *name);
 

+/* types for PostgreSQL-specific DSO init/fini functions */
+typedef void (*PG_init_t)(void);
+typedef void (*PG_fini_t)(void);
+/* * Load the specified dynamic-link library file, and look for a function * named funcname in it.  (funcname can be
NULLto just load the file.)
 
@@ -82,6 +86,7 @@    char       *load_error;    struct stat stat_buf;    char       *fullname;
+    PG_init_t *PG_init;
    fullname = expand_dynamic_library_name(filename);    if (!fullname)
@@ -146,6 +151,13 @@                            fullname, load_error)));        }

+        /* optionally give the DSO a chance to initialize by calling a
+           PostgreSQL-specific (and this way portable) "_PG_init" function
+           similar to what dlopen(3) implicitly does with "_init" on some
+           Unix platforms. */
+        if ((PG_init = (PG_init_t *)pg_dlsym(file_scanner->handle, "_PG_init")) != NULL)
+                (*PG_init)();
+        /* OK to link it into list */        if (file_list == NULL)            file_list = file_scanner;
@@ -192,6 +204,7 @@               *nxt;    struct stat stat_buf;    char       *fullname;
+    PG_fini_t *PG_fini;
    fullname = expand_dynamic_library_name(filename);    if (!fullname)
@@ -224,6 +237,14 @@            else                file_list = nxt;
clear_external_function_hash(file_scanner->handle);
+
+            /* optionally give the DSO a chance to finish by calling
+               a PostgreSQL-specific (and this way portable) "_PG_fini"
+               function similar to what dlopen(3) implicitly does with
+               "_fini" on some Unix platforms. */
+            if ((PG_fini = (PG_init_t *)pg_dlsym(file_scanner->handle, "_PG_fini")) != NULL)
+                (*PG_fini)();
+            pg_dlclose(file_scanner->handle);            free((char *) file_scanner);            /* prv does not
change*/
 



Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
tomas@tuxteam.de
Дата:
On Wed, Aug 02, 2006 at 09:04:11PM +0200, Ralf S. Engelschall wrote:
> PostgreSQL provides a way to load C extension modules with its internal
> FMGR. Unfortunately there is no portable way for an extension module to
> initialize (directly after the pg_dlopen() of the DSO) and to finish
> (directly before the pg_dlclose() of the DSO). [...]

Cool, but...

[...]

> +
> +            /* optionally give the DSO a chance to finish by calling
> +               a PostgreSQL-specific (and this way portable) "_PG_fini"
> +               function similar to what dlopen(3) implicitly does with
> +               "_fini" on some Unix platforms. */
> +            if ((PG_fini = (PG_init_t *)pg_dlsym(file_scanner->handle, "_PG_fini")) != NULL)
             ^^^^^^^^^
 
> +                (*PG_fini)();
> +
>              pg_dlclose(file_scanner->handle);
>              free((char *) file_scanner);
>              /* prv does not change */

shouldn't that be PG_fini_t?

(yeah, those nitpickers, especially those who are mostly silent
bystanders ;)

But I'd support the idea myself.

Thanks
-- tomas

Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
Tom Lane
Дата:
"Ralf S. Engelschall" <rse@engelschall.com> writes:
> Hence I propose the patch below (applies to PostgreSQL 8.1.4) which
> mimics the dlopen(3) and dlclose(3) behaviour of some Unix platforms
> and resolves and calls _PG_init and _PG_fini functions of an extension
> module right after/before the pg_dlopen/pg_dlclose calls in the FMGR.

This seems like a reasonably good idea, and we have got uses for at
least the "init" case in most or all of our PLs.  It's nominally too
late for 8.2 feature freeze, but I said just a couple days ago that
we shouldn't take a very hard line on that.  Does anyone object to
considering this for 8.2?

One question I have is whether it really works as expected in all cases.
In particular what if the library is "preloaded" into the postmaster?
Both plpgsql and plperl seem to think they might need to make a
distinction between things to do at library load time and things to do
per-backend ... and yet, neither of them *actually* have anything they
need to do per-backend.

Also, what about Windows?  I assume that DSOs don't remain attached
across the pseudo-fork/exec, will that mess anything up?
        regards, tom lane


Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
David Fetter
Дата:
On Thu, Aug 03, 2006 at 05:30:48PM -0400, Tom Lane wrote:
> "Ralf S. Engelschall" <rse@engelschall.com> writes:
> > Hence I propose the patch below (applies to PostgreSQL 8.1.4)
> > which mimics the dlopen(3) and dlclose(3) behaviour of some Unix
> > platforms and resolves and calls _PG_init and _PG_fini functions
> > of an extension module right after/before the pg_dlopen/pg_dlclose
> > calls in the FMGR.
> 
> This seems like a reasonably good idea, and we have got uses for at
> least the "init" case in most or all of our PLs.  It's nominally too
> late for 8.2 feature freeze, but I said just a couple days ago that
> we shouldn't take a very hard line on that.  Does anyone object to
> considering this for 8.2?

Nope :)

> One question I have is whether it really works as expected in all
> cases.  In particular what if the library is "preloaded" into the
> postmaster?  Both plpgsql and plperl seem to think they might need
> to make a distinction between things to do at library load time and
> things to do per-backend ... and yet, neither of them *actually*
> have anything they need to do per-backend.

I'm not sure quite what you mean here, but PL/PerlU functions can
use() modules, and those are called per-backend, i.e. when the
function is invoked.  There's also some possibility that something
might go into %_SHARED.

Cheers,
D
-- 
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666                             Skype: davidfetter

Remember to vote!


Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
Tom Lane
Дата:
David Fetter <david@fetter.org> writes:
> On Thu, Aug 03, 2006 at 05:30:48PM -0400, Tom Lane wrote:
>> One question I have is whether it really works as expected in all
>> cases.  In particular what if the library is "preloaded" into the
>> postmaster?

> I'm not sure quite what you mean here, but PL/PerlU functions can
> use() modules, and those are called per-backend, i.e. when the
> function is invoked.  There's also some possibility that something
> might go into %_SHARED.

Well, the point is that you could have a scenario where the PG_init
function is executed in the postmaster, the process image is duplicated
via fork(), and then in a specific backend a LOAD command is executed
causing the PG_fini function to be called.  Is it likely that anything
would get confused by PG_init and PG_fini getting called by different
processes?

Also, if we do this we probably ought to remove the special-purpose
hack for preload_libraries to specify an init function --- it should
just happen by default.  Any objections to simplifying that?
        regards, tom lane


Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
Andrew Dunstan
Дата:

Tom Lane wrote:

>"Ralf S. Engelschall" <rse@engelschall.com> writes:
>  
>
>>Hence I propose the patch below (applies to PostgreSQL 8.1.4) which
>>mimics the dlopen(3) and dlclose(3) behaviour of some Unix platforms
>>and resolves and calls _PG_init and _PG_fini functions of an extension
>>module right after/before the pg_dlopen/pg_dlclose calls in the FMGR.
>>    
>>
>
>This seems like a reasonably good idea, and we have got uses for at
>least the "init" case in most or all of our PLs.  It's nominally too
>late for 8.2 feature freeze, but I said just a couple days ago that
>we shouldn't take a very hard line on that.  Does anyone object to
>considering this for 8.2?
>  
>

I don't. We've been porous in the past and I think we should be prepared 
to be a bit lenient again, especially since this release is not hugely 
feature rich.

>One question I have is whether it really works as expected in all cases.
>In particular what if the library is "preloaded" into the postmaster?
>Both plpgsql and plperl seem to think they might need to make a
>distinction between things to do at library load time and things to do
>per-backend ... and yet, neither of them *actually* have anything they
>need to do per-backend.
>  
>


I have longterm plans for plperl concerning preloading perl modules, 
which might involve the preloaded lib. At the moment it's just a thought 
in my head, though.

>Also, what about Windows?  I assume that DSOs don't remain attached
>across the pseudo-fork/exec, will that mess anything up?
>
>
>  
>

Good question.

cheers

andrew


Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
Joe Conway
Дата:
Tom Lane wrote:
> 
> Also, if we do this we probably ought to remove the special-purpose
> hack for preload_libraries to specify an init function --- it should
> just happen by default.  Any objections to simplifying that?
> 

The original idea of using the init function with preload_libraries was 
to eliminate library startup that was expensive and only needed once. 
Specifically in the case of libR (and presumably other libraries as 
well), the init time was much greater than the actual library load time. 
If it is removed from preload_libraries, then we'll pay that price for 
every backend startup, no?

Joe




Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
Tom Lane
Дата:
Joe Conway <mail@joeconway.com> writes:
> Tom Lane wrote:
>> Also, if we do this we probably ought to remove the special-purpose
>> hack for preload_libraries to specify an init function --- it should
>> just happen by default.  Any objections to simplifying that?

> The original idea of using the init function with preload_libraries was 
> to eliminate library startup that was expensive and only needed once. 
> Specifically in the case of libR (and presumably other libraries as 
> well), the init time was much greater than the actual library load time. 
> If it is removed from preload_libraries, then we'll pay that price for 
> every backend startup, no?

No, my thought is that you'd rename PL/R's init function to PG_init, and
then it'd get called automagically without needing to assume that the DBA
remembers to specify it in preload_libraries.  If there's a reason *not*
to do that then it'd be a strike against this whole proposal, methinks.
        regards, tom lane


Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
Joe Conway
Дата:
Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>>Tom Lane wrote:
>>
>>>Also, if we do this we probably ought to remove the special-purpose
>>>hack for preload_libraries to specify an init function --- it should
>>>just happen by default.  Any objections to simplifying that?
> 
>>The original idea of using the init function with preload_libraries was 
>>to eliminate library startup that was expensive and only needed once. 
>>Specifically in the case of libR (and presumably other libraries as 
>>well), the init time was much greater than the actual library load time. 
>>If it is removed from preload_libraries, then we'll pay that price for 
>>every backend startup, no?
> 
> No, my thought is that you'd rename PL/R's init function to PG_init, and
> then it'd get called automagically without needing to assume that the DBA
> remembers to specify it in preload_libraries.  If there's a reason *not*
> to do that then it'd be a strike against this whole proposal, methinks.

Oh, well that sounds perfect to me. At least in the case of a procedural 
language handler you can easily initialize and cache anything that must 
be done per-backend anyway. It's the "expensive but must be done at 
least once stuff" that was a problem. As long as that happens, I'm 
happy. And if we eliminate a dba dependency, so much the better.

Joe


Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
"Ralf S. Engelschall"
Дата:
On Thu, Aug 03, 2006, tomas@tuxteam.de wrote:

> On Wed, Aug 02, 2006 at 09:04:11PM +0200, Ralf S. Engelschall wrote:
> > PostgreSQL provides a way to load C extension modules with its internal
> > FMGR. Unfortunately there is no portable way for an extension module to
> > initialize (directly after the pg_dlopen() of the DSO) and to finish
> > (directly before the pg_dlclose() of the DSO). [...]
>
> Cool, but...
> [...]
>
> > +
> > +            /* optionally give the DSO a chance to finish by calling
> > +               a PostgreSQL-specific (and this way portable) "_PG_fini"
> > +               function similar to what dlopen(3) implicitly does with
> > +               "_fini" on some Unix platforms. */
> > +            if ((PG_fini = (PG_init_t *)pg_dlsym(file_scanner->handle, "_PG_fini")) != NULL)
>                                         ^^^^^^^^^
> > +                (*PG_fini)();
> > +
> >              pg_dlclose(file_scanner->handle);
> >              free((char *) file_scanner);
> >              /* prv does not change */
>
> shouldn't that be PG_fini_t?

Ops, good catch. Yes, "PG_fini_t", of course.
                                      Ralf S. Engelschall                                      rse@engelschall.com
                               www.engelschall.com
 



Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
Gregory Stark
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> No, my thought is that you'd rename PL/R's init function to PG_init, and
> then it'd get called automagically without needing to assume that the DBA
> remembers to specify it in preload_libraries.  If there's a reason *not*
> to do that then it'd be a strike against this whole proposal, methinks.

If I understand the question correctly it hinges on whether you want to do all
the initialization pre-fork or post-fork? I'm pretty sure you have to allow
for both possibilities.

I know when I was using mod_perl heavily we wanted to load as many perl
modules and code pre-fork as possible. The more we loaded pre-fork the more
memory was shared across processes and the more processes we could run on a
box without suffering from memory pressure.

On the other side the classic case of something that cannot be set up pre-fork
is actually database connections :) So, for example if for someone wanted to
have a persistent Oracle connection they could not open it pre-fork at library
load time but they might want to open it immediately after the fork rather
than when it's first used.


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
Tom Lane
Дата:
Gregory Stark <gsstark@mit.edu> writes:
> So, for example if for someone wanted to
> have a persistent Oracle connection they could not open it pre-fork at library
> load time but they might want to open it immediately after the fork rather
> than when it's first used.

Uh ... why?  Seems like all you're accomplishing there is to expend
cycles that might be wasted, if the particular session never uses
the feature.

In any case, the PG_init proposal neither adds nor takes away ability
to do stuff immediately post-fork, so I think that's an orthogonal
consideration.
        regards, tom lane


Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
Greg Stark
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <gsstark@mit.edu> writes:
> > So, for example if for someone wanted to
> > have a persistent Oracle connection they could not open it pre-fork at library
> > load time but they might want to open it immediately after the fork rather
> > than when it's first used.
> 
> Uh ... why?  Seems like all you're accomplishing there is to expend
> cycles that might be wasted, if the particular session never uses
> the feature.

Well as long as we're talking hypothetically I could come up with plenty.
Maybe you know every session will be using the connection and you can't deal
with the latency creating the connection may create in your critical path. Or
maybe you don't want to deal with connection errors from the depths of your
call tree where it'll cause a user-visible error. Or perhaps you're using some
library that expects to be handed a connection from some initialization
routine and it's just not convenient to or possible to invoke that on an
on-demand basis.

> In any case, the PG_init proposal neither adds nor takes away ability
> to do stuff immediately post-fork, so I think that's an orthogonal
> consideration.

So is the only question whether there's a need to do stuff pre-fork?

-- 
greg



Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
Tom Lane
Дата:
Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> In any case, the PG_init proposal neither adds nor takes away ability
>> to do stuff immediately post-fork, so I think that's an orthogonal
>> consideration.

> So is the only question whether there's a need to do stuff pre-fork?

That's not a question, that's a well-established fact --- pl/R certainly
needs it, and any other library that has expensive setup work that can
propagate through a fork does too.

I think adding a hook to allow a postmaster-preloaded library to execute
some work immediately post-fork is a separate consideration.  Feel free
to propose it if you want, but I don't see what it's got to do with the
patch on the table.
        regards, tom lane


Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
Tom Lane
Дата:
"Ralf S. Engelschall" <rse@engelschall.com> writes:
> Hence I propose the patch below (applies to PostgreSQL 8.1.4) which
> mimics the dlopen(3) and dlclose(3) behaviour of some Unix platforms
> and resolves and calls _PG_init and _PG_fini functions of an extension
> module right after/before the pg_dlopen/pg_dlclose calls in the FMGR.

Patch applied, with consequent changes to simplify preload_libraries
feature in favor of using _PG_init().
        regards, tom lane


Re: [BUGS] Patch to allow C extension modules to initialize/finish

От
"Ralf S. Engelschall"
Дата:
On Tue, Aug 08, 2006, Tom Lane wrote:

> "Ralf S. Engelschall" <rse@engelschall.com> writes:
> > Hence I propose the patch below (applies to PostgreSQL 8.1.4) which
> > mimics the dlopen(3) and dlclose(3) behaviour of some Unix platforms
> > and resolves and calls _PG_init and _PG_fini functions of an extension
> > module right after/before the pg_dlopen/pg_dlclose calls in the FMGR.
>
> Patch applied, with consequent changes to simplify preload_libraries
> feature in favor of using _PG_init().

Thanks.                                      Ralf S. Engelschall
rse@engelschall.com                                     www.engelschall.com