Обсуждение: Patch to allow C extension modules to initialize/finish
PostgreSQL provides a way to load C extension modules with its internal FMGR. Unfortunately there is no portable way for an extension module to initialize (directly after the pg_dlopen() of the DSO) and to finish (directly before the pg_dlclose() of the DSO). This way it is mostly impossible to write a more complex extension module in a portable way. The only to me known workarounds are either to call an own initialization function at the start of _EVERY_ exported function manually (works, but is ugly and especially doesn't work for the finishing function!) or to leverage some platform specific hacks like the implicitly called _init and _fini functions (is what the ODBC extension module currently does, but is horribly platform specific and not portable). Hence I propose the patch below (applies to PostgreSQL 8.1.4) which mimics the dlopen(3) and dlclose(3) behaviour of some Unix platforms and resolves and calls _PG_init and _PG_fini functions of an extension module right after/before the pg_dlopen/pg_dlclose calls in the FMGR. This is both a fully portable solution and fully backward compatible to existing and forthcoming extension modules (except they really would have _PG_init and _PG_fini functions already defined). Ralf S. Engelschall rse@engelschall.com www.engelschall.com Index: src/backend/utils/fmgr/dfmgr.c --- src/backend/utils/fmgr/dfmgr.c.orig 2005-10-15 04:49:32 +0200 +++ src/backend/utils/fmgr/dfmgr.c 2006-08-02 20:48:48 +0200 @@ -60,6 +60,10 @@static char *expand_dynamic_library_name(const char *name);static char *substitute_libpath_macro(constchar *name); +/* types for PostgreSQL-specific DSO init/fini functions */ +typedef void (*PG_init_t)(void); +typedef void (*PG_fini_t)(void); +/* * Load the specified dynamic-link library file, and look for a function * named funcname in it. (funcname can be NULLto just load the file.) @@ -82,6 +86,7 @@ char *load_error; struct stat stat_buf; char *fullname; + PG_init_t *PG_init; fullname = expand_dynamic_library_name(filename); if (!fullname) @@ -146,6 +151,13 @@ fullname, load_error))); } + /* optionally give the DSO a chance to initialize by calling a + PostgreSQL-specific (and this way portable) "_PG_init" function + similar to what dlopen(3) implicitly does with "_init" on some + Unix platforms. */ + if ((PG_init = (PG_init_t *)pg_dlsym(file_scanner->handle, "_PG_init")) != NULL) + (*PG_init)(); + /* OK to link it into list */ if (file_list == NULL) file_list = file_scanner; @@ -192,6 +204,7 @@ *nxt; struct stat stat_buf; char *fullname; + PG_fini_t *PG_fini; fullname = expand_dynamic_library_name(filename); if (!fullname) @@ -224,6 +237,14 @@ else file_list = nxt; clear_external_function_hash(file_scanner->handle); + + /* optionally give the DSO a chance to finish by calling + a PostgreSQL-specific (and this way portable) "_PG_fini" + function similar to what dlopen(3) implicitly does with + "_fini" on some Unix platforms. */ + if ((PG_fini = (PG_init_t *)pg_dlsym(file_scanner->handle, "_PG_fini")) != NULL) + (*PG_fini)(); + pg_dlclose(file_scanner->handle); free((char *) file_scanner); /* prv does not change*/
On Wed, Aug 02, 2006 at 09:04:11PM +0200, Ralf S. Engelschall wrote: > PostgreSQL provides a way to load C extension modules with its internal > FMGR. Unfortunately there is no portable way for an extension module to > initialize (directly after the pg_dlopen() of the DSO) and to finish > (directly before the pg_dlclose() of the DSO). [...] Cool, but... [...] > + > + /* optionally give the DSO a chance to finish by calling > + a PostgreSQL-specific (and this way portable) "_PG_fini" > + function similar to what dlopen(3) implicitly does with > + "_fini" on some Unix platforms. */ > + if ((PG_fini = (PG_init_t *)pg_dlsym(file_scanner->handle, "_PG_fini")) != NULL) ^^^^^^^^^ > + (*PG_fini)(); > + > pg_dlclose(file_scanner->handle); > free((char *) file_scanner); > /* prv does not change */ shouldn't that be PG_fini_t? (yeah, those nitpickers, especially those who are mostly silent bystanders ;) But I'd support the idea myself. Thanks -- tomas
"Ralf S. Engelschall" <rse@engelschall.com> writes: > Hence I propose the patch below (applies to PostgreSQL 8.1.4) which > mimics the dlopen(3) and dlclose(3) behaviour of some Unix platforms > and resolves and calls _PG_init and _PG_fini functions of an extension > module right after/before the pg_dlopen/pg_dlclose calls in the FMGR. This seems like a reasonably good idea, and we have got uses for at least the "init" case in most or all of our PLs. It's nominally too late for 8.2 feature freeze, but I said just a couple days ago that we shouldn't take a very hard line on that. Does anyone object to considering this for 8.2? One question I have is whether it really works as expected in all cases. In particular what if the library is "preloaded" into the postmaster? Both plpgsql and plperl seem to think they might need to make a distinction between things to do at library load time and things to do per-backend ... and yet, neither of them *actually* have anything they need to do per-backend. Also, what about Windows? I assume that DSOs don't remain attached across the pseudo-fork/exec, will that mess anything up? regards, tom lane
On Thu, Aug 03, 2006 at 05:30:48PM -0400, Tom Lane wrote: > "Ralf S. Engelschall" <rse@engelschall.com> writes: > > Hence I propose the patch below (applies to PostgreSQL 8.1.4) > > which mimics the dlopen(3) and dlclose(3) behaviour of some Unix > > platforms and resolves and calls _PG_init and _PG_fini functions > > of an extension module right after/before the pg_dlopen/pg_dlclose > > calls in the FMGR. > > This seems like a reasonably good idea, and we have got uses for at > least the "init" case in most or all of our PLs. It's nominally too > late for 8.2 feature freeze, but I said just a couple days ago that > we shouldn't take a very hard line on that. Does anyone object to > considering this for 8.2? Nope :) > One question I have is whether it really works as expected in all > cases. In particular what if the library is "preloaded" into the > postmaster? Both plpgsql and plperl seem to think they might need > to make a distinction between things to do at library load time and > things to do per-backend ... and yet, neither of them *actually* > have anything they need to do per-backend. I'm not sure quite what you mean here, but PL/PerlU functions can use() modules, and those are called per-backend, i.e. when the function is invoked. There's also some possibility that something might go into %_SHARED. Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote!
David Fetter <david@fetter.org> writes: > On Thu, Aug 03, 2006 at 05:30:48PM -0400, Tom Lane wrote: >> One question I have is whether it really works as expected in all >> cases. In particular what if the library is "preloaded" into the >> postmaster? > I'm not sure quite what you mean here, but PL/PerlU functions can > use() modules, and those are called per-backend, i.e. when the > function is invoked. There's also some possibility that something > might go into %_SHARED. Well, the point is that you could have a scenario where the PG_init function is executed in the postmaster, the process image is duplicated via fork(), and then in a specific backend a LOAD command is executed causing the PG_fini function to be called. Is it likely that anything would get confused by PG_init and PG_fini getting called by different processes? Also, if we do this we probably ought to remove the special-purpose hack for preload_libraries to specify an init function --- it should just happen by default. Any objections to simplifying that? regards, tom lane
Tom Lane wrote: >"Ralf S. Engelschall" <rse@engelschall.com> writes: > > >>Hence I propose the patch below (applies to PostgreSQL 8.1.4) which >>mimics the dlopen(3) and dlclose(3) behaviour of some Unix platforms >>and resolves and calls _PG_init and _PG_fini functions of an extension >>module right after/before the pg_dlopen/pg_dlclose calls in the FMGR. >> >> > >This seems like a reasonably good idea, and we have got uses for at >least the "init" case in most or all of our PLs. It's nominally too >late for 8.2 feature freeze, but I said just a couple days ago that >we shouldn't take a very hard line on that. Does anyone object to >considering this for 8.2? > > I don't. We've been porous in the past and I think we should be prepared to be a bit lenient again, especially since this release is not hugely feature rich. >One question I have is whether it really works as expected in all cases. >In particular what if the library is "preloaded" into the postmaster? >Both plpgsql and plperl seem to think they might need to make a >distinction between things to do at library load time and things to do >per-backend ... and yet, neither of them *actually* have anything they >need to do per-backend. > > I have longterm plans for plperl concerning preloading perl modules, which might involve the preloaded lib. At the moment it's just a thought in my head, though. >Also, what about Windows? I assume that DSOs don't remain attached >across the pseudo-fork/exec, will that mess anything up? > > > > Good question. cheers andrew
Tom Lane wrote: > > Also, if we do this we probably ought to remove the special-purpose > hack for preload_libraries to specify an init function --- it should > just happen by default. Any objections to simplifying that? > The original idea of using the init function with preload_libraries was to eliminate library startup that was expensive and only needed once. Specifically in the case of libR (and presumably other libraries as well), the init time was much greater than the actual library load time. If it is removed from preload_libraries, then we'll pay that price for every backend startup, no? Joe
Joe Conway <mail@joeconway.com> writes: > Tom Lane wrote: >> Also, if we do this we probably ought to remove the special-purpose >> hack for preload_libraries to specify an init function --- it should >> just happen by default. Any objections to simplifying that? > The original idea of using the init function with preload_libraries was > to eliminate library startup that was expensive and only needed once. > Specifically in the case of libR (and presumably other libraries as > well), the init time was much greater than the actual library load time. > If it is removed from preload_libraries, then we'll pay that price for > every backend startup, no? No, my thought is that you'd rename PL/R's init function to PG_init, and then it'd get called automagically without needing to assume that the DBA remembers to specify it in preload_libraries. If there's a reason *not* to do that then it'd be a strike against this whole proposal, methinks. regards, tom lane
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: >>Tom Lane wrote: >> >>>Also, if we do this we probably ought to remove the special-purpose >>>hack for preload_libraries to specify an init function --- it should >>>just happen by default. Any objections to simplifying that? > >>The original idea of using the init function with preload_libraries was >>to eliminate library startup that was expensive and only needed once. >>Specifically in the case of libR (and presumably other libraries as >>well), the init time was much greater than the actual library load time. >>If it is removed from preload_libraries, then we'll pay that price for >>every backend startup, no? > > No, my thought is that you'd rename PL/R's init function to PG_init, and > then it'd get called automagically without needing to assume that the DBA > remembers to specify it in preload_libraries. If there's a reason *not* > to do that then it'd be a strike against this whole proposal, methinks. Oh, well that sounds perfect to me. At least in the case of a procedural language handler you can easily initialize and cache anything that must be done per-backend anyway. It's the "expensive but must be done at least once stuff" that was a problem. As long as that happens, I'm happy. And if we eliminate a dba dependency, so much the better. Joe
On Thu, Aug 03, 2006, tomas@tuxteam.de wrote: > On Wed, Aug 02, 2006 at 09:04:11PM +0200, Ralf S. Engelschall wrote: > > PostgreSQL provides a way to load C extension modules with its internal > > FMGR. Unfortunately there is no portable way for an extension module to > > initialize (directly after the pg_dlopen() of the DSO) and to finish > > (directly before the pg_dlclose() of the DSO). [...] > > Cool, but... > [...] > > > + > > + /* optionally give the DSO a chance to finish by calling > > + a PostgreSQL-specific (and this way portable) "_PG_fini" > > + function similar to what dlopen(3) implicitly does with > > + "_fini" on some Unix platforms. */ > > + if ((PG_fini = (PG_init_t *)pg_dlsym(file_scanner->handle, "_PG_fini")) != NULL) > ^^^^^^^^^ > > + (*PG_fini)(); > > + > > pg_dlclose(file_scanner->handle); > > free((char *) file_scanner); > > /* prv does not change */ > > shouldn't that be PG_fini_t? Ops, good catch. Yes, "PG_fini_t", of course. Ralf S. Engelschall rse@engelschall.com www.engelschall.com
Tom Lane <tgl@sss.pgh.pa.us> writes: > No, my thought is that you'd rename PL/R's init function to PG_init, and > then it'd get called automagically without needing to assume that the DBA > remembers to specify it in preload_libraries. If there's a reason *not* > to do that then it'd be a strike against this whole proposal, methinks. If I understand the question correctly it hinges on whether you want to do all the initialization pre-fork or post-fork? I'm pretty sure you have to allow for both possibilities. I know when I was using mod_perl heavily we wanted to load as many perl modules and code pre-fork as possible. The more we loaded pre-fork the more memory was shared across processes and the more processes we could run on a box without suffering from memory pressure. On the other side the classic case of something that cannot be set up pre-fork is actually database connections :) So, for example if for someone wanted to have a persistent Oracle connection they could not open it pre-fork at library load time but they might want to open it immediately after the fork rather than when it's first used. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark <gsstark@mit.edu> writes: > So, for example if for someone wanted to > have a persistent Oracle connection they could not open it pre-fork at library > load time but they might want to open it immediately after the fork rather > than when it's first used. Uh ... why? Seems like all you're accomplishing there is to expend cycles that might be wasted, if the particular session never uses the feature. In any case, the PG_init proposal neither adds nor takes away ability to do stuff immediately post-fork, so I think that's an orthogonal consideration. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Gregory Stark <gsstark@mit.edu> writes: > > So, for example if for someone wanted to > > have a persistent Oracle connection they could not open it pre-fork at library > > load time but they might want to open it immediately after the fork rather > > than when it's first used. > > Uh ... why? Seems like all you're accomplishing there is to expend > cycles that might be wasted, if the particular session never uses > the feature. Well as long as we're talking hypothetically I could come up with plenty. Maybe you know every session will be using the connection and you can't deal with the latency creating the connection may create in your critical path. Or maybe you don't want to deal with connection errors from the depths of your call tree where it'll cause a user-visible error. Or perhaps you're using some library that expects to be handed a connection from some initialization routine and it's just not convenient to or possible to invoke that on an on-demand basis. > In any case, the PG_init proposal neither adds nor takes away ability > to do stuff immediately post-fork, so I think that's an orthogonal > consideration. So is the only question whether there's a need to do stuff pre-fork? -- greg
Greg Stark <gsstark@mit.edu> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> In any case, the PG_init proposal neither adds nor takes away ability >> to do stuff immediately post-fork, so I think that's an orthogonal >> consideration. > So is the only question whether there's a need to do stuff pre-fork? That's not a question, that's a well-established fact --- pl/R certainly needs it, and any other library that has expensive setup work that can propagate through a fork does too. I think adding a hook to allow a postmaster-preloaded library to execute some work immediately post-fork is a separate consideration. Feel free to propose it if you want, but I don't see what it's got to do with the patch on the table. regards, tom lane
"Ralf S. Engelschall" <rse@engelschall.com> writes: > Hence I propose the patch below (applies to PostgreSQL 8.1.4) which > mimics the dlopen(3) and dlclose(3) behaviour of some Unix platforms > and resolves and calls _PG_init and _PG_fini functions of an extension > module right after/before the pg_dlopen/pg_dlclose calls in the FMGR. Patch applied, with consequent changes to simplify preload_libraries feature in favor of using _PG_init(). regards, tom lane
On Tue, Aug 08, 2006, Tom Lane wrote: > "Ralf S. Engelschall" <rse@engelschall.com> writes: > > Hence I propose the patch below (applies to PostgreSQL 8.1.4) which > > mimics the dlopen(3) and dlclose(3) behaviour of some Unix platforms > > and resolves and calls _PG_init and _PG_fini functions of an extension > > module right after/before the pg_dlopen/pg_dlclose calls in the FMGR. > > Patch applied, with consequent changes to simplify preload_libraries > feature in favor of using _PG_init(). Thanks. Ralf S. Engelschall rse@engelschall.com www.engelschall.com