Обсуждение: PL/Python initialization cleanup

Поиск

Список

Период

Сортировка

PL/Python initialization cleanup

От

Peter Eisentraut

Дата:

31 декабря 2025 г., 11:47:32

As I was working through steps to make PL/Python more thread-safe, I 
noticed that the initialization code of PL/Python is pretty messy.  I 
think some of this has grown while both Python 2 and 3 were supported, 
because they required different initialization steps, and we had some 
defenses against accidentally running both at the same time.  But that 
is over, and right now a lot of this doesn't make sense anymore.  For 
example, the function PLy_init_interp() said "Initialize the Python 
interpreter ..." but it didn't actually do this, and PLy_init_plpy() 
said "initialize plpy module" but it didn't do that either (or at least 
they used the term "initialize" in non-standard ways).

Here are some patches to clean this up.  After this change, all the 
global initialization is called directly from _PG_init(), and the plpy 
module initialization is all called from its registered initialization 
function PyInit_plpy().  (For the thread-safety job, the plpy module 
initialization will need to be rewritten using a different API.  That's 
why I'm keen to have it clearly separated.)  I also tried to add more 
comments and make existing comments more precise.  There was also some 
apparently obsolete or redundant code that could be deleted.

Surely, all of this will need some more rounds of careful scrutiny, but 
I think the overall code arrangement is correct and an improvement.

Вложения

Re: PL/Python initialization cleanup

От

Chao Li

Дата:

01 января, 02:34:53


> On Dec 31, 2025, at 16:47, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> As I was working through steps to make PL/Python more thread-safe, I noticed that the initialization code of
PL/Pythonis pretty messy.  I think some of this has grown while both Python 2 and 3 were supported, because they
requireddifferent initialization steps, and we had some defenses against accidentally running both at the same time.
Butthat is over, and right now a lot of this doesn't make sense anymore.  For example, the function PLy_init_interp()
said"Initialize the Python interpreter ..." but it didn't actually do this, and PLy_init_plpy() said "initialize plpy
module"but it didn't do that either (or at least they used the term "initialize" in non-standard ways). 
>
> Here are some patches to clean this up.  After this change, all the global initialization is called directly from
_PG_init(),and the plpy module initialization is all called from its registered initialization function PyInit_plpy().
(Forthe thread-safety job, the plpy module initialization will need to be rewritten using a different API.  That's why
I'mkeen to have it clearly separated.)  I also tried to add more comments and make existing comments more precise.
Therewas also some apparently obsolete or redundant code that could be deleted. 
>
> Surely, all of this will need some more rounds of careful scrutiny, but I think the overall code arrangement is
correctand an improvement. 
>
<v1-0001-plpython-Remove-commented-out-code.patch><v1-0002-plpython-Clean-up-PyModule_AddObject-uses.patch><v1-0003-plpython-Remove-duplicate-PyModule_Create.patch><v1-0004-plpython-Streamline-initialization.patch>

I just did an eyeball review. Overall looks good to me. The cleanup, as explained in the patch email, makes sense to
me.Only a nit comment on 0002: 

1 - 0002
```
+    if (PyModule_AddObject(mod, modname, exc) < 0)
+    {
+        Py_XDECREF(exc);
+        PLy_elog(ERROR, "could not add exceptions %s", name);
+    }
```

Plural “exceptions” is a little confusing. What about “could not add exception object”?

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Re: PL/Python initialization cleanup

От

"Matheus Alcantara"

Дата:

05 января, 18:08:40

On Wed Dec 31, 2025 at 5:47 AM -03, Peter Eisentraut wrote:
> As I was working through steps to make PL/Python more thread-safe, I
> noticed that the initialization code of PL/Python is pretty messy.  I
> think some of this has grown while both Python 2 and 3 were supported,
> because they required different initialization steps, and we had some
> defenses against accidentally running both at the same time.  But that
> is over, and right now a lot of this doesn't make sense anymore.  For
> example, the function PLy_init_interp() said "Initialize the Python
> interpreter ..." but it didn't actually do this, and PLy_init_plpy()
> said "initialize plpy module" but it didn't do that either (or at least
> they used the term "initialize" in non-standard ways).
>
> Here are some patches to clean this up.  After this change, all the
> global initialization is called directly from _PG_init(), and the plpy
> module initialization is all called from its registered initialization
> function PyInit_plpy().  (For the thread-safety job, the plpy module
> initialization will need to be rewritten using a different API.  That's
> why I'm keen to have it clearly separated.)  I also tried to add more
> comments and make existing comments more precise.  There was also some
> apparently obsolete or redundant code that could be deleted.
>
> Surely, all of this will need some more rounds of careful scrutiny, but
> I think the overall code arrangement is correct and an improvement.

0001, 0003 and 0004 looks good to me, just a small comment on 0002:

-    /*
-     * PyModule_AddObject does not add a refcount to the object, for some odd
-     * reason; we must do that.
-     */
-    Py_INCREF(exc);
-    PyModule_AddObject(mod, modname, exc);
-
     /*
      * The caller will also store a pointer to the exception object in some
-     * permanent variable, so add another ref to account for that.  This is
-     * probably excessively paranoid, but let's be sure.
+     * permanent variable, so add another ref to account for that.
      */
     Py_INCREF(exc);

The later code comment say "so add another ref to account for that", but
you've removed the previous Py_INCREF() call. The returned object
PyErr_NewException() already have a refcount increased for usage? If
that's not the case I think that the "add another ref..." don't seems
correct because IIUC we are increasing the ref count for the first time,
so there is no "another" refcount being increased.

--
Matheus Alcantara
EDB: https://www.enterprisedb.com

Re: PL/Python initialization cleanup

От

Peter Eisentraut

Дата:

12 января, 15:24:50

On 05.01.26 16:08, Matheus Alcantara wrote:
> On Wed Dec 31, 2025 at 5:47 AM -03, Peter Eisentraut wrote:
>> As I was working through steps to make PL/Python more thread-safe, I
>> noticed that the initialization code of PL/Python is pretty messy.  I
>> think some of this has grown while both Python 2 and 3 were supported,
>> because they required different initialization steps, and we had some
>> defenses against accidentally running both at the same time.  But that
>> is over, and right now a lot of this doesn't make sense anymore.  For
>> example, the function PLy_init_interp() said "Initialize the Python
>> interpreter ..." but it didn't actually do this, and PLy_init_plpy()
>> said "initialize plpy module" but it didn't do that either (or at least
>> they used the term "initialize" in non-standard ways).
>>
>> Here are some patches to clean this up.  After this change, all the
>> global initialization is called directly from _PG_init(), and the plpy
>> module initialization is all called from its registered initialization
>> function PyInit_plpy().  (For the thread-safety job, the plpy module
>> initialization will need to be rewritten using a different API.  That's
>> why I'm keen to have it clearly separated.)  I also tried to add more
>> comments and make existing comments more precise.  There was also some
>> apparently obsolete or redundant code that could be deleted.
>>
>> Surely, all of this will need some more rounds of careful scrutiny, but
>> I think the overall code arrangement is correct and an improvement.
> 
> 0001, 0003 and 0004 looks good to me, just a small comment on 0002:
> 
> -    /*
> -     * PyModule_AddObject does not add a refcount to the object, for some odd
> -     * reason; we must do that.
> -     */
> -    Py_INCREF(exc);
> -    PyModule_AddObject(mod, modname, exc);
> -
>       /*
>        * The caller will also store a pointer to the exception object in some
> -     * permanent variable, so add another ref to account for that.  This is
> -     * probably excessively paranoid, but let's be sure.
> +     * permanent variable, so add another ref to account for that.
>        */
>       Py_INCREF(exc);
> 
> The later code comment say "so add another ref to account for that", but
> you've removed the previous Py_INCREF() call. The returned object
> PyErr_NewException() already have a refcount increased for usage? If
> that's not the case I think that the "add another ref..." don't seems
> correct because IIUC we are increasing the ref count for the first time,
> so there is no "another" refcount being increased.

The reference created by PyErr_NewException() is "stolen" by 
PyModule_AddObject(), so we need to create another one for returning the 
object from the function and storing it in the permanent variable.  I 
have updated the comment in this new patch version.  But I think the 
actual code is correct.

Вложения

Re: PL/Python initialization cleanup

От

Peter Eisentraut

Дата:

12 января, 15:25:32

On 01.01.26 00:34, Chao Li wrote:
> 
> 
>> On Dec 31, 2025, at 16:47, Peter Eisentraut <peter@eisentraut.org> wrote:
>>
>> As I was working through steps to make PL/Python more thread-safe, I noticed that the initialization code of
PL/Pythonis pretty messy.  I think some of this has grown while both Python 2 and 3 were supported, because they
requireddifferent initialization steps, and we had some defenses against accidentally running both at the same time.
Butthat is over, and right now a lot of this doesn't make sense anymore.  For example, the function PLy_init_interp()
said"Initialize the Python interpreter ..." but it didn't actually do this, and PLy_init_plpy() said "initialize plpy
module"but it didn't do that either (or at least they used the term "initialize" in non-standard ways).
 
>>
>> Here are some patches to clean this up.  After this change, all the global initialization is called directly from
_PG_init(),and the plpy module initialization is all called from its registered initialization function PyInit_plpy().
(Forthe thread-safety job, the plpy module initialization will need to be rewritten using a different API.  That's why
I'mkeen to have it clearly separated.)  I also tried to add more comments and make existing comments more precise.
Therewas also some apparently obsolete or redundant code that could be deleted.
 
>>
>> Surely, all of this will need some more rounds of careful scrutiny, but I think the overall code arrangement is
correctand an improvement.
 
>>
<v1-0001-plpython-Remove-commented-out-code.patch><v1-0002-plpython-Clean-up-PyModule_AddObject-uses.patch><v1-0003-plpython-Remove-duplicate-PyModule_Create.patch><v1-0004-plpython-Streamline-initialization.patch>
> 
> I just did an eyeball review. Overall looks good to me. The cleanup, as explained in the patch email, makes sense to
me.Only a nit comment on 0002:
 
> 
> 1 - 0002
> ```
> +    if (PyModule_AddObject(mod, modname, exc) < 0)
> +    {
> +        Py_XDECREF(exc);
> +        PLy_elog(ERROR, "could not add exceptions %s", name);
> +    }
> ```
> 
> Plural “exceptions” is a little confusing. What about “could not add exception object”?

Thanks, I have fixed this in the v2 patch (sent in a separate message).

Re: PL/Python initialization cleanup

От

Matheus Alcantara

Дата:

12 января, 21:07:21

On 12/01/26 09:24, Peter Eisentraut wrote:
>> 0001, 0003 and 0004 looks good to me, just a small comment on 0002:
>>
>> -    /*
>> -     * PyModule_AddObject does not add a refcount to the object, 
>> for some odd
>> -     * reason; we must do that.
>> -     */
>> -    Py_INCREF(exc);
>> -    PyModule_AddObject(mod, modname, exc);
>> -
>>       /*
>>        * The caller will also store a pointer to the exception 
>> object in some
>> -     * permanent variable, so add another ref to account for that.  
>> This is
>> -     * probably excessively paranoid, but let's be sure.
>> +     * permanent variable, so add another ref to account for that.
>>        */
>>       Py_INCREF(exc);
>>
>> The later code comment say "so add another ref to account for that", 
>> but
>> you've removed the previous Py_INCREF() call. The returned object
>> PyErr_NewException() already have a refcount increased for usage? If
>> that's not the case I think that the "add another ref..." don't seems
>> correct because IIUC we are increasing the ref count for the first 
>> time,
>> so there is no "another" refcount being increased.
> 
> The reference created by PyErr_NewException() is "stolen" by 
> PyModule_AddObject(), so we need to create another one for returning 
> the object from the function and storing it in the permanent 
> variable.  I have updated the comment in this new patch version.  But 
> I think the actual code is correct.

Thanks, it looks more clear IMHO now. Agree that the code is correct.


--
Matheus Alcantara
EDB: https://www.enterprisedb.com

RE: PL/Python initialization cleanup

От

li carol

Дата:

13 января, 09:15:54

> On 12/01/26 09:24, Peter Eisentraut wrote:
> >> 0001, 0003 and 0004 looks good to me, just a small comment on 0002:
> >>
> >> -    /*
> >> -     * PyModule_AddObject does not add a refcount to the object, for
> >> some odd
> >> -     * reason; we must do that.
> >> -     */
> >> -    Py_INCREF(exc);
> >> -    PyModule_AddObject(mod, modname, exc);
> >> -
> >>       /*
> >>        * The caller will also store a pointer to the exception object
> >> in some
> >> -     * permanent variable, so add another ref to account for that.
> >> This is
> >> -     * probably excessively paranoid, but let's be sure.
> >> +     * permanent variable, so add another ref to account for that.
> >>        */
> >>       Py_INCREF(exc);
> >>
> >> The later code comment say "so add another ref to account for that",
> >> but you've removed the previous Py_INCREF() call. The returned object
> >> PyErr_NewException() already have a refcount increased for usage? If
> >> that's not the case I think that the "add another ref..." don't seems
> >> correct because IIUC we are increasing the ref count for the first
> >> time, so there is no "another" refcount being increased.
> >
> > The reference created by PyErr_NewException() is "stolen" by
> > PyModule_AddObject(), so we need to create another one for returning
> > the object from the function and storing it in the permanent variable.
> > I have updated the comment in this new patch version.  But I think the
> > actual code is correct.
> 
> Thanks, it looks more clear IMHO now. Agree that the code is correct.
> 
> 
> --
> Matheus Alcantara
> EDB: https://www.enterprisedb.com

Hi Peter,
I have applied and reviewed the v2 patch set. The cleanup of the initialization flow is a very good improvement and
makesthe logic much easier to follow.
 
In particular, the updated comments in PLy_create_exception regarding the reference counting logic are very helpful and
clarifythe previous ambiguity.
 
I have one minor nitpick in plpy_main.c regarding consistency. In _PG_init(), the import and dictionary insertion of
the"plpy" module is currently split into two checks with the same error message. To make it more consistent with how
the__main__ module is handled earlier in the same function, we could potentially streamline it like this:
 
    /*
     * Import plpy.
     */
    plpy_mod = PyImport_ImportModule("plpy");
    if (plpy_mod == NULL || PyDict_SetItemString(main_dict, "plpy", plpy_mod) < 0)
        PLy_elog(ERROR, "could not import \"plpy\" module");

This is just a small suggestion for style consistency; the existing logic in v2 is perfectly correct.

Regards,
Yuan Li(carol)

Re: PL/Python initialization cleanup

От

Kirill Reshke

Дата:

15 января, 00:03:07

On Mon, 12 Jan 2026 at 17:25, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 01.01.26 00:34, Chao Li wrote:
> >
> >
> >> On Dec 31, 2025, at 16:47, Peter Eisentraut <peter@eisentraut.org> wrote:
> >>
> >> As I was working through steps to make PL/Python more thread-safe, I noticed that the initialization code of
PL/Pythonis pretty messy.  I think some of this has grown while both Python 2 and 3 were supported, because they
requireddifferent initialization steps, and we had some defenses against accidentally running both at the same time.
Butthat is over, and right now a lot of this doesn't make sense anymore.  For example, the function PLy_init_interp()
said"Initialize the Python interpreter ..." but it didn't actually do this, and PLy_init_plpy() said "initialize plpy
module"but it didn't do that either (or at least they used the term "initialize" in non-standard ways). 
> >>
> >> Here are some patches to clean this up.  After this change, all the global initialization is called directly from
_PG_init(),and the plpy module initialization is all called from its registered initialization function PyInit_plpy().
(Forthe thread-safety job, the plpy module initialization will need to be rewritten using a different API.  That's why
I'mkeen to have it clearly separated.)  I also tried to add more comments and make existing comments more precise.
Therewas also some apparently obsolete or redundant code that could be deleted. 
> >>
> >> Surely, all of this will need some more rounds of careful scrutiny, but I think the overall code arrangement is
correctand an improvement. 
> >>
<v1-0001-plpython-Remove-commented-out-code.patch><v1-0002-plpython-Clean-up-PyModule_AddObject-uses.patch><v1-0003-plpython-Remove-duplicate-PyModule_Create.patch><v1-0004-plpython-Streamline-initialization.patch>
> >
> > I just did an eyeball review. Overall looks good to me. The cleanup, as explained in the patch email, makes sense
tome. Only a nit comment on 0002: 
> >
> > 1 - 0002
> > ```
> > +     if (PyModule_AddObject(mod, modname, exc) < 0)
> > +     {
> > +             Py_XDECREF(exc);
> > +             PLy_elog(ERROR, "could not add exceptions %s", name);
> > +     }
> > ```
> >
> > Plural “exceptions” is a little confusing. What about “could not add exception object”?
>
> Thanks, I have fixed this in the v2 patch (sent in a separate message).
>


hi!

0001, 0002, 0003, are ready, LGTM.


For 0004, do we need main_dict at all? it is only used inside _PG_init
and then its value assigned to PLy_interp_globals...

--
Best regards,
Kirill Reshke

Re: PL/Python initialization cleanup

От

Peter Eisentraut

Дата:

15 января, 14:21:56

On 13.01.26 07:15, li carol wrote:
> I have applied and reviewed the v2 patch set. The cleanup of the initialization flow is a very good improvement and
makesthe logic much easier to follow.
 
> In particular, the updated comments in PLy_create_exception regarding the reference counting logic are very helpful
andclarify the previous ambiguity.
 
> I have one minor nitpick in plpy_main.c regarding consistency. In _PG_init(), the import and dictionary insertion of
the"plpy" module is currently split into two checks with the same error message. To make it more consistent with how
the__main__ module is handled earlier in the same function, we could potentially streamline it like this:
 
>     /*
>      * Import plpy.
>      */
>     plpy_mod = PyImport_ImportModule("plpy");
>     if (plpy_mod == NULL || PyDict_SetItemString(main_dict, "plpy", plpy_mod) < 0)
>         PLy_elog(ERROR, "could not import \"plpy\" module");
> 
> This is just a small suggestion for style consistency; the existing logic in v2 is perfectly correct.

Thank you for this suggestion.  I agree that having the same error 
message for two different situations was not good.  I changed the error 
checking of PyDict_SetItemString() like you suggested here, but also 
gave it a separate PLy_elog() call (without a message, since it's pretty 
much "can't happen").

I committed the patches with this change.

Re: PL/Python initialization cleanup

От

Peter Eisentraut

Дата:

15 января, 14:24:51

On 14.01.26 22:03, Kirill Reshke wrote:
> For 0004, do we need main_dict at all? it is only used inside _PG_init
> and then its value assigned to PLy_interp_globals...

This was merely a stylistic, defensive choice.  I prefer to keep 
main_dict in a local variable while it is being assembled, and only put 
it into a global variable once it's done.  This reduces the chance of 
accidentally having a partially constructed object globally visible. 
Another point is that main_dict is a borrowed reference, but when we put 
it into the global variable we need to add another reference.  Having 
two separate variables makes this step more explicit.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: PL/Python initialization cleanup

Вложения

Вложения