Обсуждение: Fix memleaks and error handling in jsonb_plpython
Unfortunately, contrib/jsonb_plpython still contain a lot of problems in error handling that can lead to memory leaks:- not all Python function calls are checked for the success- not in all places PG exceptions are caught to release Python references But it seems that this errors can happen only in OOM case. Attached patch with the fix. Back-patch for PG11 is needed. -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
On Fri, Mar 01, 2019 at 05:24:39AM +0300, Nikita Glukhov wrote: > Unfortunately, contrib/jsonb_plpython still contain a lot of problems in error > handling that can lead to memory leaks: > - not all Python function calls are checked for the success > - not in all places PG exceptions are caught to release Python references > But it seems that this errors can happen only in OOM case. > > Attached patch with the fix. Back-patch for PG11 is needed. That looks right to me. Here are some comments. One thing to be really careful of when using PG_TRY/PG_CATCH blocks is that variables modified in the try block and then referenced in the catch block need to be marked as volatile. If you don't do that, the value when reaching the catch part is indeterminate. With your patch the result variable used in two places of PLyObject_FromJsonbContainer() is not marked as volatile. Similarly, it seems that "items" in PLyMapping_ToJsonbValue() and "seq" in "PLySequence_ToJsonbValue" should be volatile because they get changed in the try loop, and referenced afterwards. Another issue: in ltree_plpython we don't check the return state of PyList_SetItem(), which we should complain about I think. -- Michael
Вложения
On 05.03.2019 6:45, Michael Paquier wrote:
On Fri, Mar 01, 2019 at 05:24:39AM +0300, Nikita Glukhov wrote:Unfortunately, contrib/jsonb_plpython still contain a lot of problems in error handling that can lead to memory leaks:- not all Python function calls are checked for the success- not in all places PG exceptions are caught to release Python references But it seems that this errors can happen only in OOM case. Attached patch with the fix. Back-patch for PG11 is needed.That looks right to me. Here are some comments. One thing to be really careful of when using PG_TRY/PG_CATCH blocks is that variables modified in the try block and then referenced in the catch block need to be marked as volatile. If you don't do that, the value when reaching the catch part is indeterminate. With your patch the result variable used in two places of PLyObject_FromJsonbContainer() is not marked as volatile. Similarly, it seems that "items" in PLyMapping_ToJsonbValue() and "seq" in "PLySequence_ToJsonbValue" should be volatile because they get changed in the try loop, and referenced afterwards.
I known about this volatility issues, but maybe I incorrectly understand what should be marked as volatile for pointer variables: the pointer itself and/or the memory referenced by it. I thought that only pointer needs to be marked, and also there is message [1] clearly describing what needs to be marked. Previously in PLyMapping_ToJsonbValue() the whole contents of PyObject was marked as volatile, not the pointer itself which is not modified in PG_TRY:
- /* We need it volatile, since we use it after longjmp */ - volatile PyObject *items_v = NULL;
So, I removed volatile qualifier here. Variable 'result' is also not modified in PG_TRY, it is also non-volatile. I marked only 'key' variable in PLyObject_FromJsonbContainer() as volatile, because it is really modified in the loop inside PG_TRY(), and PLyObject_FromJsonbValue(&v2) call after its assignment can throw PG exception: + PyObject *volatile key = NULL;
Also I have idea to introduce a global list of Python objects that need to be dereferenced in PG_CATCH inside PLy_exec_function() in the case of exception. Then typical code will be look like that: PyObject *list = PLy_RegisterObject(PyList_New()); if (!list) return NULL; ... code that can throw PG exception, PG_TRY/PG_CATCH is not needed ... return PLy_UnregisterObject(list); /* returns list */
Another issue: in ltree_plpython we don't check the return state of PyList_SetItem(), which we should complain about I think.
Yes, PyList_SetItem() and PyString_FromStringAndSize() should be checked, but CPython's PyList_SetItem() really should not fail because list storage is preallocated:
int PyList_SetItem(PyObject *op, Py_ssize_t i, PyObject *newitem) { PyObject **p; if (!PyList_Check(op)) { Py_XDECREF(newitem); PyErr_BadInternalCall(); return -1; } if (!valid_index(i, Py_SIZE(op))) { Py_XDECREF(newitem); PyErr_SetString(PyExc_IndexError, "list assignment index out of range"); return -1; } p = ((PyListObject *)op) -> ob_item + i; Py_XSETREF(*p, newitem); return 0; }
[1] https://www.postgresql.org/message-id/31436.1483415248%40sss.pgh.pa.us
On Tue, Mar 05, 2019 at 02:10:01PM +0300, Nikita Glukhov wrote: > I known about this volatility issues, but maybe I incorrectly understand what > should be marked as volatile for pointer variables: the pointer itself and/or > the memory referenced by it. I thought that only pointer needs to be marked, > and also there is message [1] clearly describing what needs to be marked. Yeah, sorry for bringing some confusion. > Previously in PLyMapping_ToJsonbValue() the whole contents of PyObject was > marked as volatile, not the pointer itself which is not modified in PG_TRY: > > - /* We need it volatile, since we use it after longjmp */ > - volatile PyObject *items_v = NULL; > > So, I removed volatile qualifier here. Okay, this one looks correct to me. Well the whole variable has been removed. > Variable 'result' is also not modified in PG_TRY, it is also non-volatile. Fine here as well. > I marked only 'key' variable in PLyObject_FromJsonbContainer() as volatile, > because it is really modified in the loop inside PG_TRY(), and > PLyObject_FromJsonbValue(&v2) call after its assignment can throw PG > exception: > + PyObject *volatile key = NULL; One thing that you are missing here is that key can become NULL when reaching the catch block, so Py_XDECREF() should be called on it only when the value is not NULL. And actually, looking closer, you don't need to have that volatile variable at all, no? Why not just declaring it as a PyObject in the while loop? Also here, key and val can be NULL, so we had better only call Py_XDECREF() when they are not. On top of that, potential errors on PyDict_SetItem() not be simply ignored, so the loop should only break when the key or the value is NULL, but not when PyDict_SetItem() has a problem. > Also I have idea to introduce a global list of Python objects that need to be > dereferenced in PG_CATCH inside PLy_exec_function() in the case of exception. > Then typical code will be look like that: Perhaps we could do that, but let's not juggle with the code more than necessary for a bug fix. > Yes, PyList_SetItem() and PyString_FromStringAndSize() should be checked, > but CPython's PyList_SetItem() really should not fail because list storage > is preallocated: Hm. We could add an elog() here for safety I think. That's not a big deal either. Another thing is that you cannot just return within a try block with what is added in PLyObject_FromJsonbContainer, or the error stack is not reset properly. So they should be replaced by breaks. -- Michael
Вложения
On Wed, Mar 06, 2019 at 11:04:23AM +0900, Michael Paquier wrote: > Another thing is that you cannot just return within a try block with > what is added in PLyObject_FromJsonbContainer, or the error stack is > not reset properly. So they should be replaced by breaks. So, I have been poking at this stuff, and I am finishing with the attached. The origin of the issue comes from PLyObject_ToJsonbValue() and PLyObject_FromJsonbValue() which could result in problems when working on PyObject which it may allocate. So this has resulted in more refactoring of the code than I expected first. I also decided to not keep the additional errors which have been added in the previous version of the patch. From my understanding of the code, these cannot actually happen, so replacing them by assertions is enough in my opinion. While on it, I also noticed that hstore_plpython does not actually need a volatile pointer for plpython_to_hstore(). Also, as all those problems are really unlikely going to happen in real-life cases, improving this code only on HEAD looks enough to me. -- Michael
Вложения
Michael Paquier <michael@paquier.xyz> writes: > On Wed, Mar 06, 2019 at 11:04:23AM +0900, Michael Paquier wrote: >> Another thing is that you cannot just return within a try block with >> what is added in PLyObject_FromJsonbContainer, or the error stack is >> not reset properly. So they should be replaced by breaks. > So, I have been poking at this stuff, and I am finishing with the > attached. This patch had bit-rotted due to somebody else fooling with the volatile-qualifiers situation. I fixed it up, tweaked a couple of things, and pushed it. > Also, as all those > problems are really unlikely going to happen in real-life cases, > improving this code only on HEAD looks enough to me. Yeah, I concur. regards, tom lane
On Sat, Apr 06, 2019 at 05:56:24PM -0400, Tom Lane wrote: > This patch had bit-rotted due to somebody else fooling with the > volatile-qualifiers situation. I fixed it up, tweaked a couple of > things, and pushed it. Thanks, Tom! -- Michael