Re: BUG #19026: ResourceOwnerForget can't find owner for invalid plancache

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #19026: ResourceOwnerForget can't find owner for invalid plancache
Дата
Msg-id 1117844.1755704498@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #19026: ResourceOwnerForget can't find owner for invalid plancache  (Dilip Kumar <dilipbalaut@gmail.com>)
Ответы Re: BUG #19026: ResourceOwnerForget can't find owner for invalid plancache
Список pgsql-bugs
Dilip Kumar <dilipbalaut@gmail.com> writes:
> That's a valid point.  Thanks.  I think we can add the same test case
> to what is given in this example, maybe in plpgsql.sql or maybe some
> other file which is more relevant.  So we can continue with the v1
> patch only you have sent, maybe you can add a one liner comment to it.

I do not like either of these proposed patches.  Dilip's suffers from
an extremely myopic idea of which error reports could trigger the
problem, while Kirill's is abusing (IMV) the purpose of an error
context callback.  Those exist to add detail to an error report, not
to clean up state outside the error system, and elog.c doesn't exactly
guarantee that they will be invoked.

The reason this broke at 0313c5dc6 is that that enabled
SQLFunctionCaches to be re-used for the life of the associated
FmgrInfo, and when we are talking about an opclass support procedure,
that FmgrInfo is in the relcache so it is likely to last for the
life of the session.  So the presented test case causes us to error
out of execution of the SQL function during the first INSERT, but
its SQLFunctionCache still exists and has fcache->cplan != NULL,
even though error cleanup would've released the reference count
already.  When we come back to this point in the second INSERT,
init_execution_state is fooled into trying to release the
already-released cplan.

In practice, fcache->cplan will never be not-null after successful
completion of a SQL function, so one idea is to simply clear it
unconditionally as soon as we know we're starting a fresh execution,
more or less as in alternative-1 attached.  However that leaves me
a bit unsatisfied, because it doesn't protect against the case of
erroring out of a set-returning function: if we come in and see
eslist != NULL, we'll pick right back up attempting to execute
plans that probably aren't there anymore.  I think that that case
is unreachable today because we don't allow any opclass support
functions to be SRFs, and AFAIK there are no other cases where an
FmgrInfo would be re-used after a failed query.  Still, I'm inclined
to go with something more like alternative-2, which feels a little
more future-proof.

            regards, tom lane

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 359aafea681..08cf9dce2af 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -612,7 +612,14 @@ init_sql_fcache(FunctionCallInfo fcinfo, bool lazyEvalOK)
     fcache->lazyEvalOK = lazyEvalOK;
     fcache->lazyEval = false;

-    /* Also reset data about where we are in the function. */
+    /*
+     * Also reset data about where we are in the function.  Notice we just
+     * clear cplan without doing ReleaseCachedPlan.  The only way that cplan
+     * could be non-NULL here is if we errored out of a previous execution of
+     * this SQLFunctionCache, in which case error abort would have released
+     * the plan reference, so we mustn't do so again.
+     */
+    fcache->cplan = NULL;
     fcache->eslist = NULL;
     fcache->next_query_index = 0;
     fcache->error_query_index = 0;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 359aafea681..92a7be09dd3 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -143,6 +143,7 @@ typedef struct SQLFunctionCache
 {
     SQLFunctionHashEntry *func; /* associated SQLFunctionHashEntry */

+    bool        active;            /* are we executing this cache entry? */
     bool        lazyEvalOK;        /* true if lazyEval is safe */
     bool        shutdown_reg;    /* true if registered shutdown callback */
     bool        lazyEval;        /* true if using lazyEval for result query */
@@ -556,6 +557,22 @@ init_sql_fcache(FunctionCallInfo fcinfo, bool lazyEvalOK)
         finfo->fn_extra = fcache;
     }

+    /*
+     * If the SQLFunctionCache is marked as active, we must have errored out
+     * of a prior execution.  Reset state.  (It might seem that we could also
+     * reach this during recursive invocation of a SQL function, but we won't
+     * because that case won't involve re-use of the same FmgrInfo.)
+     *
+     * In particular, we must clear fcache->cplan without doing
+     * ReleaseCachedPlan, because error cleanup from the prior execution would
+     * have taken care of releasing that plan.
+     */
+    if (fcache->active)
+    {
+        fcache->cplan = NULL;
+        fcache->eslist = NULL;
+    }
+
     /*
      * If we are resuming execution of a set-returning function, just keep
      * using the same cache.  We do not ask funccache.c to re-validate the
@@ -1597,6 +1614,9 @@ fmgr_sql(PG_FUNCTION_ARGS)
      */
     fcache = init_sql_fcache(fcinfo, lazyEvalOK);

+    /* Mark fcache as active */
+    fcache->active = true;
+
     /* Remember info that we might need later to construct tuplestore */
     fcache->tscontext = tscontext;
     fcache->randomAccess = randomAccess;
@@ -1853,6 +1873,9 @@ fmgr_sql(PG_FUNCTION_ARGS)
     if (es == NULL)
         fcache->eslist = NULL;

+    /* Mark fcache as inactive */
+    fcache->active = false;
+
     error_context_stack = sqlerrcontext.previous;

     return result;

В списке pgsql-bugs по дате отправления: