Re: Plan invalidation design

Поиск
Список
Период
Сортировка
От Russell Smith
Тема Re: Plan invalidation design
Дата
Msg-id 45D8483C.2090000@pws.com.au
обсуждение исходный текст
Ответ на Plan invalidation design  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Plan invalidation design
Список pgsql-hackers
Tom Lane wrote:
> I'm starting to think about the long-wanted plan invalidation mechanism.
> Here's a sketch --- anyone see any problems?
>
> * Create a new module, say src/backend/utils/cache/plancache.c, that we
> will put in charge of all long-lived plans --- or at least those cached by
> PREPARE, plpgsql, and RI triggers.  I'm unsure whether we should make all
> SPI plans work this way or not; it's possible that doing so would change
> SPI's API behavior enough to break user-written code.  Any thoughts on
> that?
>
> * plancache.c will have two basic functions:
>
> 1. Given a query's raw parse tree (that is, the raw output of gram.y),
> analyze and plan the query.  Store both the parse tree and plan in a
> backend-local cache table, and return a handle for the table entry as well
> as the plan tree.
>
> 2. Given a handle for a previously stored query, check to see if the plan
> is still up to date; if not, regenerate it from the raw parse tree (note
> this could result in failure, eg if a column used by the query has been
> dropped).  Then return the plan tree.
>   
What do we do in the case of failure?  Die in the same way we do now 
when you can't use the plan that's been made?
> We probably want to return a direct pointer to the cached plan tree
> instead of making a copy.  This should be safe, because the executor now
> treats plan trees as read-only, but it does mean that when plan
> invalidation occurs the cached plan tree might still be in use.  We'll
> probably need to have a notion of a reference count: so the two functions
> above would increment the plan's refcount and there would be a third
> "ReleasePlanCache" function to call when done using a plan (and, hence,
> these references would need to be supported by the ResourceManager
> mechanism).
>   
excuse my ignorance here, but under what circumstances is a plan in use 
for a single backend at the same time as it's invalidated.
What potential failures does this introduce?  If you are using the old 
plan, and the new plan fails as mentioned above.  Where are we then?
> Note that the source object for caching is a raw parse tree.  This should
> work since we already require that gram.y not look into the database
> during its processing; therefore, the raw tree need never be invalidated.
> It'd be conceptually simpler if we passed in a query string instead, but
> I don't think that works for PREPARE, because it might be embedded in a
> multi-command string.  (We do probably want to pass in the original query
> string too, if available, because it's needed for syntax error reporting.)
> nodes/copyfuncs.c will need some expansion, as I don't believe it has
> coverage for all raw-parse-tree node types.
>   
If the syntax has become invalid, that is because the columns in the 
query, or tables have changed.  Is this information not available in the 
plan tree?
What other circumstances could you have a syntax error from a query that 
has been successfully planned and parsed?
I've read this paragraph 3 times now and am still quite unclear about 
the requirements for the original query to be stored.  Is the plan cache 
going to replace the syntax check which I thought would have been done 
in gram.y. 
> Invalidation will be detected by having plancache.c watch for relcache
> invalidation events, using the existing inval.c callback mechanism.
> On any relcache inval, traverse the plan cache looking for plans that
> mention the invalidated relation in their rangetables, and mark them as
> needing to be regenerated before next use.  (If they currently have
> refcount zero, we could delete the plan part of the cache entry
> immediately.)
>
> Relcache inval casts a fairly wide net; for example, adding or dropping an
> index will invalidate all plans using the index's table whether or not
> they used that particular index, and I believe that VACUUM will also
> result in a relcache inval due to updating the table's pg_class row.
> I think this is a good thing though --- for instance, after adding an
> index it seems a good idea to replan to see if the new index is useful,
> and replanning after a VACUUM is useful if the table has changed size
> enough to warrant a different plan.  OTOH this might mean that plans on a
> high-update-traffic table never survive very long because of autovacuum's
> efforts.  If that proves to be a problem in practice we can look at ways
> to dial down the number of replans, but for the moment I think it's more
> important to be sure we *can* replan at need than to find ways to avoid
> replans.
>
> Note that I'm currently intending to detect only relcache invals, not
> changes to functions or operators used in the plan.  (Relcache inval will
> cover view redefinitions, though.)  We could extend it to handle that
> later, but it looks like a lot more mechanism and overhead for not a lot
> of gain.  AFAICS there are only three cases where there'd be a benefit:
> * if you redefine an immutable function, any places where its result has
>   been pre-computed by constant-folding wouldn't get updated without inval.
>   
If you replan and immutable function, aren't you possibly messing up a 
functional index that is using the old function.  Hey, if you change an 
immutable function that has an index, you are in trouble already.
So the implication of a immutable function change are much more wide 
ranging that plan invalidation problems.

> * if you have a SQL function that's been inlined into a plan, a change
>   in the function definition wouldn't get reflected into the plan without
>   inval.
> * if you alter a function and change its volatility property, that might
>   possibly affect the shape of plans that use the function (for instance
>   some optimization transformation might now be allowed or not).
>   
Replanning pl/pgsql with CREATE TEMP TABLE would be a good use here.  
You loose the preplanning benefits, but we remove the ongoing problem 
where people report that their temp-table isn't working.

Even function alterations to pl/pgsql should a replan.  But of more 
interest is being able to use the old function for currently running 
transactions when the function is changed.  Last time I tried to edit a 
pl/pgsql function while it was being used by a transaction, the 
transaction failed because the function definition changed.  I'm not 
100% sure of the use case here as I'm writing this email at too late an 
hour.
> To my memory none of these problems have been complained of from the
> field.  Making the cache module able to detect function-related
> invalidations would be a bit of work --- for example, if a function has
> been inlined, there is no recognizable reference to it at all in the plan
> tree, so we'd have to modify the planner to track such things and report
> them somehow.  (The corresponding problem for views doesn't exist, because
> there is still a rangetable entry for a view after it's been expanded.)
> So I think this is a "maybe do someday" part, not something to do in the
> first release.
>
> One interesting point is that to avoid race conditions, the function that
> checks for is-plan-update-required will have to acquire locks on the
> tables mentioned in the plan before it can be sure there's not a pending
> invalidation event on them.  This doesn't seem like a problem, though it
> might mean we want to refactor the executor API a bit to avoid duplicate
> effort.
>   
Is the race condition here any more likely to happen than the failure of 
a re plan when something has changed from underneath the original query?
My very brief thought gives me the impression that they are the same 
thing, however they may not be.
> Comments?
>   
Again, as a person who has only a limited understand of the code, I'm 
happy to be wrong about anything I have written.

Regards

Russell Smith


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Gregory Stark
Дата:
Сообщение: Short varlena header bit-packing options
Следующее
От: "Pavel Stehule"
Дата:
Сообщение: Re: return varchar from C function