Обсуждение: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

Поиск
Список
Период
Сортировка

More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

От
David Rowley
Дата:
Hi,

On [1] I suggested an idea to make improvements to the planner around the Equivalence Class code. Later in [2] Tom raised concerns with this adding too many planning cycles for a perhaps not common enough situation.  I don't want to discuss that particular patch here, I want to discuss more generally about the dilemma about adding more smarts to the planner to allow it to generate a more optimal plan in order to save on execution time.

In the case of the Equivalence Class Filters code, I quoted an example where pushing these filters down into the joined relation caused a significant performance improvement to a query. Now, I understand Tom's concerns with slowing down the planner, as in cases where the query is short running, or the optimisations don't apply, then we could cause the query to overall (including planning time) perform worse. Nobody wants that, but on the other hand, if spending 5-10 extra microseconds during planning equates to 6 hours shaved off execution time, then nobody would think to grudge that extra 5-10 microseconds during planning.

What I'd like to discuss here is what was touched on on that other thread on ways to get around this problem:

A number of ideas were suggested on the other thread about how we might go about solving this problem. In [3] Simon talked about perhaps enabling extra optimisations when the planner sees that the plan will cost more than some given threshold. That's perhaps an option, but may not work well for optimisations which must take place very early in planning, for example [4].
Another idea which came up was from Evgeniy [5], which was more of a request not to do it this way, but never-the-less, the idea was basically to add lots of GUCs to enable/disable each extra planner feature.

Another option which I've thought about previously was a planner_strength GUC, at which various additional optimisations are enabled at various predefined strength levels, so that databases which tend to spend a great deal more execution time compared to planning time can turn this up a bit to see if that helps change that ratio a bit.  This idea is far from perfect though, as who's to say that planner feature X should "kick in" before planner feature Y? I've also often thought that it might be nice to have it so the planner does not modify the Parse object, so that the planner has the option to throw away what it's done so far and start planning all over again with the "planner_strength" knob turned up to the maximum, if the cost happened to indicate that the query was going to take a long time to execute.

In reality we already have some planner features which are possible candidates for non essential optimisations. For example join removals likely don't apply in all that many cases, but when they do, this feature is a great win. So by having some sort of ability to enable/disable planner features we also stand to actually speed the planner up for fast simple queries. 

I do strongly believe that we need to come up with something to solve this problem. I already summarised my thoughts on the other thread.

I wrote:
> I believe that with parallel query on the horizon for 9.6 that we're now
> aiming to support bigger OLAP type database than ever before. So if we
> ignore patches like this one then it appears that we have some conflicting
> goals in the community as it seems that we're willing to add the brawn, but
> we're not willing to add the brain. If this is the case then it's a shame,
> as I think we can have both. So I very much agree on the fact that we must
> find a way to maintain support and high performance of small OLTP databases
> too.

So here I'd very much like to kick off discussion on an acceptable way to solve this problem, in a realistic way which we're all happy with.

Comments are of course welcome.

Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

От
Benedikt Grundmann
Дата:
On Wed, Dec 30, 2015 at 7:16 AM, David Rowley <david.rowley@2ndquadrant.com> wrote:

A number of ideas were suggested on the other thread about how we might go about solving this problem. In [3] Simon talked about perhaps enabling extra optimisations when the planner sees that the plan will cost more than some given threshold. That's perhaps an option, but may not work well for optimisations which must take place very early in planning, for example [4].

A small tweak on 3 to deal with 4.  If the returned plan cost is quite high (say you estimate minutes+) you could just restart planning from scratch with all costly planning enabled, because even in the worst case (that is the additional options don't find a better plan), the total planning cost won't matter much in the grand scheme of things.
 

Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

От
Tomas Vondra
Дата:
On 12/30/2015 08:16 AM, David Rowley wrote:
>
> I do strongly believe that we need to come up with something to
> solve this problem. I already summarised my thoughts on the other
> thread.

One approach that I don't see mentioned on any of the threads is plan 
caching, which allows reusing the plan across many query executions, 
hopefully amortizing the planning costs.

I'm sure implementing such caching is non-trivial and there are cases 
where it may not help, but perhaps it's not entirely futile (AFAIK it's 
used by some databases exactly to address the higher planning costs).

I imagine a single GUC enabling or disabling this (possibly not just 
globally but per session, user or database).

We already have some form of plan caching, although only for prepared 
statements within a single session - maybe that could be a good starting 
point? For example what if we only enabled those "expensive" 
optimizations for prepared statements, which are assumed to be executed 
multiple times? Of course, this may not be entirely true (say, PL/pgSQL 
uses prepared statements all the time).

Of course, the annoying consequence of this would be that the planning 
may get somewhat unpredictable - the plan will depend on whether the 
query was planned directly or as a prepared statement, or whether plan 
caching is enabled. However, the same mostly applies to solutions 
proposed in the other threads so far.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

От
Evgeniy Shishkin
Дата:
> On 30 Dec 2015, at 10:16, David Rowley <david.rowley@2ndquadrant.com> wrote:
>
> Hi,
>
> On [1] I suggested an idea to make improvements to the planner around the Equivalence Class code. Later in [2] Tom
raisedconcerns with this adding too many planning cycles for a perhaps not common enough situation.  I don't want to
discussthat particular patch here, I want to discuss more generally about the dilemma about adding more smarts to the
plannerto allow it to generate a more optimal plan in order to save on execution time. 
>
> In the case of the Equivalence Class Filters code, I quoted an example where pushing these filters down into the
joinedrelation caused a significant performance improvement to a query. Now, I understand Tom's concerns with slowing
downthe planner, as in cases where the query is short running, or the optimisations don't apply, then we could cause
thequery to overall (including planning time) perform worse. Nobody wants that, but on the other hand, if spending 5-10
extramicroseconds during planning equates to 6 hours shaved off execution time, then nobody would think to grudge that
extra5-10 microseconds during planning. 
>
> What I'd like to discuss here is what was touched on on that other thread on ways to get around this problem:
>
> A number of ideas were suggested on the other thread about how we might go about solving this problem. In [3] Simon
talkedabout perhaps enabling extra optimisations when the planner sees that the plan will cost more than some given
threshold.That's perhaps an option, but may not work well for optimisations which must take place very early in
planning,for example [4]. 
> Another idea which came up was from Evgeniy [5], which was more of a request not to do it this way, but
never-the-less,the idea was basically to add lots of GUCs to enable/disable each extra planner feature. 
>

Well, my idea was to track planning/execution cost in something like pg_stat_statements.
That way we can track actual time, not estimated cost like Simon proposed.

This table can be combined with Tomas proposal of plan caching.


> Another option which I've thought about previously was a planner_strength GUC, at which various additional
optimisationsare enabled at various predefined strength levels, so that databases which tend to spend a great deal more
executiontime compared to planning time can turn this up a bit to see if that helps change that ratio a bit.  This idea
isfar from perfect though, as who's to say that planner feature X should "kick in" before planner feature Y? I've also
oftenthought that it might be nice to have it so the planner does not modify the Parse object, so that the planner has
theoption to throw away what it's done so far and start planning all over again with the "planner_strength" knob turned
upto the maximum, if the cost happened to indicate that the query was going to take a long time to execute. 
>
> In reality we already have some planner features which are possible candidates for non essential optimisations. For
examplejoin removals likely don't apply in all that many cases, but when they do, this feature is a great win. So by
havingsome sort of ability to enable/disable planner features we also stand to actually speed the planner up for fast
simplequeries.  
>
> I do strongly believe that we need to come up with something to solve this problem. I already summarised my thoughts
onthe other thread. 
>
> I wrote:
> > I believe that with parallel query on the horizon for 9.6 that we're now
> > aiming to support bigger OLAP type database than ever before. So if we
> > ignore patches like this one then it appears that we have some conflicting
> > goals in the community as it seems that we're willing to add the brawn, but
> > we're not willing to add the brain. If this is the case then it's a shame,
> > as I think we can have both. So I very much agree on the fact that we must
> > find a way to maintain support and high performance of small OLTP databases
> > too.
>
> So here I'd very much like to kick off discussion on an acceptable way to solve this problem, in a realistic way
whichwe're all happy with. 
>
> Comments are of course welcome.
>
> [1] http://www.postgresql.org/message-id/CAKJS1f9FK_X_5HKcPcSeimy16Owe3EmPmmGsGWLcKkj_rW9s6A@mail.gmail.com
> [2] http://www.postgresql.org/message-id/30810.1449335261@sss.pgh.pa.us
> [3] http://www.postgresql.org/message-id/CANP8+jLRpRN4ynMsRkOqhYi-Dw5JrODMOt05qejhrAyrsExVmg@mail.gmail.com
> [4] http://www.postgresql.org/message-id/CAKJS1f_UZ_MXtpot6EPXsgHSujoUCrKuXYHLH06h072rDXsCzw@mail.gmail.com
> [5] http://www.postgresql.org/message-id/2F30BA8B-DAB9-4907-9E4E-102D242566E3@gmail.com
>
> --
>  David Rowley                   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services




Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

От
David Rowley
Дата:
On 30 December 2015 at 21:12, Benedikt Grundmann <bgrundmann@janestreet.com> wrote:
On Wed, Dec 30, 2015 at 7:16 AM, David Rowley <david.rowley@2ndquadrant.com> wrote:

A number of ideas were suggested on the other thread about how we might go about solving this problem. In [3] Simon talked about perhaps enabling extra optimisations when the planner sees that the plan will cost more than some given threshold. That's perhaps an option, but may not work well for optimisations which must take place very early in planning, for example [4].

A small tweak on 3 to deal with 4.  If the returned plan cost is quite high (say you estimate minutes+) you could just restart planning from scratch with all costly planning enabled, because even in the worst case (that is the additional options don't find a better plan), the total planning cost won't matter much in the grand scheme of things.

I do personally quite like this idea. Quite likely the extra logic could be added to the planner() function so that it calls standard_planner() again in the event that the cost exceeds some specified threshold. I think the planner might need a little bit of work before replanning on the same parse is ok, as there's places where the planner makes changes to this object which cause things not to work well during the replan. So I think if we went down this route, then the first steps should be to find alternative ways to do things so that the parse is never edited, and set new standards that the parse cannot be changed within the planner anymore. 

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

От
David Rowley
Дата:
On 31 December 2015 at 01:24, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On 12/30/2015 08:16 AM, David Rowley wrote:

I do strongly believe that we need to come up with something to
solve this problem. I already summarised my thoughts on the other
thread.

One approach that I don't see mentioned on any of the threads is plan caching, which allows reusing the plan across many query executions, hopefully amortizing the planning costs.

I'm sure implementing such caching is non-trivial and there are cases where it may not help, but perhaps it's not entirely futile (AFAIK it's used by some databases exactly to address the higher planning costs).

I imagine a single GUC enabling or disabling this (possibly not just globally but per session, user or database).

We already have some form of plan caching, although only for prepared statements within a single session - maybe that could be a good starting point? For example what if we only enabled those "expensive" optimizations for prepared statements, which are assumed to be executed multiple times? Of course, this may not be entirely true (say, PL/pgSQL uses prepared statements all the time).

Of course, the annoying consequence of this would be that the planning may get somewhat unpredictable - the plan will depend on whether the query was planned directly or as a prepared statement, or whether plan caching is enabled. However, the same mostly applies to solutions proposed in the other threads so far.

Personally I'd like to see automatic plan caching occur in PostgreSQL. There is a bit of a problem with it in regards to a query such as: select * from t where mySkewedColumn = 1;  where the value 1 appears 99% of the time. Initially we may plan to seqscan, where with other values we'd likely prefer to index scan. I imagine with my unique joins patch, that it could be expanded to test baserestrictinfos to see if they contain quals with a unique index.   This knowledge could later permit plan caching to occur on queries which are safe from having any skew in the results.  It might sound rather restrictive, but likely this would cover 99% of UPDATE and DELETE operations in an OLTP workload, and likely a good deal of the SELECTs too.  The safety of caching could be analyzed during planning, and a flag could be set somewhere, perhaps in PlannedStmt to mark if the plan is safe to cache. The planner() function could initially hash the query string and check if any cached plan exists with that hash, if not, plan the statement, and then check if the "safeToCache" flag is set, and if so, stick that plan into a hash table. Also plans with no baserestrictinfos could be "safeToCache" too.


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

От
David Rowley
Дата:
On 31 December 2015 at 04:39, Evgeniy Shishkin <itparanoia@gmail.com> wrote:

> On 30 Dec 2015, at 10:16, David Rowley <david.rowley@2ndquadrant.com> wrote:
>
> A number of ideas were suggested on the other thread about how we might go about solving this problem. In [3] Simon talked about perhaps enabling extra optimisations when the planner sees that the plan will cost more than some given threshold. That's perhaps an option, but may not work well for optimisations which must take place very early in planning, for example [4].
> Another idea which came up was from Evgeniy [5], which was more of a request not to do it this way, but never-the-less, the idea was basically to add lots of GUCs to enable/disable each extra planner feature.
>

Well, my idea was to track planning/execution cost in something like pg_stat_statements.
That way we can track actual time, not estimated cost like Simon proposed.


I like this idea also, but I do see the use for this as more of a guide which could be used by the DBA to tweak some sort of planner_effort GUCs. I don't think that the data of pg_stat_statements could be used by the planner as this is an extension rather than core code.
 
 
--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

От
Tomas Vondra
Дата:

On 12/30/2015 10:30 PM, David Rowley wrote:
> On 31 December 2015 at 01:24, Tomas Vondra <tomas.vondra@2ndquadrant.com
> <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>
> Personally I'd like to see automatic plan caching occur in
> PostgreSQL. There is a bit of a problem with it in regards to a query
> such as: select * from t where mySkewedColumn = 1; where the value 1
> appears 99% of the time. Initially we may plan to seqscan, where with
> other values we'd likely prefer to index scan. I imagine with my
> unique joins patch, that it could be expanded to test
> baserestrictinfos to see if they contain quals with a unique index.
> This knowledge could later permit plan caching to occur on queries
> which are safe from having any skew in the results. It might sound
> rather restrictive, but likely this would cover 99% of UPDATE and
> DELETE operations in an OLTP workload, and likely a good deal of the
> SELECTs too. The safety of caching could be analyzed during planning,
> and a flag could be set somewhere, perhaps in PlannedStmt to mark if
> the plan is safe to cache. The planner() function could initially
> hash the query string and check if any cached plan exists with that
> hash, if not, plan the statement, and then check if the "safeToCache"
> flag is set, and if so, stick that plan into a hash table. Also plans
> with no baserestrictinfos could be "safeToCache" too.

Yeah, that's what I meant by "non-trivial". I don't know a good approach 
to this problem, or if such a "good" approach even exists, but I'd say 
being able to decide whether to rebuild a plan in such cases is a 
"must-have" feature. Otherwise we could easily loose any gains from the 
more thorough optimization because of poor plans.

In other words, we'd have to come up with a way to decide whether to use 
the same plan as before, or try building another plan (for the same 
query with different parameter values). I can think of two approaches:

(1) Try to measure how "different" are the parameter values used in the    new query, compared to the existing plan(s).
Thisprobably means    difference in terms of probability / frequencies etc.
 

(2) Compute the cost of the existing plan for the new parameters. I.e.    don't perform the whole optimization, just
thecosting for the    single plan. If the costs are "close" then use the existing plan.
 

Of course, none of this is trivial and still may fail for some cases.

I wonder what the other databases do, or if there are papers about this 
topic (I'd guess there are, but I haven't looked too much).

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

От
Noah Misch
Дата:
On Wed, Dec 30, 2015 at 08:16:28PM +1300, David Rowley wrote:
> On [1] I suggested an idea to make improvements to the planner around the
> Equivalence Class code. Later in [2] Tom raised concerns with this adding
> too many planning cycles for a perhaps not common enough situation.  I
> don't want to discuss that particular patch here, I want to discuss more
> generally about the dilemma about adding more smarts to the planner to
> allow it to generate a more optimal plan in order to save on execution time.

It's an important topic.

> A number of ideas were suggested on the other thread about how we might go
> about solving this problem. In [3] Simon talked about perhaps enabling
> extra optimisations when the planner sees that the plan will cost more than
> some given threshold. That's perhaps an option, but may not work well for
> optimisations which must take place very early in planning, for example [4].

Yeah.  A slew of steps precede us assembling a notion of total cost.  I bet
most controversial proposed steps will happen in that early period.  We'd need
a rough cost earlier than we get it today, and I don't know what achieving
that would look like.  Simon, did you have specifics in view?

> Another idea which came up was from Evgeniy [5], which was more of a
> request not to do it this way, but never-the-less, the idea was basically
> to add lots of GUCs to enable/disable each extra planner feature.

This and subsequent ideas each envision some kind of optimization step
blacklist.  Suppose it's a bitmap with one bit per optional optimizer step.
Each idea chooses blacklist membership differently, but the planner acts on
the blacklist about the same way.  I paraphrase the ideas in those terms
below, and I offer a couple more.  For this idea, the blacklist is simple:

1. User defines the blacklist fully.

It's essentially handing the hard part back to the user.  While I sympathize
with allergic reactions to this, I like it far better than rejecting
optimizations based on thinking the average case wants them disabled.
work_mem likewise isn't the ideal knob for any user, but it has a simple
implementation and beats "1MB per hash table is okay for everyone."

> Another option which I've thought about previously was a planner_strength
> GUC, at which various additional optimisations are enabled at various
> predefined strength levels, so that databases which tend to spend a great
> deal more execution time compared to planning time can turn this up a bit
> to see if that helps change that ratio a bit.  This idea is far from

2. User selects from predefined blacklists like "maximum", "default", etc.

> perfect though, as who's to say that planner feature X should "kick in"
> before planner feature Y? I've also often thought that it might be nice to

Yep.  It will be essentially impossible to build an argument to move a planner
feature from one strength to another.  If the feature's committer has
personally experienced the problem in the field, the optimization will end up
active at default planner strength.

> have it so the planner does not modify the Parse object, so that the
> planner has the option to throw away what it's done so far and start
> planning all over again with the "planner_strength" knob turned up to the
> maximum, if the cost happened to indicate that the query was going to take
> a long time to execute.

3. System first plans with a specific predefined blacklist that omits  speculative (low probability, high payoff)
steps. If that yields a  high-cost plan, it repeats planning with an empty blacklist.
 

I agree that the planner's modification of the Query tree is a substantial
roadblock.  The design needs to work for one-shot plans, and we're unlikely to
recoup the cost of copying every one-shot Query tree.

> So here I'd very much like to kick off discussion on an acceptable way to
> solve this problem, in a realistic way which we're all happy with.

It's bad to add 10us per plan times 3000/s/client, but the same waste once per
minute per client is no cause for concern.  I advise accepting speculative
planner optimizations, enabled by default when similar in cost to comparable
existing steps.  At the same time, I encourage developing automation to cap
the waste when a client elicits millions of planner runs that don't benefit
from a certain optimization.  Specific lines of inquiry:

4. Derive a conservative blacklist for each rewritten Query when first  planning it, and use that blacklist for
subsequentplans.
 

Some prepared statements use a custom plan every time.  (Constraint exclusion
is one driver of such cases.)  Many facets of a Query's planning problem don't
depend on parameter values, so a particular optimization will apply to all the
custom plans or to none of them.  Let the first plan build a blacklist of
provably-irrelevant optimizations, which the plan cache stores and furnishes
to later runs.  The first plan after an invalidation recomputes the blacklist.

5. Add a facility that profiles a workload to generate blacklist data.

Think along the lines of gcc -fprofile-generate/-fprofile-use.  Start a
profile; run your application load test; the server collects data sufficient
to match each query with its optimal blacklist.  Issue "ALTER USER appuser SET
optimizer_profile = 'profname'" to adopt profile-driven blacklisting.  This is
the automated equivalent of (1).


I recommend starting with (1), full user control over the blacklist via GUC.
Specifically, I recommend one GUC expressing the list rather than one Boolean
GUC per optional step.  The core work of enumerating optional steps and making
the planner obey a blacklist applies toward all five ideas.  A GUC to set that
blacklist is cheap, and it would be worth having as an escape hatch even if we
add sophistication later.

(2) is a dead end for the reason you gave.  (3) has potential, but each of
(3), (4) and (5) is a big project with great uncertainty.  They shouldn't
block introducing optimizer steps.

Thanks,
nm