Обсуждение: How can I make PostgreSQL to select another quey plan?

Поиск
Список
Период
Сортировка

How can I make PostgreSQL to select another quey plan?

От
Yangyang
Дата:
HI, dear all
I'm trying to force the database to select another query plan for a
query, rather than the optimal one.
I changed query tuning related parameters, but the query plan stays
the same. Although the costs for nodes changed correspondingly,
What parameters can I modify to force the DB to select another plan?
Any hints will be helpful.
Thanks!
-Yangyang

Re: How can I make PostgreSQL to select another quey plan?

От
Sean Davis
Дата:


On Wed, Nov 30, 2011 at 10:51 AM, Yangyang <yangyangbm@gmail.com> wrote:
HI, dear all
I'm trying to force the database to select another query plan for a
query, rather than the optimal one.
I changed query tuning related parameters, but the query plan stays
the same. Although the costs for nodes changed correspondingly,
What parameters can I modify to force the DB to select another plan?
Any hints will be helpful.

What is the database structure, query, and plan?  What do you want to have changed?  

Sean 

Re: How can I make PostgreSQL to select another quey plan?

От
Yangyang
Дата:
Thank you so much for the reply, Sean.
1. I created tables and populated data with DBT-3.
2. I'm using TPC-H Q4. I may also test more TPC-H queries.
3. The plan is like:

Sort  (cost=1219522013.16..1219522013.17 rows=1 width=16)
   Sort Key: orders.o_orderpriority
   ->  HashAggregate  (cost=1219521913.15..1219522013.15 rows=1 width=16)
         ->  Seq Scan on orders  (cost=0.00..1219521627.24 rows=57181 width=16)
               Filter: ((o_orderdate >= '1993-07-01'::date) AND
(o_orderdate < '1993-10-01 00:00:00'::timestamp without time zone) AND
(subplan))
               SubPlan
                 ->  Index Scan using lineitem_pkey on lineitem
(cost=0.00..28502.87 rows=93 width=112)
                       Index Cond: (l_orderkey = $0)
                       Filter: (l_commitdate < l_receiptdate)
(9 rows)

4. My question is, is it possible for me to modify some parameters,
therefore the query plan is changed? I have tried some in the
postgres.conf file, but no matter which one I change, the query plan
is the same.
5. My assumption is, if one type of resource is treated more valuable,
it would affect how the DB evaluate costs and thus may choose a
different query plan.

Thank you so much for your time and help.
-Yangyang
On Wed, Nov 30, 2011 at 10:57 AM, Sean Davis <sdavis2@mail.nih.gov> wrote:
>
>
> On Wed, Nov 30, 2011 at 10:51 AM, Yangyang <yangyangbm@gmail.com> wrote:
>>
>> HI, dear all
>> I'm trying to force the database to select another query plan for a
>> query, rather than the optimal one.
>> I changed query tuning related parameters, but the query plan stays
>> the same. Although the costs for nodes changed correspondingly,
>> What parameters can I modify to force the DB to select another plan?
>> Any hints will be helpful.
>
>
> What is the database structure, query, and plan?  What do you want to have
> changed?
>
> Sean

Re: How can I make PostgreSQL to select another quey plan?

От
Tom Lane
Дата:
Yangyang <yangyangbm@gmail.com> writes:
> I'm trying to force the database to select another query plan for a
> query, rather than the optimal one.
> I changed query tuning related parameters, but the query plan stays
> the same. Although the costs for nodes changed correspondingly,
> What parameters can I modify to force the DB to select another plan?

There are the enable_xxx flags, or if the problem is a bad join order
you could reduce the join_collapse_limit setting to 1 and use JOIN
syntax to specify the join order.  If you'd like it to not use a
particular index you could transiently drop that index:

    begin;
    drop index ...;
    explain ...;
    rollback;

I recently rewrote the manual's discussion of using explain, so
you might care to read this:
http://developer.postgresql.org/pgdocs/postgres/using-explain.html
although it does refer to a couple of features that don't exist in
production releases yet.

            regards, tom lane

Re: How can I make PostgreSQL to select another quey plan?

От
Yangyang
Дата:
Thanks, Tom. This helps.
If I don't change the database structures (like index), and only
modify "Planner cost constants", is it possible to make DB change the
query plan? I've tried, but looks like only chaning these constants
don't affect the query plan. Or maybe I'm wrong.
I appreciate your time and help.
-Yangyang

On Wed, Nov 30, 2011 at 11:11 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yangyang <yangyangbm@gmail.com> writes:
>> I'm trying to force the database to select another query plan for a
>> query, rather than the optimal one.
>> I changed query tuning related parameters, but the query plan stays
>> the same. Although the costs for nodes changed correspondingly,
>> What parameters can I modify to force the DB to select another plan?
>
> There are the enable_xxx flags, or if the problem is a bad join order
> you could reduce the join_collapse_limit setting to 1 and use JOIN
> syntax to specify the join order.  If you'd like it to not use a
> particular index you could transiently drop that index:
>
>        begin;
>        drop index ...;
>        explain ...;
>        rollback;
>
> I recently rewrote the manual's discussion of using explain, so
> you might care to read this:
> http://developer.postgresql.org/pgdocs/postgres/using-explain.html
> although it does refer to a couple of features that don't exist in
> production releases yet.
>
>                        regards, tom lane

Re: How can I make PostgreSQL to select another quey plan?

От
Tom Lane
Дата:
Yangyang <yangyangbm@gmail.com> writes:
> If I don't change the database structures (like index), and only
> modify "Planner cost constants", is it possible to make DB change the
> query plan?

Yes, but usually these don't have huge impacts on estimated costs,
so you will only be able to get it to switch to plans that (it thinks)
are relatively close in cost anyway.

A bigger question is what sort of plan you think would be better, and
whether the planner is even capable of building that plan from the
given query.  You didn't show the particular query you're concerned
about, but I gather from the EXPLAIN output that it involves
sub-selects, and those can act as optimization fences ...

            regards, tom lane

Re: How can I make PostgreSQL to select another quey plan?

От
Yangyang
Дата:
Thank you for the advice, Tom.
My proposal is like this:
Suppose for the same query, Plan A consumes 10 units of I/O and 50
units of CPU. Plan B consumes 20 units of I/O and 40 units of CPU.
If A has less total cost than B, A will be selected as best plan.
If the database has less CPU available, I would prefer it to select
Plan B, which consumes less units of CPU.
I tried to limit the CPU to DB and modify cpu_tupe_cost, but no good.
I'm wondering if I need a particular query to do so.
The query I used is :
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date('1998-12-01') - interval '90 days'
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus;
Hope this time I made it clear.
I apologize for any inconvenience caused by my post.

On Wed, Nov 30, 2011 at 11:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yangyang <yangyangbm@gmail.com> writes:
>> If I don't change the database structures (like index), and only
>> modify "Planner cost constants", is it possible to make DB change the
>> query plan?
>
> Yes, but usually these don't have huge impacts on estimated costs,
> so you will only be able to get it to switch to plans that (it thinks)
> are relatively close in cost anyway.
>
> A bigger question is what sort of plan you think would be better, and
> whether the planner is even capable of building that plan from the
> given query.  You didn't show the particular query you're concerned
> about, but I gather from the EXPLAIN output that it involves
> sub-selects, and those can act as optimization fences ...
>
>                        regards, tom lane

Re: How can I make PostgreSQL to select another quey plan?

От
Tom Lane
Дата:
Yangyang <yangyangbm@gmail.com> writes:
> My proposal is like this:
> Suppose for the same query, Plan A consumes 10 units of I/O and 50
> units of CPU. Plan B consumes 20 units of I/O and 40 units of CPU.
> If A has less total cost than B, A will be selected as best plan.
> If the database has less CPU available, I would prefer it to select
> Plan B, which consumes less units of CPU.

No doubt, but the question I was asking is whether you have a clear idea
of what Plan B is and whether the Postgres query planner can even
generate that plan from your query.

> The query I used is :
> select
> l_returnflag,
> l_linestatus,
> sum(l_quantity) as sum_qty,
> sum(l_extendedprice) as sum_base_price,
> sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
> sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
> avg(l_quantity) as avg_qty,
> avg(l_extendedprice) as avg_price,
> avg(l_discount) as avg_disc,
> count(*) as count_order
> from
> lineitem
> where
> l_shipdate <= date('1998-12-01') - interval '90 days'
> group by
> l_returnflag,
> l_linestatus
> order by
> l_returnflag,
> l_linestatus;

Hmm ... after looking up the TPC-H spec, that seems to correspond to Q1,
whereas earlier you said you were using Q4.  Since your plan shows a
sub-select, and Q1 contains no sub-select but Q4 does, I'm going to
suppose you meant Q4.

Now the next question is why you're getting a subplan at all.  I'd
expect Postgres 8.4 and up to recognize Q4's WHERE EXISTS(SELECT ...)
as a semi-join, rather than implementing it as an unoptimized sub-select
which is what your plan is showing.  If I'm right in deducing that you
are testing an ancient version of Postgres, then the first step to
improvement is to get onto a modern version.  The unoptimized sub-select
framework does not provide very much room at all for alternative plans
--- it's basically always going to amount to a nestloop join.  Which is
probably why you're not getting anywhere by fooling with the planner
cost settings.

            regards, tom lane