Обсуждение: factors determining the execution plan

Поиск
Список
Период
Сортировка

factors determining the execution plan

От
Jenny Zhang
Дата:
The osdl-dbt3 test starts with building and vacuuming the database.
The execution plans were taken after the vacuuming.

I did two tests with the same database parameters:
1. run two osdl-dbt3 runs on one box without rebooting the stystem.
Though the execution plans are the same, the costs are different.  The
system status are different for the two runs, for example, some page
cache are not released after the first one.  Does that make the cost
different?

2. run two osdl-dbt3 runs on two boxes.  The hardware of the two boxes
are the same.  And each run starts from scratch (building linux kernel,
pgsql ect, and reboot).  To my surprise, not only the cost are different
between the two runs, the execution plan changed for Query 9.
The execution plans can be found at:
http://khack.osdl.org/stp/277780/results/plan/power_query9.txt
http://khack.osdl.org/stp/277779/results/plan/power_query9.txt

My test leads me to the following questions:
What are the factors which determine the execution plan and cost?
Does PostgreSQL check the system resource dynamically?
If Inserts and Updates happened but vacuuming is not executed, dose the
execution plan change?

Thanks,
--
Jenny Zhang
Open Source Development Lab Inc
12725 SW Millikan Way
Suite 400
Beaverton, OR 97005
(503)626-2455 ext 31


how to get accurate values in pg_statistic

От
Jenny Zhang
Дата:
Hi,

Last week I reported getting different execution plans for the same
query against the same database.  I did further investigation.  After
building the database, I did:
vacuumdb -z DBT3
psql DBT3 -c "analyze supplier"
psql DBT3 -c "analyze part"
psql DBT3 -c "analyze partsupp"
psql DBT3 -c "analyze customer"
psql DBT3 -c "analyze orders"
psql DBT3 -c "analyze lineitem"
psql DBT3 -c "analyze nation"
psql DBT3 -c "analyze region"

Then I check the data in pg_class and pg_statistic and get the execution
plans.

I did this twice (say run_8 and run_9).  I found that while the pg_class
are the same for run_8 and run_9, the pg_statistic output are
different.  I posted the query, the execution plan, pg_class and
pg_statistic output at:
http://www.osdl.org/archive/jenny/

The difference in pg_statistic results in a different execution plan for
query 18.sql.  In fact, I updated the pg_statistic table for run_9 with
the values got from run_8, then it gives me the run_8 execution plan,
and the cost is very close(within 1).  This makes me to believe that the
optimizer picks the plan based on the values in pg_statistic and
pg_class.

But why the pg_statistic value changes each time I build the database?
Did I do something wrong?

Thanks for your help,
Jenny



Re: how to get accurate values in pg_statistic

От
Stephan Szabo
Дата:
On 22 Aug 2003, Jenny Zhang wrote:

> Last week I reported getting different execution plans for the same
> query against the same database.  I did further investigation.  After
> building the database, I did:
> vacuumdb -z DBT3
> psql DBT3 -c "analyze supplier"
> psql DBT3 -c "analyze part"
> psql DBT3 -c "analyze partsupp"
> psql DBT3 -c "analyze customer"
> psql DBT3 -c "analyze orders"
> psql DBT3 -c "analyze lineitem"
> psql DBT3 -c "analyze nation"
> psql DBT3 -c "analyze region"
>
> Then I check the data in pg_class and pg_statistic and get the execution
> plans.
>
> I did this twice (say run_8 and run_9).  I found that while the pg_class
> are the same for run_8 and run_9, the pg_statistic output are
> different.  I posted the query, the execution plan, pg_class and
> pg_statistic output at:
> http://www.osdl.org/archive/jenny/
>
> The difference in pg_statistic results in a different execution plan for
> query 18.sql.  In fact, I updated the pg_statistic table for run_9 with
> the values got from run_8, then it gives me the run_8 execution plan,
> and the cost is very close(within 1).  This makes me to believe that the
> optimizer picks the plan based on the values in pg_statistic and
> pg_class.
>
> But why the pg_statistic value changes each time I build the database?
> Did I do something wrong?

Analyze only does a statistical sampling so it'd be expected that you can
get different results from analyze on the same data.  There are ways to
affect the sample size and some other analyze parameters if you're
interested in look at it (default_statistics_target and ALTER TABLE ALTER
COLUMN SET STATISTICS).