Обсуждение: Same database, different query plans

Поиск
Список
Период
Сортировка

Same database, different query plans

От
WireSpot
Дата:
I have the exact same database on two different systems, both using
8.0.3. I've installed the database from the same dump. Here's the
query I'm trying on each:

EXPLAIN ANALYZE SELECT
answers.*,i18n.field1,i18n.field2,i18n.field3,i18n.field4,i18n.field5
FROM answers LEFT OUTER JOIN i18n on answers.i18n_id = i18n.id and
i18n.languages_id = 2 WHERE question_id = 2938 ORDER BY display_order;

Here's the result from one of them:

-------------------------------------------
 Sort  (cost=30.46..30.47 rows=2 width=125) (actual time=0.110..0.111
rows=1 loops=1)
   Sort Key: answers.display_order
   ->  Nested Loop Left Join  (cost=0.00..30.45 rows=2 width=125)
(actual time=0.081..0.088 rows=1 loops=1)
         ->  Index Scan using question_id_answers_key on answers
(cost=0.00..9.07 rows=2 width=38) (actual time=0.042..0.047 rows=1
loops=1)
               Index Cond: (question_id = 2938)
         ->  Index Scan using i18n_pkey on i18n  (cost=0.00..10.68
rows=1 width=91) (actual time=0.019..0.019 rows=0 loops=1)
               Index Cond: ("outer".i18n_id = i18n.id)
               Filter: (languages_id = 2)
 Total runtime: 0.306 ms
(9 rows)
-------------------------------------------

And the other:

-------------------------------------------
 Sort  (cost=1025.08..1025.14 rows=22 width=223) (actual
time=397.154..397.155 rows=1 loops=1)
   Sort Key: answers.display_order
   ->  Merge Left Join  (cost=1023.34..1024.59 rows=22 width=223)
(actual time=396.695..396.700 rows=1 loops=1)
         Merge Cond: ("outer".i18n_id = "inner".id)
         ->  Sort  (cost=71.81..71.86 rows=22 width=63) (actual
time=0.346..0.349 rows=1 loops=1)
               Sort Key: answers.i18n_id
               ->  Index Scan using question_id_answers_key on answers
 (cost=0.00..71.31 rows=22 width=63) (actual time=0.320..0.327 rows=1
loops=1)
                     Index Cond: (question_id = 2938)
         ->  Sort  (cost=951.53..952.00 rows=187 width=164) (actual
time=375.092..385.246 rows=5651 loops=1)
               Sort Key: i18n.id
               ->  Seq Scan on i18n  (cost=0.00..944.48 rows=187
width=164) (actual time=0.127..132.919 rows=10940 loops=1)
                     Filter: (languages_id = 2)
 Total runtime: 398.751 ms
(13 rows)
-------------------------------------------

The actual SELECT results (ie. non EXPLAIN) are identical in both
cases. The indexes and so on are identical. I've done a reindexing and
vacuuming on both of them just to be sure.

As you can see, there's quite a bit of a difference between 0.3 ms and
398 ms, and it shows. I haven't touched the query planning options.
Why the different planning and what can I do to fix the misguided one?

Re: Same database, different query plans

От
Michael Fuhr
Дата:
On Thu, Aug 18, 2005 at 12:03:59PM +0300, WireSpot wrote:
> The actual SELECT results (ie. non EXPLAIN) are identical in both
> cases. The indexes and so on are identical. I've done a reindexing and
> vacuuming on both of them just to be sure.
>
> As you can see, there's quite a bit of a difference between 0.3 ms and
> 398 ms, and it shows. I haven't touched the query planning options.
> Why the different planning and what can I do to fix the misguided one?

Have you run ANALYZE or VACUUM ANALYZE in both databases to update
the planner's statistics?  If you have and get the same results,
then it might be interesting to see the output of the following on
both systems:

SET enable_mergejoin TO off;
SET enable_nestloop TO on;
EXPLAIN ANALYZE SELECT ...

SET enable_mergejoin TO on;
SET enable_nestloop TO off;
EXPLAIN ANALYZE SELECT ...

--
Michael Fuhr

Re: Same database, different query plans

От
Дата:
I think I can answer this cause I recently had a similar problem. There is a voodoo setting in postgresql called
"analyzetarget". It controls how much statistic information is kept per table. This information affects the query
planner.If it makes a bad guess based on insufficient statistics data, it will absolutely kill performance (BTW, the
documentationnever makes it explicit). Increase default_analyze_target (I think that's what it's called, look up the
docs)at least tenfold, restart postgresql, and run analyze again. 

BTW, the default postgresql settings are WAY too conservative. I am now looking into tuning and there are a lot of
thingsthat need to be turned up. 

hope this helps,

Eugene


>
> From: Michael Fuhr <mike@fuhr.org>
> Date: 2005/08/18 Thu AM 10:05:14 EST
> To: WireSpot <wirespot@gmail.com>
> CC: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Same database, different query plans
>
> On Thu, Aug 18, 2005 at 12:03:59PM +0300, WireSpot wrote:
> > The actual SELECT results (ie. non EXPLAIN) are identical in both
> > cases. The indexes and so on are identical. I've done a reindexing and
> > vacuuming on both of them just to be sure.
> >
> > As you can see, there's quite a bit of a difference between 0.3 ms and
> > 398 ms, and it shows. I haven't touched the query planning options.
> > Why the different planning and what can I do to fix the misguided one?
>
> Have you run ANALYZE or VACUUM ANALYZE in both databases to update
> the planner's statistics?  If you have and get the same results,
> then it might be interesting to see the output of the following on
> both systems:
>
> SET enable_mergejoin TO off;
> SET enable_nestloop TO on;
> EXPLAIN ANALYZE SELECT ...
>
> SET enable_mergejoin TO on;
> SET enable_nestloop TO off;
> EXPLAIN ANALYZE SELECT ...
>
> --
> Michael Fuhr
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>


Re: Same database, different query plans

От
Steve Crawford
Дата:
On Thursday 18 August 2005 10:16 am, eugene1@sympatico.ca wrote:
> BTW, the default postgresql settings are WAY too conservative.

The archives are full of discussion on this issue. PG runs on several
operating systems and is installed on machines ranging from slow
single-cpu multi-service machines with well under 100MB RAM to fast
multi-cpu machines chock-full of memory and fast drive arrays
dedicated to PG only.

The default configuration allows you to get PG running out-of-the box
on almost any machine. The end-user is responsible for adjusting the
settings to match the particulars the system on which it is
installed.

> I am now looking into tuning and there are a lot of things that need
> to be turned up.

Google on "postgresql performance tuning" and you will find lots of
useful articles on the first page of returns. Chapter 16 of the
manual describes the various settings (but is not terribly helpful in
recommending the appropriate setting). The stats collector can
provide you with useful input for determining settings.

When you run out of ideas from those sources, subscribe to the
"Performance" mailing list.

Cheers,
Steve