Re: Plan for relatively simple query seems to be very inefficient

От: Tom Lane
Тема: Re: Plan for relatively simple query seems to be very inefficient
Дата: ,
Msg-id: 5999.1112820690@sss.pgh.pa.us
(см: обсуждение, исходный текст)
Ответ на: Re: Plan for relatively simple query seems to be very inefficient  (Arjen van der Meijden)
Ответы: Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)  (Tom Lane)
Список: pgsql-performance

Скрыть дерево обсуждения

Plan for relatively simple query seems to be very inefficient  (Arjen van der Meijden, )
 Re: Plan for relatively simple query seems to be very inefficient  (Steve Atkins, )
  Re: Plan for relatively simple query seems to be very inefficient  (Arjen van der Meijden, )
 Re: Plan for relatively simple query seems to be very inefficient  (Tom Lane, )
  Re: Plan for relatively simple query seems to be very inefficient  (Arjen van der Meijden, )
   Re: Plan for relatively simple query seems to be very inefficient  (Tom Lane, )
    Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)  (Tom Lane, )
     Re: [HACKERS] Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)  ("Jim C. Nasby", )
      Re: [HACKERS] Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)  (Tom Lane, )
       Re: [HACKERS] Recognizing range constraints (was Re: Plan  (John A Meinel, )
        Re: [HACKERS] Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)  (Tom Lane, )
       Re: [HACKERS] Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)  ("Jim C. Nasby", )
     Re: Recognizing range constraints (was Re: Plan for  (Simon Riggs, )
     Re: Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)  (Bruno Wolff III, )
      Re: Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)  (Tom Lane, )
       Re: Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)  (Mischa, )
        Re: Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)  (Tom Lane, )
         Re: Recognizing range constraints (was Re: Plan for relatively simple query seems to be very inefficient)  (, )
 Re: Plan for relatively simple query seems to be very inefficient  ("Dave Held", )
  Re: Plan for relatively simple query seems to be very inefficient  (Tom Lane, )
  Re: Plan for relatively simple query seems to be very inefficient  (Tom Lane, )
 Re: Plan for relatively simple query seems to be very inefficient  (Mischa, )

Arjen van der Meijden <> writes:
> On 6-4-2005 19:42, Tom Lane wrote:
>> Wrong index ... what you probably could use here is an index on
>> data_main.range, so that the query could run with postcodes as the
>> outer side.  I get such a plan by default with empty tables:
>> but I'm not sure if the planner would prefer it with the tables loaded
>> up.  (It might not be the right thing anyway ... but seems worth
>> trying.)

> No it didn't prefer it.

Planner error ... because it doesn't have any good way to estimate the
number of matching rows, it thinks that way is a bit more expensive than
data_main as the outside, but in reality it seems a good deal cheaper:


arjen=# set enable_seqscan TO 1;
SET
arjen=# explain analyze
arjen-# SELECT COUNT(*) FROM data_main AS dm, postcodes AS p WHERE dm.range BETWEEN p.range_from AND p.range_till;
                                                          QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=332586.85..332586.85 rows=1 width=0) (actual time=143999.678..143999.683 rows=1 loops=1)
   ->  Nested Loop  (cost=3.76..328945.96 rows=1456356 width=0) (actual time=0.211..143549.461 rows=82688 loops=1)
         Join Filter: (("outer".range >= "inner".range_from) AND ("outer".range <= "inner".range_till))
         ->  Seq Scan on data_main dm  (cost=0.00..1262.20 rows=81920 width=2) (actual time=0.059..663.065 rows=81920
loops=1)
         ->  Materialize  (cost=3.76..5.36 rows=160 width=4) (actual time=0.004..0.695 rows=160 loops=81920)
               ->  Seq Scan on postcodes p  (cost=0.00..3.60 rows=160 width=4) (actual time=0.028..1.589 rows=160
loops=1)
 Total runtime: 144000.415 ms
(7 rows)

arjen=# set enable_seqscan TO 0;
SET
arjen=# explain analyze
arjen-# SELECT COUNT(*) FROM data_main AS dm, postcodes AS p WHERE dm.range BETWEEN p.range_from AND p.range_till;
                                                                 QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=100336307.18..100336307.18 rows=1 width=0) (actual time=2367.097..2367.102 rows=1 loops=1)
   ->  Nested Loop  (cost=100000000.00..100332666.28 rows=1456356 width=0) (actual time=0.279..1918.890 rows=82688
loops=1)
         ->  Seq Scan on postcodes p  (cost=100000000.00..100000003.60 rows=160 width=4) (actual time=0.060..1.381
rows=160loops=1) 
         ->  Index Scan using dm_range on data_main dm  (cost=0.00..1942.60 rows=9103 width=2) (actual
time=0.034..7.511rows=517 loops=160) 
               Index Cond: ((dm.range >= "outer".range_from) AND (dm.range <= "outer".range_till))
 Total runtime: 2368.056 ms
(6 rows)

(this machine is slower than yours, plus I have profiling enabled still...)

            regards, tom lane


В списке pgsql-performance по дате сообщения:

От: Christopher Kings-Lynne
Дата:
Сообщение: Re: COPY Hacks (WAS: RE: Postgresql vs SQLserver for this
От: Greg Stark
Дата:
Сообщение: Re: How to improve db performance with $7K?