Hash or merge join instead of inner loop

Поиск
Список
Период
Сортировка
От Jim C. Nasby
Тема Hash or merge join instead of inner loop
Дата
Msg-id 20030609204009.GJ40542@flake.decibel.org
обсуждение исходный текст
Ответы Re: Hash or merge join instead of inner loop  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
I have a query that's cauing pgsql choose either a hash or merge join
depending on how I mess with the stats variables, but it won't choose an
nested loop, even though it's the fastest.

The estimate for the nested loop index scans always seems to be way high
on the high end. Note that it's 0-3 in one case and 0-2 in the other,
but the actual time is very low in both cases. Why is this? I haven't
been able to make much of a difference by changing the optimizer
variables.

This is on a solaris machine, if that matters. Tinput_data, locality,
and postal code have 1300, 28000 and 43000 rows, respectively, and
locality and postal code are very narrow tables (full definition below).

usps=# explain analyze   SELECT key, pc.locality_id, l.state_code::varchar FROM Tinput_data i, postal_code pc, locality
lWHERE i.zip = pc.postal_code AND l.locality_id = pc.locality_id; 
                                                                                QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=940.20..1417.94 rows=1380 width=36) (actual time=1727.30..2363.91 rows=1380 loops=1)
   Merge Cond: ("outer".locality_id = "inner".locality_id)
   ->  Index Scan using locality_pkey on locality l  (cost=0.00..455.99 rows=27789 width=10) (actual time=0.62..495.39
rows=27632loops=1) 
   ->  Sort  (cost=940.20..940.55 rows=1380 width=26) (actual time=1725.53..1726.71 rows=1380 loops=1)
         Sort Key: pc.locality_id
         ->  Merge Join  (cost=42.00..933.00 rows=1380 width=26) (actual time=56.27..1684.67 rows=1380 loops=1)
               Merge Cond: ("outer".postal_code = "inner".zip)
               ->  Index Scan using postal_code_postal_code_key on postal_code pc  (cost=0.00..869.31 rows=42704
width=13)(actual time=10.05..1396.11 rows=42418 loops=1) 
               ->  Sort  (cost=42.00..42.34 rows=1380 width=13) (actual time=39.63..40.97 rows=1380 loops=1)
                     Sort Key: i.zip
                     ->  Seq Scan on tinput_data i  (cost=0.00..34.80 rows=1380 width=13) (actual time=0.02..12.13
rows=1380loops=1) 
 Total runtime: 2367.50 msec
(12 rows)

usps=# set enable_mergejoin=0;
SET
usps=# set enable_hashjoin=0;
SET
usps=# explain analyze   SELECT key, pc.locality_id, l.state_code::varchar FROM Tinput_data i, postal_code pc, locality
lWHERE i.zip = pc.postal_code AND l.locality_id = pc.locality_id; 
                                                                        QUERY PLAN


----------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..6991.66 rows=1380 width=36) (actual time=0.22..231.00 rows=1380 loops=1)
   ->  Nested Loop  (cost=0.00..4203.23 rows=1380 width=26) (actual time=0.14..132.70 rows=1380 loops=1)
         ->  Seq Scan on tinput_data i  (cost=0.00..34.80 rows=1380 width=13) (actual time=0.02..17.41 rows=1380
loops=1)
         ->  Index Scan using postal_code_postal_code_key on postal_code pc  (cost=0.00..3.01 rows=1 width=13) (actual
time=0.06..0.06rows=1 loops=1380) 
               Index Cond: ("outer".zip = pc.postal_code)
   ->  Index Scan using locality_pkey on locality l  (cost=0.00..2.01 rows=1 width=10) (actual time=0.05..0.05 rows=1
loops=1380)
         Index Cond: (l.locality_id = "outer".locality_id)
 Total runtime: 233.60 msec
(8 rows)

            Table "pg_temp_1.tinput_data"
      Column      |         Type          | Modifiers
------------------+-----------------------+-----------
 key              | integer               | not null
 firm             | character varying(40) |
 address          | integer               |
 address_v        | character varying(10) |
 odd_even         | character(1)          |
 street_name      | character varying(40) |
 street_metaphone | character varying(4)  |
 apartment        | integer               |
 apartment_v      | character varying(10) |
 apartment_label  | character varying(5)  |
 city             | character varying(40) |
 city_metaphone   | character varying(4)  |
 state            | character varying(40) |
 zip              | character varying(5)  |
Indexes: tinput_data_pkey primary key btree ("key")

usps=# \d postal_code
                                            Table "public.postal_code"
     Column     |         Type          |                                Modifiers
----------------+-----------------------+-------------------------------------------------------------------------
 postal_code_id | integer               | not null default nextval('public.postal_code_postal_code_id_seq'::text)
 postal_code    | character varying(10) | not null
 locality_id    | integer               | not null
Indexes: postal_code_pkey primary key btree (postal_code_id),
         postal_code_postal_code_key unique btree (postal_code)

usps=# \d locality
                                         Table "public.locality"
   Column    |         Type          |                             Modifiers
-------------+-----------------------+-------------------------------------------------------------------
 locality_id | integer               | not null default nextval('public.locality_locality_id_seq'::text)
 locality    | character varying(10) | not null
 state_code  | character(2)          | not null
Indexes: locality_pkey primary key btree (locality_id)
Foreign Key constraints: $1 FOREIGN KEY (state_code) REFERENCES state(state_code) ON UPDATE NO ACTION ON DELETE NO
ACTION

--
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

В списке pgsql-performance по дате отправления:

Предыдущее
От: Yusuf
Дата:
Сообщение: [Fwd: Re: Enabling and disabling run time configuration parameters.]
Следующее
От: Howard Oblowitz
Дата:
Сообщение: FW: [ADMIN] Shared_buffers and kernel parameters, tuning