No hash join across partitioned tables?

От: Kris Jurka
Тема: No hash join across partitioned tables?
Дата: ,
Msg-id: Pine.BSO.4.64.0904161836540.11937@leary.csoft.net
(см: обсуждение, исходный текст)
Ответы: Re: No hash join across partitioned tables?  (Tom Lane)
Список: pgsql-performance

Скрыть дерево обсуждения

No hash join across partitioned tables?  (Kris Jurka, )
 Re: No hash join across partitioned tables?  (Tom Lane, )
  Re: No hash join across partitioned tables?  (Kris Jurka, )
   Re: No hash join across partitioned tables?  (Kris Jurka, )
    Re: No hash join across partitioned tables?  (Tom Lane, )
     Re: No hash join across partitioned tables?  (Kris Jurka, )
    Re: No hash join across partitioned tables?  (Tom Lane, )
     Re: No hash join across partitioned tables?  (Bruce Momjian, )
      Re: No hash join across partitioned tables?  (Tom Lane, )
       Re: No hash join across partitioned tables?  (Robert Haas, )
        Re: No hash join across partitioned tables?  (Tom Lane, )
         Re: No hash join across partitioned tables?  (Grzegorz Jaśkiewicz, )
         Re: No hash join across partitioned tables?  (Robert Haas, )
          Re: No hash join across partitioned tables?  (Tom Lane, )
           Re: No hash join across partitioned tables?  (Bruce Momjian, )
            Re: No hash join across partitioned tables?  (Tom Lane, )
             Re: No hash join across partitioned tables?  (Bruce Momjian, )
              Re: No hash join across partitioned tables?  (Tom Lane, )
               Re: No hash join across partitioned tables?  (Robert Haas, )
                Re: No hash join across partitioned tables?  (Bruce Momjian, )
          Re: No hash join across partitioned tables?  (Alvaro Herrera, )
           Re: No hash join across partitioned tables?  (Tom Lane, )
            Re: No hash join across partitioned tables?  (Samuel Gendler, )
             Re: No hash join across partitioned tables?  (Alvaro Herrera, )
              Re: No hash join across partitioned tables?  (Samuel Gendler, )
               Re: No hash join across partitioned tables?  (Alvaro Herrera, )
            Re: No hash join across partitioned tables?  (Robert Haas, )
 Re: No hash join across partitioned tables?  (Tom Lane, )
  Re: No hash join across partitioned tables?  (Kris Jurka, )

PG (8.3.7) doesn't seem to want to do a hash join across two partitioned
tables.  I have two partition hierarchies: impounds (with different
impound sources) and liens (with vehicle liens from different companies).
Trying to match those up gives:

EXPLAIN SELECT COUNT(*)
FROM impounds i
     JOIN liens l ON (i.vin = l.vin);

  Aggregate  (cost=11164042.66..11164042.67 rows=1 width=0)
    ->  Nested Loop  (cost=0.27..3420012.94 rows=3097611886 width=0)
          Join Filter: ((i.vin)::text = (l.vin)::text)
          ->  Append  (cost=0.00..1072.77 rows=33577 width=21)
                ->  Seq Scan on impounds i  (cost=0.00..11.40 rows=140 width=21)
                ->  Seq Scan on impounds_s1 i  (cost=0.00..926.87 rows=29587 width=18)
                ->  Seq Scan on impounds_s2 i  (cost=0.00..99.96 rows=3296 width=18)
                ->  Seq Scan on impounds_s3 i  (cost=0.00..23.14 rows=414 width=18)
                ->  Seq Scan on impounds_s4 i  (cost=0.00..11.40 rows=140 width=21)
          ->  Append  (cost=0.27..101.64 rows=15 width=21)
                ->  Bitmap Heap Scan on liens l  (cost=0.27..5.60 rows=2 width=21)
                      Recheck Cond: ((l.vin)::text = (i.vin)::text)
                      ->  Bitmap Index Scan on liens_pk  (cost=0.00..0.27 rows=2 width=0)
                            Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using liens_s1_pk on liens_s1 l  (cost=0.00..7.02 rows=1 width=18)
                      Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using liens_s2_pk on liens_s2 l  (cost=0.00..3.47 rows=1 width=21)
                      Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using newliens_s3_pk on liens_s3 l  (cost=0.00..7.52 rows=1 width=18)
                      Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using newliens_s4_pk on liens_s4 l  (cost=0.00..7.67 rows=1 width=18)
                      Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using newliens_s5_pk on liens_s5 l  (cost=0.00..7.62 rows=1 width=18)
                      Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using newliens_s6_pk on liens_s6 l  (cost=0.00..7.61 rows=1 width=18)
                      Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using newliens_s7_pk on liens_s7 l  (cost=0.00..7.50 rows=1 width=18)
                      Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using newliens_s8_pk on liens_s8 l  (cost=0.00..7.36 rows=1 width=18)
                      Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using newliens_s9_pk on liens_s9 l  (cost=0.00..7.43 rows=1 width=18)
                      Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using newliens_s10_pk on liens_s10 l  (cost=0.00..7.79 rows=1 width=18)
                      Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using newliens_s11_pk on liens_s11 l  (cost=0.00..8.07 rows=1 width=18)
                      Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using newliens_s12_pk on liens_s12 l  (cost=0.00..8.45 rows=1 width=18)
                      Index Cond: ((l.vin)::text = (i.vin)::text)
                ->  Index Scan using newliens_s13_pk on liens_s13 l  (cost=0.00..8.53 rows=1 width=18)
                      Index Cond: ((l.vin)::text = (i.vin)::text)


This takes quite a while as it's got to do tons of index probes which
results it tons of random IO.  I killed this after five minutes of
running.

But if I do:

CREATE TABLE i1 AS SELECT * FROM impounds;
CREATE TABLE l1 AS SELECT * FROM liens;

I get a reasonable plan, which runs in about 15 seconds, from:

EXPLAIN SELECT COUNT(*)
FROM i1 i
         JOIN l1 l ON (i.vin = l.vin);

  Aggregate  (cost=749054.78..749054.79 rows=1 width=0)
    ->  Hash Join  (cost=1444.18..748971.43 rows=33338 width=0)
          Hash Cond: ((l.vin)::text = (i.vin)::text)
          ->  Seq Scan on l1 l  (cost=0.00..332068.96 rows=18449996
width=18)
          ->  Hash  (cost=1027.97..1027.97 rows=33297 width=18)
                ->  Seq Scan on i1 i  (cost=0.00..1027.97 rows=33297
width=18)


I've tried to force the hash join plan on the partitioned tables via:

set enable_nestloop to off;

This results in a merge join plan which needs to do a giant sort, again
killed after five minutes.

  Aggregate  (cost=58285765.20..58285765.21 rows=1 width=0)
    ->  Merge Join  (cost=4077389.31..50541735.48 rows=3097611886 width=0)
          Merge Cond: ((i.vin)::text = (l.vin)::text)
          ->  Sort  (cost=4286.45..4370.39 rows=33577 width=21)
                Sort Key: i.vin
                ->  Append  (cost=0.00..1072.77 rows=33577 width=21)
                      ->  Seq Scan on impounds i  (cost=0.00..11.40 rows=140 width=21)
                      ->  [Seq Scans on other partitions]
          ->  Materialize  (cost=4073102.86..4303737.81 rows=18450796 width=21)
                ->  Sort  (cost=4073102.86..4119229.85 rows=18450796 width=21)
                      Sort Key: l.vin
                      ->  Append  (cost=0.00..332797.96 rows=18450796 width=21)
                            ->  Seq Scan on liens l  (cost=0.00..14.00 rows=400 width=21)
                            ->  [Seq Scans on other partitions]


Disabling mergejoin pushes it back to a nestloop join.  Why can't it hash
join these two together?

Kris Jurka


В списке pgsql-performance по дате сообщения:

От: Vlad Arkhipov
Дата:
Сообщение: Optimizer's issue
От: Tom Lane
Дата:
Сообщение: Re: No hash join across partitioned tables?