Re: [HACKERS] advanced partition matching algorithm forpartition-wise join

Поиск
Список
Период
Сортировка
От amul sul
Тема Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
Дата
Msg-id CAAJ_b94tJTix3kR8uBjin-ruJ-7ojn-gAWJQRicbLqAttQTe1g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] advanced partition matching algorithm forpartition-wise join  (Etsuro Fujita <etsuro.fujita@gmail.com>)
Ответы Re: [HACKERS] advanced partition matching algorithm forpartition-wise join  (Etsuro Fujita <etsuro.fujita@gmail.com>)
Список pgsql-hackers
Hi Fujita San,


Please find my comments inline below:

On Wed, Aug 28, 2019 at 3:52 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
On Fri, Aug 16, 2019 at 10:25 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>  
[... skipped ..] 

About the attached:

* The attached patch modified try_partitionwise_join() so that we call
partition_bounds_equal() then partition_bounds_merge() (if necessary)
to create the partition bounds for the join rel.  We don't support for
merging the partition bounds for the hash-partitioning case, so this
makes code to handle the hash-partitioning case in
partition_bounds_merge() completely unnecessary, so I removed that
entirely.

Yes, that make sense. 

On thinking further, a thought hits to me why we can't join two hash partitioned
table which has the same modulus and partition key specification, but some of
the partitions are missing from either partitioned table.

For e.g. here is a smaller case where foo has two partitions and bar has only one.

CREATE TABLE foo(a int) PARTITION BY HASH(a);
CREATE TABLE foo_p0 PARTITION OF foo FOR VALUES WITH(modulus 2, remainder 0);
CREATE TABLE foo_p1 PARTITION OF foo FOR VALUES WITH(modulus 2, remainder 1);

CREATE TABLE bar(a int) PARTITION BY HASH(a);  <-- missing partitions for REMAINDER 1
CREATE TABLE bar_p0 PARTITION OF bar FOR VALUES WITH(modulus 2, remainder 0);


Explain:
postgres=# explain select * from foo p1, bar p2 where p1.a = p2.a;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Merge Join  (cost=590.35..1578.47 rows=65025 width=8)
   Merge Cond: (p2.a = p1.a)
   ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
         Sort Key: p2.a
         ->  Seq Scan on bar_p0 p2  (cost=0.00..35.50 rows=2550 width=4)
   ->  Sort  (cost=410.57..423.32 rows=5100 width=4)
         Sort Key: p1.a
         ->  Append  (cost=0.00..96.50 rows=5100 width=4)
               ->  Seq Scan on foo_p0 p1  (cost=0.00..35.50 rows=2550 width=4)
               ->  Seq Scan on foo_p1 p1_1  (cost=0.00..35.50 rows=2550 width=4)
(10 rows)


The partitions-wise join will be performed only if we fill the partition hole that
exists for the joining table i.e. adding partitions to bar table.

postgres=# CREATE TABLE bar_p1 PARTITION OF bar FOR VALUES WITH(modulus 2, remainder 1);
CREATE TABLE
postgres=# explain select * from foo p1, bar p2 where p1.a = p2.a;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Append  (cost=359.57..2045.11 rows=65024 width=8)
   ->  Merge Join  (cost=359.57..860.00 rows=32512 width=8)
         Merge Cond: (p1.a = p2.a)
         ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
               Sort Key: p1.a
               ->  Seq Scan on foo_p0 p1  (cost=0.00..35.50 rows=2550 width=4)
         ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
               Sort Key: p2.a
               ->  Seq Scan on bar_p0 p2  (cost=0.00..35.50 rows=2550 width=4)
   ->  Merge Join  (cost=359.57..860.00 rows=32512 width=8)
         Merge Cond: (p1_1.a = p2_1.a)
         ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
               Sort Key: p1_1.a
               ->  Seq Scan on foo_p1 p1_1  (cost=0.00..35.50 rows=2550 width=4)
         ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
               Sort Key: p2_1.a
               ->  Seq Scan on bar_p1 p2_1  (cost=0.00..35.50 rows=2550 width=4)
(17 rows)


It would have been nice if we could support this case, as we do allow partitions
hole for the other partition scheme, but there wouldn't be much objection if
we don't want to add this support for now since there will be a lesser chance
that hash partitioned table has the hole, IMO.


* I removed this assertion in partition_bounds_merge(), because I
think this is covered by two assertions above this.

+   Assert((*outer_parts == NIL || *inner_parts != NIL) &&
+          (*inner_parts == NIL || *outer_parts != NIL));

* (I forgot to mention this in a previous email, but) I removed this
bit of generate_matching_part_pairs(), because we already have the
same check in try_partitionwise_join(), so this bit would be redundant
IIUC.
 
Looks good.



* I added more comments.

Thanks.


If there are no objections, I'll merge the attached with the base patch in [1].

The proposed enhancement in the patch is too good and the patch is pretty much
reasonable to merge into the main patch.

Here are the few cosmetic fixes for this path I think is needed. Feel free to
ignore if some of them do not make sense or too obvious.

Note: left side number represents code line number of the patch.

118 +       }
119 +
120 +       /*
121 +        * Try to create the partition bounds along with join pairs.
122 +        */
123 +       if (boundinfo == NULL)
124 +       {

Can we add this block as else part of previous if condition checking equal partitions bound?

133 +           Assert(list_length(parts1) == list_length(parts2));
134 +           if (boundinfo == NULL)
135 +           {
136 +               joinrel->nparts = 0;
137 +               return;
138 +           }
139 +           nparts = list_length(parts1);

Can we move the assert check below at line#139 in the patch i.e. after if block.

And the question is do we need to do that assert check since partition_bounds_merge()
does that just before returning, thoughts?

204 +       RelOptInfo *child_rel1 = merged ? (RelOptInfo *) lfirst(lcr1) : rel1->part_rels[cnt_parts];
205 +       RelOptInfo *child_rel2 = merged ? (RelOptInfo *) lfirst(lcr2) : rel2->part_rels[cnt_parts];

How about using lfirst_node instead of lfirst & casting explicitly?

Also, these lines crossing 80 column length which I think we need to fix. How about
doing the assignment as follow, just after the variable declaration part:

        if (merged)
        {  
            child_rel1 = lfirst_node(lcr1, RelOptInfo);
            child_rel2 = lfirst_node(lcr2, RelOptInfo);
            lcr1 = lnext(parts1, lcr1);
            lcr2 = lnext(parts2, lcr2);
        }  
        else
        {  
            child_rel1 = rel1->part_rels[cnt_parts];
            child_rel2 =  rel2->part_rels[cnt_parts]
        }  

        rel1_empty = (child_rel1 == NULL || IS_DUMMY_REL(child_rel1));
        rel2_empty = (child_rel2 == NULL || IS_DUMMY_REL(child_rel2));


266 + * get_matching_part_pairs
267 + *     Generate join pairs of partitions for the two inputs for the join rel

Can we rewrite this description as  " Generate join pairs of partitions for the
join rel from the two inputs." OR  "Generate join pairs of partitions for the
two inputs"


310 +       Assert(bms_num_members(child_relids1) == bms_num_members(rel1->relids));
311 +       /*

335 +       Assert(bms_num_members(child_relids2) == bms_num_members(rel2->relids));
336 +       /*

Need newline after assert statements.

Regards,
Amul

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: moonjelly vs tsearch
Следующее
От: Andrey Borodin
Дата:
Сообщение: Re: Bug in GiST paring heap comparator