Обсуждение: Fixing some ancient errors in hash join costing

Поиск

Список

Период

Сортировка

Fixing some ancient errors in hash join costing

От

Tom Lane

Дата:

28 декабря 2025 г., 00:31:37

I spent some time digging into bug #19363 [1], and figured out the
reason why the planner is failing to reject a horribly bad plan.
Even without any stats, it should be able to figure out that building
a hash table estimated at 10 billion rows is less good than building
one estimated at 1000 rows ... but it didn't.  The cause is

(1) estimate_hash_bucket_stats is defined to return zero to *mcv_freq
if it cannot obtain a value for the frequency of the most common
value.

(2) final_cost_hashjoin neglected this specification, and would
blindly adopt zero for the innermcvfreq.

(3) This results in calculating zero for the largest hash bucket size,
and thus the code that intends to disable hashjoin when that bucket
size exceeds get_hash_memory_limit() is turned into a no-op.

This is the exact opposite of what we want, I think.  The intent in
the planner has always been to avoid hashing unless we are pretty
confident that the inner relation's hash key is well-distributed.
Turning off the disable-for-giant-hash-table check when we have
no stats is the polar opposite of sane.

So I said to myself "this is a one-liner fix, we just have to
disregard any mcv_freq reported as zero".  And that did fix the
case shown in the bug report, but it also broke a bunch of
regression test cases.  Upon closer investigation, there is also
an old oversight within estimate_hash_bucket_stats itself.  It
returns zero for mcv_freq if there's no STATISTIC_KIND_MCV entry,
but that neglects the fact that ANALYZE does not make an MCV entry
if it doesn't find any data values that are noticeably more common
than any others.  So the correct behavior really should be to
assume the column is unique and set the mcv_freq to 1 / rows.
In the attached draft patch I made it do this if there's no MCV
stats entry but there is a STATISTIC_KIND_HISTOGRAM entry.
If there's neither, we are probably dealing with a weird datatype
that doesn't have meaningful scalar stats, so I'm hesitant to
just apply the 1 / rows rule blindly.

Even after that, there were more changes in regression test outputs
than I'd expected.  Poking into it further, the first diff in join.out
is precisely a case like the bug report's, where we have no stats
about a potentially large self-join.  The repaired code decides that a
hash join is too risky, so it disables it and we select a merge join
instead, as desired.  However, none of the other places that visibly
change plans are triggering that disable logic.  Instead what is
happening is that the improved mcv_freq estimate is getting used
within estimate_hash_bucket_stats to refine its bucket-size result:

    /*
     * Adjust estimated bucketsize upward to account for skewed distribution.
     */
    if (avgfreq > 0.0 && *mcv_freq > avgfreq)
        estfract *= *mcv_freq / avgfreq;

This code does nothing if *mcv_freq is zero, but if we have a
more-than-zero estimate it can increase the bucket size fraction,
and that is indeed happening in the other places in the core
regression tests where we see plans change.  AFAICT these changes
are all perfectly sane and not anything to worry about.

I also had to stick an additional ANALYZE step into join_hash.sql
to keep plans from changing there.

I remain a bit confused by the change in postgres_fdw.out though.
It's deciding to push an ORDER BY down to the remote side when
it didn't before, which seems like an improvement; but I fail to
see how a marginal change in hash join costing would lead to that.
Perhaps that is worth looking into more closely.

            regards, tom lane

[1] https://www.postgresql.org/message-id/19363-8dd32fc7600a1153%40postgresql.org

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 48e3185b227..a01aedc84c2 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -4216,16 +4216,13 @@ RESET enable_sort;
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND
date(c5)= '1970-01-17'::date) ORDER BY c1; 
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st3(10, 20);
-
      QUERY PLAN
                      

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
- Sort
+
                   QUERY PLAN
                                                 

+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-   Sort Key: t1.c1
-   ->  Foreign Scan
-         Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-         Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
-         Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE
((r1."C1" < 20)) AND EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND ((date(r3.c5) =
'1970-01-17'::date))AND ((r3.c3 = r1.c3))) 
-(7 rows)
+   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
+   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" <
20))AND EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND ((date(r3.c5) = '1970-01-17'::date)) AND
((r3.c3= r1.c3))) ORDER BY r1."C 1" ASC NULLS LAST 
+(4 rows)

 EXECUTE st3(10, 20);
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index a39cc793b4d..822c696572b 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -4360,7 +4360,7 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
                                  path->jpath.jointype))
     {
         innerbucketsize = 1.0 / virtualbuckets;
-        innermcvfreq = 0.0;
+        innermcvfreq = 1.0 / inner_path_rows_total;
     }
     else
     {
@@ -4428,7 +4428,8 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,

             if (innerbucketsize > thisbucketsize)
                 innerbucketsize = thisbucketsize;
-            if (innermcvfreq > thismcvfreq)
+            /* Disregard zero for MCV freq, it means we have no data */
+            if (innermcvfreq > thismcvfreq && thismcvfreq > 0.0)
                 innermcvfreq = thismcvfreq;
         }
     }
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index c760b19db55..c0dc26b4647 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -4414,6 +4414,18 @@ estimate_hash_bucket_stats(PlannerInfo *root, Node *hashkey, double nbuckets,
                 *mcv_freq = sslot.numbers[0];
             free_attstatsslot(&sslot);
         }
+        else if (get_attstatsslot(&sslot, vardata.statsTuple,
+                                  STATISTIC_KIND_HISTOGRAM, InvalidOid,
+                                  0))
+        {
+            /*
+             * If there are no recorded MCVs, but we do have a histogram, then
+             * assume that ANALYZE determined that the column is unique.
+             */
+            if (vardata.rel && vardata.rel->rows > 0)
+                *mcv_freq = 1.0 / vardata.rel->rows;
+            free_attstatsslot(&sslot);
+        }
     }

     /* Get number of distinct values */
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index edde9e99893..bf11d48bb3a 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3303,10 +3303,13 @@ where not exists (
 );
                        QUERY PLAN
 ---------------------------------------------------------
- Hash Anti Join
-   Hash Cond: (t1.c1 = t2.c2)
-   ->  Seq Scan on tt4x t1
-   ->  Hash
+ Merge Anti Join
+   Merge Cond: (t1.c1 = t2.c2)
+   ->  Sort
+         Sort Key: t1.c1
+         ->  Seq Scan on tt4x t1
+   ->  Sort
+         Sort Key: t2.c2
          ->  Merge Right Join
                Merge Cond: (t5.c1 = t3.c2)
                ->  Merge Join
@@ -3327,7 +3330,7 @@ where not exists (
                            ->  Sort
                                  Sort Key: t3.c1
                                  ->  Seq Scan on tt4x t3
-(24 rows)
+(27 rows)

 --
 -- regression test for problems of the sort depicted in bug #3494
@@ -9313,19 +9316,19 @@ select * from fkest f1
   join fkest f2 on (f1.x = f2.x and f1.x10 = f2.x10b and f1.x100 = f2.x100)
   join fkest f3 on f1.x = f3.x
   where f1.x100 = 2;
-                     QUERY PLAN
------------------------------------------------------
+                        QUERY PLAN
+-----------------------------------------------------------
  Hash Join
-   Hash Cond: ((f2.x = f1.x) AND (f2.x10b = f1.x10))
+   Hash Cond: (f1.x = f3.x)
    ->  Hash Join
-         Hash Cond: (f3.x = f2.x)
-         ->  Seq Scan on fkest f3
+         Hash Cond: ((f2.x = f1.x) AND (f2.x10b = f1.x10))
+         ->  Seq Scan on fkest f2
+               Filter: (x100 = 2)
          ->  Hash
-               ->  Seq Scan on fkest f2
+               ->  Seq Scan on fkest f1
                      Filter: (x100 = 2)
    ->  Hash
-         ->  Seq Scan on fkest f1
-               Filter: (x100 = 2)
+         ->  Seq Scan on fkest f3
 (11 rows)

 rollback;
diff --git a/src/test/regress/expected/join_hash.out b/src/test/regress/expected/join_hash.out
index a45e1450040..4749f6ed70d 100644
--- a/src/test/regress/expected/join_hash.out
+++ b/src/test/regress/expected/join_hash.out
@@ -559,6 +559,7 @@ create table join_foo as select generate_series(1, 3) as id, 'xxxxx'::text as t;
 alter table join_foo set (parallel_workers = 0);
 create table join_bar as select generate_series(1, 10000) as id, 'xxxxx'::text as t;
 alter table join_bar set (parallel_workers = 2);
+analyze join_foo, join_bar;
 -- multi-batch with rescan, parallel-oblivious
 savepoint settings;
 set enable_parallel_hash = off;
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 17d27ef3d46..559b87ec66e 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -158,20 +158,20 @@ SELECT t1, t2 FROM prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b WHERE t1.b = 0 ORDER
 --------------------------------------------------
  Sort
    Sort Key: t1.a, t2.b
-   ->  Hash Right Join
-         Hash Cond: (t2.b = t1.a)
+   ->  Hash Left Join
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_p1 t2_1
-               ->  Seq Scan on prt2_p2 t2_2
-               ->  Seq Scan on prt2_p3 t2_3
+               ->  Seq Scan on prt1_p1 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_p2 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_p3 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_p1 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_p2 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_p3 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_p1 t2_1
+                     ->  Seq Scan on prt2_p2 t2_2
+                     ->  Seq Scan on prt2_p3 t2_3
 (16 rows)

 SELECT t1, t2 FROM prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b WHERE t1.b = 0 ORDER BY t1.a, t2.b;
@@ -297,23 +297,23 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.a <
 -- Currently we can't do partitioned join if nullable-side partitions are pruned
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1 WHERE a < 450) t1 LEFT JOIN (SELECT * FROM prt2 WHERE b > 250)
t2ON t1.a = t2.b WHERE t1.b = 0 ORDER BY t1.a, t2.b; 
-                        QUERY PLAN
------------------------------------------------------------
+                     QUERY PLAN
+-----------------------------------------------------
  Sort
    Sort Key: prt1.a, prt2.b
-   ->  Hash Right Join
-         Hash Cond: (prt2.b = prt1.a)
+   ->  Hash Left Join
+         Hash Cond: (prt1.a = prt2.b)
          ->  Append
-               ->  Seq Scan on prt2_p2 prt2_1
-                     Filter: (b > 250)
-               ->  Seq Scan on prt2_p3 prt2_2
-                     Filter: (b > 250)
+               ->  Seq Scan on prt1_p1 prt1_1
+                     Filter: ((a < 450) AND (b = 0))
+               ->  Seq Scan on prt1_p2 prt1_2
+                     Filter: ((a < 450) AND (b = 0))
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_p1 prt1_1
-                           Filter: ((a < 450) AND (b = 0))
-                     ->  Seq Scan on prt1_p2 prt1_2
-                           Filter: ((a < 450) AND (b = 0))
+                     ->  Seq Scan on prt2_p2 prt2_1
+                           Filter: (b > 250)
+                     ->  Seq Scan on prt2_p3 prt2_2
+                           Filter: (b > 250)
 (15 rows)

 SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1 WHERE a < 450) t1 LEFT JOIN (SELECT * FROM prt2 WHERE b > 250)
t2ON t1.a = t2.b WHERE t1.b = 0 ORDER BY t1.a, t2.b; 
@@ -778,23 +778,23 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_e t1, prt2_e t2 WHERE (t1.a + t1.b)/2 =
    Sort Key: t1.a, t2.b
    ->  Append
          ->  Hash Join
-               Hash Cond: (((t2_1.b + t2_1.a) / 2) = ((t1_1.a + t1_1.b) / 2))
-               ->  Seq Scan on prt2_e_p1 t2_1
+               Hash Cond: (((t1_1.a + t1_1.b) / 2) = ((t2_1.b + t2_1.a) / 2))
+               ->  Seq Scan on prt1_e_p1 t1_1
+                     Filter: (c = 0)
                ->  Hash
-                     ->  Seq Scan on prt1_e_p1 t1_1
-                           Filter: (c = 0)
+                     ->  Seq Scan on prt2_e_p1 t2_1
          ->  Hash Join
-               Hash Cond: (((t2_2.b + t2_2.a) / 2) = ((t1_2.a + t1_2.b) / 2))
-               ->  Seq Scan on prt2_e_p2 t2_2
+               Hash Cond: (((t1_2.a + t1_2.b) / 2) = ((t2_2.b + t2_2.a) / 2))
+               ->  Seq Scan on prt1_e_p2 t1_2
+                     Filter: (c = 0)
                ->  Hash
-                     ->  Seq Scan on prt1_e_p2 t1_2
-                           Filter: (c = 0)
+                     ->  Seq Scan on prt2_e_p2 t2_2
          ->  Hash Join
-               Hash Cond: (((t2_3.b + t2_3.a) / 2) = ((t1_3.a + t1_3.b) / 2))
-               ->  Seq Scan on prt2_e_p3 t2_3
+               Hash Cond: (((t1_3.a + t1_3.b) / 2) = ((t2_3.b + t2_3.a) / 2))
+               ->  Seq Scan on prt1_e_p3 t1_3
+                     Filter: (c = 0)
                ->  Hash
-                     ->  Seq Scan on prt1_e_p3 t1_3
-                           Filter: (c = 0)
+                     ->  Seq Scan on prt2_e_p3 t2_3
 (21 rows)

 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_e t1, prt2_e t2 WHERE (t1.a + t1.b)/2 = (t2.b + t2.a)/2 AND t1.c = 0 ORDER BY
t1.a,t2.b; 
@@ -864,26 +864,26 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2
  Sort
    Sort Key: t1.a, t2.b, ((t3.a + t3.b))
    ->  Append
-         ->  Hash Right Join
-               Hash Cond: (((t3_1.a + t3_1.b) / 2) = t1_1.a)
-               ->  Seq Scan on prt1_e_p1 t3_1
+         ->  Hash Left Join
+               Hash Cond: (t1_1.a = ((t3_1.a + t3_1.b) / 2))
+               ->  Hash Right Join
+                     Hash Cond: (t2_1.b = t1_1.a)
+                     ->  Seq Scan on prt2_p1 t2_1
+                     ->  Hash
+                           ->  Seq Scan on prt1_p1 t1_1
+                                 Filter: (b = 0)
                ->  Hash
-                     ->  Hash Right Join
-                           Hash Cond: (t2_1.b = t1_1.a)
-                           ->  Seq Scan on prt2_p1 t2_1
-                           ->  Hash
-                                 ->  Seq Scan on prt1_p1 t1_1
-                                       Filter: (b = 0)
-         ->  Hash Right Join
-               Hash Cond: (((t3_2.a + t3_2.b) / 2) = t1_2.a)
-               ->  Seq Scan on prt1_e_p2 t3_2
+                     ->  Seq Scan on prt1_e_p1 t3_1
+         ->  Hash Left Join
+               Hash Cond: (t1_2.a = ((t3_2.a + t3_2.b) / 2))
+               ->  Hash Right Join
+                     Hash Cond: (t2_2.b = t1_2.a)
+                     ->  Seq Scan on prt2_p2 t2_2
+                     ->  Hash
+                           ->  Seq Scan on prt1_p2 t1_2
+                                 Filter: (b = 0)
                ->  Hash
-                     ->  Hash Right Join
-                           Hash Cond: (t2_2.b = t1_2.a)
-                           ->  Seq Scan on prt2_p2 t2_2
-                           ->  Hash
-                                 ->  Seq Scan on prt1_p2 t1_2
-                                       Filter: (b = 0)
+                     ->  Seq Scan on prt1_e_p2 t3_2
          ->  Hash Right Join
                Hash Cond: (((t3_3.a + t3_3.b) / 2) = t1_3.a)
                ->  Seq Scan on prt1_e_p3 t3_3
@@ -921,21 +921,21 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2
    Sort Key: t1.a, t2.b, ((t3.a + t3.b))
    ->  Append
          ->  Nested Loop Left Join
-               ->  Hash Right Join
-                     Hash Cond: (t1_1.a = ((t3_1.a + t3_1.b) / 2))
-                     ->  Seq Scan on prt1_p1 t1_1
+               ->  Hash Left Join
+                     Hash Cond: (((t3_1.a + t3_1.b) / 2) = t1_1.a)
+                     ->  Seq Scan on prt1_e_p1 t3_1
+                           Filter: (c = 0)
                      ->  Hash
-                           ->  Seq Scan on prt1_e_p1 t3_1
-                                 Filter: (c = 0)
+                           ->  Seq Scan on prt1_p1 t1_1
                ->  Index Scan using iprt2_p1_b on prt2_p1 t2_1
                      Index Cond: (b = t1_1.a)
          ->  Nested Loop Left Join
-               ->  Hash Right Join
-                     Hash Cond: (t1_2.a = ((t3_2.a + t3_2.b) / 2))
-                     ->  Seq Scan on prt1_p2 t1_2
+               ->  Hash Left Join
+                     Hash Cond: (((t3_2.a + t3_2.b) / 2) = t1_2.a)
+                     ->  Seq Scan on prt1_e_p2 t3_2
+                           Filter: (c = 0)
                      ->  Hash
-                           ->  Seq Scan on prt1_e_p2 t3_2
-                                 Filter: (c = 0)
+                           ->  Seq Scan on prt1_p2 t1_2
                ->  Index Scan using iprt2_p2_b on prt2_p2 t2_2
                      Index Cond: (b = t1_2.a)
          ->  Nested Loop Left Join
@@ -1080,14 +1080,14 @@ SELECT COUNT(*) FROM prt1 FULL JOIN prt2 p2(b,a,c) USING(a,b) FULL JOIN prt2 p3(
 -- make sure these go to null as expected
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.phv, t2.b, t2.phv, t3.a + t3.b, t3.phv FROM ((SELECT 50 phv, * FROM prt1 WHERE prt1.b = 0) t1 FULL
JOIN(SELECT 75 phv, * FROM prt2 WHERE prt2.a = 0) t2 ON (t1.a = t2.b)) FULL JOIN (SELECT 50 phv, * FROM prt1_e WHERE
prt1_e.c= 0) t3 ON (t1.a = (t3.a + t3.b)/2) WHERE t1.a = t1.phv OR t2.b = t2.phv OR (t3.a + t3.b)/2 = t3.phv ORDER BY
t1.a,t2.b, t3.a + t3.b; 
-                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
+                                          QUERY PLAN
+----------------------------------------------------------------------------------------------
  Sort
    Sort Key: prt1.a, prt2.b, ((prt1_e.a + prt1_e.b))
-   ->  Append
-         ->  Hash Full Join
-               Hash Cond: (prt1_1.a = ((prt1_e_1.a + prt1_e_1.b) / 2))
-               Filter: ((prt1_1.a = (50)) OR (prt2_1.b = (75)) OR (((prt1_e_1.a + prt1_e_1.b) / 2) = (50)))
+   ->  Hash Full Join
+         Hash Cond: (prt1.a = ((prt1_e.a + prt1_e.b) / 2))
+         Filter: ((prt1.a = (50)) OR (prt2.b = (75)) OR (((prt1_e.a + prt1_e.b) / 2) = (50)))
+         ->  Append
                ->  Hash Full Join
                      Hash Cond: (prt1_1.a = prt2_1.b)
                      ->  Seq Scan on prt1_p1 prt1_1
@@ -1095,12 +1095,6 @@ SELECT t1.a, t1.phv, t2.b, t2.phv, t3.a + t3.b, t3.phv FROM ((SELECT 50 phv, * F
                      ->  Hash
                            ->  Seq Scan on prt2_p1 prt2_1
                                  Filter: (a = 0)
-               ->  Hash
-                     ->  Seq Scan on prt1_e_p1 prt1_e_1
-                           Filter: (c = 0)
-         ->  Hash Full Join
-               Hash Cond: (prt1_2.a = ((prt1_e_2.a + prt1_e_2.b) / 2))
-               Filter: ((prt1_2.a = (50)) OR (prt2_2.b = (75)) OR (((prt1_e_2.a + prt1_e_2.b) / 2) = (50)))
                ->  Hash Full Join
                      Hash Cond: (prt1_2.a = prt2_2.b)
                      ->  Seq Scan on prt1_p2 prt1_2
@@ -1108,12 +1102,6 @@ SELECT t1.a, t1.phv, t2.b, t2.phv, t3.a + t3.b, t3.phv FROM ((SELECT 50 phv, * F
                      ->  Hash
                            ->  Seq Scan on prt2_p2 prt2_2
                                  Filter: (a = 0)
-               ->  Hash
-                     ->  Seq Scan on prt1_e_p2 prt1_e_2
-                           Filter: (c = 0)
-         ->  Hash Full Join
-               Hash Cond: (prt1_3.a = ((prt1_e_3.a + prt1_e_3.b) / 2))
-               Filter: ((prt1_3.a = (50)) OR (prt2_3.b = (75)) OR (((prt1_e_3.a + prt1_e_3.b) / 2) = (50)))
                ->  Hash Full Join
                      Hash Cond: (prt1_3.a = prt2_3.b)
                      ->  Seq Scan on prt1_p3 prt1_3
@@ -1121,10 +1109,15 @@ SELECT t1.a, t1.phv, t2.b, t2.phv, t3.a + t3.b, t3.phv FROM ((SELECT 50 phv, * F
                      ->  Hash
                            ->  Seq Scan on prt2_p3 prt2_3
                                  Filter: (a = 0)
-               ->  Hash
+         ->  Hash
+               ->  Append
+                     ->  Seq Scan on prt1_e_p1 prt1_e_1
+                           Filter: (c = 0)
+                     ->  Seq Scan on prt1_e_p2 prt1_e_2
+                           Filter: (c = 0)
                      ->  Seq Scan on prt1_e_p3 prt1_e_3
                            Filter: (c = 0)
-(42 rows)
+(35 rows)

 SELECT t1.a, t1.phv, t2.b, t2.phv, t3.a + t3.b, t3.phv FROM ((SELECT 50 phv, * FROM prt1 WHERE prt1.b = 0) t1 FULL
JOIN(SELECT 75 phv, * FROM prt2 WHERE prt2.a = 0) t2 ON (t1.a = t2.b)) FULL JOIN (SELECT 50 phv, * FROM prt1_e WHERE
prt1_e.c= 0) t3 ON (t1.a = (t3.a + t3.b)/2) WHERE t1.a = t1.phv OR t2.b = t2.phv OR (t3.a + t3.b)/2 = t3.phv ORDER BY
t1.a,t2.b, t3.a + t3.b; 
  a  | phv | b  | phv | ?column? | phv
@@ -1146,11 +1139,11 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1, prt1_e t2 WHER
                ->  Sort
                      Sort Key: t1_5.b
                      ->  Hash Join
-                           Hash Cond: (((t2_1.a + t2_1.b) / 2) = t1_5.b)
-                           ->  Seq Scan on prt1_e_p1 t2_1
+                           Hash Cond: (t1_5.b = ((t2_1.a + t2_1.b) / 2))
+                           ->  Seq Scan on prt2_p1 t1_5
+                                 Filter: (a = 0)
                            ->  Hash
-                                 ->  Seq Scan on prt2_p1 t1_5
-                                       Filter: (a = 0)
+                                 ->  Seq Scan on prt1_e_p1 t2_1
          ->  Index Scan using iprt1_p1_a on prt1_p1 t1_2
                Index Cond: (a = ((t2_1.a + t2_1.b) / 2))
                Filter: (b = 0)
@@ -1160,11 +1153,11 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1, prt1_e t2 WHER
                ->  Sort
                      Sort Key: t1_6.b
                      ->  Hash Join
-                           Hash Cond: (((t2_2.a + t2_2.b) / 2) = t1_6.b)
-                           ->  Seq Scan on prt1_e_p2 t2_2
+                           Hash Cond: (t1_6.b = ((t2_2.a + t2_2.b) / 2))
+                           ->  Seq Scan on prt2_p2 t1_6
+                                 Filter: (a = 0)
                            ->  Hash
-                                 ->  Seq Scan on prt2_p2 t1_6
-                                       Filter: (a = 0)
+                                 ->  Seq Scan on prt1_e_p2 t2_2
          ->  Index Scan using iprt1_p2_a on prt1_p2 t1_3
                Index Cond: (a = ((t2_2.a + t2_2.b) / 2))
                Filter: (b = 0)
@@ -1944,12 +1937,12 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1 LEFT JOIN prt2_l t2 ON t1.a = t2.b
  Sort
    Sort Key: t1.a, t2.b
    ->  Append
-         ->  Hash Right Join
-               Hash Cond: ((t2_1.b = t1_1.a) AND ((t2_1.c)::text = (t1_1.c)::text))
-               ->  Seq Scan on prt2_l_p1 t2_1
+         ->  Hash Left Join
+               Hash Cond: ((t1_1.a = t2_1.b) AND ((t1_1.c)::text = (t2_1.c)::text))
+               ->  Seq Scan on prt1_l_p1 t1_1
+                     Filter: (b = 0)
                ->  Hash
-                     ->  Seq Scan on prt1_l_p1 t1_1
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_l_p1 t2_1
          ->  Hash Right Join
                Hash Cond: ((t2_2.b = t1_2.a) AND ((t2_2.c)::text = (t1_2.c)::text))
                ->  Seq Scan on prt2_l_p2_p1 t2_2
@@ -2968,26 +2961,26 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 LEFT JOIN prt2_adv t2 ON (t1.a =
 -- partitions on the nullable side
 EXPLAIN (COSTS OFF)
 SELECT t1.b, t1.c, t2.a, t2.c FROM prt2_adv t1 LEFT JOIN prt1_adv t2 ON (t1.b = t2.a) WHERE t1.a = 0 ORDER BY t1.b,
t2.a;
-                       QUERY PLAN
----------------------------------------------------------
+                      QUERY PLAN
+------------------------------------------------------
  Sort
    Sort Key: t1.b, t2.a
-   ->  Hash Right Join
-         Hash Cond: (t2.a = t1.b)
+   ->  Hash Left Join
+         Hash Cond: (t1.b = t2.a)
          ->  Append
-               ->  Seq Scan on prt1_adv_p1 t2_1
-               ->  Seq Scan on prt1_adv_p2 t2_2
-               ->  Seq Scan on prt1_adv_p3 t2_3
+               ->  Seq Scan on prt2_adv_p1 t1_1
+                     Filter: (a = 0)
+               ->  Seq Scan on prt2_adv_p2 t1_2
+                     Filter: (a = 0)
+               ->  Seq Scan on prt2_adv_p3 t1_3
+                     Filter: (a = 0)
+               ->  Seq Scan on prt2_adv_extra t1_4
+                     Filter: (a = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt2_adv_p1 t1_1
-                           Filter: (a = 0)
-                     ->  Seq Scan on prt2_adv_p2 t1_2
-                           Filter: (a = 0)
-                     ->  Seq Scan on prt2_adv_p3 t1_3
-                           Filter: (a = 0)
-                     ->  Seq Scan on prt2_adv_extra t1_4
-                           Filter: (a = 0)
+                     ->  Seq Scan on prt1_adv_p1 t2_1
+                     ->  Seq Scan on prt1_adv_p2 t2_2
+                     ->  Seq Scan on prt1_adv_p3 t2_3
 (18 rows)

 -- anti join
@@ -3031,26 +3024,26 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
 -- partitions on the nullable side
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt2_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt1_adv t2 WHERE t1.b = t2.a) AND t1.a = 0 ORDER BY
t1.b;
-                       QUERY PLAN
----------------------------------------------------------
+                      QUERY PLAN
+------------------------------------------------------
  Sort
    Sort Key: t1.b
-   ->  Hash Right Anti Join
-         Hash Cond: (t2.a = t1.b)
+   ->  Hash Anti Join
+         Hash Cond: (t1.b = t2.a)
          ->  Append
-               ->  Seq Scan on prt1_adv_p1 t2_1
-               ->  Seq Scan on prt1_adv_p2 t2_2
-               ->  Seq Scan on prt1_adv_p3 t2_3
+               ->  Seq Scan on prt2_adv_p1 t1_1
+                     Filter: (a = 0)
+               ->  Seq Scan on prt2_adv_p2 t1_2
+                     Filter: (a = 0)
+               ->  Seq Scan on prt2_adv_p3 t1_3
+                     Filter: (a = 0)
+               ->  Seq Scan on prt2_adv_extra t1_4
+                     Filter: (a = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt2_adv_p1 t1_1
-                           Filter: (a = 0)
-                     ->  Seq Scan on prt2_adv_p2 t1_2
-                           Filter: (a = 0)
-                     ->  Seq Scan on prt2_adv_p3 t1_3
-                           Filter: (a = 0)
-                     ->  Seq Scan on prt2_adv_extra t1_4
-                           Filter: (a = 0)
+                     ->  Seq Scan on prt1_adv_p1 t2_1
+                     ->  Seq Scan on prt1_adv_p2 t2_2
+                     ->  Seq Scan on prt1_adv_p3 t2_3
 (18 rows)

 -- full join; currently we can't do partitioned join if there are no matched
@@ -3146,97 +3139,97 @@ ANALYZE prt2_adv;
 -- inner join
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a = t2.b) WHERE t1.b = 0 ORDER BY t1.a,
t2.b;
-                      QUERY PLAN
-------------------------------------------------------
+                       QUERY PLAN
+--------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Hash Join
-         Hash Cond: (t2.b = t1.a)
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t2_1
-               ->  Seq Scan on prt2_adv_p2 t2_2
-               ->  Seq Scan on prt2_adv_p3_1 t2_3
-               ->  Seq Scan on prt2_adv_p3_2 t2_4
+               ->  Seq Scan on prt1_adv_p1 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p2 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p3 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p1 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p2 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p3 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_adv_p1 t2_1
+                     ->  Seq Scan on prt2_adv_p2 t2_2
+                     ->  Seq Scan on prt2_adv_p3_1 t2_3
+                     ->  Seq Scan on prt2_adv_p3_2 t2_4
 (17 rows)

 -- semi join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
-                      QUERY PLAN
-------------------------------------------------------
+                       QUERY PLAN
+--------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Right Semi Join
-         Hash Cond: (t2.b = t1.a)
+   ->  Hash Semi Join
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t2_1
-               ->  Seq Scan on prt2_adv_p2 t2_2
-               ->  Seq Scan on prt2_adv_p3_1 t2_3
-               ->  Seq Scan on prt2_adv_p3_2 t2_4
+               ->  Seq Scan on prt1_adv_p1 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p2 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p3 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p1 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p2 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p3 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_adv_p1 t2_1
+                     ->  Seq Scan on prt2_adv_p2 t2_2
+                     ->  Seq Scan on prt2_adv_p3_1 t2_3
+                     ->  Seq Scan on prt2_adv_p3_2 t2_4
 (17 rows)

 -- left join
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 LEFT JOIN prt2_adv t2 ON (t1.a = t2.b) WHERE t1.b = 0 ORDER BY t1.a,
t2.b;
-                      QUERY PLAN
-------------------------------------------------------
+                       QUERY PLAN
+--------------------------------------------------------
  Sort
    Sort Key: t1.a, t2.b
-   ->  Hash Right Join
-         Hash Cond: (t2.b = t1.a)
+   ->  Hash Left Join
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t2_1
-               ->  Seq Scan on prt2_adv_p2 t2_2
-               ->  Seq Scan on prt2_adv_p3_1 t2_3
-               ->  Seq Scan on prt2_adv_p3_2 t2_4
+               ->  Seq Scan on prt1_adv_p1 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p2 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p3 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p1 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p2 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p3 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_adv_p1 t2_1
+                     ->  Seq Scan on prt2_adv_p2 t2_2
+                     ->  Seq Scan on prt2_adv_p3_1 t2_3
+                     ->  Seq Scan on prt2_adv_p3_2 t2_4
 (17 rows)

 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY
t1.a;
-                      QUERY PLAN
-------------------------------------------------------
+                       QUERY PLAN
+--------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Right Anti Join
-         Hash Cond: (t2.b = t1.a)
+   ->  Hash Anti Join
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t2_1
-               ->  Seq Scan on prt2_adv_p2 t2_2
-               ->  Seq Scan on prt2_adv_p3_1 t2_3
-               ->  Seq Scan on prt2_adv_p3_2 t2_4
+               ->  Seq Scan on prt1_adv_p1 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p2 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p3 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p1 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p2 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p3 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_adv_p1 t2_1
+                     ->  Seq Scan on prt2_adv_p2 t2_2
+                     ->  Seq Scan on prt2_adv_p3_1 t2_3
+                     ->  Seq Scan on prt2_adv_p3_2 t2_4
 (17 rows)

 -- full join
@@ -3326,19 +3319,19 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a =
  Sort
    Sort Key: t1.a
    ->  Hash Join
-         Hash Cond: (t2.b = t1.a)
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t2_1
-               ->  Seq Scan on prt2_adv_p2 t2_2
-               ->  Seq Scan on prt2_adv_p3 t2_3
+               ->  Seq Scan on prt1_adv_p2 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p3 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p1 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p2 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p3 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p1 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_adv_p1 t2_1
+                     ->  Seq Scan on prt2_adv_p2 t2_2
+                     ->  Seq Scan on prt2_adv_p3 t2_3
 (16 rows)

 ALTER TABLE prt2_adv DETACH PARTITION prt2_adv_p3;
@@ -3354,19 +3347,19 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a =
  Sort
    Sort Key: t1.a
    ->  Hash Join
-         Hash Cond: (t2.b = t1.a)
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t2_1
-               ->  Seq Scan on prt2_adv_p2 t2_2
-               ->  Seq Scan on prt2_adv_p3 t2_3
+               ->  Seq Scan on prt1_adv_p2 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p3 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p1 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p2 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p3 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p1 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_adv_p1 t2_1
+                     ->  Seq Scan on prt2_adv_p2 t2_2
+                     ->  Seq Scan on prt2_adv_p3 t2_3
 (16 rows)

 DROP TABLE prt1_adv_p3;
@@ -5018,11 +5011,11 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2
                ->  Hash
                      ->  Seq Scan on beta_neg_p1 t2_1
          ->  Hash Join
-               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.b = t1_2.b))
-               ->  Seq Scan on beta_neg_p2 t2_2
+               Hash Cond: ((t1_2.a = t2_2.a) AND (t1_2.b = t2_2.b))
+               ->  Seq Scan on alpha_neg_p2 t1_2
+                     Filter: ((b >= 125) AND (b < 225))
                ->  Hash
-                     ->  Seq Scan on alpha_neg_p2 t1_2
-                           Filter: ((b >= 125) AND (b < 225))
+                     ->  Seq Scan on beta_neg_p2 t2_2
          ->  Hash Join
                Hash Cond: ((t2_4.a = t1_4.a) AND (t2_4.b = t1_4.b))
                ->  Append
@@ -5141,25 +5134,28 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.c = t2

 EXPLAIN (COSTS OFF)
 SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2.b AND t1.c = t2.c) WHERE ((t1.b >=
100AND t1.b < 110) OR (t1.b >= 200 AND t1.b < 210)) AND ((t2.b >= 100 AND t2.b < 110) OR (t2.b >= 200 AND t2.b < 210))
ANDt1.c IN ('0004', '0009') ORDER BY t1.a, t1.b; 
-                                                           QUERY PLAN
         

---------------------------------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN
               

+--------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Sort Key: t1.a, t1.b
    ->  Append
-         ->  Hash Join
-               Hash Cond: ((t1_1.a = t2_1.a) AND (t1_1.b = t2_1.b) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on alpha_neg_p1 t1_1
-                     Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b
<210)))) 
-               ->  Hash
+         ->  Merge Join
+               Merge Cond: ((t1_1.a = t2_1.a) AND (t1_1.b = t2_1.b) AND (t1_1.c = t2_1.c))
+               ->  Sort
+                     Sort Key: t1_1.a, t1_1.b, t1_1.c
+                     ->  Seq Scan on alpha_neg_p1 t1_1
+                           Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200)
AND(b < 210)))) 
+               ->  Sort
+                     Sort Key: t2_1.a, t2_1.b, t2_1.c
                      ->  Seq Scan on beta_neg_p1 t2_1
                            Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))
          ->  Hash Join
-               Hash Cond: ((t1_2.a = t2_2.a) AND (t1_2.b = t2_2.b) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on alpha_neg_p2 t1_2
-                     Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b
<210)))) 
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.b = t1_2.b) AND (t2_2.c = t1_2.c))
+               ->  Seq Scan on beta_neg_p2 t2_2
+                     Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))
                ->  Hash
-                     ->  Seq Scan on beta_neg_p2 t2_2
-                           Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))
+                     ->  Seq Scan on alpha_neg_p2 t1_2
+                           Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200)
AND(b < 210)))) 
          ->  Nested Loop
                Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.b = t2_3.b) AND (t1_3.c = t2_3.c))
                ->  Seq Scan on alpha_pos_p2 t1_3
@@ -5172,7 +5168,7 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2
                      Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b
<210)))) 
                ->  Seq Scan on beta_pos_p3 t2_4
                      Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))
-(29 rows)
+(32 rows)

 SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2.b AND t1.c = t2.c) WHERE ((t1.b >=
100AND t1.b < 110) OR (t1.b >= 200 AND t1.b < 210)) AND ((t2.b >= 100 AND t2.b < 110) OR (t2.b >= 200 AND t2.b < 210))
ANDt1.c IN ('0004', '0009') ORDER BY t1.a, t1.b; 
  a  |  b  |  c   | a  |  b  |  c
diff --git a/src/test/regress/sql/join_hash.sql b/src/test/regress/sql/join_hash.sql
index 6b0688ab0a6..49d3fd61856 100644
--- a/src/test/regress/sql/join_hash.sql
+++ b/src/test/regress/sql/join_hash.sql
@@ -314,6 +314,7 @@ create table join_foo as select generate_series(1, 3) as id, 'xxxxx'::text as t;
 alter table join_foo set (parallel_workers = 0);
 create table join_bar as select generate_series(1, 10000) as id, 'xxxxx'::text as t;
 alter table join_bar set (parallel_workers = 2);
+analyze join_foo, join_bar;

 -- multi-batch with rescan, parallel-oblivious
 savepoint settings;

Re: Fixing some ancient errors in hash join costing

От

Tom Lane

Дата:

28 декабря 2025 г., 23:15:56

I wrote:
> I remain a bit confused by the change in postgres_fdw.out though.
> It's deciding to push an ORDER BY down to the remote side when
> it didn't before, which seems like an improvement; but I fail to
> see how a marginal change in hash join costing would lead to that.
> Perhaps that is worth looking into more closely.

I dug into that bit today, and concluded that it's a "nothing to
see here" case.  We are comparing the costs of doing a sort step
locally vs remotely --- but if the remote server is identically
configured, which it surely is in this test, then cost_sort()
will produce the same answers on both sides, and we are comparing
path costs that are the same to within roundoff error and
cost-quantization effects.  My patch does move the underlying
semijoin's cost just a hair, and that results in changes in what
add_path does with the locally-sorted versus remotely-sorted paths,
but really there's no reason to prefer one over the other.

I was amused to notice that the postgres_fdw.out change made in my
patch reverts one made in aa86129e1 (which also affected semijoin
costing).  So we've had trouble before with that test case being
fundamentally unstable.  I wonder if we shouldn't do something to try
to stabilize it?  I see that the test immediately before this one
forces the matter by turning off enable_sort (which'd affect only
the local side not the remote).  That's a hack all right but maybe
we should extend it to this test.

            regards, tom lane

Re: Fixing some ancient errors in hash join costing

От

Tom Lane

Дата:

29 декабря 2025 г., 01:56:40

I wrote:
> I was amused to notice that the postgres_fdw.out change made in my
> patch reverts one made in aa86129e1 (which also affected semijoin
> costing).  So we've had trouble before with that test case being
> fundamentally unstable.  I wonder if we shouldn't do something to try
> to stabilize it?  I see that the test immediately before this one
> forces the matter by turning off enable_sort (which'd affect only
> the local side not the remote).  That's a hack all right but maybe
> we should extend it to this test.

Here's a v2 patchset that splits that out as a separate change for
clarity's sake.  I also spent a bit of effort on commit log messages,
including researching the git history.

            regards, tom lane

From 70de171b9e1802a63ed524b7412c239bd4f6e05e Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 28 Dec 2025 17:00:01 -0500
Subject: [PATCH v2 1/2] Further stabilize a postgres_fdw test case.

This patch causes one postgres_fdw test case to revert to the plan
it used before aa86129e1, i.e., using a remote sort in preference to
local sort.  That decision is actually a coin-flip because cost_sort()
will give the same answer on both sides, so that the plan choice comes
down to little more than roundoff error.  In consequence, the test
output can change as a result of even minor changes in nearby costs,
as we saw in aa86129e1 (compare also b690e5fac and 4b14e1871).

b690e5fac's solution to stabilizing the adjacent test case was to
disable sorting locally, and here we extend that to the currently-
problematic case.  Without this, an upcoming patch would cause this
plan choice to change back in this same way, for even less apparent
reason.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2551253.1766952956@sss.pgh.pa.us
---
 .../postgres_fdw/expected/postgres_fdw.out    | 22 +++++++++----------
 contrib/postgres_fdw/sql/postgres_fdw.sql     | 12 ++++++++--
 2 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 48e3185b227..6066510c7c0 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -4178,9 +4178,12 @@ EXECUTE st1(101, 101);
  00101 | 00101
 (1 row)

-SET enable_hashjoin TO off;
+-- These next tests require choosing between remote and local sort, which is
+-- a coin flip so long as cost_sort() gives the same results on both sides.
+-- To stabilize the expected plans, disable sorting locally.
 SET enable_sort TO off;
 -- subquery using stable function (can't be sent to remote)
+SET enable_hashjoin TO off;  -- this one needs even more help to be stable
 PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND
date(c4)= '1970-01-17'::date) ORDER BY c1; 
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st2(10, 20);
                                                             QUERY PLAN
           
@@ -4212,20 +4215,16 @@ EXECUTE st2(101, 121);
 (1 row)

 RESET enable_hashjoin;
-RESET enable_sort;
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND
date(c5)= '1970-01-17'::date) ORDER BY c1; 
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st3(10, 20);
-
      QUERY PLAN
                      

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
- Sort
+
                   QUERY PLAN
                                                 

+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Foreign Scan
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-   Sort Key: t1.c1
-   ->  Foreign Scan
-         Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-         Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
-         Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE
((r1."C1" < 20)) AND EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND ((date(r3.c5) =
'1970-01-17'::date))AND ((r3.c3 = r1.c3))) 
-(7 rows)
+   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
+   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" <
20))AND EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND ((date(r3.c5) = '1970-01-17'::date)) AND
((r3.c3= r1.c3))) ORDER BY r1."C 1" ASC NULLS LAST 
+(4 rows)

 EXECUTE st3(10, 20);
  c1 | c2 |  c3   |              c4              |            c5            | c6 |     c7     | c8
@@ -4238,6 +4237,7 @@ EXECUTE st3(20, 30);
 ----+----+----+----+----+----+----+----
 (0 rows)

+RESET enable_sort;
 -- custom plan should be chosen initially
 PREPARE st4(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 = $1;
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st4(1);
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 9a8f9e28135..4f7ab2ed0ac 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -1165,20 +1165,28 @@ PREPARE st1(int, int) AS SELECT t1.c3, t2.c3 FROM ft1 t1, ft2 t2 WHERE t1.c1 = $
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st1(1, 2);
 EXECUTE st1(1, 1);
 EXECUTE st1(101, 101);
-SET enable_hashjoin TO off;
+
+-- These next tests require choosing between remote and local sort, which is
+-- a coin flip so long as cost_sort() gives the same results on both sides.
+-- To stabilize the expected plans, disable sorting locally.
 SET enable_sort TO off;
+
 -- subquery using stable function (can't be sent to remote)
+SET enable_hashjoin TO off;  -- this one needs even more help to be stable
 PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND
date(c4)= '1970-01-17'::date) ORDER BY c1; 
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st2(10, 20);
 EXECUTE st2(10, 20);
 EXECUTE st2(101, 121);
 RESET enable_hashjoin;
-RESET enable_sort;
+
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND
date(c5)= '1970-01-17'::date) ORDER BY c1; 
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st3(10, 20);
 EXECUTE st3(10, 20);
 EXECUTE st3(20, 30);
+
+RESET enable_sort;
+
 -- custom plan should be chosen initially
 PREPARE st4(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 = $1;
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st4(1);
--
2.43.7

From 9ef24c4beca49c56186df6600388100aea6a8b19 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 28 Dec 2025 17:52:25 -0500
Subject: [PATCH v2 2/2] Ensure sanity of hash-join costing when there are no
 MCV statistics.

estimate_hash_bucket_stats is defined to return zero to *mcv_freq if
it cannot obtain a value for the frequency of the most common value.
Its sole caller final_cost_hashjoin ignored this provision and would
blindly believe the zero value, resulting in computing zero for the
largest bucket size.  In consequence, the safety check that intended
to prevent the largest bucket from exceeding get_hash_memory_limit()
was ineffective, allowing very silly plans to be chosen if statistics
were missing.

After fixing final_cost_hashjoin to disregard zero results for
mcv_freq, a second problem appeared: some cases that should use hash
joins failed to.  This is because estimate_hash_bucket_stats was
unaware of the fact that ANALYZE won't store MCV statistics if it
doesn't find any multiply-occurring values.  Thus the lack of an MCV
stats entry doesn't necessarily mean that we know nothing; we may
well know that the column is unique.  The former coding returned zero
for *mcv_freq in this case, which was pretty close to correct, but now
final_cost_hashjoin doesn't believe it and disables the hash join.
So check to see if there is a HISTOGRAM stats entry; if so, ANALYZE
has in fact run for this column and must have found it to be unique.
In that case report the MCV frequency as 1 / rows, instead of claiming
ignorance.

Reporting a more accurate *mcv_freq in this case can also affect the
bucket-size skew adjustment further down in estimate_hash_bucket_stats,
causing hash-join cost estimates to change slightly.  This affects
some plan choices in the core regression tests.  The first diff in
join.out corresponds to a case where we have no stats and should not
risk a hash join, but the remaining changes are caused by producing
a better bucket-size estimate for unique join columns.  Those are all
harmless changes so far as I can tell.

The existing behavior was introduced in commit 4867d7f62 in v11.
It appears from the commit log that disabling the bucket-size safety
check in the absence of statistics was intentional; but we've now seen
a case where the ensuing behavior is bad enough to make that seem like
a poor decision.  In any case the lack of other problems with that
safety check after several years helps to justify enforcing it more
strictly.  However, we won't risk back-patching this, in case any
applications are depending on the existing behavior.

Bug: #19363
Reported-by: Jinhui Lai <jinhui.lai@qq.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2380165.1766871097@sss.pgh.pa.us
Discussion: https://postgr.es/m/19363-8dd32fc7600a1153@postgresql.org
---
 src/backend/optimizer/path/costsize.c        |   5 +-
 src/backend/utils/adt/selfuncs.c             |  12 +
 src/test/regress/expected/join.out           |  29 +-
 src/test/regress/expected/join_hash.out      |   1 +
 src/test/regress/expected/partition_join.out | 432 +++++++++----------
 src/test/regress/sql/join_hash.sql           |   1 +
 6 files changed, 247 insertions(+), 233 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index a39cc793b4d..54931cd6e2a 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -4360,7 +4360,7 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
                                  path->jpath.jointype))
     {
         innerbucketsize = 1.0 / virtualbuckets;
-        innermcvfreq = 0.0;
+        innermcvfreq = 1.0 / inner_path_rows_total;
     }
     else
     {
@@ -4428,7 +4428,8 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,

             if (innerbucketsize > thisbucketsize)
                 innerbucketsize = thisbucketsize;
-            if (innermcvfreq > thismcvfreq)
+            /* Disregard zero for MCV freq, it means we have no data */
+            if (thismcvfreq > 0.0 && innermcvfreq > thismcvfreq)
                 innermcvfreq = thismcvfreq;
         }
     }
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index c760b19db55..c0dc26b4647 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -4414,6 +4414,18 @@ estimate_hash_bucket_stats(PlannerInfo *root, Node *hashkey, double nbuckets,
                 *mcv_freq = sslot.numbers[0];
             free_attstatsslot(&sslot);
         }
+        else if (get_attstatsslot(&sslot, vardata.statsTuple,
+                                  STATISTIC_KIND_HISTOGRAM, InvalidOid,
+                                  0))
+        {
+            /*
+             * If there are no recorded MCVs, but we do have a histogram, then
+             * assume that ANALYZE determined that the column is unique.
+             */
+            if (vardata.rel && vardata.rel->rows > 0)
+                *mcv_freq = 1.0 / vardata.rel->rows;
+            free_attstatsslot(&sslot);
+        }
     }

     /* Get number of distinct values */
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index edde9e99893..bf11d48bb3a 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3303,10 +3303,13 @@ where not exists (
 );
                        QUERY PLAN
 ---------------------------------------------------------
- Hash Anti Join
-   Hash Cond: (t1.c1 = t2.c2)
-   ->  Seq Scan on tt4x t1
-   ->  Hash
+ Merge Anti Join
+   Merge Cond: (t1.c1 = t2.c2)
+   ->  Sort
+         Sort Key: t1.c1
+         ->  Seq Scan on tt4x t1
+   ->  Sort
+         Sort Key: t2.c2
          ->  Merge Right Join
                Merge Cond: (t5.c1 = t3.c2)
                ->  Merge Join
@@ -3327,7 +3330,7 @@ where not exists (
                            ->  Sort
                                  Sort Key: t3.c1
                                  ->  Seq Scan on tt4x t3
-(24 rows)
+(27 rows)

 --
 -- regression test for problems of the sort depicted in bug #3494
@@ -9313,19 +9316,19 @@ select * from fkest f1
   join fkest f2 on (f1.x = f2.x and f1.x10 = f2.x10b and f1.x100 = f2.x100)
   join fkest f3 on f1.x = f3.x
   where f1.x100 = 2;
-                     QUERY PLAN
------------------------------------------------------
+                        QUERY PLAN
+-----------------------------------------------------------
  Hash Join
-   Hash Cond: ((f2.x = f1.x) AND (f2.x10b = f1.x10))
+   Hash Cond: (f1.x = f3.x)
    ->  Hash Join
-         Hash Cond: (f3.x = f2.x)
-         ->  Seq Scan on fkest f3
+         Hash Cond: ((f2.x = f1.x) AND (f2.x10b = f1.x10))
+         ->  Seq Scan on fkest f2
+               Filter: (x100 = 2)
          ->  Hash
-               ->  Seq Scan on fkest f2
+               ->  Seq Scan on fkest f1
                      Filter: (x100 = 2)
    ->  Hash
-         ->  Seq Scan on fkest f1
-               Filter: (x100 = 2)
+         ->  Seq Scan on fkest f3
 (11 rows)

 rollback;
diff --git a/src/test/regress/expected/join_hash.out b/src/test/regress/expected/join_hash.out
index a45e1450040..4749f6ed70d 100644
--- a/src/test/regress/expected/join_hash.out
+++ b/src/test/regress/expected/join_hash.out
@@ -559,6 +559,7 @@ create table join_foo as select generate_series(1, 3) as id, 'xxxxx'::text as t;
 alter table join_foo set (parallel_workers = 0);
 create table join_bar as select generate_series(1, 10000) as id, 'xxxxx'::text as t;
 alter table join_bar set (parallel_workers = 2);
+analyze join_foo, join_bar;
 -- multi-batch with rescan, parallel-oblivious
 savepoint settings;
 set enable_parallel_hash = off;
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 17d27ef3d46..559b87ec66e 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -158,20 +158,20 @@ SELECT t1, t2 FROM prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b WHERE t1.b = 0 ORDER
 --------------------------------------------------
  Sort
    Sort Key: t1.a, t2.b
-   ->  Hash Right Join
-         Hash Cond: (t2.b = t1.a)
+   ->  Hash Left Join
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_p1 t2_1
-               ->  Seq Scan on prt2_p2 t2_2
-               ->  Seq Scan on prt2_p3 t2_3
+               ->  Seq Scan on prt1_p1 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_p2 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_p3 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_p1 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_p2 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_p3 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_p1 t2_1
+                     ->  Seq Scan on prt2_p2 t2_2
+                     ->  Seq Scan on prt2_p3 t2_3
 (16 rows)

 SELECT t1, t2 FROM prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b WHERE t1.b = 0 ORDER BY t1.a, t2.b;
@@ -297,23 +297,23 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.a <
 -- Currently we can't do partitioned join if nullable-side partitions are pruned
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1 WHERE a < 450) t1 LEFT JOIN (SELECT * FROM prt2 WHERE b > 250)
t2ON t1.a = t2.b WHERE t1.b = 0 ORDER BY t1.a, t2.b; 
-                        QUERY PLAN
------------------------------------------------------------
+                     QUERY PLAN
+-----------------------------------------------------
  Sort
    Sort Key: prt1.a, prt2.b
-   ->  Hash Right Join
-         Hash Cond: (prt2.b = prt1.a)
+   ->  Hash Left Join
+         Hash Cond: (prt1.a = prt2.b)
          ->  Append
-               ->  Seq Scan on prt2_p2 prt2_1
-                     Filter: (b > 250)
-               ->  Seq Scan on prt2_p3 prt2_2
-                     Filter: (b > 250)
+               ->  Seq Scan on prt1_p1 prt1_1
+                     Filter: ((a < 450) AND (b = 0))
+               ->  Seq Scan on prt1_p2 prt1_2
+                     Filter: ((a < 450) AND (b = 0))
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_p1 prt1_1
-                           Filter: ((a < 450) AND (b = 0))
-                     ->  Seq Scan on prt1_p2 prt1_2
-                           Filter: ((a < 450) AND (b = 0))
+                     ->  Seq Scan on prt2_p2 prt2_1
+                           Filter: (b > 250)
+                     ->  Seq Scan on prt2_p3 prt2_2
+                           Filter: (b > 250)
 (15 rows)

 SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1 WHERE a < 450) t1 LEFT JOIN (SELECT * FROM prt2 WHERE b > 250)
t2ON t1.a = t2.b WHERE t1.b = 0 ORDER BY t1.a, t2.b; 
@@ -778,23 +778,23 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_e t1, prt2_e t2 WHERE (t1.a + t1.b)/2 =
    Sort Key: t1.a, t2.b
    ->  Append
          ->  Hash Join
-               Hash Cond: (((t2_1.b + t2_1.a) / 2) = ((t1_1.a + t1_1.b) / 2))
-               ->  Seq Scan on prt2_e_p1 t2_1
+               Hash Cond: (((t1_1.a + t1_1.b) / 2) = ((t2_1.b + t2_1.a) / 2))
+               ->  Seq Scan on prt1_e_p1 t1_1
+                     Filter: (c = 0)
                ->  Hash
-                     ->  Seq Scan on prt1_e_p1 t1_1
-                           Filter: (c = 0)
+                     ->  Seq Scan on prt2_e_p1 t2_1
          ->  Hash Join
-               Hash Cond: (((t2_2.b + t2_2.a) / 2) = ((t1_2.a + t1_2.b) / 2))
-               ->  Seq Scan on prt2_e_p2 t2_2
+               Hash Cond: (((t1_2.a + t1_2.b) / 2) = ((t2_2.b + t2_2.a) / 2))
+               ->  Seq Scan on prt1_e_p2 t1_2
+                     Filter: (c = 0)
                ->  Hash
-                     ->  Seq Scan on prt1_e_p2 t1_2
-                           Filter: (c = 0)
+                     ->  Seq Scan on prt2_e_p2 t2_2
          ->  Hash Join
-               Hash Cond: (((t2_3.b + t2_3.a) / 2) = ((t1_3.a + t1_3.b) / 2))
-               ->  Seq Scan on prt2_e_p3 t2_3
+               Hash Cond: (((t1_3.a + t1_3.b) / 2) = ((t2_3.b + t2_3.a) / 2))
+               ->  Seq Scan on prt1_e_p3 t1_3
+                     Filter: (c = 0)
                ->  Hash
-                     ->  Seq Scan on prt1_e_p3 t1_3
-                           Filter: (c = 0)
+                     ->  Seq Scan on prt2_e_p3 t2_3
 (21 rows)

 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_e t1, prt2_e t2 WHERE (t1.a + t1.b)/2 = (t2.b + t2.a)/2 AND t1.c = 0 ORDER BY
t1.a,t2.b; 
@@ -864,26 +864,26 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2
  Sort
    Sort Key: t1.a, t2.b, ((t3.a + t3.b))
    ->  Append
-         ->  Hash Right Join
-               Hash Cond: (((t3_1.a + t3_1.b) / 2) = t1_1.a)
-               ->  Seq Scan on prt1_e_p1 t3_1
+         ->  Hash Left Join
+               Hash Cond: (t1_1.a = ((t3_1.a + t3_1.b) / 2))
+               ->  Hash Right Join
+                     Hash Cond: (t2_1.b = t1_1.a)
+                     ->  Seq Scan on prt2_p1 t2_1
+                     ->  Hash
+                           ->  Seq Scan on prt1_p1 t1_1
+                                 Filter: (b = 0)
                ->  Hash
-                     ->  Hash Right Join
-                           Hash Cond: (t2_1.b = t1_1.a)
-                           ->  Seq Scan on prt2_p1 t2_1
-                           ->  Hash
-                                 ->  Seq Scan on prt1_p1 t1_1
-                                       Filter: (b = 0)
-         ->  Hash Right Join
-               Hash Cond: (((t3_2.a + t3_2.b) / 2) = t1_2.a)
-               ->  Seq Scan on prt1_e_p2 t3_2
+                     ->  Seq Scan on prt1_e_p1 t3_1
+         ->  Hash Left Join
+               Hash Cond: (t1_2.a = ((t3_2.a + t3_2.b) / 2))
+               ->  Hash Right Join
+                     Hash Cond: (t2_2.b = t1_2.a)
+                     ->  Seq Scan on prt2_p2 t2_2
+                     ->  Hash
+                           ->  Seq Scan on prt1_p2 t1_2
+                                 Filter: (b = 0)
                ->  Hash
-                     ->  Hash Right Join
-                           Hash Cond: (t2_2.b = t1_2.a)
-                           ->  Seq Scan on prt2_p2 t2_2
-                           ->  Hash
-                                 ->  Seq Scan on prt1_p2 t1_2
-                                       Filter: (b = 0)
+                     ->  Seq Scan on prt1_e_p2 t3_2
          ->  Hash Right Join
                Hash Cond: (((t3_3.a + t3_3.b) / 2) = t1_3.a)
                ->  Seq Scan on prt1_e_p3 t3_3
@@ -921,21 +921,21 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2
    Sort Key: t1.a, t2.b, ((t3.a + t3.b))
    ->  Append
          ->  Nested Loop Left Join
-               ->  Hash Right Join
-                     Hash Cond: (t1_1.a = ((t3_1.a + t3_1.b) / 2))
-                     ->  Seq Scan on prt1_p1 t1_1
+               ->  Hash Left Join
+                     Hash Cond: (((t3_1.a + t3_1.b) / 2) = t1_1.a)
+                     ->  Seq Scan on prt1_e_p1 t3_1
+                           Filter: (c = 0)
                      ->  Hash
-                           ->  Seq Scan on prt1_e_p1 t3_1
-                                 Filter: (c = 0)
+                           ->  Seq Scan on prt1_p1 t1_1
                ->  Index Scan using iprt2_p1_b on prt2_p1 t2_1
                      Index Cond: (b = t1_1.a)
          ->  Nested Loop Left Join
-               ->  Hash Right Join
-                     Hash Cond: (t1_2.a = ((t3_2.a + t3_2.b) / 2))
-                     ->  Seq Scan on prt1_p2 t1_2
+               ->  Hash Left Join
+                     Hash Cond: (((t3_2.a + t3_2.b) / 2) = t1_2.a)
+                     ->  Seq Scan on prt1_e_p2 t3_2
+                           Filter: (c = 0)
                      ->  Hash
-                           ->  Seq Scan on prt1_e_p2 t3_2
-                                 Filter: (c = 0)
+                           ->  Seq Scan on prt1_p2 t1_2
                ->  Index Scan using iprt2_p2_b on prt2_p2 t2_2
                      Index Cond: (b = t1_2.a)
          ->  Nested Loop Left Join
@@ -1080,14 +1080,14 @@ SELECT COUNT(*) FROM prt1 FULL JOIN prt2 p2(b,a,c) USING(a,b) FULL JOIN prt2 p3(
 -- make sure these go to null as expected
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.phv, t2.b, t2.phv, t3.a + t3.b, t3.phv FROM ((SELECT 50 phv, * FROM prt1 WHERE prt1.b = 0) t1 FULL
JOIN(SELECT 75 phv, * FROM prt2 WHERE prt2.a = 0) t2 ON (t1.a = t2.b)) FULL JOIN (SELECT 50 phv, * FROM prt1_e WHERE
prt1_e.c= 0) t3 ON (t1.a = (t3.a + t3.b)/2) WHERE t1.a = t1.phv OR t2.b = t2.phv OR (t3.a + t3.b)/2 = t3.phv ORDER BY
t1.a,t2.b, t3.a + t3.b; 
-                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
+                                          QUERY PLAN
+----------------------------------------------------------------------------------------------
  Sort
    Sort Key: prt1.a, prt2.b, ((prt1_e.a + prt1_e.b))
-   ->  Append
-         ->  Hash Full Join
-               Hash Cond: (prt1_1.a = ((prt1_e_1.a + prt1_e_1.b) / 2))
-               Filter: ((prt1_1.a = (50)) OR (prt2_1.b = (75)) OR (((prt1_e_1.a + prt1_e_1.b) / 2) = (50)))
+   ->  Hash Full Join
+         Hash Cond: (prt1.a = ((prt1_e.a + prt1_e.b) / 2))
+         Filter: ((prt1.a = (50)) OR (prt2.b = (75)) OR (((prt1_e.a + prt1_e.b) / 2) = (50)))
+         ->  Append
                ->  Hash Full Join
                      Hash Cond: (prt1_1.a = prt2_1.b)
                      ->  Seq Scan on prt1_p1 prt1_1
@@ -1095,12 +1095,6 @@ SELECT t1.a, t1.phv, t2.b, t2.phv, t3.a + t3.b, t3.phv FROM ((SELECT 50 phv, * F
                      ->  Hash
                            ->  Seq Scan on prt2_p1 prt2_1
                                  Filter: (a = 0)
-               ->  Hash
-                     ->  Seq Scan on prt1_e_p1 prt1_e_1
-                           Filter: (c = 0)
-         ->  Hash Full Join
-               Hash Cond: (prt1_2.a = ((prt1_e_2.a + prt1_e_2.b) / 2))
-               Filter: ((prt1_2.a = (50)) OR (prt2_2.b = (75)) OR (((prt1_e_2.a + prt1_e_2.b) / 2) = (50)))
                ->  Hash Full Join
                      Hash Cond: (prt1_2.a = prt2_2.b)
                      ->  Seq Scan on prt1_p2 prt1_2
@@ -1108,12 +1102,6 @@ SELECT t1.a, t1.phv, t2.b, t2.phv, t3.a + t3.b, t3.phv FROM ((SELECT 50 phv, * F
                      ->  Hash
                            ->  Seq Scan on prt2_p2 prt2_2
                                  Filter: (a = 0)
-               ->  Hash
-                     ->  Seq Scan on prt1_e_p2 prt1_e_2
-                           Filter: (c = 0)
-         ->  Hash Full Join
-               Hash Cond: (prt1_3.a = ((prt1_e_3.a + prt1_e_3.b) / 2))
-               Filter: ((prt1_3.a = (50)) OR (prt2_3.b = (75)) OR (((prt1_e_3.a + prt1_e_3.b) / 2) = (50)))
                ->  Hash Full Join
                      Hash Cond: (prt1_3.a = prt2_3.b)
                      ->  Seq Scan on prt1_p3 prt1_3
@@ -1121,10 +1109,15 @@ SELECT t1.a, t1.phv, t2.b, t2.phv, t3.a + t3.b, t3.phv FROM ((SELECT 50 phv, * F
                      ->  Hash
                            ->  Seq Scan on prt2_p3 prt2_3
                                  Filter: (a = 0)
-               ->  Hash
+         ->  Hash
+               ->  Append
+                     ->  Seq Scan on prt1_e_p1 prt1_e_1
+                           Filter: (c = 0)
+                     ->  Seq Scan on prt1_e_p2 prt1_e_2
+                           Filter: (c = 0)
                      ->  Seq Scan on prt1_e_p3 prt1_e_3
                            Filter: (c = 0)
-(42 rows)
+(35 rows)

 SELECT t1.a, t1.phv, t2.b, t2.phv, t3.a + t3.b, t3.phv FROM ((SELECT 50 phv, * FROM prt1 WHERE prt1.b = 0) t1 FULL
JOIN(SELECT 75 phv, * FROM prt2 WHERE prt2.a = 0) t2 ON (t1.a = t2.b)) FULL JOIN (SELECT 50 phv, * FROM prt1_e WHERE
prt1_e.c= 0) t3 ON (t1.a = (t3.a + t3.b)/2) WHERE t1.a = t1.phv OR t2.b = t2.phv OR (t3.a + t3.b)/2 = t3.phv ORDER BY
t1.a,t2.b, t3.a + t3.b; 
  a  | phv | b  | phv | ?column? | phv
@@ -1146,11 +1139,11 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1, prt1_e t2 WHER
                ->  Sort
                      Sort Key: t1_5.b
                      ->  Hash Join
-                           Hash Cond: (((t2_1.a + t2_1.b) / 2) = t1_5.b)
-                           ->  Seq Scan on prt1_e_p1 t2_1
+                           Hash Cond: (t1_5.b = ((t2_1.a + t2_1.b) / 2))
+                           ->  Seq Scan on prt2_p1 t1_5
+                                 Filter: (a = 0)
                            ->  Hash
-                                 ->  Seq Scan on prt2_p1 t1_5
-                                       Filter: (a = 0)
+                                 ->  Seq Scan on prt1_e_p1 t2_1
          ->  Index Scan using iprt1_p1_a on prt1_p1 t1_2
                Index Cond: (a = ((t2_1.a + t2_1.b) / 2))
                Filter: (b = 0)
@@ -1160,11 +1153,11 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1, prt1_e t2 WHER
                ->  Sort
                      Sort Key: t1_6.b
                      ->  Hash Join
-                           Hash Cond: (((t2_2.a + t2_2.b) / 2) = t1_6.b)
-                           ->  Seq Scan on prt1_e_p2 t2_2
+                           Hash Cond: (t1_6.b = ((t2_2.a + t2_2.b) / 2))
+                           ->  Seq Scan on prt2_p2 t1_6
+                                 Filter: (a = 0)
                            ->  Hash
-                                 ->  Seq Scan on prt2_p2 t1_6
-                                       Filter: (a = 0)
+                                 ->  Seq Scan on prt1_e_p2 t2_2
          ->  Index Scan using iprt1_p2_a on prt1_p2 t1_3
                Index Cond: (a = ((t2_2.a + t2_2.b) / 2))
                Filter: (b = 0)
@@ -1944,12 +1937,12 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1 LEFT JOIN prt2_l t2 ON t1.a = t2.b
  Sort
    Sort Key: t1.a, t2.b
    ->  Append
-         ->  Hash Right Join
-               Hash Cond: ((t2_1.b = t1_1.a) AND ((t2_1.c)::text = (t1_1.c)::text))
-               ->  Seq Scan on prt2_l_p1 t2_1
+         ->  Hash Left Join
+               Hash Cond: ((t1_1.a = t2_1.b) AND ((t1_1.c)::text = (t2_1.c)::text))
+               ->  Seq Scan on prt1_l_p1 t1_1
+                     Filter: (b = 0)
                ->  Hash
-                     ->  Seq Scan on prt1_l_p1 t1_1
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_l_p1 t2_1
          ->  Hash Right Join
                Hash Cond: ((t2_2.b = t1_2.a) AND ((t2_2.c)::text = (t1_2.c)::text))
                ->  Seq Scan on prt2_l_p2_p1 t2_2
@@ -2968,26 +2961,26 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 LEFT JOIN prt2_adv t2 ON (t1.a =
 -- partitions on the nullable side
 EXPLAIN (COSTS OFF)
 SELECT t1.b, t1.c, t2.a, t2.c FROM prt2_adv t1 LEFT JOIN prt1_adv t2 ON (t1.b = t2.a) WHERE t1.a = 0 ORDER BY t1.b,
t2.a;
-                       QUERY PLAN
----------------------------------------------------------
+                      QUERY PLAN
+------------------------------------------------------
  Sort
    Sort Key: t1.b, t2.a
-   ->  Hash Right Join
-         Hash Cond: (t2.a = t1.b)
+   ->  Hash Left Join
+         Hash Cond: (t1.b = t2.a)
          ->  Append
-               ->  Seq Scan on prt1_adv_p1 t2_1
-               ->  Seq Scan on prt1_adv_p2 t2_2
-               ->  Seq Scan on prt1_adv_p3 t2_3
+               ->  Seq Scan on prt2_adv_p1 t1_1
+                     Filter: (a = 0)
+               ->  Seq Scan on prt2_adv_p2 t1_2
+                     Filter: (a = 0)
+               ->  Seq Scan on prt2_adv_p3 t1_3
+                     Filter: (a = 0)
+               ->  Seq Scan on prt2_adv_extra t1_4
+                     Filter: (a = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt2_adv_p1 t1_1
-                           Filter: (a = 0)
-                     ->  Seq Scan on prt2_adv_p2 t1_2
-                           Filter: (a = 0)
-                     ->  Seq Scan on prt2_adv_p3 t1_3
-                           Filter: (a = 0)
-                     ->  Seq Scan on prt2_adv_extra t1_4
-                           Filter: (a = 0)
+                     ->  Seq Scan on prt1_adv_p1 t2_1
+                     ->  Seq Scan on prt1_adv_p2 t2_2
+                     ->  Seq Scan on prt1_adv_p3 t2_3
 (18 rows)

 -- anti join
@@ -3031,26 +3024,26 @@ SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t
 -- partitions on the nullable side
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt2_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt1_adv t2 WHERE t1.b = t2.a) AND t1.a = 0 ORDER BY
t1.b;
-                       QUERY PLAN
----------------------------------------------------------
+                      QUERY PLAN
+------------------------------------------------------
  Sort
    Sort Key: t1.b
-   ->  Hash Right Anti Join
-         Hash Cond: (t2.a = t1.b)
+   ->  Hash Anti Join
+         Hash Cond: (t1.b = t2.a)
          ->  Append
-               ->  Seq Scan on prt1_adv_p1 t2_1
-               ->  Seq Scan on prt1_adv_p2 t2_2
-               ->  Seq Scan on prt1_adv_p3 t2_3
+               ->  Seq Scan on prt2_adv_p1 t1_1
+                     Filter: (a = 0)
+               ->  Seq Scan on prt2_adv_p2 t1_2
+                     Filter: (a = 0)
+               ->  Seq Scan on prt2_adv_p3 t1_3
+                     Filter: (a = 0)
+               ->  Seq Scan on prt2_adv_extra t1_4
+                     Filter: (a = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt2_adv_p1 t1_1
-                           Filter: (a = 0)
-                     ->  Seq Scan on prt2_adv_p2 t1_2
-                           Filter: (a = 0)
-                     ->  Seq Scan on prt2_adv_p3 t1_3
-                           Filter: (a = 0)
-                     ->  Seq Scan on prt2_adv_extra t1_4
-                           Filter: (a = 0)
+                     ->  Seq Scan on prt1_adv_p1 t2_1
+                     ->  Seq Scan on prt1_adv_p2 t2_2
+                     ->  Seq Scan on prt1_adv_p3 t2_3
 (18 rows)

 -- full join; currently we can't do partitioned join if there are no matched
@@ -3146,97 +3139,97 @@ ANALYZE prt2_adv;
 -- inner join
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a = t2.b) WHERE t1.b = 0 ORDER BY t1.a,
t2.b;
-                      QUERY PLAN
-------------------------------------------------------
+                       QUERY PLAN
+--------------------------------------------------------
  Sort
    Sort Key: t1.a
    ->  Hash Join
-         Hash Cond: (t2.b = t1.a)
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t2_1
-               ->  Seq Scan on prt2_adv_p2 t2_2
-               ->  Seq Scan on prt2_adv_p3_1 t2_3
-               ->  Seq Scan on prt2_adv_p3_2 t2_4
+               ->  Seq Scan on prt1_adv_p1 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p2 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p3 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p1 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p2 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p3 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_adv_p1 t2_1
+                     ->  Seq Scan on prt2_adv_p2 t2_2
+                     ->  Seq Scan on prt2_adv_p3_1 t2_3
+                     ->  Seq Scan on prt2_adv_p3_2 t2_4
 (17 rows)

 -- semi join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
-                      QUERY PLAN
-------------------------------------------------------
+                       QUERY PLAN
+--------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Right Semi Join
-         Hash Cond: (t2.b = t1.a)
+   ->  Hash Semi Join
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t2_1
-               ->  Seq Scan on prt2_adv_p2 t2_2
-               ->  Seq Scan on prt2_adv_p3_1 t2_3
-               ->  Seq Scan on prt2_adv_p3_2 t2_4
+               ->  Seq Scan on prt1_adv_p1 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p2 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p3 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p1 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p2 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p3 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_adv_p1 t2_1
+                     ->  Seq Scan on prt2_adv_p2 t2_2
+                     ->  Seq Scan on prt2_adv_p3_1 t2_3
+                     ->  Seq Scan on prt2_adv_p3_2 t2_4
 (17 rows)

 -- left join
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 LEFT JOIN prt2_adv t2 ON (t1.a = t2.b) WHERE t1.b = 0 ORDER BY t1.a,
t2.b;
-                      QUERY PLAN
-------------------------------------------------------
+                       QUERY PLAN
+--------------------------------------------------------
  Sort
    Sort Key: t1.a, t2.b
-   ->  Hash Right Join
-         Hash Cond: (t2.b = t1.a)
+   ->  Hash Left Join
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t2_1
-               ->  Seq Scan on prt2_adv_p2 t2_2
-               ->  Seq Scan on prt2_adv_p3_1 t2_3
-               ->  Seq Scan on prt2_adv_p3_2 t2_4
+               ->  Seq Scan on prt1_adv_p1 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p2 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p3 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p1 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p2 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p3 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_adv_p1 t2_1
+                     ->  Seq Scan on prt2_adv_p2 t2_2
+                     ->  Seq Scan on prt2_adv_p3_1 t2_3
+                     ->  Seq Scan on prt2_adv_p3_2 t2_4
 (17 rows)

 -- anti join
 EXPLAIN (COSTS OFF)
 SELECT t1.* FROM prt1_adv t1 WHERE NOT EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY
t1.a;
-                      QUERY PLAN
-------------------------------------------------------
+                       QUERY PLAN
+--------------------------------------------------------
  Sort
    Sort Key: t1.a
-   ->  Hash Right Anti Join
-         Hash Cond: (t2.b = t1.a)
+   ->  Hash Anti Join
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t2_1
-               ->  Seq Scan on prt2_adv_p2 t2_2
-               ->  Seq Scan on prt2_adv_p3_1 t2_3
-               ->  Seq Scan on prt2_adv_p3_2 t2_4
+               ->  Seq Scan on prt1_adv_p1 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p2 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p3 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p1 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p2 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p3 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_adv_p1 t2_1
+                     ->  Seq Scan on prt2_adv_p2 t2_2
+                     ->  Seq Scan on prt2_adv_p3_1 t2_3
+                     ->  Seq Scan on prt2_adv_p3_2 t2_4
 (17 rows)

 -- full join
@@ -3326,19 +3319,19 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a =
  Sort
    Sort Key: t1.a
    ->  Hash Join
-         Hash Cond: (t2.b = t1.a)
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t2_1
-               ->  Seq Scan on prt2_adv_p2 t2_2
-               ->  Seq Scan on prt2_adv_p3 t2_3
+               ->  Seq Scan on prt1_adv_p2 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p3 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p1 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p2 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p3 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p1 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_adv_p1 t2_1
+                     ->  Seq Scan on prt2_adv_p2 t2_2
+                     ->  Seq Scan on prt2_adv_p3 t2_3
 (16 rows)

 ALTER TABLE prt2_adv DETACH PARTITION prt2_adv_p3;
@@ -3354,19 +3347,19 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_adv t1 INNER JOIN prt2_adv t2 ON (t1.a =
  Sort
    Sort Key: t1.a
    ->  Hash Join
-         Hash Cond: (t2.b = t1.a)
+         Hash Cond: (t1.a = t2.b)
          ->  Append
-               ->  Seq Scan on prt2_adv_p1 t2_1
-               ->  Seq Scan on prt2_adv_p2 t2_2
-               ->  Seq Scan on prt2_adv_p3 t2_3
+               ->  Seq Scan on prt1_adv_p2 t1_1
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p3 t1_2
+                     Filter: (b = 0)
+               ->  Seq Scan on prt1_adv_p1 t1_3
+                     Filter: (b = 0)
          ->  Hash
                ->  Append
-                     ->  Seq Scan on prt1_adv_p2 t1_1
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p3 t1_2
-                           Filter: (b = 0)
-                     ->  Seq Scan on prt1_adv_p1 t1_3
-                           Filter: (b = 0)
+                     ->  Seq Scan on prt2_adv_p1 t2_1
+                     ->  Seq Scan on prt2_adv_p2 t2_2
+                     ->  Seq Scan on prt2_adv_p3 t2_3
 (16 rows)

 DROP TABLE prt1_adv_p3;
@@ -5018,11 +5011,11 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2
                ->  Hash
                      ->  Seq Scan on beta_neg_p1 t2_1
          ->  Hash Join
-               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.b = t1_2.b))
-               ->  Seq Scan on beta_neg_p2 t2_2
+               Hash Cond: ((t1_2.a = t2_2.a) AND (t1_2.b = t2_2.b))
+               ->  Seq Scan on alpha_neg_p2 t1_2
+                     Filter: ((b >= 125) AND (b < 225))
                ->  Hash
-                     ->  Seq Scan on alpha_neg_p2 t1_2
-                           Filter: ((b >= 125) AND (b < 225))
+                     ->  Seq Scan on beta_neg_p2 t2_2
          ->  Hash Join
                Hash Cond: ((t2_4.a = t1_4.a) AND (t2_4.b = t1_4.b))
                ->  Append
@@ -5141,25 +5134,28 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.c = t2

 EXPLAIN (COSTS OFF)
 SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2.b AND t1.c = t2.c) WHERE ((t1.b >=
100AND t1.b < 110) OR (t1.b >= 200 AND t1.b < 210)) AND ((t2.b >= 100 AND t2.b < 110) OR (t2.b >= 200 AND t2.b < 210))
ANDt1.c IN ('0004', '0009') ORDER BY t1.a, t1.b; 
-                                                           QUERY PLAN
         

---------------------------------------------------------------------------------------------------------------------------------
+                                                              QUERY PLAN
               

+--------------------------------------------------------------------------------------------------------------------------------------
  Sort
    Sort Key: t1.a, t1.b
    ->  Append
-         ->  Hash Join
-               Hash Cond: ((t1_1.a = t2_1.a) AND (t1_1.b = t2_1.b) AND (t1_1.c = t2_1.c))
-               ->  Seq Scan on alpha_neg_p1 t1_1
-                     Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b
<210)))) 
-               ->  Hash
+         ->  Merge Join
+               Merge Cond: ((t1_1.a = t2_1.a) AND (t1_1.b = t2_1.b) AND (t1_1.c = t2_1.c))
+               ->  Sort
+                     Sort Key: t1_1.a, t1_1.b, t1_1.c
+                     ->  Seq Scan on alpha_neg_p1 t1_1
+                           Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200)
AND(b < 210)))) 
+               ->  Sort
+                     Sort Key: t2_1.a, t2_1.b, t2_1.c
                      ->  Seq Scan on beta_neg_p1 t2_1
                            Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))
          ->  Hash Join
-               Hash Cond: ((t1_2.a = t2_2.a) AND (t1_2.b = t2_2.b) AND (t1_2.c = t2_2.c))
-               ->  Seq Scan on alpha_neg_p2 t1_2
-                     Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b
<210)))) 
+               Hash Cond: ((t2_2.a = t1_2.a) AND (t2_2.b = t1_2.b) AND (t2_2.c = t1_2.c))
+               ->  Seq Scan on beta_neg_p2 t2_2
+                     Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))
                ->  Hash
-                     ->  Seq Scan on beta_neg_p2 t2_2
-                           Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))
+                     ->  Seq Scan on alpha_neg_p2 t1_2
+                           Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200)
AND(b < 210)))) 
          ->  Nested Loop
                Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.b = t2_3.b) AND (t1_3.c = t2_3.c))
                ->  Seq Scan on alpha_pos_p2 t1_3
@@ -5172,7 +5168,7 @@ SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2
                      Filter: ((c = ANY ('{0004,0009}'::text[])) AND (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b
<210)))) 
                ->  Seq Scan on beta_pos_p3 t2_4
                      Filter: (((b >= 100) AND (b < 110)) OR ((b >= 200) AND (b < 210)))
-(29 rows)
+(32 rows)

 SELECT t1.*, t2.* FROM alpha t1 INNER JOIN beta t2 ON (t1.a = t2.a AND t1.b = t2.b AND t1.c = t2.c) WHERE ((t1.b >=
100AND t1.b < 110) OR (t1.b >= 200 AND t1.b < 210)) AND ((t2.b >= 100 AND t2.b < 110) OR (t2.b >= 200 AND t2.b < 210))
ANDt1.c IN ('0004', '0009') ORDER BY t1.a, t1.b; 
  a  |  b  |  c   | a  |  b  |  c
diff --git a/src/test/regress/sql/join_hash.sql b/src/test/regress/sql/join_hash.sql
index 6b0688ab0a6..49d3fd61856 100644
--- a/src/test/regress/sql/join_hash.sql
+++ b/src/test/regress/sql/join_hash.sql
@@ -314,6 +314,7 @@ create table join_foo as select generate_series(1, 3) as id, 'xxxxx'::text as t;
 alter table join_foo set (parallel_workers = 0);
 create table join_bar as select generate_series(1, 10000) as id, 'xxxxx'::text as t;
 alter table join_bar set (parallel_workers = 2);
+analyze join_foo, join_bar;

 -- multi-batch with rescan, parallel-oblivious
 savepoint settings;
--
2.43.7

Re: Fixing some ancient errors in hash join costing

От

Chao Li

Дата:

29 декабря 2025 г., 03:39:38


> On Dec 29, 2025, at 06:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
>> I was amused to notice that the postgres_fdw.out change made in my
>> patch reverts one made in aa86129e1 (which also affected semijoin
>> costing).  So we've had trouble before with that test case being
>> fundamentally unstable.  I wonder if we shouldn't do something to try
>> to stabilize it?  I see that the test immediately before this one
>> forces the matter by turning off enable_sort (which'd affect only
>> the local side not the remote).  That's a hack all right but maybe
>> we should extend it to this test.
>
> Here's a v2 patchset that splits that out as a separate change for
> clarity's sake.  I also spent a bit of effort on commit log messages,
> including researching the git history.
>
> regards, tom lane
>

Just a few small comments on v2:

1 - 0001
```
-SET enable_hashjoin TO off;
+-- These next tests require choosing between remote and local sort, which is
+-- a coin flip so long as cost_sort() gives the same results on both sides.
+-- To stabilize the expected plans, disable sorting locally.
SET enable_sort TO off;
-- subquery using stable function (can't be sent to remote)
+SET enable_hashjoin TO off;  -- this one needs even more help to be stable
```

This is not a concern, just curious why switch the setting order of enable_hashjoin and enable_sort?

2 - 0002
```
+        else if (get_attstatsslot(&sslot, vardata.statsTuple,
+                                  STATISTIC_KIND_HISTOGRAM, InvalidOid,
+                                  0))
+        {
+            /*
+             * If there are no recorded MCVs, but we do have a histogram, then
+             * assume that ANALYZE determined that the column is unique.
+             */
+            if (vardata.rel && vardata.rel->rows > 0)
+                *mcv_freq = 1.0 / vardata.rel->rows;
+            free_attstatsslot(&sslot);
+        }
```

As flag 0 is passed to get_attstatsslot(), free_attstatsslot() is not needed. The function comment of get_attstatsslot
statesabout that: 
```
* Passing flags=0 can be useful to quickly check if the requested slot type
* exists. In this case no arrays are extracted, so free_attstatsslot need
* not be called.
```

3 - In funciton estimate_hash_bucket_stats(), when mcv_freq is initialized:
```
/* Look up the frequency of the most common value, if available */
*mcv_freq = 0.0;
```

Maybe worth adding a short comment like “0.0 doesn’t mean zero frequency, instead 0.0 means no data or unknown
frequency”,which might help code readers to quickly understand the logic. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Re: Fixing some ancient errors in hash join costing

От

Tom Lane

Дата:

29 декабря 2025 г., 06:29:16

Chao Li <li.evan.chao@gmail.com> writes:
> Just a few small comments on v2:

> This is not a concern, just curious why switch the setting order of enable_hashjoin and enable_sort?

The need to disable hashjoin only applies to one of the two tests, so
it seemed to me that this way is more nicely nested.  Judgment call
of course.

> As flag 0 is passed to get_attstatsslot(), free_attstatsslot() is not needed.

True.  I wrote it like that so people wouldn't wonder if I'd forgotten
free_attstatsslot(), but if other call sites passing flags == 0 don't
use it then it'd be better to be consistent.  (I didn't check that.)

> Maybe worth adding a short comment like “0.0 doesn’t mean zero frequency, instead 0.0 means no data or unknown
frequency”,which might help code readers to quickly understand the logic. 

Doesn't the function's header comment cover that adequately?

            regards, tom lane

Re: Fixing some ancient errors in hash join costing

От

Chao Li

Дата:

29 декабря 2025 г., 09:19:05


> On Dec 29, 2025, at 11:29, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Chao Li <li.evan.chao@gmail.com> writes:
>> Just a few small comments on v2:
>
>> This is not a concern, just curious why switch the setting order of enable_hashjoin and enable_sort?
>
> The need to disable hashjoin only applies to one of the two tests, so
> it seemed to me that this way is more nicely nested.  Judgment call
> of course.

Make sense. Making enable_hashjoin closer to the test that depends on it and immediately resetting it after the test,
whichis clearer. 


>
>> As flag 0 is passed to get_attstatsslot(), free_attstatsslot() is not needed.
>
> True.  I wrote it like that so people wouldn't wonder if I'd forgotten
> free_attstatsslot(), but if other call sites passing flags == 0 don't
> use it then it'd be better to be consistent.  (I didn't check that.)

I searched over the source tree, looks like not a direct reference. The only usages of flag 0 is in eqjoinsel(), the
codesnippet is: 
```
    /*
     * There is no use in fetching one side's MCVs if we lack MCVs for the
     * other side, so do a quick check to verify that both stats exist.
     */
    get_mcv_stats = (HeapTupleIsValid(vardata1.statsTuple) &&
                     HeapTupleIsValid(vardata2.statsTuple) &&
                     get_attstatsslot(&sslot1, vardata1.statsTuple,
                                      STATISTIC_KIND_MCV, InvalidOid,
                                      0) &&
                     get_attstatsslot(&sslot2, vardata2.statsTuple,
                                      STATISTIC_KIND_MCV, InvalidOid,
                                      0));

    if (HeapTupleIsValid(vardata1.statsTuple))
    {
        /* note we allow use of nullfrac regardless of security check */
        stats1 = (Form_pg_statistic) GETSTRUCT(vardata1.statsTuple);
        if (get_mcv_stats &&
            statistic_proc_security_check(&vardata1, opfuncoid))
            have_mcvs1 = get_attstatsslot(&sslot1, vardata1.statsTuple,
                                          STATISTIC_KIND_MCV, InvalidOid,
                                          ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS);
    }

    if (HeapTupleIsValid(vardata2.statsTuple))
    {
        /* note we allow use of nullfrac regardless of security check */
        stats2 = (Form_pg_statistic) GETSTRUCT(vardata2.statsTuple);
        if (get_mcv_stats &&
            statistic_proc_security_check(&vardata2, opfuncoid))
            have_mcvs2 = get_attstatsslot(&sslot2, vardata2.statsTuple,
                                          STATISTIC_KIND_MCV, InvalidOid,
                                          ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS);
    }
```

Here sslot1 and sslot2 are called with flag 0 first, then when called again with non-0 flags, sslot1 and sslot2 are not
free-edfirst, which implies that free_attstatsslot() is not need to be called. 

If a reader considers free_attstatsslot() is necessary in this patch, then he might concern memory leaks in
eqjoinsel().


>
>> Maybe worth adding a short comment like “0.0 doesn’t mean zero frequency, instead 0.0 means no data or unknown
frequency”,which might help code readers to quickly understand the logic. 
>
> Doesn't the function's header comment cover that adequately?

Yep, fair point. The function comment does explain that.

My thought was just that at the point of initialization, the nearby comment reads as if we’re about to fetch a value,
whereasthe assignment is really initializing with “unknown so far”.  A short tweak like “Initialize MCV frequency to
unknown”might help make that intent obvious locally, but I’m fine either way. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Re: Fixing some ancient errors in hash join costing

От

Tom Lane

Дата:

29 декабря 2025 г., 21:08:39

Chao Li <li.evan.chao@gmail.com> writes:
> On Dec 29, 2025, at 11:29, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Chao Li <li.evan.chao@gmail.com> writes:
>>> As flag 0 is passed to get_attstatsslot(), free_attstatsslot() is not needed.

>> True.  I wrote it like that so people wouldn't wonder if I'd forgotten
>> free_attstatsslot(), but if other call sites passing flags == 0 don't
>> use it then it'd be better to be consistent.  (I didn't check that.)

> I searched over the source tree, looks like not a direct reference. The only usages of flag 0 is in eqjoinsel(), the
codesnippet is: 

Yeah, that's not a direct precedent since there is a
free_attstatsslot() further down; nonetheless it is relying on the
call with flags==0 to not do anything that'd need freeing, since
there's later calls that may overwrite the sslot structs.  But
get_attstatsslot's API spec is clear enough that you don't need
free_attstatsslot() in this usage, so I removed it.

> My thought was just that at the point of initialization, the nearby comment reads as if we’re about to fetch a value,
whereasthe assignment is really initializing with “unknown so far”.  A short tweak like “Initialize MCV frequency to
unknown”might help make that intent obvious locally, but I’m fine either way. 

Fair enough, I modified it along those lines.

Pushed, thanks for reviewing.

            regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Fixing some ancient errors in hash join costing

Fixing some ancient errors in hash join costing

Re: Fixing some ancient errors in hash join costing

Re: Fixing some ancient errors in hash join costing

Re: Fixing some ancient errors in hash join costing

Re: Fixing some ancient errors in hash join costing

Re: Fixing some ancient errors in hash join costing

Re: Fixing some ancient errors in hash join costing