Re: Partition-wise join for join between (declaratively) partitioned tables

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: Partition-wise join for join between (declaratively) partitioned tables
Дата
Msg-id CAFjFpRdsWgECmvx=QrRV2_OUQqjQP-UX0bV9FtTpTm67Gs9vbw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Partition-wise join for join between (declaratively) partitioned tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Ответы Re: Partition-wise join for join between (declaratively) partitioned tables  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Partition-wise join for join between (declaratively) partitioned tables  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Re: Partition-wise join for join between (declaratively) partitioned tables  (Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>)
Re: Partition-wise join for join between (declaratively) partitioned tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Список pgsql-hackers
Hi All,

PFA the patch to support partition-wise joins for partitioned tables. The patch
is based on the declarative parition support patches provided by Amit Langote
on 26th August 2016. The previous patch added support to assess whether two
tables can be joined using partition-wise join technique, but did not have
complete support to create plans which used partition-wise technique. This
patch implements three important pieces for supporting partition-wise join

1. Logic to assess whether join between two partitioned tables can be executed
using partition-wise join technique.
2. Construct RelOptInfo's representating join between matching partitions of
the joining relations and add join paths to those RelOptInfo's
3. Add append paths to the RelOptInfo representing the join between partitioned
tables. Rest of the planner code chooses the optimal path for join.

make_join_rel() now calls try_partition_wise_join(), which executes all of the
steps listed above. If the joining partitioned relations are deemed fit for
partition-wise join, we create one RelOptInfo (if not already present)
representing a join between every pair of partitions to be joined. Since the
join between parents is deemed legal, the join between the partitions is also
legal, hence legality of the join is not checked again. RelOptInfo representing
the join between partitions is constructed by translating the relevant members
of RelOptInfo of the parent join relation. Similarly SpecialJoinInfo,
restrictlist (for given join order) are constructed by translating those for
the parent join.

make_join_rel() is split into two portions, a. that deals with constructing
restrictlist and RelOptInfo for join relation b. that creates paths for the
join. The second portion is separated into a function
populate_joinrel_with_paths(), which is reused in try_partition_wise_join() to
create paths for join between matching partitions.

set_append_rel_pathlist() generates paths for child relations, marks the empty
children as dummy relations and creates append paths by collecting paths with
similar properties (parameterization and pathkeys) from non-empty children. It
then adds append paths to the parent relation. This patch divides
set_append_rel_pathlist() into two parts a. marking empty child relations as
dummy and generating paths for non-empty children. b. collecting children paths
into append paths for parent. Part b is separate into a function
add_paths_to_append_rel() which is reused for collecting paths from
partition-wise join child relations to construct append paths for join between
partitioned tables.

For an N-way join between partitioned tables, make_join_rel() is called as many
times as the number of valid join orders exist. For each such call, we will add
paths to join between partitions for corresponding join order between those
partitions. We can generate the append paths for parent joinrel only after all
such join orders have been considered. Hence before setting cheapest path forx
parent join relation, we set the cheapest path for each join relation between
partitions, followed by creating append paths for the parent joinrel. This
method needs some readjustment for multi-level partitions (TODO item 2 below).

A GUC enable_partition_wise_join is added to enable or disable partition-wise
join technique. I think the GUC is useful similar to other join related GUCs
like enable_hashjoin.

parameterized paths: While creating parameterized paths for child relations of
a partitioned tables, we do not have an idea as to whether we will be able to
use partition-wise join technique or not. Also we do not know the child
partition of the other partitioned table, to which a given partition would
join. Hence we do not create paths parameterized by child partitions of other
partitioned relations. But path for child of a partitioned relation
parameterized by other parent relation, can be considered to be parameterised
by any child relation of the other partitioned relation by replacing the parent
parameters by corresponding child parameters. This condition is used to
eliminate parameterized paths while creating merge and hash joins, to decide
the resultant parameterization of a join between child partitions and to create
nested loop paths with inner path parameterized by outer relation where inner
and outer relations are child partitions. While creating such nest loop join
paths we translate the path parameterized by other parent partitioned relation,
to that parameterized by the required child.

Functions like select_outer_pathkeys_for_merge(), make_sort_from_pathkeys(),
find_ec_member_for_tle() which did not expect to be called for a child
relation, are now used for child partition relations for joins. These functions
are adjusted for that usage.

Testing:
I have added partition_join.sql testcase to test partition-wise join feature.
That file has extensive tests for list, range, multi-level partitioning schemes
and various kinds of joins including nested loop join with inner relation
parameterized by outer relationThat file has extensive tests for list, range,
multi-level partitioning schemes and various kinds of joins including nested
loop join with inner relation parameterized by outer relation.

make check passes clean.

TODOs:

1. Instead of storing partitioning information in RelOptInfo of each of the
partitioned relations (base and join relations), we can keep a list of
canonical partition schemes in PlannerInfo. Every RelOptInfo gets a pointer to
the member of list representing the partitioning scheme of corresponding
relation. RelOptInfo's of all similarly partitioned relations get the same
pointer thus making it easy to match the partitioning schemes by comparing the
pointers. While we are supporting only exact partition matching scheme now,
it's possible to extend this method to match compatible partitioning schemes by
maintaining a list of compatible partitioning schemes.

Right now, I have moved some partition related structures from partition.c to
partition.h. These structures are still being reviewed and might change when
Amit Langote improves his patches. Having canonical partitioning scheme in
PlannerInfo may not require moving those structures out. So, that code is still
under development. A related change is renaming RangeBound structure in Amit
Langote's patches to PartitionRangeBound to avoid name conflict with
rangetypes.h. That change too should vanish once we decide where to keep that
structure and its final name.

2. Multi-level partitioned tables: For some reason path created for joining
partitions are not being picked up as the cheapest paths. I think, we need to
finalize the lower level paths before moving upwards in the partition
hierarchy. But I am yet to investigate the issue here. RelOptInfo::parent_relid
should point to top parents rather than immediate parents.

3. Testing: need more tests for testing partition-wise join with foreign tables
as partitions. More tests for parameterized joins for multi-level partitioned
joins.

4. Remove bms_to_char(): I have added this function to print Relids in the
debugger. I have found it very useful to quickly examine Relids in debugger,
which otherwise wasn't so easy. If others find it useful too, I can create a
separate patch to be considered for a separate commit.

5. In add_paths_to_append_rel() to find the possible set of outer relations for
generating parameterized paths for a given join. This code needs to be adjusted
to eliminate the parent relations possible set of outer relations for a join
between child partitions.

6. Add support to reparameterize more types of paths for child relations. I
will add this once we finalize the method to reparameterize a parent path for
child partition.

7. The patch adds make_joinrel() (name needs to be changed because of its
similariy with make_join_rel()) to construct an empty RelOptInfo for a join
between partitions. The function copies code doing the same from
build_join_rel(). build_join_rel() too can use this function, if we decide to
retain it.

8. Few small TODOs related to code reorganization, proper function,
variable naming etc. are in the patch. pg_indent run.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: ICU integration
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: CF app and patch series