Re: [PoC] Reducing planning time when tables have many partitions

Поиск
Список
Период
Сортировка
От Yuya Watari
Тема Re: [PoC] Reducing planning time when tables have many partitions
Дата
Msg-id CAJ2pMkZt6r94NUYm-F77FYahjgnMrY4CLHGAD7HxYZxGVwCaow@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PoC] Reducing planning time when tables have many partitions  (jian he <jian.universality@gmail.com>)
Список pgsql-hackers
Hello,

Thank you for reviewing these patches.

On Thu, May 2, 2024 at 11:35 PM jian he <jian.universality@gmail.com> wrote:
> on v24-0001:
> +/*
> + * add_eq_member - build a new EquivalenceMember and add it to an EC
> + */
> +static EquivalenceMember *
> +add_eq_member(EquivalenceClass *ec, Expr *expr, Relids relids,
> +  JoinDomain *jdomain, Oid datatype)
> +{
> + EquivalenceMember *em = make_eq_member(ec, expr, relids, jdomain,
> +   NULL, datatype);
> +
> + ec->ec_members = lappend(ec->ec_members, em);
> + return em;
> +}
> +
> this part seems so weird to me.
> add_eq_member function was added very very long ago,
> why do we create a function with the same function name?
>
> also I didn't see deletion of original add_eq_member function
> (https://git.postgresql.org/cgit/postgresql.git/tree/src/backend/optimizer/path/equivclass.c#n516)
> in v24-0001.

Actually, this patch does not recreate the add_eq_member() function
but splits it into two functions: add_eq_member() and
make_eq_member().

The reason why planning takes so long time in the current
implementation is that EquivalenceClasses have a large number of child
EquivalenceMembers, and the linear search for them is time-consuming.
To solve this problem, the patch makes EquivalenceClasses have only
parent members. There are few parent members, so we can speed up the
search. In the patch, the child members are introduced when needed.

The add_eq_member() function originally created EquivalenceMembers and
added them to ec_members. In the patch, this function is split into
the following two functions.

1. make_eq_member
Creates a new (parent or child) EquivalenceMember and returns it
without adding it to ec_members.
2. add_eq_member
Creates a new parent (not child) EquivalenceMember and adds it to
ec_members. Internally calls make_eq_member.

When we create parent members, we simply call add_eq_member(). This is
the same as the current implementation. When we create child members,
we have to do something different. Look at the code below. The
add_child_rel_equivalences() function creates child members. The patch
creates child EquivalenceMembers by the make_eq_member() function and
stores them in RelOptInfo (child_rel->eclass_child_members) instead of
their parent EquivalenceClass->ec_members. When we need child
EquivalenceMembers, we get them via RelOptInfos.

=====
void
add_child_rel_equivalences(PlannerInfo *root,
                           AppendRelInfo *appinfo,
                           RelOptInfo *parent_rel,
                           RelOptInfo *child_rel)
{
    ...
    i = -1;
    while ((i = bms_next_member(parent_rel->eclass_indexes, i)) >= 0)
    {
        ...
        foreach(lc, cur_ec->ec_members)
        {
            ...
            if (bms_is_subset(cur_em->em_relids, top_parent_relids) &&
                !bms_is_empty(cur_em->em_relids))
            {
                /* OK, generate transformed child version */
                ...
                child_em = make_eq_member(cur_ec, child_expr, new_relids,
                                          cur_em->em_jdomain,
                                          cur_em, cur_em->em_datatype);
                child_rel->eclass_child_members =
lappend(child_rel->eclass_child_members,
                                                          child_em);
                ...
            }
        }
    }
}
=====

I didn't change the name of add_eq_member, but it might be better to
change it to something like add_parent_eq_member(). Alternatively,
creating a new function named add_child_eq_member() that adds child
members to RelOptInfo can be a solution. I will consider these changes
in the next version.

> Obviously, now I cannot compile it correctly.
> What am I missing?

Thank you for pointing this out. This is due to a conflict with a
recent commit [1]. This commit introduces a new function named
add_setop_child_rel_equivalences(), which is quoted below. This
function creates a new child EquivalenceMember by calling
add_eq_member(). We have to adjust this function to make my patch
work, but it is not so easy. I'm sorry it will take some time to solve
this conflict, but I will post a new version when it is fixed.

=====
/*
 * add_setop_child_rel_equivalences
 *      Add equivalence members for each non-resjunk target in 'child_tlist'
 *      to the EquivalenceClass in the corresponding setop_pathkey's pk_eclass.
 *
 * 'root' is the PlannerInfo belonging to the top-level set operation.
 * 'child_rel' is the RelOptInfo of the child relation we're adding
 * EquivalenceMembers for.
 * 'child_tlist' is the target list for the setop child relation.  The target
 * list expressions are what we add as EquivalenceMembers.
 * 'setop_pathkeys' is a list of PathKeys which must contain an entry for each
 * non-resjunk target in 'child_tlist'.
 */
void
add_setop_child_rel_equivalences(PlannerInfo *root, RelOptInfo *child_rel,
                                 List *child_tlist, List *setop_pathkeys)
{
    ListCell   *lc;
    ListCell   *lc2 = list_head(setop_pathkeys);

    foreach(lc, child_tlist)
    {
        TargetEntry *tle = lfirst_node(TargetEntry, lc);
        EquivalenceMember *parent_em;
        PathKey    *pk;

        if (tle->resjunk)
            continue;

        if (lc2 == NULL)
            elog(ERROR, "too few pathkeys for set operation");

        pk = lfirst_node(PathKey, lc2);
        parent_em = linitial(pk->pk_eclass->ec_members);

        /*
         * We can safely pass the parent member as the first member in the
         * ec_members list as this is added first in generate_union_paths,
         * likewise, the JoinDomain can be that of the initial member of the
         * Pathkey's EquivalenceClass.
         */
        add_eq_member(pk->pk_eclass,
                      tle->expr,
                      child_rel->relids,
                      parent_em->em_jdomain,
                      parent_em,
                      exprType((Node *) tle->expr));

        lc2 = lnext(setop_pathkeys, lc2);
    }

    /*
     * transformSetOperationStmt() ensures that the targetlist never contains
     * any resjunk columns, so all eclasses that exist in 'root' must have
     * received a new member in the loop above.  Add them to the child_rel's
     * eclass_indexes.
     */
    child_rel->eclass_indexes = bms_add_range(child_rel->eclass_indexes, 0,

list_length(root->eq_classes) - 1);
}
=====

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=66c0185a3d14bbbf51d0fc9d267093ffec735231

--
Best regards,
Yuya Watari



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: PostgreSQL 17 Beta 1 release announcement draft
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: First draft of PG 17 release notes