Re: Multi-Column List Partitioning

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: Multi-Column List Partitioning
Дата
Msg-id CA+HiwqEFDvmgeAkDtxmEygRqS2FHDtPPOq107wwmc-w5EzxvWA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Multi-Column List Partitioning  (Amit Langote <amitlangote09@gmail.com>)
Ответы Re: Multi-Column List Partitioning  (Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>)
Список pgsql-hackers
On Wed, Sep 1, 2021 at 2:31 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Tue, Aug 31, 2021 at 8:02 PM Nitin Jadhav
> <nitinjadhavpostgres@gmail.com> wrote:
> > The attached patch also fixes the above comments.
>
> I noticed that multi-column list partitions containing NULLs don't
> work correctly with partition pruning yet.
>
> create table p0 (a int, b text, c bool) partition by list (a, b, c);
> create table p01 partition of p0 for values in ((1, 1, true), (NULL, 1, false));
> create table p02 partition of p0 for values in ((1, NULL, false));
> explain select * from p0 where a is null;
>                        QUERY PLAN
> --------------------------------------------------------
>  Seq Scan on p01 p0  (cost=0.00..22.50 rows=6 width=37)
>    Filter: (a IS NULL)
> (2 rows)
>
> I guess that may be due to the following newly added code being incomplete:
>
> +/*
> + * get_partition_bound_null_index
> + *
> + * Returns the partition index of the partition bound which accepts NULL.
> + */
> +int
> +get_partition_bound_null_index(PartitionBoundInfo boundinfo)
> +{
> +   int i = 0;
> +   int j = 0;
> +
> +   if (!boundinfo->isnulls)
> +       return -1;
>
> -           if (!val->constisnull)
> -               count++;
> +   for (i = 0; i < boundinfo->ndatums; i++)
> +   {
> +       //TODO: Handle for multi-column cases
> +       for (j = 0; j < 1; j++)
> +       {
> +           if (boundinfo->isnulls[i][j])
> +               return boundinfo->indexes[i];
>         }
>     }
>
> +   return -1;
> +}
>
> Maybe this function needs to return a "bitmapset" of indexes, because
> multiple partitions can now contain NULL values.
>
> Some other issues I noticed and suggestions for improvement:
>
> +/*
> + * checkForDuplicates
> + *
> + * Returns TRUE if the list bound element is already present in the list of
> + * list bounds, FALSE otherwise.
> + */
> +static bool
> +checkForDuplicates(List *source, List *searchElem)
>
> This function name may be too generic.  Given that it is specific to
> implementing list bound de-duplication, maybe the following signature
> is more appropriate:
>
> static bool
> checkListBoundDuplicated(List *list_bounds, List *new_bound)
>
> Also, better if the function comment mentions those parameter names, like:
>
> "Returns TRUE if the list bound element 'new_bound' is already present
> in the target list 'list_bounds', FALSE otherwise."
>
> +/*
> + * transformPartitionListBounds
> + *
> + * Converts the expressions of list partition bounds from the raw grammar
> + * representation.
>
> A sentence about the result format would be helpful, like:
>
> The result is a List of Lists of Const nodes to account for the
> partition key possibly containing more than one column.
>
> +   int             i = 0;
> +   int             j = 0;
>
> Better to initialize such loop counters closer to the loop.
>
> +           colname[i] = (char *) palloc0(NAMEDATALEN * sizeof(char));
> +           colname[i] = get_attname(RelationGetRelid(parent),
> +                                    key->partattrs[i], false);
>
> The palloc in the 1st statement is wasteful, because the 2nd statement
> overwrites its pointer by the pointer to the string palloc'd by
> get_attname().
>
> +           ListCell   *cell2 = NULL;
>
> No need to explicitly initialize the loop variable.
>
> +           RowExpr     *rowexpr = NULL;
> +
> +           if (!IsA(expr, RowExpr))
> +               ereport(ERROR,
> +                       (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
> +                       errmsg("Invalid list bound specification"),
> +                       parser_errposition(pstate, exprLocation((Node
> *) spec))));
> +
> +           rowexpr = (RowExpr *) expr;
>
> It's okay to assign rowexpr at the top here instead of the dummy
> NULL-initialization and write the condition as:
>
>     if (!IsA(rowexpr, RowExpr))
>
> +       if (isDuplicate)
> +           continue;
> +
> +       result = lappend(result, values);
>
> I can see you copied this style from the existing code, but how about
> writing this simply as:
>
>     if (!isDuplicate)
>         result = lappend(result, values);
>
> -/* One value coming from some (index'th) list partition */
> +/* One bound of a list partition */
>  typedef struct PartitionListValue
>  {
>     int         index;
> -   Datum       value;
> +   Datum      *values;
> +   bool       *isnulls;
>  } PartitionListValue;
>
> Given that this is a locally-defined struct, I wonder if it makes
> sense to rename the struct while we're at it.  Call it, say,
> PartitionListBound?
>
> Also, please keep part of the existing comment that says that the
> bound belongs to index'th partition.
>
> Will send more comments in a bit...

+ * partition_bound_accepts_nulls
+ *
+ * Returns TRUE if partition bound has NULL value, FALSE otherwise.
  */

I suggest slight rewording, as follows:

"Returns TRUE if any of the partition bounds contains a NULL value,
FALSE otherwise."

-   PartitionListValue *all_values;
+   PartitionListValue **all_values;
...
-   all_values = (PartitionListValue *)
-       palloc(ndatums * sizeof(PartitionListValue));
+   ndatums = get_list_datum_count(boundspecs, nparts);
+   all_values = (PartitionListValue **)
+       palloc(ndatums * sizeof(PartitionListValue *));

I don't see the need to redefine all_values's pointer type.  No need
to palloc PartitionListValue repeatedly for every datum as done
further down as follows:

+           all_values[j] = (PartitionListValue *)
palloc(sizeof(PartitionListValue));

You do need the following two though:

+           all_values[j]->values = (Datum *) palloc0(key->partnatts *
sizeof(Datum));
+           all_values[j]->isnulls = (bool *) palloc0(key->partnatts *
sizeof(bool));

If you change the above the way I suggest, you'd also need to revert
the following change:

-   qsort_arg(all_values, ndatums, sizeof(PartitionListValue),
+   qsort_arg(all_values, ndatums, sizeof(PartitionListValue *),
              qsort_partition_list_value_cmp, (void *) key);

+       int         orig_index = all_values[i]->index;
+       boundinfo->datums[i] = (Datum *) palloc(key->partnatts * sizeof(Datum));

Missing a newline between these two statements.

BTW, I noticed that the boundDatums variable is no longer used in
create_list_bounds.  I traced back its origin and found that a recent
commit 53d86957e98 introduced it to implement an idea to reduce the
finer-grained pallocs that were being done in create_list_bounds().  I
don't think that this patch needs to throw away that work.  You can
make it work as the attached delta patch that applies on top of v3.
Please check.

@@ -915,7 +949,7 @@ partition_bounds_equal(int partnatts, int16
*parttyplen, bool *parttypbyval,
    if (b1->nindexes != b2->nindexes)
        return false;

-   if (b1->null_index != b2->null_index)
+   if (get_partition_bound_null_index(b1) !=
get_partition_bound_null_index(b2))

As mentioned in the last message, this bit in partition_bounds_equal()
needs to be comparing "bitmapsets" of null bound indexes, that is
after fixing get_partition_bound_null_index() as previously mentioned.

But...

@@ -988,7 +1022,22 @@ partition_bounds_equal(int partnatts, int16
*parttyplen, bool *parttypbyval,
                 * context.  datumIsEqual() should be simple enough to be
                 * safe.
                 */
-               if (!datumIsEqual(b1->datums[i][j], b2->datums[i][j],
+               if (b1->isnulls)
+                   b1_isnull = b1->isnulls[i][j];
+               if (b2->isnulls)
+                   b2_isnull = b2->isnulls[i][j];
+
+               /*
+                * If any of the partition bound has NULL value, then check
+                * equality for the NULL value instead of comparing the datums
+                * as it does not contain valid value in case of NULL.
+                */
+               if (b1_isnull || b2_isnull)
+               {
+                   if (b1_isnull != b2_isnull)
+                       return false;
+               }

...if you have this in the main loop, I don't think we need the above
code stanza which appears to implement a short-cut for this long-form
logic.

+               (key->strategy != PARTITION_STRATEGY_LIST ||
+                !src->isnulls[i][j]))

I think it's better to write this condition as follows just like the
accompanying condition involving src->kind:

    (src->nulls == NULL || !src->isnulls[i][j])

(Skipped looking at merge_list_bounds() and related changes for now as
I see a lot of TODOs remain to be done.)

In check_new_partition_bound():

+                       Datum      *values = (Datum *)
palloc0(key->partnatts * sizeof(Datum));
+                       bool       *isnulls = (bool *)
palloc0(key->partnatts * sizeof(bool));

Doesn't seem like a bad idea to declare these as:

    Datum    values[PARTITION_MAX_KEYS];
    bool        isnulls[PARTITION_MAX_KEYS];


I looked at get_qual_for_list_multi_column() and immediately thought
that it may be a bad idea.  I think it's better to integrate the logic
for multi-column case into the existing function even if that makes
the function appear more complex.  Having two functions with the same
goal and mostly the same code is not a good idea mainly because it
becomes a maintenance burden.

I have attempted a rewrite such that get_qual_for_list() now handles
both the single-column and multi-column cases.  Changes included in
the delta patch.  The patch updates some outputs of the newly added
tests for multi-column list partitions, because the new code emits the
IS NOT NULL tests a bit differently than
get_qual_for_list_mutli_column() would.  Notably, the old approach
would emit IS NOT NULL for every non-NULL datum matched to a given
column, not just once for the column.  However, the patch makes a few
other tests fail, mainly because I had to fix
partition_bound_accepts_nulls() to handle the multi-column case,
though didn't bother to update all callers of it to also handle the
multi-column case correctly.  I guess that's a TODO you're going to
deal with at some point anyway. :)

I still have more than half of v3 left to look at, so will continue
looking.   In the meantime, please check the changes I suggested,
including the delta patch, and let me know your thoughts.

--
Amit Langote
EDB: http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Luzanov
Дата:
Сообщение: Re: psql: \dl+ to list large objects privileges
Следующее
От: Robert Haas
Дата:
Сообщение: Re: replay of CREATE TABLESPACE eats data at wal_level=minimal