Re: Multi-Column List Partitioning

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: Multi-Column List Partitioning
Дата
Msg-id CA+HiwqE69D6C4kUZ_nmmYXcGvU7EDtrotLD716aed83-KVgwew@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Multi-Column List Partitioning  (Nitin Jadhav <nitinjadhavpostgres@gmail.com>)
Ответы Re: Multi-Column List Partitioning  (Amit Langote <amitlangote09@gmail.com>)
Список pgsql-hackers
Hi Nitin,

On Tue, Aug 31, 2021 at 8:02 PM Nitin Jadhav
<nitinjadhavpostgres@gmail.com> wrote:
> The attached patch also fixes the above comments.

I noticed that multi-column list partitions containing NULLs don't
work correctly with partition pruning yet.

create table p0 (a int, b text, c bool) partition by list (a, b, c);
create table p01 partition of p0 for values in ((1, 1, true), (NULL, 1, false));
create table p02 partition of p0 for values in ((1, NULL, false));
explain select * from p0 where a is null;
                       QUERY PLAN
--------------------------------------------------------
 Seq Scan on p01 p0  (cost=0.00..22.50 rows=6 width=37)
   Filter: (a IS NULL)
(2 rows)

I guess that may be due to the following newly added code being incomplete:

+/*
+ * get_partition_bound_null_index
+ *
+ * Returns the partition index of the partition bound which accepts NULL.
+ */
+int
+get_partition_bound_null_index(PartitionBoundInfo boundinfo)
+{
+   int i = 0;
+   int j = 0;
+
+   if (!boundinfo->isnulls)
+       return -1;

-           if (!val->constisnull)
-               count++;
+   for (i = 0; i < boundinfo->ndatums; i++)
+   {
+       //TODO: Handle for multi-column cases
+       for (j = 0; j < 1; j++)
+       {
+           if (boundinfo->isnulls[i][j])
+               return boundinfo->indexes[i];
        }
    }

+   return -1;
+}

Maybe this function needs to return a "bitmapset" of indexes, because
multiple partitions can now contain NULL values.

Some other issues I noticed and suggestions for improvement:

+/*
+ * checkForDuplicates
+ *
+ * Returns TRUE if the list bound element is already present in the list of
+ * list bounds, FALSE otherwise.
+ */
+static bool
+checkForDuplicates(List *source, List *searchElem)

This function name may be too generic.  Given that it is specific to
implementing list bound de-duplication, maybe the following signature
is more appropriate:

static bool
checkListBoundDuplicated(List *list_bounds, List *new_bound)

Also, better if the function comment mentions those parameter names, like:

"Returns TRUE if the list bound element 'new_bound' is already present
in the target list 'list_bounds', FALSE otherwise."

+/*
+ * transformPartitionListBounds
+ *
+ * Converts the expressions of list partition bounds from the raw grammar
+ * representation.

A sentence about the result format would be helpful, like:

The result is a List of Lists of Const nodes to account for the
partition key possibly containing more than one column.

+   int             i = 0;
+   int             j = 0;

Better to initialize such loop counters closer to the loop.

+           colname[i] = (char *) palloc0(NAMEDATALEN * sizeof(char));
+           colname[i] = get_attname(RelationGetRelid(parent),
+                                    key->partattrs[i], false);

The palloc in the 1st statement is wasteful, because the 2nd statement
overwrites its pointer by the pointer to the string palloc'd by
get_attname().

+           ListCell   *cell2 = NULL;

No need to explicitly initialize the loop variable.

+           RowExpr     *rowexpr = NULL;
+
+           if (!IsA(expr, RowExpr))
+               ereport(ERROR,
+                       (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+                       errmsg("Invalid list bound specification"),
+                       parser_errposition(pstate, exprLocation((Node
*) spec))));
+
+           rowexpr = (RowExpr *) expr;

It's okay to assign rowexpr at the top here instead of the dummy
NULL-initialization and write the condition as:

    if (!IsA(rowexpr, RowExpr))

+       if (isDuplicate)
+           continue;
+
+       result = lappend(result, values);

I can see you copied this style from the existing code, but how about
writing this simply as:

    if (!isDuplicate)
        result = lappend(result, values);

-/* One value coming from some (index'th) list partition */
+/* One bound of a list partition */
 typedef struct PartitionListValue
 {
    int         index;
-   Datum       value;
+   Datum      *values;
+   bool       *isnulls;
 } PartitionListValue;

Given that this is a locally-defined struct, I wonder if it makes
sense to rename the struct while we're at it.  Call it, say,
PartitionListBound?

Also, please keep part of the existing comment that says that the
bound belongs to index'th partition.

Will send more comments in a bit...

-- 
Amit Langote
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: Allow escape in application_name (was: [postgres_fdw] add local pid to fallback_application_name)
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: Possible missing segments in archiving on standby