Re: BUG #16840: Rows not found in table partitioned by hash when not all partitions exists

Поиск
Список
Период
Сортировка
От Michał Albrycht
Тема Re: BUG #16840: Rows not found in table partitioned by hash when not all partitions exists
Дата
Msg-id CACsoHGCxAJje+s6mZuDmHHmfVk+iUzgJk0+LNWvMMz8vnZS0og@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #16840: Rows not found in table partitioned by hash when not all partitions exists  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
Thanks for quick bug confirmation and patch. I agree that the way code is presented here is unlikely to happen on production, but I was experimenting with a custom, dummy hash function which would guarantee that all rows with volume_id=1 would go to partition 1, volume_id=2 to partition 2 and so on. 

CREATE OR REPLACE FUNCTION partition_custom_bigint_hash(value INT8, seed INT8)
RETURNS INT8 AS $$
-- this number is UINT64CONST(0x49a0f4dd15e5a8e3) from https://github.com/postgres/postgres/blob/REL_13_STABLE/src/include/common/hashfn.h#L83
SELECT value - 5305509591434766563;
$$ LANGUAGE SQL IMMUTABLE;

CREATE OPERATOR CLASS partition_custom_bigint_hash_op
FOR TYPE int8
USING hash AS
OPERATOR 1 =,
FUNCTION 2 partition_custom_bigint_hash(int8, int8);

Then adding that operator class to table definition:
CREATE TABLE dir (
id SERIAL,
volume_id BIGINT,
path TEXT
) PARTITION BY HASH (volume_id partition_custom_bigint_hash_op);
Now I'm able to create a partition only when it's needed and I know which partition should be created for a given volume_id (partition_number = volume_id % number_of_partitions).
Michał Albrycht

śr., 27 sty 2021 o 23:54 Tom Lane <tgl@sss.pgh.pa.us> napisał(a):
I wrote:
> Hmm, seems to be a case of faulty partition exclusion, because the
> plan isn't scanning anything:

Here's a proposed patch for this.  The core of the problem is confusion
around the number of entries in the PartitionBoundInfoData.indexes array.
Each of the three types of partitioning has a different rule for that,
despite which we were expecting assorted code to know what to do, and
some places got it wrong for hash --- even hash-specific code :-(

I propose here to solve that by explicitly storing the number of entries
in PartitionBoundInfoData, and thereby removing the need for partition-
strategy-independent code to know anything about the rules.  I think
we can get away with that in the back branches by adding "nindexes"
at the end of the struct.  This could break extensions that are
manufacturing their own PartitionBoundInfoData structs, but it seems
unlikely that there are any.

Most of the patch just straightforwardly sets or uses the new field.
Notably, partition_bounds_equal() and partition_bounds_copy() get
significantly simpler and safer.  The actual bug fix is in
get_matching_hash_bounds() and perform_pruning_combine_step(), where
"all partitions" needs to be 0 .. nindexes-1 not 0 .. ndatums-1.
(The reason your example fails is that the OR clause should produce
"all partitions potentially match", but because of this bug, it's
producing a bitmask that doesn't include the partition we need.)

                        regards, tom lane

В списке pgsql-bugs по дате отправления:

Предыдущее
От: PG Bug reporting form
Дата:
Сообщение: BUG #16841: psql -- \d tablename , displays "Error : column c.relhasoids does not exit"
Следующее
От: Sergei Kornilov
Дата:
Сообщение: Re: BUG #16841: psql -- \d tablename , displays "Error : column c.relhasoids does not exit"