Re: [HACKERS] [POC] hash partitioning

Поиск
Список
Период
Сортировка
От amul sul
Тема Re: [HACKERS] [POC] hash partitioning
Дата
Msg-id CAAJ_b97dGjD3ns2DhVMyLFN2EZaEmb22mq9LbEb_3RbeYvA7NQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [POC] hash partitioning  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Ответы Re: [HACKERS] [POC] hash partitioning  (Jesper Pedersen <jesper.pedersen@redhat.com>)
Список pgsql-hackers
On Thu, Sep 28, 2017 at 11:24 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2017/09/27 22:41, Jesper Pedersen wrote:
>> On 09/27/2017 03:05 AM, amul sul wrote:
>>>>> Attached rebased patch, thanks.
>>>>>
>>>>>
>>>> While reading through the patch I thought it would be better to keep
>>>> MODULUS and REMAINDER in caps, if CREATE TABLE was in caps too in order to
>>>> highlight that these are "keywords" for hash partition.
>>>>
>>>> Also updated some of the documentation.
>>>>
>>>>
>>> Thanks a lot for the patch, included in the attached version.
>>>
>>
>> Thank you.
>>
>> Based on [1] I have moved the patch to "Ready for Committer".
>
> Thanks a lot Amul for working on this.  Like Jesper said, the patch looks
> pretty good overall.  I was looking at the latest version with intent to
> study certain things about hash partitioning the way patch implements it,
> during which I noticed some things.
>

Thanks Amit for looking at the patch.

> +      The modulus must be a positive integer, and the remainder must a
>
> must be a
>

Fixed in the attached version.

> +      suppose you have a hash-partitioned table with 8 children, each of
> which
> +      has modulus 8, but find it necessary to increase the number of
> partitions
> +      to 16.
>

Fixed in the attached version.

> Might it be a good idea to say 8 "partitions" instead of "children" in the
> first sentence?
>
> +      each modulus-8 partition until none remain.  While this may still
> involve
> +      a large amount of data movement at each step, it is still better than
> +      having to create a whole new table and move all the data at once.
> +     </para>
> +
>

Fixed in the attached version.

> I read the paragraph that ends with the above text and started wondering
> if the example to redistribute data in hash partitions by detaching and
> attaching with new modulus/remainder could be illustrated with an example?
> Maybe in the examples section of the ALTER TABLE page?
>

I think hint in the documentation is more than enough. There is N number of
ways of data redistribution, the document is not meant to explain all of those.

> +      Since hash operator class provide only equality, not ordering,
> collation
>
> Either "Since hash operator classes provide" or "Since hash operator class
> provides"
>

Fixed in the attached version.

> Other than the above points, patch looks good.
>
>
> By the way, I noticed a couple of things about hash partition constraints:
>
> 1. In get_qual_for_hash(), using
> get_fn_expr_rettype(&key->partsupfunc[i]), which returns InvalidOid for
> the lack of fn_expr being set to non-NULL value, causes funcrettype of the
> FuncExpr being generated for hashing partition key columns to be set to
> InvalidOid, which I think is wrong.  That is, the following if condition
> in get_fn_expr_rettype() is always satisfied:
>
>     if (!flinfo || !flinfo->fn_expr)
>         return InvalidOid;
>
> I think we could use get_func_rettype(&key->partsupfunc[i].fn_oid)
> instead.  Attached patch
> hash-v21-set-funcexpr-funcrettype-correctly.patch, which applies on top
> v21 of your patch.
>

Thanks for the patch, included in the attached version.

> 2. It seems that the reason constraint exclusion doesn't work with hash
> partitions as implemented by the patch is that predtest.c:
> operator_predicate_proof() returns false even without looking into the
> hash partition constraint, which is of the following form:
>
> satisfies_hash_partition(<mod>, <rem>, <key1-exthash>,..)
>
> beccause the above constraint expression doesn't translate into a a binary
> opclause (an OpExpr), which operator_predicate_proof() knows how to work
> with.  So, false is returned at the beginning of that function by the
> following code:
>
>     if (!is_opclause(predicate))
>         return false;
>
> For example,
>
> create table p (a int) partition by hash (a);
> create table p0 partition of p for values with (modulus 4, remainder 0);
> create table p1 partition of p for values with (modulus 4, remainder 1);
> \d+ p0
> <...>
> Partition constraint: satisfies_hash_partition(4, 0, hashint4extended(a,
> '8816678312871386367'::bigint))
>
> -- both p0 and p1 scanned
> explain select * from p where satisfies_hash_partition(4, 0,
> hashint4extended(a, '8816678312871386367'::bigint));
>                                              QUERY PLAN
>
> ----------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..96.50 rows=1700 width=4)
>    ->  Seq Scan on p0  (cost=0.00..48.25 rows=850 width=4)
>          Filter: satisfies_hash_partition(4, 0, hashint4extended(a,
> '8816678312871386367'::bigint))
>    ->  Seq Scan on p1  (cost=0.00..48.25 rows=850 width=4)
>          Filter: satisfies_hash_partition(4, 0, hashint4extended(a,
> '8816678312871386367'::bigint))
> (5 rows)
>
> -- both p0 and p1 scanned
> explain select * from p where satisfies_hash_partition(4, 1,
> hashint4extended(a, '8816678312871386367'::bigint));
>                                              QUERY PLAN
>
> ----------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..96.50 rows=1700 width=4)
>    ->  Seq Scan on p0  (cost=0.00..48.25 rows=850 width=4)
>          Filter: satisfies_hash_partition(4, 1, hashint4extended(a,
> '8816678312871386367'::bigint))
>    ->  Seq Scan on p1  (cost=0.00..48.25 rows=850 width=4)
>          Filter: satisfies_hash_partition(4, 1, hashint4extended(a,
> '8816678312871386367'::bigint))
> (5 rows)
>
>
> I looked into how satisfies_hash_partition() works and came up with an
> idea that I think will make constraint exclusion work.  What if we emitted
> the hash partition constraint in the following form instead:
>
> hash_partition_mod(hash_partition_hash(key1-exthash, key2-exthash),
>                    <mod>) = <rem>
>
> With that form, constraint exclusion seems to work as illustrated below:
>
> \d+ p0
> <...>
> Partition constraint:
> (hash_partition_modulus(hash_partition_hash(hashint4extended(a,
> '8816678312871386367'::bigint)), 4) = 0)
>
> -- note only p0 is scanned
> explain select * from p where
> hash_partition_modulus(hash_partition_hash(hashint4extended(a,
> '8816678312871386367'::bigint)), 4) = 0;
>                      QUERY PLAN
>
>
--------------------------------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..61.00 rows=13 width=4)
>    ->  Seq Scan on p0  (cost=0.00..61.00 rows=13 width=4)
>          Filter:
> (hash_partition_modulus(hash_partition_hash(hashint4extended(a,
> '8816678312871386367'::bigint)), 4) = 0)
> (3 rows)
>
> -- note only p1 is scanned
> explain select * from p where
> hash_partition_modulus(hash_partition_hash(hashint4extended(a,
> '8816678312871386367'::bigint)), 4) = 1;
>                                                         QUERY PLAN
>
>
--------------------------------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..61.00 rows=13 width=4)
>    ->  Seq Scan on p1  (cost=0.00..61.00 rows=13 width=4)
>          Filter:
> (hash_partition_modulus(hash_partition_hash(hashint4extended(a,
> '8816678312871386367'::bigint)), 4) = 1)
> (3 rows)
>
> I tried to implement that in the attached
> hash-v21-hash-part-constraint.patch, which applies on top v21 of your
> patch (actually on top of
> hash-v21-set-funcexpr-funcrettype-correctly.patch, which I think should be
> applied anyway as it fixes a bug of the original patch).
>
> What do you think?  Eventually, the new partition-pruning method [1] will
> make using constraint exclusion obsolete, but it might be a good idea to
> have it working if we can.
>

It does not really do the partition pruning via constraint exclusion and I don't
think anyone is going to use the remainder in the where condition to fetch
data and hash partitioning is not meant for that.

But I am sure that we could solve this problem using your and Beena's work
toward faster partition pruning[1] and Runtime Partition Pruning[2].

Will think on this changes if it is required for the pruning feature.

Regards,
Amul

1] https://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp
2] https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: [HACKERS] Partition-wise aggregation/grouping
Следующее
От: Amit Khandekar
Дата:
Сообщение: Re: [HACKERS] Parallel Append implementation