Re: [HACKERS] [POC] hash partitioning

Поиск
Список
Период
Сортировка
От Yugo Nagata
Тема Re: [HACKERS] [POC] hash partitioning
Дата
Msg-id 20170623134115.86f01d0c.nagata@sraoss.co.jp
обсуждение исходный текст
Ответ на Re: [HACKERS] [POC] hash partitioning  (amul sul <sulamul@gmail.com>)
Ответы Re: [HACKERS] [POC] hash partitioning  (Yugo Nagata <nagata@sraoss.co.jp>)
Re: [HACKERS] [POC] hash partitioning  (amul sul <sulamul@gmail.com>)
Список pgsql-hackers
On Tue, 6 Jun 2017 13:03:58 +0530
amul sul <sulamul@gmail.com> wrote:


> Updated patch attached.

I looked into the latest patch (v13) and have some comments
althogh they might be trivial.

First, I couldn't apply this patch to the latest HEAD due to
a documentation fix and pgintend updates. It needes rebase.

$ git apply /tmp/0002-hash-partitioning_another_design-v13.patch  
error: patch failed: doc/src/sgml/ref/create_table.sgml:87
error: doc/src/sgml/ref/create_table.sgml: patch does not apply
error: patch failed: src/backend/catalog/partition.c:76
error: src/backend/catalog/partition.c: patch does not apply
error: patch failed: src/backend/commands/tablecmds.c:13371
error: src/backend/commands/tablecmds.c: patch does not apply

      <varlistentry>
+       <term>Hash Partitioning</term>
+
+       <listitem>
+        <para>
+         The table is partitioned by specifying modulus and remainder for each
+         partition. Each partition holds rows for which the hash value of
+         partition keys when divided by specified modulus produces specified
+         remainder. For more clarification on modulus and remainder please refer
+         <xref linkend="sql-createtable-partition">.
+        </para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>       <term>Range Partitioning</term>

I think this section should be inserted after List Partitioning section because
the order of the descriptions is Range, List, then Hash in other places of
the documentation. At least, 


-    <firstterm>partition bounds</firstterm>.  Currently supported
-    partitioning methods include range and list, where each partition is
-    assigned a range of keys and a list of keys, respectively.
+    <firstterm>partition bounds</firstterm>.  The currently supported
+    partitioning methods are list, range, and hash.   </para>

Also in this hunk. I think "The currently supported partitioning methods are
range, list, and hash." is better. We don't need to change the order of
the original description.
      <listitem>       <para>
-        Declarative partitioning only supports list and range partitioning,
-        whereas table inheritance allows data to be divided in a manner of
-        the user's choosing.  (Note, however, that if constraint exclusion is
-        unable to prune partitions effectively, query performance will be very
-        poor.)
+        Declarative partitioning only supports hash, list and range
+        partitioning, whereas table inheritance allows data to be divided in a
+        manner of the user's choosing.  (Note, however, that if constraint
+        exclusion is unable to prune partitions effectively, query performance
+        will be very poor.)

Similarly, I think "Declarative partitioning only supports range, list and hash
partitioning," is better.


+
+  <para>
+   Create a hash partitioned table:
+<programlisting>
+CREATE TABLE orders (
+    order_id     bigint not null,
+    cust_id      bigint not null,
+    status       text
+) PARTITION BY HASH (order_id);
+</programlisting></para>
+

This paragraph should be inserted between "Create a list partitioned table:"
paragraph and "Ceate partition of a range partitioned table:" paragraph
as well as range and list.

        *strategy = PARTITION_STRATEGY_LIST;    else if (pg_strcasecmp(partspec->strategy, "range") == 0)
*strategy= PARTITION_STRATEGY_RANGE;
 
+    else if (pg_strcasecmp(partspec->strategy, "hash") == 0)
+        *strategy = PARTITION_STRATEGY_HASH;    else        ereport(ERROR,

In the most of codes, the order is hash, range, then list, but only
in transformPartitionSpec(), the order is list, range, then hash,
as above. Maybe it is better to be uniform.


+            {
+                if (strategy == PARTITION_STRATEGY_HASH)
+                    ereport(ERROR,
+                            (errcode(ERRCODE_UNDEFINED_OBJECT),
+                             errmsg("data type %s has no default hash operator class",
+                                    format_type_be(atttype)),
+                             errhint("You must specify a hash operator class or define a default hash operator class
forthe data type.")));
 
+                else
+                    ereport(ERROR,
+                            (errcode(ERRCODE_UNDEFINED_OBJECT),
+                             errmsg("data type %s has no default btree operator class",
+                                    format_type_be(atttype)),
+                             errhint("You must specify a btree operator class or define a default btree operator class
forthe data type.")));
 
+
+
                                               atttype,
-                                               "btree",
-                                               BTREE_AM_OID);
+                                               am_oid == HASH_AM_OID ? "hash" : "btree",
+                                               am_oid);

How about writing this part as following to reduce code redundancy?

+    Oid            am_oid;
+    char       *am_name;

<snip> 

+        if (strategy == PARTITION_STRATEGY_HASH)
+        {
+            am_oid = HASH_AM_OID;
+            am_name = pstrdup("hash");
+        }
+        else
+        {
+            am_oid = BTREE_AM_OID;
+            am_name = pstrdup("btree");
+        }
+        if (!pelem->opclass)        {
-            partopclass[attn] = GetDefaultOpClass(atttype, BTREE_AM_OID);
+            partopclass[attn] = GetDefaultOpClass(atttype, am_oid);            if (!OidIsValid(partopclass[attn]))
          ereport(ERROR,                        (errcode(ERRCODE_UNDEFINED_OBJECT),
 
-                   errmsg("data type %s has no default btree operator class",
-                          format_type_be(atttype)),
-                         errhint("You must specify a btree operator class or define a default btree operator class for
thedata type.")));
 
+                         errmsg("data type %s has no default %s operator class",
+                                format_type_be(atttype), am_name),
+                         errhint("You must specify a %s operator class or define a default %s operator class for the
datatype.",
 
+                                 am_name, am_name)));
+        }        else            partopclass[attn] = ResolveOpClass(pelem->opclass,
          atttype,
 
-                                               "btree",
-                                               BTREE_AM_OID);
+                                               am_name,
+                                               am_oid);


There is meaningless indentation change.

@@ -2021,7 +2370,8 @@ get_partition_for_tuple(PartitionDispatch *pd,                   /* bsearch in
partdesc->boundinfo*/                   cur_offset = partition_bound_bsearch(key,
                partdesc->boundinfo,
 
-                                                        values, false, &equal);
+                                                     values, false, &equal);
+                   /*                    * Offset returned is such that the bound at offset is


Fixing the comment of pg_get_partkeydef() is missing.
* pg_get_partkeydef** Returns the partition key specification, ie, the following:** PARTITION BY { RANGE | LIST }
(columnopt_collation opt_opclass [, ...])*/
 
Datum
pg_get_partkeydef(PG_FUNCTION_ARGS)
{

Regards,

> 
> Regards,
> Amul Sul


-- 
Yugo Nagata <nagata@sraoss.co.jp>



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Mahendranath Gurram
Дата:
Сообщение: Re: [HACKERS] Regarding Postgres Dynamic Shared Memory (DSA)
Следующее
От: amul sul
Дата:
Сообщение: Re: [HACKERS] Multi column range partition table