Re: Declarative partitioning

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: Declarative partitioning
Дата
Msg-id 5731BA97.9090108@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Declarative partitioning  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Ответы Re: Declarative partitioning  (Sameer Thakur-2 <Sameer.Thakur@nttdata.com>)
Список pgsql-hackers
Hi Ashutosh,

On 2016/05/09 20:21, Ashutosh Bapat wrote:
> Hi Amit,
> I am trying multi-column/expression partitions.

Thanks for the tests.

> create table t1_multi_col (a int, b int) partition by range (a, b);
> create table t1_mc_p1 partition of t1_multi_col for values start (1, 200)
> end (100, 300);
> create table t1_mc_p2 partition of t1_multi_col for values start (200, 1)
> end (300, 100);
> insert into t1_multi_col values (1, 250);
> insert into t1_multi_col values (250, 1);
> insert into t1_multi_col values (100, 100);
> select tableoid::regclass, * from t1_multi_col;
>  tableoid |  a  |  b
> ----------+-----+-----
>  t1_mc_p1 |   1 | 250
>  t1_mc_p1 | 100 | 100
>  t1_mc_p2 | 250 |   1
> The row (100, 100) landed in t1_mc_p1 which has partition bounds as (1,
> 200) and (100, 300) which should not accept a row with b = 100. It looks
> like the binary search got confused with the reversed order of ranges
> (should that be allowed?)

It's useful to think of multi-column key as defining ranges of composite
values (tuples) instead of thinking in terms of ranges of values of
individual columns.  That's how a row's partition key is compared against
individual partition bounds until a suitable partition is found (with
binary search that is), which uses record comparison logic as shown below:

postgres=# select (1, 200) <= (100, 100) AND (100, 100) < (100, 300);?column?
----------t
(1 row)

Which means the row (100, 100) belongs in the partition with the start
bound (1, 200) and the end bound (100, 300).  Just like in composite value
case, comparison stops at some leading column that returns != 0 result.
So, in comparison (1, 200) <= (100, 100), the second column plays no role.

> Symantec of multiple columns for ranges (may be list as well) looks

Note that list partition key does not support multiple columns.

> confusing. The current scheme doesn't allow overlapping range for one of
> the partitioning keys even if the combined range is non-overlapping.
> create table t1_multi_col (a int, b int) partition by range (a, b);
> create table t1_mc_p1 partition of t1_multi_col for values start (1, 100)
> end (100, 200);
> create table t1_mc_p2 partition of t1_multi_col for values start (1, 200)
> end (100, 300);
> ERROR:  new partition's range overlaps with that of partition "t1_mc_p1" of
> "t1_multi_col"
> HINT:  Please specify a range that does not overlap with any existing
> partition's range.
> create table t1_mc_p2 partition of t1_multi_col for values start (1, 300)
> end (100, 400);
> ERROR:  new partition's range overlaps with that of partition "t1_mc_p1" of
> "t1_multi_col"
> HINT:  Please specify a range that does not overlap with any existing
> partition's range.

Ranges [ (1, 100), (100, 200) ) and [ (1, 200), (100, 300) ) do overlap:

postgres=# select (1, 100) <= (1, 200) AND (1, 200) < (100, 200);?column?
----------t
(1 row)

That is, (1, 200) is both the start element of the 2nd partition's range
and is contained in the first partition's range as illustrated above.

> That should be better realised using subpartitioning on b. The question is,
> if one column's value is enough to identify partition (since they can not
> contain overlapping values for that column), why do we need mutliple
> columns/expressions as partition keys? IIUC, all the other column does is
> to disallow certain range of values for that column, which can better be
> done by a CHECK constraint. It looks like Oracle looks at combined range
> and not just one column.

A more familiar example I have seen around the web illustrating
multi-column range partitioning is for something like (year, month, day)
triple.  Consider the following example:

create table parted(year int, month int, day int) partition by range (year, month, day);

create table part201605week1 partition of parted for values start (2016, 5, 1) end (2016, 5, 8);
create table part201605week2 partition of parted for values start (2016, 5, 8) end (2016, 5, 15);
create table part201605week3 partition of parted for values start (2016, 5, 15) end (2016, 5, 22);
create table part201605week4 partition of parted for values start (2016, 5, 22) end (2016, 5, 29);
create table part201605week5 partition of parted for values start (2016, 5, 29) end (2016, 5, 31) inclusive;

create table part201606week1 partition of parted for values start (2016, 6, 1) end (2016, 6, 8);
create table part201606week2 partition of parted for values start (2016, 6, 8) end (2016, 6, 15);
create table part201606week3 partition of parted for values start (2016, 6, 15) end (2016, 6, 22);
create table part201606week4 partition of parted for values start (2016, 6, 2) end (2016, 6, 29);
create table part201606week4 partition of parted for values start (2016, 6, 22) end (2016, 6, 29);
create table part201606week5 partition of parted for values start (2016, 6, 29) end (2016, 6, 30) inclusive;

explain (costs off) select * from parted where day between 4 and 10;                 QUERY PLAN
----------------------------------------------Append  ->  Seq Scan on parted        Filter: ((day >= 4) AND (day <=
10)) ->  Seq Scan on part201605week1        Filter: ((day >= 4) AND (day <= 10))  ->  Seq Scan on part201605week2
Filter: ((day >= 4) AND (day <= 10))  ->  Seq Scan on part201606week1        Filter: ((day >= 4) AND (day <= 10))  ->
SeqScan on part201606week2        Filter: ((day >= 4) AND (day <= 10))
 
(11 rows)

explain (costs off) select * from parted where day between 4 and 10 and
month = 5;                         QUERY PLAN
--------------------------------------------------------------Append  ->  Seq Scan on parted        Filter: ((day >= 4)
AND(day <= 10) AND (month = 5))  ->  Seq Scan on part201605week1        Filter: ((day >= 4) AND (day <= 10) AND (month
=5))  ->  Seq Scan on part201605week2        Filter: ((day >= 4) AND (day <= 10) AND (month = 5))
 
(7 rows)

explain (costs off) select * from parted where month = 6;           QUERY PLAN
-----------------------------------Append  ->  Seq Scan on parted        Filter: (month = 6)  ->  Seq Scan on
part201606week1       Filter: (month = 6)  ->  Seq Scan on part201606week2        Filter: (month = 6)  ->  Seq Scan on
part201606week3       Filter: (month = 6)  ->  Seq Scan on part201606week4        Filter: (month = 6)  ->  Seq Scan on
part201606week5       Filter: (month = 6)
 
(13 rows)

Thanks,
Amit





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)
Следующее
От: Vladimir Gordiychuk
Дата:
Сообщение: Re: Stopping logical replication protocol