Re: [HACKERS] Multi column range partition table

Поиск
Список
Период
Сортировка
От Dean Rasheed
Тема Re: [HACKERS] Multi column range partition table
Дата
Msg-id CAEZATCXy_Xfd4U70iQ8L33UY9dw=AiwEnYEmAfOFq_542TREnA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Multi column range partition table  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Ответы Re: [HACKERS] Multi column range partition table  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Список pgsql-hackers
On 23 June 2017 at 08:01, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> The way we have designed our syntax, we don't have a way to tell that
> p3 comes after p2 and they have no gap between those. But I don't
> think that's your question. What you are struggling with is a way to
> specify a lower bound (10, +infinity) so that anything with i1 > 10
> would go to partition 3.
>

I think actually there is a fundamental problem here, which arises
because UNBOUNDED has 2 different meanings depending on context, and
thus it is not possible in general to specify the start of one range
to be equal to the end of the previous range, as is necessary to get
contiguous non-overlapping ranges.

Note that this isn't just a problem for floating point datatypes
either, it also applies to other types such as strings. For example,
given a partition over (text, int) types defined with the following
values:
 FROM ('a', UNBOUNDED) TO ('b', UNBOUNDED)

which is equivalent to
 FROM ('a', -INFINITY) TO ('b', +INFINITY)

where should the next range start?

Even if you were to find a way to specify "the next string after 'b'",
it wouldn't exactly be pretty. The problem is that the above partition
corresponds to "all the strings starting with 'a', plus the string
'b', which is pretty ugly. A neater way to define the pair of ranges
in this case would be:
 FROM ('a', -INFINITY) TO ('b', -INFINITY) FROM ('b', -INFINITY) TO ('c', -INFINITY)

since then all strings starting with 'a' would fall into the first
partition and all the strings starting with 'b' would fall into the
second one.

Currently, when there are 2 partition columns, the partition
constraint is defined as
 (a is not null) and (b is not null) and (a > al or (a = al and b >= bl)) and (a < au or (a = au and b < bu))

if the upper bound bu were allowed to be -INFINITY (something that
should probably be forbidden unless the previous column's upper bound
were finite), then this would simplify to
 (a is not null) and (b is not null) and (a > al or (a = al and b >= bl)) and (a < au)

and in the example above, where al is -INFINITY, it would further simplify to
 (a is not null) and (b is not null) and (a >= al) and (a < au)

There would also be a similar simplification possible if the lower
bound of a partition column were allowed to be +INFINITY.

So, I think that having UNBOUNDED represent both -INFINITY and
+INFINITY depending on context is a design flaw, and that we need to
allow both -INFINITY and +INFINITY as upper and lower bounds (provided
they are preceded by a column with a finite bound). I think that, in
general, that's the only way to allow contiguous non-overlapping
partitions to be defined on multiple columns.

Regards,
Dean



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: [HACKERS] Setting pd_lower in GIN metapage
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: [HACKERS] Broken O(n^2) avoidance in wal segment recycling.