Обсуждение: Partitioning by month causing an error?

Поиск
Список
Период
Сортировка

Partitioning by month causing an error?

От
Wells Oliver
Дата:
I have a partition per each month, like this:

CREATE TABLE mmd_yr2019_3 PARTITION OF mmd FOR VALUES FROM ('2019-03-01') TO ('2019-03-31');
CREATE TABLE mmd_yr2019_4 PARTITION OF mmd FOR VALUES FROM ('2019-04-01') TO ('2019-04-30');

This breaks for dates at the end of the month. What's the better way to write the bounds? 3/1 to 4/1 and then 4/1 to 5/1? Does that cause some overlap issue?

Thanks.

--

Re: Partitioning by month causing an error?

От
Ron
Дата:
On 12/20/19 6:49 PM, Wells Oliver wrote:
> I have a partition per each month, like this:
>
> CREATE TABLE mmd_yr2019_3 PARTITION OF mmd FOR VALUES FROM ('2019-03-01') 
> TO ('2019-03-31');
> CREATE TABLE mmd_yr2019_4 PARTITION OF mmd FOR VALUES FROM ('2019-04-01') 
> TO ('2019-04-30');
>
> This breaks for dates at the end of the month. What's the better way to 
> write the bounds? 3/1 to 4/1 and then 4/1 to 5/1? Does that cause some 
> overlap issue?

What is the mmd data type?  I ask that because '2019-03-31' is really 
'2019-03-31 00:00:00.000', so there's nowhere for '2019-03-31 00:00:00.001' 
through '2019-03-31 23:59:59.999' to go.  (And don't forget timezones.)


-- 
Angular momentum makes the world go 'round.



Re: Partitioning by month causing an error?

От
Wells Oliver
Дата:
mmd has a PARTITION BY RANGE on a date column.

On Fri, Dec 20, 2019 at 6:24 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 12/20/19 6:49 PM, Wells Oliver wrote:
> I have a partition per each month, like this:
>
> CREATE TABLE mmd_yr2019_3 PARTITION OF mmd FOR VALUES FROM ('2019-03-01')
> TO ('2019-03-31');
> CREATE TABLE mmd_yr2019_4 PARTITION OF mmd FOR VALUES FROM ('2019-04-01')
> TO ('2019-04-30');
>
> This breaks for dates at the end of the month. What's the better way to
> write the bounds? 3/1 to 4/1 and then 4/1 to 5/1? Does that cause some
> overlap issue?

What is the mmd data type?  I ask that because '2019-03-31' is really
'2019-03-31 00:00:00.000', so there's nowhere for '2019-03-31 00:00:00.001'
through '2019-03-31 23:59:59.999' to go.  (And don't forget timezones.)


--
Angular momentum makes the world go 'round.




--

Re: Partitioning by month causing an error?

От
Ron
Дата:
What's the:

- Pg version number
- exact error message
- definition of mmd
- example statement that fails

On 12/20/19 8:32 PM, Wells Oliver wrote:
mmd has a PARTITION BY RANGE on a date column.

On Fri, Dec 20, 2019 at 6:24 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 12/20/19 6:49 PM, Wells Oliver wrote:
> I have a partition per each month, like this:
>
> CREATE TABLE mmd_yr2019_3 PARTITION OF mmd FOR VALUES FROM ('2019-03-01')
> TO ('2019-03-31');
> CREATE TABLE mmd_yr2019_4 PARTITION OF mmd FOR VALUES FROM ('2019-04-01')
> TO ('2019-04-30');
>
> This breaks for dates at the end of the month. What's the better way to
> write the bounds? 3/1 to 4/1 and then 4/1 to 5/1? Does that cause some
> overlap issue?

What is the mmd data type?  I ask that because '2019-03-31' is really
'2019-03-31 00:00:00.000', so there's nowhere for '2019-03-31 00:00:00.001'
through '2019-03-31 23:59:59.999' to go.  (And don't forget timezones.)


--
Angular momentum makes the world go 'round.

Re: Partitioning by month causing an error?

От
Tom Lane
Дата:
Wells Oliver <wells.oliver@gmail.com> writes:
> I have a partition per each month, like this:
> CREATE TABLE mmd_yr2019_3 PARTITION OF mmd FOR VALUES FROM ('2019-03-01')
> TO ('2019-03-31');
> CREATE TABLE mmd_yr2019_4 PARTITION OF mmd FOR VALUES FROM ('2019-04-01')
> TO ('2019-04-30');

> This breaks for dates at the end of the month. What's the better way to
> write the bounds? 3/1 to 4/1 and then 4/1 to 5/1? Does that cause some
> overlap issue?

The rule is that a range partition from A to B covers values A <= X < B.
So you should be using first-of-the-month dates for all of these
values.  As you have it, there's gaps in the allowable values,
totally aside from the PITA factor of having to identify the last
day of each month accurately.

            regards, tom lane



Re: Partitioning by month causing an error?

От
Ron
Дата:
On 12/20/19 8:59 PM, Tom Lane wrote:
Wells Oliver <wells.oliver@gmail.com> writes:
I have a partition per each month, like this:
CREATE TABLE mmd_yr2019_3 PARTITION OF mmd FOR VALUES FROM ('2019-03-01')
TO ('2019-03-31');
CREATE TABLE mmd_yr2019_4 PARTITION OF mmd FOR VALUES FROM ('2019-04-01')
TO ('2019-04-30');
This breaks for dates at the end of the month. What's the better way to
write the bounds? 3/1 to 4/1 and then 4/1 to 5/1? Does that cause some
overlap issue?
The rule is that a range partition from A to B covers values A <= X < B.
So you should be using first-of-the-month dates for all of these
values.  As you have it, there's gaps in the allowable values,
totally aside from the PITA factor of having to identify the last
day of each month accurately.

If I were to ask to count from 1 to 100, you'd count from 1 to 100, not 1 to 99.  Who decided to arbitrarily break a grammatical rule we (native English speakers, and I bet everyone else, too) all learn as children, causing all sorts of needless confusion and breakage?

--
Angular momentum makes the world go 'round.

Re: Partitioning by month causing an error?

От
Wells Oliver
Дата:
This is really interesting to me: why? Running select '2019-04-30'::date between '2019-04-01'::date and '2019-04-30'::date; yields true, and generally the lower and upper bounds are inclusive in any kind of between check for languages I am familiar with.

Appreciate the answer, just curious. Thanks as always Tom.

On Fri, Dec 20, 2019 at 6:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Wells Oliver <wells.oliver@gmail.com> writes:
> I have a partition per each month, like this:
> CREATE TABLE mmd_yr2019_3 PARTITION OF mmd FOR VALUES FROM ('2019-03-01')
> TO ('2019-03-31');
> CREATE TABLE mmd_yr2019_4 PARTITION OF mmd FOR VALUES FROM ('2019-04-01')
> TO ('2019-04-30');

> This breaks for dates at the end of the month. What's the better way to
> write the bounds? 3/1 to 4/1 and then 4/1 to 5/1? Does that cause some
> overlap issue?

The rule is that a range partition from A to B covers values A <= X < B.
So you should be using first-of-the-month dates for all of these
values.  As you have it, there's gaps in the allowable values,
totally aside from the PITA factor of having to identify the last
day of each month accurately.

                        regards, tom lane


--

Re: Partitioning by month causing an error?

От
Tom Lane
Дата:
Ron <ronljohnsonjr@gmail.com> writes:
> On 12/20/19 8:59 PM, Tom Lane wrote:
>> The rule is that a range partition from A to B covers values A <= X < B.

> If I were to ask to count *from* 1 *to* 100, you'd count from 1 to 100, 
> *not* 1 to 99.  Who decided to arbitrarily break a grammatical rule we 
> (native English speakers, and I bet everyone else, too) all learn as 
> children, causing all sorts of needless confusion and breakage?

This rule breaks fewer things than a rule of A <= X <= B would.  In
particular, for more-or-less-continuous values like float8s or timestamps,
you really really don't want to have to write down
the-last-possible-value-before-the-start-of-the-next-partition.

            regards, tom lane



Re: Partitioning by month causing an error?

От
Tom Lane
Дата:
Wells Oliver <wells.oliver@gmail.com> writes:
> This is really interesting to me: why? Running select '2019-04-30'::date
> between '2019-04-01'::date and '2019-04-30'::date; yields true, and
> generally the lower and upper bounds are inclusive in any kind of between
> check for languages I am familiar with.

It's possible to do it like that for discrete types like dates.  It's
not possible for continuous types.  Exercise: try to write a partitioning
rule for a "numeric" column that classifies every value into one and only
one partition, if the partitioning behavior is like A <= X <= B.

(No fair restricting the precision of the numeric.)

            regards, tom lane



Re: Partitioning by month causing an error?

От
Ron
Дата:
On 12/20/19 10:10 PM, Tom Lane wrote:
Ron <ronljohnsonjr@gmail.com> writes:
On 12/20/19 8:59 PM, Tom Lane wrote:
The rule is that a range partition from A to B covers values A <= X < B.
If I were to ask to count *from* 1 *to* 100, you'd count from 1 to 100, 
*not* 1 to 99.  Who decided to arbitrarily break a grammatical rule we 
(native English speakers, and I bet everyone else, too) all learn as 
children, causing all sorts of needless confusion and breakage?
This rule breaks fewer things than a rule of A <= X <= B would.  In
particular, for more-or-less-continuous values like float8s or timestamps,
you really really don't want to have to write down
the-last-possible-value-before-the-start-of-the-next-partition.

Why not?  For 20+ years (not on Postgres) I've been specifying partition limits where the upper limits are the edge of timestamps (octaword integers counting the number of 100ns ticks since November 17, 1858), and have never had a problem:
IN ETC_TRAN_DETAIL_201909_UA WITH LIMIT OF '2019-09-30 23:59:59.99',
IN ETC_TRAN_DETAIL_201910_UA WITH LIMIT OF '2019-10-31 23:59:59.99',
IN ETC_TRAN_DETAIL_201911_UA WITH LIMIT OF '2019-11-30 23:59:59.99',
IN ETC_TRAN_DETAIL_201912_UA WITH LIMIT OF '2019-12-31 23:59:59.99',
IN ETC_TRAN_DETAIL_202001_UA WITH LIMIT OF '2020-01-31 23:59:59.99',
IN ETC_TRAN_DETAIL_202002_UA WITH LIMIT OF '2020-02-28 23:59:59.99',
etc, etc, etc

Of course, I've also never partitioned a table on a float, thinking that's utterly daft.


--
Angular momentum makes the world go 'round.

Re: Partitioning by month causing an error?

От
Tom Lane
Дата:
Ron <ronljohnsonjr@gmail.com> writes:
> On 12/20/19 10:10 PM, Tom Lane wrote:
>> This rule breaks fewer things than a rule of A <= X <= B would.

> Why not?  For 20+ years (not on Postgres) I've been specifying partition 
> limits where the upper limits are the edge of timestamps (octaword integers 
> counting the number of 100ns ticks since November 17, 1858), and have 
> *never* had a problem:
> IN ETC_TRAN_DETAIL_201909_UA WITH LIMIT OF '2019-09-30 23:59:59.99',
> IN ETC_TRAN_DETAIL_201910_UA WITH LIMIT OF '2019-10-31 23:59:59.99',

And what happens with '2019-10-31 23:59:59.9999'?  I don't know about
your other database, but in Postgres that value is legal and it's larger
than '2019-10-31 23:59:59.99', but still less than '2019-11-01 00:00:00'.

The main point here is that even if that works, it requires a whole
lot more assumptions about the behavior of the datatype (specifically,
its precision limit) than the other way.  As soon as you look at cases
where the precision limit isn't clearly defined, it just breaks.

> Of course, I've also never partitioned a table on a float, thinking that's 
> utterly daft.

[ shrug... ]  Works fine as long as you have a non-daft partitioning rule.

            regards, tom lane



Re: Partitioning by month causing an error?

От
Ron
Дата:
On 12/20/19 10:56 PM, Tom Lane wrote:
Ron <ronljohnsonjr@gmail.com> writes:
On 12/20/19 10:10 PM, Tom Lane wrote:
This rule breaks fewer things than a rule of A <= X <= B would.
Why not?  For 20+ years (not on Postgres) I've been specifying partition 
limits where the upper limits are the edge of timestamps (octaword integers 
counting the number of 100ns ticks since November 17, 1858), and have 
*never* had a problem:
IN ETC_TRAN_DETAIL_201909_UA WITH LIMIT OF '2019-09-30 23:59:59.99',
IN ETC_TRAN_DETAIL_201910_UA WITH LIMIT OF '2019-10-31 23:59:59.99',
And what happens with '2019-10-31 23:59:59.9999'? 

The datatype only extends to centiseconds.  Naturally, if it extended to tenths of milliseconds, then I'd have added another two nines to the range limits.

 I don't know about
your other database, but in Postgres that value is legal and it's larger
than '2019-10-31 23:59:59.99', but still less than '2019-11-01 00:00:00'.

The main point here is that even if that works, it requires a whole
lot more assumptions about the behavior of the datatype (specifically,
its precision limit) than the other way.  As soon as you look at cases
where the precision limit isn't clearly defined, it just breaks.

It works perfectly on discreet datatypes.  Honestly... experience has demonstrated it.

Not so much on floats, but I think that's daft anyway.

Of course, I've also never partitioned a table on a float, thinking that's 
utterly daft.
[ shrug... ]  Works fine as long as you have a non-daft partitioning rule.
		regards, tom lane

--
Angular momentum makes the world go 'round.

Re: Partitioning by month causing an error?

От
Gaetano Mendola
Дата:
On Sat, Dec 21, 2019 at 1:49 AM Wells Oliver <wells.oliver@gmail.com> wrote:
>
> I have a partition per each month, like this:
>
> CREATE TABLE mmd_yr2019_3 PARTITION OF mmd FOR VALUES FROM ('2019-03-01') TO ('2019-03-31');
> CREATE TABLE mmd_yr2019_4 PARTITION OF mmd FOR VALUES FROM ('2019-04-01') TO ('2019-04-30');
>
> This breaks for dates at the end of the month. What's the better way to write the bounds? 3/1 to 4/1 and then 4/1 to
5/1?Does that cause some overlap issue?
 

This should have been:

CREATE TABLE mmd_yr2019_3 PARTITION OF mmd FOR VALUES FROM
('2019-03-01') TO ('2019-04-01');
CREATE TABLE mmd_yr2019_4 PARTITION OF mmd FOR VALUES FROM
('2019-04-01') TO ('2019-04-30');


-- 
cpp-today.blogspot.com