Обсуждение: Question related to partitioning with pg_partman

Поиск
Список
Период
Сортировка

Question related to partitioning with pg_partman

От
sud
Дата:

Starting a new thread...

Something interesting and not sure if its expected behaviour as below. We are also confused a bit here.

In the below example we created two partitioned tables on timestamptz type columns with different time zones and the child partitions are created appropriately with boundaries as one mid night to next mid night of a day and so on. But when we change the time zone and query the  data dictionary views again, it shows the start and end of the partition boundary as not midnights but different times of the day's values. 

So I was wondering if this can cause us any unforeseen issues in the long run while creating the partitions though partman and persisting the data into the tables from the end users then querying those and having queries properly partitioned pruned? 
or 
should we always set the local timezone as UTC always before running or calling the pg_partman/pg_cron process which creates the partitions? Mainly in a database which serves global users sitting across multiple timezones. And same thing while inserting data into the table, we should use UTC timezone conversion function. Can you please confirm. 

And while checking the timezone using the "show timezone" function it shows the local timezone, so is there any way to see postgres DB the server timezone?

*******Example********

SET SESSION TIME ZONE 'UTC';
CREATE TABLE test_timestamp (
ts TIMESTAMP,
tstz TIMESTAMPTZ) PARTITION BY RANGE (tstz);

SELECT partman.create_parent(
   p_parent_table := 'public.test_timestamp',
   p_control := 'tstz',
   p_type := 'native',
   p_interval := '1 day',
   p_premake := 4,
   p_start_partition => '2024-03-07 00:00:00'
);

UPDATE partman.part_config SET infinite_time_partitions = 'true' WHERE parent_table = 'public.test_timestamp';

with recursive inh as (
   select i.inhrelid, null::text as parent
   from pg_catalog.pg_inherits i
     join pg_catalog.pg_class cl on i.inhparent = cl.oid
     join pg_catalog.pg_namespace nsp on cl.relnamespace = nsp.oid
   where nsp.nspname = 'public'          
     and cl.relname = 'test_timestamp2'  
   union all
   select i.inhrelid, (i.inhparent::regclass)::text
   from inh
   join pg_catalog.pg_inherits i on (inh.inhrelid = i.inhparent)
)
select c.relname as partition_name,
        pg_get_expr(c.relpartbound, c.oid, true) as partition_expression
from inh
   join pg_catalog.pg_class c on inh.inhrelid = c.oid
   join pg_catalog.pg_namespace n on c.relnamespace = n.oid
   left join pg_partitioned_table p on p.partrelid = c.oid
order by n.nspname, c.relname;

test_timestamp_default DEFAULT
test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-07 00:00:00+00') TO ('2024-03-08 00:00:00+00')
test_timestamp_p2024_03_08 FOR VALUES FROM ('2024-03-08 00:00:00+00') TO ('2024-03-09 00:00:00+00')
test_timestamp_p2024_03_09 FOR VALUES FROM ('2024-03-09 00:00:00+00') TO ('2024-03-10 00:00:00+00')
test_timestamp_p2024_03_10 FOR VALUES FROM ('2024-03-10 00:00:00+00') TO ('2024-03-11 00:00:00+00')
test_timestamp_p2024_03_11 FOR VALUES FROM ('2024-03-11 00:00:00+00') TO ('2024-03-12 00:00:00+00')

SET SESSION TIME ZONE 'EST';

test_timestamp_default DEFAULT
test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-06 19:00:00-05') TO ('2024-03-07 19:00:00-05')
test_timestamp_p2024_03_08 FOR VALUES FROM ('2024-03-07 19:00:00-05') TO ('2024-03-08 19:00:00-05')
test_timestamp_p2024_03_09 FOR VALUES FROM ('2024-03-08 19:00:00-05') TO ('2024-03-09 19:00:00-05')
test_timestamp_p2024_03_10 FOR VALUES FROM ('2024-03-09 19:00:00-05') TO ('2024-03-10 19:00:00-05')
test_timestamp_p2024_03_11 FOR VALUES FROM ('2024-03-10 19:00:00-05') TO ('2024-03-11 19:00:00-05')

***********************

SET SESSION TIME ZONE 'EST';

CREATE TABLE test_timestamp2 (
ts TIMESTAMP,
tstz TIMESTAMPTZ) PARTITION BY RANGE (tstz);

SELECT partman.create_parent(
   p_parent_table := 'public.test_timestamp2',
   p_control := 'tstz',
   p_type := 'native',
   p_interval := '1 day',
   p_premake := 4,
   p_start_partition => '2024-03-07 00:00:00'
);

UPDATE partman.part_config SET infinite_time_partitions = 'true' WHERE parent_table = 'public.test_timestamp2';

with recursive inh as (
   select i.inhrelid, null::text as parent
   from pg_catalog.pg_inherits i
     join pg_catalog.pg_class cl on i.inhparent = cl.oid
     join pg_catalog.pg_namespace nsp on cl.relnamespace = nsp.oid
   where nsp.nspname = 'public'          
     and cl.relname = 'test_timestamp2'  
   union all
   select i.inhrelid, (i.inhparent::regclass)::text
   from inh
   join pg_catalog.pg_inherits i on (inh.inhrelid = i.inhparent)
)
select c.relname as partition_name,
        pg_get_expr(c.relpartbound, c.oid, true) as partition_expression
from inh
   join pg_catalog.pg_class c on inh.inhrelid = c.oid
   join pg_catalog.pg_namespace n on c.relnamespace = n.oid
   left join pg_partitioned_table p on p.partrelid = c.oid
order by n.nspname, c.relname;

test_timestamp2_default DEFAULT
test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07 00:00:00-05') TO ('2024-03-08 00:00:00-05')
test_timestamp2_p2024_03_08 FOR VALUES FROM ('2024-03-08 00:00:00-05') TO ('2024-03-09 00:00:00-05')
test_timestamp2_p2024_03_09 FOR VALUES FROM ('2024-03-09 00:00:00-05') TO ('2024-03-10 00:00:00-05')
test_timestamp2_p2024_03_10 FOR VALUES FROM ('2024-03-10 00:00:00-05') TO ('2024-03-11 00:00:00-05')
test_timestamp2_p2024_03_11 FOR VALUES FROM ('2024-03-11 00:00:00-05') TO ('2024-03-12 00:00:00-05')


SET SESSION TIME ZONE 'UTC';

test_timestamp2_default DEFAULT
test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07 05:00:00+00') TO ('2024-03-08 05:00:00+00')
test_timestamp2_p2024_03_08 FOR VALUES FROM ('2024-03-08 05:00:00+00') TO ('2024-03-09 05:00:00+00')
test_timestamp2_p2024_03_09 FOR VALUES FROM ('2024-03-09 05:00:00+00') TO ('2024-03-10 05:00:00+00')
test_timestamp2_p2024_03_10 FOR VALUES FROM ('2024-03-10 05:00:00+00') TO ('2024-03-11 05:00:00+00')
test_timestamp2_p2024_03_11 FOR VALUES FROM ('2024-03-11 05:00:00+00') TO ('2024-03-12 05:00:00+00')

Re: Question related to partitioning with pg_partman

От
sud
Дата:
Can somebody help me to understand the behaviour?

Re: Question related to partitioning with pg_partman

От
Adrian Klaver
Дата:
On 3/8/24 00:23, sud wrote:
> 
> Starting a new thread...
> 
> Something interesting and not sure if its expected behaviour as below. 
> We are also confused a bit here.
> 
> In the below example we created two partitioned tables on timestamptz 
> type columns with different time zones and the child partitions are 
> created appropriately with boundaries as one mid night to next mid night 
> of a day and so on. But when we change the time zone and query the  data 
> dictionary views again, it shows the start and end of the partition 
> boundary as not midnights but different times of the day's values.
> 
> So I was wondering if this can cause us any unforeseen issues in the 
> long run while creating the partitions though partman and persisting the 
> data into the tables from the end users then querying those and having 
> queries properly partitioned pruned?
> or
> should we always set the local timezone as UTC always before running or 
> calling the pg_partman/pg_cron process which creates the partitions? 
> Mainly in a database which serves global users sitting across multiple 
> timezones. And same thing while inserting data into the table, we should 
> use UTC timezone conversion function. Can you please confirm.

'2024-03-07 00:00:00+00' and '2024-03-06 19:00:00-05' are the same time 
as is '2024-03-07 00:00:00-05' and '2024-03-07 05:00:00+00'.

Still I would think for sanity sake you would want to stick with UTC.

> 
> And while checking the timezone using the "show timezone" function it 
> shows the local timezone, so is there any way to see postgres DB the 
> server timezone?

show timezone is the currently set server timezone.

select reset_val from pg_settings where name = 'TimeZone';

would show you what the value would be reset to, e.g it's 'default 
value. For more information do:

select * from pg_settings where name = 'TimeZone';

to see where the 'default' is set.


-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Question related to partitioning with pg_partman

От
sud
Дата:

On Sat, Mar 9, 2024 at 3:41 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 3/8/24 00:23, sud wrote:
>
> Starting a new thread...
>
> Something interesting and not sure if its expected behaviour as below.
> We are also confused a bit here.
>
> In the below example we created two partitioned tables on timestamptz
> type columns with different time zones and the child partitions are
> created appropriately with boundaries as one mid night to next mid night
> of a day and so on. But when we change the time zone and query the  data
> dictionary views again, it shows the start and end of the partition
> boundary as not midnights but different times of the day's values.
>
> So I was wondering if this can cause us any unforeseen issues in the
> long run while creating the partitions though partman and persisting the
> data into the tables from the end users then querying those and having
> queries properly partitioned pruned?
> or
> should we always set the local timezone as UTC always before running or
> calling the pg_partman/pg_cron process which creates the partitions?
> Mainly in a database which serves global users sitting across multiple
> timezones. And same thing while inserting data into the table, we should
> use UTC timezone conversion function. Can you please confirm.

'2024-03-07 00:00:00+00' and '2024-03-06 19:00:00-05' are the same time
as is '2024-03-07 00:00:00-05' and '2024-03-07 05:00:00+00'.

Still I would think for sanity sake you would want to stick with UTC.



Thank you so much Adrian.

In my example in the first post, I see, if someone connected to a RDS Postgres database and run the create partition command using pg_partman by setting the timezone as "UTC", the 7th march partition looks to be spanned from "7th march midnight" to "8th march midnight", when queried the partition_experession from the data dictionary view. Which is correct.

And same information if someone querying by setting the timezone as EST is showing spanning from "6th march 7PM" to "7th March 7PM". And this can cause sometimes the partition may shift to other days all together. Similar differences happen if creating the partitions using EST timezone initially and then querying the data dictionary from UTC timezone.

So my question was, if in these types of scenarios, we should follow a standard approach of setting the timezone as UTC in such a type of global user use case, while the system can persist data from multiple users sitting across different time zones? So that the boundary(start and end time) of each of the range partitions will be set as consistent in one timezone across all the partitioned tables?

And even while inserting the data , should we set the timezone to first UTC and do the data load ?

******* Partition created by pg_partman by setting timezone as UTC ***************

UTC
Partition_name                         Partition_expression
test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-07 00:00:00+00') TO ('2024-03-08 00:00:00+00')

when queried the partition_expression using EST ..

EST
Partition_name                         Partition_expression
test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-06 19:00:00-05') TO ('2024-03-07 19:00:00-05')


******* Partition created by pg_partman by setting timezone as EST ***************

EST
Partition_name                         Partition_expression
test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07 00:00:00-05') TO ('2024-03-08 00:00:00-05')

when queried the partition_expression using UTC ..

UTC
Partition_name                         Partition_expression
test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07 05:00:00+00') TO ('2024-03-08 05:00:00+00')

*******

Also i see both the "setting" and "reset_val" is showing as local timezone only. If we set the timezone to a different value than the local timezone then it gets updated on the "setting". 

Regards
Sud
 

Re: Question related to partitioning with pg_partman

От
Adrian Klaver
Дата:
On 3/10/24 05:12, sud wrote:
> 

> 
>     '2024-03-07 00:00:00+00' and '2024-03-06 19:00:00-05' are the same time
>     as is '2024-03-07 00:00:00-05' and '2024-03-07 05:00:00+00'.
> 
>     Still I would think for sanity sake you would want to stick with UTC.
> 
> 
> 
> Thank you so much Adrian.
> 
> In my example in the first post, I see, if someone connected to a RDS 
> Postgres database and run the create partition command using pg_partman 
> by setting the timezone as "UTC", the 7th march partition looks to be 
> spanned from "7th march midnight" to "8th march midnight", when queried 
> the partition_experession from the data dictionary view. Which is correct.
> 
> And same information if someone querying by setting the timezone as EST 
> is showing spanning from "6th march 7PM" to "7th March 7PM". And this 
> can cause sometimes the partition may shift to other days all together. 
> Similar differences happen if creating the partitions using EST timezone 
> initially and then querying the data dictionary from UTC timezone.

The above is at odds with your example below which has the correct values:

2024-03-07 00:00:00+00 = 2024-03-06 19:00:00-05

> 
> So my question was, if in these types of scenarios, we should follow a 
> standard approach of setting the timezone as UTC in such a type of 
> global user use case, while the system can persist data from multiple 
> users sitting across different time zones? So that the boundary(start 
> and end time) of each of the range partitions will be set as consistent 
> in one timezone across all the partitioned tables?

You need to first determine what your time frames are going to be?

1) Midnight to Midnight in UTC will be consistent when viewed in UTC. It 
will not be when viewed in other time zone +/- the offset from UTC.

2) Or Midnight to Midnight in the users time zone, in which case the UTC 
values will differ.

You have to decide which of the above is your goal. The bottom line is 
by definition the local wall clock time will not equal UTC, GMT 
excepted. This comes down to what the purpose of the partitions are? In 
other words how do you want to organize the data?

> 
> And even while inserting the data , should we set the timezone to first 
> UTC and do the data load ?



> 
> ******* Partition created by pg_partman by setting timezone as UTC 
> ***************
> 
> *UTC*
> *Partition_name                         Partition_expression*
> test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-07 00:00:00+00') TO 
> ('2024-03-08 00:00:00+00')
> 
> when queried the partition_expression using EST ..
> 
> *EST*
> *Partition_name                         Partition_expression*
> test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-06 19:00:00-05') TO 
> ('2024-03-07 19:00:00-05')
> 
> 
> ******* Partition created by pg_partman by setting timezone as EST 
> ***************
> 
> *EST*
> *Partition_name                         Partition_expression*
> test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07 00:00:00-05') 
> TO ('2024-03-08 00:00:00-05')
> 
> when queried the partition_expression using UTC ..
> 
> *UTC*
> *Partition_name                         Partition_expression*
> test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07 05:00:00+00') 
> TO ('2024-03-08 05:00:00+00')
> 
> *******
> 
> Also i see both the "setting" and "reset_val" is showing as local 
> timezone only. If we set the timezone to a different value than the 
> local timezone then it gets updated on the "setting".
> 
> Regards
> Sud

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Question related to partitioning with pg_partman

От
sud
Дата:

On Sun, Mar 10, 2024 at 10:32 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 3/10/24 05:12, sud wrote:
>
> In my example in the first post, I see, if someone connected to a RDS
> Postgres database and run the create partition command using pg_partman
> by setting the timezone as "UTC", the 7th march partition looks to be
> spanned from "7th march midnight" to "8th march midnight", when queried
> the partition_experession from the data dictionary view. Which is correct.
>
> And same information if someone querying by setting the timezone as EST
> is showing spanning from "6th march 7PM" to "7th March 7PM". And this
> can cause sometimes the partition may shift to other days all together.
> Similar differences happen if creating the partitions using EST timezone
> initially and then querying the data dictionary from UTC timezone.

The above is at odds with your example below which has the correct values:

2024-03-07 00:00:00+00 = 2024-03-06 19:00:00-05

>
> So my question was, if in these types of scenarios, we should follow a
> standard approach of setting the timezone as UTC in such a type of
> global user use case, while the system can persist data from multiple
> users sitting across different time zones? So that the boundary(start
> and end time) of each of the range partitions will be set as consistent
> in one timezone across all the partitioned tables?

You need to first determine what your time frames are going to be?

1) Midnight to Midnight in UTC will be consistent when viewed in UTC. It
will not be when viewed in other time zone +/- the offset from UTC.

2) Or Midnight to Midnight in the users time zone, in which case the UTC
values will differ.

You have to decide which of the above is your goal. The bottom line is
by definition the local wall clock time will not equal UTC, GMT
excepted. This comes down to what the purpose of the partitions are? In
other words how do you want to organize the data?

>
> And even while inserting the data , should we set the timezone to first
> UTC and do the data load ?


>
> ******* Partition created by pg_partman by setting timezone as UTC
> ***************
>
> *UTC*
> *Partition_name                         Partition_expression*
> test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-07 00:00:00+00') TO
> ('2024-03-08 00:00:00+00')
>
> when queried the partition_expression using EST ..
>
> *EST*
> *Partition_name                         Partition_expression*
> test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-06 19:00:00-05') TO
> ('2024-03-07 19:00:00-05')
>
>
> ******* Partition created by pg_partman by setting timezone as EST
> ***************
>
> *EST*
> *Partition_name                         Partition_expression*
> test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07 00:00:00-05')
> TO ('2024-03-08 00:00:00-05')
>
> when queried the partition_expression using UTC ..
>
> *UTC*
> *Partition_name                         Partition_expression*
> test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07 05:00:00+00')
> TO ('2024-03-08 05:00:00+00')
>
> *******
>
> Also i see both the "setting" and "reset_val" is showing as local
> timezone only. If we set the timezone to a different value than the
> local timezone then it gets updated on the "setting".


Our requirement is to have the transaction table partitioned by range daily on the transaction_date column(i.e one midnight to next midnight transaction data in one partition). Transaction date column will be of timestamptz data type. And this application/database might be consuming data from users across multiple time zones in future. These tables will be queried based on the date range (minimum being ~1 transaction day) and also will be purged one day partition.

So for above I understand , it might not be possible to keep the users data restricted to one day partition in the table considering the users will perform transactions across multiple timezones, but we are thinking of restricting the database with UTC timezone irrespective of the users. And thus during creating the table partitions , we need to ensure the UTC timezone is set , such that the upper and lower boundary for the daily range partitions remains consistent for all. Correct me if my understanding is wrong.


 

Re: Question related to partitioning with pg_partman

От
Adrian Klaver
Дата:
On 3/10/24 10:51, sud wrote:
> 
> On Sun, Mar 10, 2024 at 10:32 PM Adrian Klaver 
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 3/10/24 05:12, sud wrote:
>      >
>      > In my example in the first post, I see, if someone connected to a
>     RDS
>      > Postgres database and run the create partition command using
>     pg_partman
>      > by setting the timezone as "UTC", the 7th march partition looks
>     to be
>      > spanned from "7th march midnight" to "8th march midnight", when
>     queried
>      > the partition_experession from the data dictionary view. Which is
>     correct.
>      >
>      > And same information if someone querying by setting the timezone
>     as EST
>      > is showing spanning from "6th march 7PM" to "7th March 7PM". And
>     this
>      > can cause sometimes the partition may shift to other days all
>     together.
>      > Similar differences happen if creating the partitions using EST
>     timezone
>      > initially and then querying the data dictionary from UTC timezone.
> 
>     The above is at odds with your example below which has the correct
>     values:
> 
>     2024-03-07 00:00:00+00 = 2024-03-06 19:00:00-05
> 
>      >
>      > So my question was, if in these types of scenarios, we should
>     follow a
>      > standard approach of setting the timezone as UTC in such a type of
>      > global user use case, while the system can persist data from
>     multiple
>      > users sitting across different time zones? So that the
>     boundary(start
>      > and end time) of each of the range partitions will be set as
>     consistent
>      > in one timezone across all the partitioned tables?
> 
>     You need to first determine what your time frames are going to be?
> 
>     1) Midnight to Midnight in UTC will be consistent when viewed in
>     UTC. It
>     will not be when viewed in other time zone +/- the offset from UTC.
> 
>     2) Or Midnight to Midnight in the users time zone, in which case the
>     UTC
>     values will differ.
> 
>     You have to decide which of the above is your goal. The bottom line is
>     by definition the local wall clock time will not equal UTC, GMT
>     excepted. This comes down to what the purpose of the partitions are? In
>     other words how do you want to organize the data?
> 
>      >
>      > And even while inserting the data , should we set the timezone to
>     first
>      > UTC and do the data load ?
> 
> 
>      >
>      > ******* Partition created by pg_partman by setting timezone as UTC
>      > ***************
>      >
>      > *UTC*
>      > *Partition_name                         Partition_expression*
>      > test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-07
>     00:00:00+00') TO
>      > ('2024-03-08 00:00:00+00')
>      >
>      > when queried the partition_expression using EST ..
>      >
>      > *EST*
>      > *Partition_name                         Partition_expression*
>      > test_timestamp_p2024_03_07 FOR VALUES FROM ('2024-03-06
>     19:00:00-05') TO
>      > ('2024-03-07 19:00:00-05')
>      >
>      >
>      > ******* Partition created by pg_partman by setting timezone as EST
>      > ***************
>      >
>      > *EST*
>      > *Partition_name                         Partition_expression*
>      > test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07
>     00:00:00-05')
>      > TO ('2024-03-08 00:00:00-05')
>      >
>      > when queried the partition_expression using UTC ..
>      >
>      > *UTC*
>      > *Partition_name                         Partition_expression*
>      > test_timestamp2_p2024_03_07 FOR VALUES FROM ('2024-03-07
>     05:00:00+00')
>      > TO ('2024-03-08 05:00:00+00')
>      >
>      > *******
>      >
>      > Also i see both the "setting" and "reset_val" is showing as local
>      > timezone only. If we set the timezone to a different value than the
>      > local timezone then it gets updated on the "setting".
> 
> 
> Our requirement is to have the transaction table partitioned by 
> range daily on the transaction_date column(i.e one midnight to next 
> midnight transaction data in one partition). Transaction date column 
> will be of timestamptz data type. And this application/database might be 
> consuming data from users across multiple time zones in future. These 
> tables will be queried based on the date range (minimum being ~1 
> transaction day) and also will be purged one day partition.
> 
> So for above I understand , it might not be possible to keep the users 
> data restricted to one day partition in the table considering the users 
> will perform transactions across multiple timezones, but we are thinking 
> of restricting the database with UTC timezone irrespective of the users. 
> And thus during creating the table partitions , we need to ensure the 
> UTC timezone is set , such that the upper and lower boundary for the 
> daily range partitions remains consistent for all. Correct me if my 
> understanding is wrong.

1) The partition will be across one day(24 hours) it is just the times 
may confuse people. Per you example 2024-03-07 00:00:00+00  is the same 
time as 2024-03-06 19:00:00-05 for EST. The issue is that the +00 and 
-05 maybe ignored. Also it depends on the clients being consistent in 
using timestamptz.

2) You still have not answered what the datetime range(not date range) 
is that will be queried. If you have the partitions Midnight to Midnight 
UTC and the clients are querying Midnight to Midnight local time the 
query will not match the partitions.

> 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: Question related to partitioning with pg_partman

От
sud
Дата:

On Sun, Mar 10, 2024 at 11:31 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
1) The partition will be across one day(24 hours) it is just the times
may confuse people. Per you example 2024-03-07 00:00:00+00  is the same
time as 2024-03-06 19:00:00-05 for EST. The issue is that the +00 and
-05 maybe ignored. Also it depends on the clients being consistent in
using timestamptz.

2) You still have not answered what the datetime range(not date range)
is that will be queried. If you have the partitions Midnight to Midnight
UTC and the clients are querying Midnight to Midnight local time the
query will not match the partitions.


 My apology if not able to clearly put the details. Actually, the query will always happen on a day basis i.e they can query from one day to 15 days transactions. But as you rightly pointed , the partitions can only span from midnight to midnight in one timezone, and thus users who queries the data from another time zone will mostly scan two partitions (even if they just queries one days transaction data in their own timezone). And I don't see an easy solution for this , which will help all users across all time zones to scan only a single partition in the database, when they queries data for a single transaction date.

And thus my question was, is it necessary to have the creation of partitions to happen on UTC time zone only? and then whatever transaction data inserted by the users from respective time zones will be stored in the database as is and will be queried based on the user timezone (it may span across multiple partitions though for a single user transaction date).

 

Re: Question related to partitioning with pg_partman

От
Adrian Klaver
Дата:
On 3/10/24 11:34, sud wrote:
> 
> On Sun, Mar 10, 2024 at 11:31 PM Adrian Klaver 
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     1) The partition will be across one day(24 hours) it is just the times
>     may confuse people. Per you example 2024-03-07 00:00:00+00  is the same
>     time as 2024-03-06 19:00:00-05 for EST. The issue is that the +00 and
>     -05 maybe ignored. Also it depends on the clients being consistent in
>     using timestamptz.
> 
>     2) You still have not answered what the datetime range(not date range)
>     is that will be queried. If you have the partitions Midnight to
>     Midnight
>     UTC and the clients are querying Midnight to Midnight local time the
>     query will not match the partitions.
> 
> 
>   My apology if not able to clearly put the details. Actually, the query 
> will always happen on a day basis i.e they can query from one day to 15 
> days transactions. But as you rightly pointed , the partitions can only 
> span from midnight to midnight in one timezone, and thus users who 
> queries the data from another time zone will mostly scan two partitions 
> (even if they just queries one days transaction data in their own 
> timezone). And I don't see an easy solution for this , which will help 
> all users across all time zones to scan only a single partition in the 
> database, when they queries data for a single transaction date.
> 
> And thus my question was, is it necessary to have the creation of 
> partitions to happen on UTC time zone only? and then whatever 
> transaction data inserted by the users from respective time zones will 
> be stored in the database as is and will be queried based on the user 
> timezone (it may span across multiple partitions though for a single 
> user transaction date).


This is going to depend on many things.

1) Partitions are not free they have overhead, which is fine if the 
cost(overhead) is less then the benefits. For details on that see:

https://www.postgresql.org/docs/current/ddl-partitioning.html

and partition parts of

https://www.postgresql.org/docs/current/sql-createtable.html

As part of this there is the consideration of whether daily partitions 
are really what you want?

2) What you hope to get out of the partitioning?

a) If it is confining queries to the partition boundaries then you have 
already stated that is not going to happen.

b) If it is for data pruning purposes, then you have something to 
consider on both ends. Creating/dropping partitions with Midnight to 
Midnight UTC means you will need to consider whether they cover the 
range of datetimes that your users are interested in. In other words 
creating a partition ahead that covers local times that resolve to a UTC 
time in the 'future'. On the back end not dropping a partition until it 
has gone out of scope for everybody.

To answer 1 & 2 you are probably going to need to create a test setup 
and verify how the expected queries are actually going to work with your 
partition scheme.



-- 
Adrian Klaver
adrian.klaver@aklaver.com