Обсуждение: Partitioned tables constraint_exclusion
I'm not sure if this is a bug, missing feature, misunderstanding on my part?
I checked the TODO list and couldn't find anything on it.
I currently have a 750 million row table, indexes are > 10 GB, so trying
to partition it.
The basic -
constraint_exclusion + exact match = OK
constraint_exclusion + ( var + var )::case = Not OK
Weslee
I tried to break it down to a simple case -
(kid_200601 should never show up in the plan)
mytest=# create table master ( var_text text not null, var_ts timestamp
with time zone not null, unique ( var_ts ) );
NOTICE:  CREATE TABLE / UNIQUE will create implicit index
"master_var_ts_key" for table "master"
CREATE TABLE
mytest=# create table kid_200601 ( check ( var_ts >= '2006-01-01
00:00:00' AND var_ts < '2006-02-01 00:00:00' ) ) inherits ( master ) ;
CREATE TABLE
mytest=# create table kid_200602 ( check ( var_ts >= '2006-02-01
00:00:00' AND var_ts < '2006-03-01 00:00:00' ) ) inherits ( master ) ;
CREATE TABLE
mytest=# create table kid_200603 ( check ( var_ts >= '2006-03-01
00:00:00' AND var_ts < '2006-04-01 00:00:00' ) ) inherits ( master ) ;
CREATE TABLE
mytest=# explain select count(*) from master where var_ts > '2006-02-22
00:00:00' ;                                         QUERY PLAN
-----------------------------------------------------------------------------------------------Aggregate
(cost=71.94..71.95rows=1 width=0)  ->  Append  (cost=7.09..69.18 rows=1101 width=0)        ->  Bitmap Heap Scan on
master (cost=7.09..21.68 rows=367 width=0)              Recheck Cond: (var_ts > '2006-02-22
 
00:00:00+00'::timestamp with time zone)              ->  Bitmap Index Scan on master_var_ts_key
(cost=0.00..7.00 rows=367 width=0)                    Index Cond: (var_ts > '2006-02-22
00:00:00+00'::timestamp with time zone)        ->  Seq Scan on kid_200602 master  (cost=0.00..23.75 rows=367
width=0)              Filter: (var_ts > '2006-02-22 00:00:00+00'::timestamp
with time zone)        ->  Seq Scan on kid_200603 master  (cost=0.00..23.75 rows=367
width=0)              Filter: (var_ts > '2006-02-22 00:00:00+00'::timestamp
with time zone)
(10 rows)
mytest=# select now() ;             now
-------------------------------2007-03-26 16:02:29.360435+00
(1 row)
mytest=# explain select count(*) from master where var_ts > ( now() - '1
month'::interval )::timestamptz ;                                         QUERY PLAN
----------------------------------------------------------------------------------------------Aggregate
(cost=114.94..114.95rows=1 width=0)  ->  Append  (cost=7.10..111.27 rows=1468 width=0)        ->  Bitmap Heap Scan on
master (cost=7.10..23.52 rows=367 width=0)              Recheck Cond: (var_ts > (now() - '1 mon'::interval))
 ->  Bitmap Index Scan on master_var_ts_key
 
(cost=0.00..7.01 rows=367 width=0)                    Index Cond: (var_ts > (now() - '1 mon'::interval))        ->  Seq
Scanon kid_200601 master  (cost=0.00..29.25 rows=367
 
width=0)              Filter: (var_ts > (now() - '1 mon'::interval))        ->  Seq Scan on kid_200602 master
(cost=0.00..29.25rows=367
 
width=0)              Filter: (var_ts > (now() - '1 mon'::interval))        ->  Seq Scan on kid_200603 master
(cost=0.00..29.25rows=367
 
width=0)              Filter: (var_ts > (now() - '1 mon'::interval))
(12 rows)
mytest=# show constraint_exclusion ;constraint_exclusion
----------------------on
(1 row)
mytest=# explain select count(*) from master where var_ts > (
'2007-03-26 16:03:27.370627+00'::timestamptz - '1 month'::interval
)::timestamptz ;                                                        QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=105.77..105.78 rows=1 width=0)  ->  Append  (cost=7.10..102.10 rows=1468 width=0)        ->  Bitmap Heap Scan on
master (cost=7.10..22.60 rows=367 width=0)              Recheck Cond: (var_ts > ('2007-03-26
 
16:03:27.370627+00'::timestamp with time zone - '1 mon'::interval))              ->  Bitmap Index Scan on
master_var_ts_key
(cost=0.00..7.01 rows=367 width=0)                    Index Cond: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone - '1 mon'::interval))        ->  Seq Scan on kid_200601 master
(cost=0.00..26.50rows=367
 
width=0)              Filter: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone - '1 mon'::interval))        ->  Seq Scan on kid_200602 master
(cost=0.00..26.50rows=367
 
width=0)              Filter: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone - '1 mon'::interval))        ->  Seq Scan on kid_200603 master
(cost=0.00..26.50rows=367
 
width=0)              Filter: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone - '1 mon'::interval))
(12 rows)
mytest=#
			
		On Mon, 2007-03-26 at 09:38 -0700, Weslee Bilodeau wrote: > mytest=# explain select count(*) from master where var_ts > ( > '2007-03-26 16:03:27.370627+00'::timestamptz - '1 month'::interval > )::timestamptz ; If you're able to supply a constant value, why not subtract 1 month before you submit the query? AFAIK timestamptz arithmetic depends upon the current timezone which is a STABLE value and so won't currently work with partitioning. Having partitioning work with STABLE functions should be a TODO item if it isn't already, but that requires some thought to implement and won't happen for 8.3. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote:
> On Mon, 2007-03-26 at 09:38 -0700, Weslee Bilodeau wrote:
> 
>> mytest=# explain select count(*) from master where var_ts > (
>> '2007-03-26 16:03:27.370627+00'::timestamptz - '1 month'::interval
>> )::timestamptz ;
> 
> If you're able to supply a constant value, why not subtract 1 month
> before you submit the query?
> 
> AFAIK timestamptz arithmetic depends upon the current timezone which is
> a STABLE value and so won't currently work with partitioning.
> 
> Having partitioning work with STABLE functions should be a TODO item if
> it isn't already, but that requires some thought to implement and won't
> happen for 8.3.
> 
Mainly its because the value comes from a reporting system that has
minimal brains, it passes values it gets from the user directly into a
query.
IE, they enter '1 month', which I use to populate the interval value,
"ts > ( NOW() - $VALUE )"
But, in the example I did a "timestamp - interval", the exact date, not
NOW() - Still didn't work.
I'm guessing anything that has to think, math, etc is not valid for
constrain_exclusion?
Its not in the docs anywhere, so trying to isolate what can and can't be
done.
Weslee
mytest=# explain select count(*) from master where var_ts > (
'2007-03-26 16:03:27.370627+00'::timestamptz + '1 second'::interval
)::timestamptz ;                                                         QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=105.77..105.78 rows=1 width=0)  ->  Append  (cost=7.10..102.10 rows=1468 width=0)        ->  Bitmap Heap Scan on
master (cost=7.10..22.60 rows=367 width=0)              Recheck Cond: (var_ts > ('2007-03-26
 
16:03:27.370627+00'::timestamp with time zone + '00:00:01'::interval))              ->  Bitmap Index Scan on
master_var_ts_key
(cost=0.00..7.01 rows=367 width=0)                    Index Cond: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone + '00:00:01'::interval))        ->  Seq Scan on kid_200601 master
(cost=0.00..26.50rows=367
 
width=0)              Filter: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone + '00:00:01'::interval))        ->  Seq Scan on kid_200602 master
(cost=0.00..26.50rows=367
 
width=0)              Filter: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone + '00:00:01'::interval))        ->  Seq Scan on kid_200603 master
(cost=0.00..26.50rows=367
 
width=0)              Filter: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone + '00:00:01'::interval))
(12 rows)
mytest=# explain select count(*) from master where var_ts > (
'2007-03-26 16:03:27.370627+00' ) ;                                             QUERY PLAN
------------------------------------------------------------------------------------------------------Aggregate
(cost=22.60..22.61rows=1 width=0)  ->  Append  (cost=7.09..21.68 rows=367 width=0)        ->  Bitmap Heap Scan on
master (cost=7.09..21.68 rows=367 width=0)              Recheck Cond: (var_ts > '2007-03-26
 
16:03:27.370627+00'::timestamp with time zone)              ->  Bitmap Index Scan on master_var_ts_key
(cost=0.00..7.00 rows=367 width=0)                    Index Cond: (var_ts > '2007-03-26
16:03:27.370627+00'::timestamp with time zone)
(6 rows)
mytest=#
			
		Weslee Bilodeau wrote: > Mainly its because the value comes from a reporting system that has > minimal brains, it passes values it gets from the user directly into a > query. > > IE, they enter '1 month', which I use to populate the interval value, > "ts > ( NOW() - $VALUE )" > > But, in the example I did a "timestamp - interval", the exact date, not > NOW() - Still didn't work. > > I'm guessing anything that has to think, math, etc is not valid for > constrain_exclusion? > > Its not in the docs anywhere, so trying to isolate what can and can't be > done. This works - CREATE FUNCTION now_interval( interval ) RETURNS timestamptz IMMUTABLE STRICT AS $$ SELECT NOW() - $1 $$ LANGUAGE SQL ; SELECT count(*) FROM master WHERE var_ts > now_interval( '1 month' ); This doesn't work - SELECT count(*) FROM master WHERE var_ts > ( NOW() - '1 month'::interval ); This works for me, as the reporting system I know doesn't change timezones, and function cache doesn't last longer then the current select? But, its basically the exact same logic in both cases? Weslee
See Simon's reply... timestamptz math is *not* IMMUTABLE, because sessions are free to change their timezone at any time. I bet you can get some invalid results using that function with a clever test case. On Mar 26, 2007, at 3:48 PM, Weslee Bilodeau wrote: > Weslee Bilodeau wrote: >> Mainly its because the value comes from a reporting system that has >> minimal brains, it passes values it gets from the user directly >> into a >> query. >> >> IE, they enter '1 month', which I use to populate the interval value, >> "ts > ( NOW() - $VALUE )" >> >> But, in the example I did a "timestamp - interval", the exact >> date, not >> NOW() - Still didn't work. >> >> I'm guessing anything that has to think, math, etc is not valid for >> constrain_exclusion? >> >> Its not in the docs anywhere, so trying to isolate what can and >> can't be >> done. > > This works - > > CREATE FUNCTION now_interval( interval ) RETURNS timestamptz IMMUTABLE > STRICT AS $$ SELECT NOW() - $1 $$ LANGUAGE SQL ; > > SELECT count(*) FROM master WHERE var_ts > now_interval( '1 month' ); > > This doesn't work - > > SELECT count(*) FROM master WHERE var_ts > ( NOW() - '1 > month'::interval ); > > > This works for me, as the reporting system I know doesn't change > timezones, and function cache doesn't last longer then the current > select? > > > But, its basically the exact same logic in both cases? > > Weslee > > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that > your > message can get through to the mailing list cleanly > -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Jim Nasby wrote: > See Simon's reply... timestamptz math is *not* IMMUTABLE, because > sessions are free to change their timezone at any time. I bet you can > get some invalid results using that function with a clever test case. > I'm pretty sure it could easily be broken. But to make it easier for me, I know that the reporting system connects, runs the query, and disconnects. So I'm so far safe using my current system. If the system had persistent connections and changed timezones a lot, it might however cause problems. Its been the only way that I could get it to be smart enough to not use the tables outside its range. With the tables growing 2+ million rows a day, approaching 1 billion rows, its helps performance a lot. This works at least until the ongoing discussion of partitioned tables hopefully improves things in this area. > On Mar 26, 2007, at 3:48 PM, Weslee Bilodeau wrote: >> This works - >> >> CREATE FUNCTION now_interval( interval ) RETURNS timestamptz IMMUTABLE >> STRICT AS $$ SELECT NOW() - $1 $$ LANGUAGE SQL ; >> >> SELECT count(*) FROM master WHERE var_ts > now_interval( '1 month' ); >> >> This doesn't work - >> >> SELECT count(*) FROM master WHERE var_ts > ( NOW() - '1 >> month'::interval ); >> >> >> This works for me, as the reporting system I know doesn't change >> timezones, and function cache doesn't last longer then the current >> select? >> >> >> But, its basically the exact same logic in both cases? >> >> Weslee >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 1: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that your >> message can get through to the mailing list cleanly >> > > -- > Jim Nasby jim@nasby.net > EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) > > > Weslee