Обсуждение: Partitioned tables constraint_exclusion

Поиск
Список
Период
Сортировка

Partitioned tables constraint_exclusion

От
Weslee Bilodeau
Дата:
I'm not sure if this is a bug, missing feature, misunderstanding on my part?

I checked the TODO list and couldn't find anything on it.

I currently have a 750 million row table, indexes are > 10 GB, so trying
to partition it.

The basic -

constraint_exclusion + exact match = OK
constraint_exclusion + ( var + var )::case = Not OK


Weslee


I tried to break it down to a simple case -
(kid_200601 should never show up in the plan)

mytest=# create table master ( var_text text not null, var_ts timestamp
with time zone not null, unique ( var_ts ) );
NOTICE:  CREATE TABLE / UNIQUE will create implicit index
"master_var_ts_key" for table "master"
CREATE TABLE
mytest=# create table kid_200601 ( check ( var_ts >= '2006-01-01
00:00:00' AND var_ts < '2006-02-01 00:00:00' ) ) inherits ( master ) ;
CREATE TABLE
mytest=# create table kid_200602 ( check ( var_ts >= '2006-02-01
00:00:00' AND var_ts < '2006-03-01 00:00:00' ) ) inherits ( master ) ;
CREATE TABLE
mytest=# create table kid_200603 ( check ( var_ts >= '2006-03-01
00:00:00' AND var_ts < '2006-04-01 00:00:00' ) ) inherits ( master ) ;
CREATE TABLE
mytest=# explain select count(*) from master where var_ts > '2006-02-22
00:00:00' ;                                         QUERY PLAN
-----------------------------------------------------------------------------------------------Aggregate
(cost=71.94..71.95rows=1 width=0)  ->  Append  (cost=7.09..69.18 rows=1101 width=0)        ->  Bitmap Heap Scan on
master (cost=7.09..21.68 rows=367 width=0)              Recheck Cond: (var_ts > '2006-02-22
 
00:00:00+00'::timestamp with time zone)              ->  Bitmap Index Scan on master_var_ts_key
(cost=0.00..7.00 rows=367 width=0)                    Index Cond: (var_ts > '2006-02-22
00:00:00+00'::timestamp with time zone)        ->  Seq Scan on kid_200602 master  (cost=0.00..23.75 rows=367
width=0)              Filter: (var_ts > '2006-02-22 00:00:00+00'::timestamp
with time zone)        ->  Seq Scan on kid_200603 master  (cost=0.00..23.75 rows=367
width=0)              Filter: (var_ts > '2006-02-22 00:00:00+00'::timestamp
with time zone)
(10 rows)

mytest=# select now() ;             now
-------------------------------2007-03-26 16:02:29.360435+00
(1 row)

mytest=# explain select count(*) from master where var_ts > ( now() - '1
month'::interval )::timestamptz ;                                         QUERY PLAN
----------------------------------------------------------------------------------------------Aggregate
(cost=114.94..114.95rows=1 width=0)  ->  Append  (cost=7.10..111.27 rows=1468 width=0)        ->  Bitmap Heap Scan on
master (cost=7.10..23.52 rows=367 width=0)              Recheck Cond: (var_ts > (now() - '1 mon'::interval))
 ->  Bitmap Index Scan on master_var_ts_key
 
(cost=0.00..7.01 rows=367 width=0)                    Index Cond: (var_ts > (now() - '1 mon'::interval))        ->  Seq
Scanon kid_200601 master  (cost=0.00..29.25 rows=367
 
width=0)              Filter: (var_ts > (now() - '1 mon'::interval))        ->  Seq Scan on kid_200602 master
(cost=0.00..29.25rows=367
 
width=0)              Filter: (var_ts > (now() - '1 mon'::interval))        ->  Seq Scan on kid_200603 master
(cost=0.00..29.25rows=367
 
width=0)              Filter: (var_ts > (now() - '1 mon'::interval))
(12 rows)

mytest=# show constraint_exclusion ;constraint_exclusion
----------------------on
(1 row)

mytest=# explain select count(*) from master where var_ts > (
'2007-03-26 16:03:27.370627+00'::timestamptz - '1 month'::interval
)::timestamptz ;                                                        QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=105.77..105.78 rows=1 width=0)  ->  Append  (cost=7.10..102.10 rows=1468 width=0)        ->  Bitmap Heap Scan on
master (cost=7.10..22.60 rows=367 width=0)              Recheck Cond: (var_ts > ('2007-03-26
 
16:03:27.370627+00'::timestamp with time zone - '1 mon'::interval))              ->  Bitmap Index Scan on
master_var_ts_key
(cost=0.00..7.01 rows=367 width=0)                    Index Cond: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone - '1 mon'::interval))        ->  Seq Scan on kid_200601 master
(cost=0.00..26.50rows=367
 
width=0)              Filter: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone - '1 mon'::interval))        ->  Seq Scan on kid_200602 master
(cost=0.00..26.50rows=367
 
width=0)              Filter: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone - '1 mon'::interval))        ->  Seq Scan on kid_200603 master
(cost=0.00..26.50rows=367
 
width=0)              Filter: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone - '1 mon'::interval))
(12 rows)

mytest=#


Re: Partitioned tables constraint_exclusion

От
"Simon Riggs"
Дата:
On Mon, 2007-03-26 at 09:38 -0700, Weslee Bilodeau wrote:

> mytest=# explain select count(*) from master where var_ts > (
> '2007-03-26 16:03:27.370627+00'::timestamptz - '1 month'::interval
> )::timestamptz ;

If you're able to supply a constant value, why not subtract 1 month
before you submit the query?

AFAIK timestamptz arithmetic depends upon the current timezone which is
a STABLE value and so won't currently work with partitioning.

Having partitioning work with STABLE functions should be a TODO item if
it isn't already, but that requires some thought to implement and won't
happen for 8.3.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com




Re: Partitioned tables constraint_exclusion

От
Weslee Bilodeau
Дата:
Simon Riggs wrote:
> On Mon, 2007-03-26 at 09:38 -0700, Weslee Bilodeau wrote:
> 
>> mytest=# explain select count(*) from master where var_ts > (
>> '2007-03-26 16:03:27.370627+00'::timestamptz - '1 month'::interval
>> )::timestamptz ;
> 
> If you're able to supply a constant value, why not subtract 1 month
> before you submit the query?
> 
> AFAIK timestamptz arithmetic depends upon the current timezone which is
> a STABLE value and so won't currently work with partitioning.
> 
> Having partitioning work with STABLE functions should be a TODO item if
> it isn't already, but that requires some thought to implement and won't
> happen for 8.3.
> 

Mainly its because the value comes from a reporting system that has
minimal brains, it passes values it gets from the user directly into a
query.

IE, they enter '1 month', which I use to populate the interval value,
"ts > ( NOW() - $VALUE )"

But, in the example I did a "timestamp - interval", the exact date, not
NOW() - Still didn't work.

I'm guessing anything that has to think, math, etc is not valid for
constrain_exclusion?

Its not in the docs anywhere, so trying to isolate what can and can't be
done.

Weslee

mytest=# explain select count(*) from master where var_ts > (
'2007-03-26 16:03:27.370627+00'::timestamptz + '1 second'::interval
)::timestamptz ;                                                         QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=105.77..105.78 rows=1 width=0)  ->  Append  (cost=7.10..102.10 rows=1468 width=0)        ->  Bitmap Heap Scan on
master (cost=7.10..22.60 rows=367 width=0)              Recheck Cond: (var_ts > ('2007-03-26
 
16:03:27.370627+00'::timestamp with time zone + '00:00:01'::interval))              ->  Bitmap Index Scan on
master_var_ts_key
(cost=0.00..7.01 rows=367 width=0)                    Index Cond: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone + '00:00:01'::interval))        ->  Seq Scan on kid_200601 master
(cost=0.00..26.50rows=367
 
width=0)              Filter: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone + '00:00:01'::interval))        ->  Seq Scan on kid_200602 master
(cost=0.00..26.50rows=367
 
width=0)              Filter: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone + '00:00:01'::interval))        ->  Seq Scan on kid_200603 master
(cost=0.00..26.50rows=367
 
width=0)              Filter: (var_ts > ('2007-03-26
16:03:27.370627+00'::timestamp with time zone + '00:00:01'::interval))
(12 rows)

mytest=# explain select count(*) from master where var_ts > (
'2007-03-26 16:03:27.370627+00' ) ;                                             QUERY PLAN
------------------------------------------------------------------------------------------------------Aggregate
(cost=22.60..22.61rows=1 width=0)  ->  Append  (cost=7.09..21.68 rows=367 width=0)        ->  Bitmap Heap Scan on
master (cost=7.09..21.68 rows=367 width=0)              Recheck Cond: (var_ts > '2007-03-26
 
16:03:27.370627+00'::timestamp with time zone)              ->  Bitmap Index Scan on master_var_ts_key
(cost=0.00..7.00 rows=367 width=0)                    Index Cond: (var_ts > '2007-03-26
16:03:27.370627+00'::timestamp with time zone)
(6 rows)

mytest=#


Re: Partitioned tables constraint_exclusion

От
Weslee Bilodeau
Дата:
Weslee Bilodeau wrote:
> Mainly its because the value comes from a reporting system that has
> minimal brains, it passes values it gets from the user directly into a
> query.
> 
> IE, they enter '1 month', which I use to populate the interval value,
> "ts > ( NOW() - $VALUE )"
> 
> But, in the example I did a "timestamp - interval", the exact date, not
> NOW() - Still didn't work.
> 
> I'm guessing anything that has to think, math, etc is not valid for
> constrain_exclusion?
> 
> Its not in the docs anywhere, so trying to isolate what can and can't be
> done.

This works -

CREATE FUNCTION now_interval( interval ) RETURNS timestamptz IMMUTABLE
STRICT AS $$ SELECT NOW() - $1 $$ LANGUAGE SQL ;

SELECT count(*) FROM master WHERE var_ts > now_interval( '1 month' );

This doesn't work -

SELECT count(*) FROM master WHERE var_ts > ( NOW() - '1 month'::interval );


This works for me, as the reporting system I know doesn't change
timezones, and function cache doesn't last longer then the current select?


But, its basically the exact same logic in both cases?

Weslee



Re: Partitioned tables constraint_exclusion

От
Jim Nasby
Дата:
See Simon's reply... timestamptz math is *not* IMMUTABLE, because  
sessions are free to change their timezone at any time. I bet you can  
get some invalid results using that function with a clever test case.

On Mar 26, 2007, at 3:48 PM, Weslee Bilodeau wrote:

> Weslee Bilodeau wrote:
>> Mainly its because the value comes from a reporting system that has
>> minimal brains, it passes values it gets from the user directly  
>> into a
>> query.
>>
>> IE, they enter '1 month', which I use to populate the interval value,
>> "ts > ( NOW() - $VALUE )"
>>
>> But, in the example I did a "timestamp - interval", the exact  
>> date, not
>> NOW() - Still didn't work.
>>
>> I'm guessing anything that has to think, math, etc is not valid for
>> constrain_exclusion?
>>
>> Its not in the docs anywhere, so trying to isolate what can and  
>> can't be
>> done.
>
> This works -
>
> CREATE FUNCTION now_interval( interval ) RETURNS timestamptz IMMUTABLE
> STRICT AS $$ SELECT NOW() - $1 $$ LANGUAGE SQL ;
>
> SELECT count(*) FROM master WHERE var_ts > now_interval( '1 month' );
>
> This doesn't work -
>
> SELECT count(*) FROM master WHERE var_ts > ( NOW() - '1  
> month'::interval );
>
>
> This works for me, as the reporting system I know doesn't change
> timezones, and function cache doesn't last longer then the current  
> select?
>
>
> But, its basically the exact same logic in both cases?
>
> Weslee
>
>
> ---------------------------(end of  
> broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that  
> your
>        message can get through to the mailing list cleanly
>

--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)




Re: Partitioned tables constraint_exclusion

От
Weslee Bilodeau
Дата:
Jim Nasby wrote:
> See Simon's reply... timestamptz math is *not* IMMUTABLE, because
> sessions are free to change their timezone at any time. I bet you can
> get some invalid results using that function with a clever test case.
> 

I'm pretty sure it could easily be broken.
But to make it easier for me, I know that the reporting system connects,
runs the query, and disconnects.

So I'm so far safe using my current system.

If the system had persistent connections and changed timezones a lot, it
might however cause problems.

Its been the only way that I could get it to be smart enough to not use
the tables outside its range.

With the tables growing 2+ million rows a day, approaching 1 billion
rows, its helps performance a lot.

This works at least until the ongoing discussion of partitioned tables
hopefully improves things in this area.

> On Mar 26, 2007, at 3:48 PM, Weslee Bilodeau wrote:
>> This works -
>>
>> CREATE FUNCTION now_interval( interval ) RETURNS timestamptz IMMUTABLE
>> STRICT AS $$ SELECT NOW() - $1 $$ LANGUAGE SQL ;
>>
>> SELECT count(*) FROM master WHERE var_ts > now_interval( '1 month' );
>>
>> This doesn't work -
>>
>> SELECT count(*) FROM master WHERE var_ts > ( NOW() - '1
>> month'::interval );
>>
>>
>> This works for me, as the reporting system I know doesn't change
>> timezones, and function cache doesn't last longer then the current
>> select?
>>
>>
>> But, its basically the exact same logic in both cases?
>>
>> Weslee
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>        subscribe-nomail command to majordomo@postgresql.org so that your
>>        message can get through to the mailing list cleanly
>>
> 
> -- 
> Jim Nasby                                            jim@nasby.net
> EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)
> 
> 
> 

Weslee