Обсуждение: BUG #15334: Partition elimination not working as expected when usingenum as partition key

Поиск
Список
Период
Сортировка

BUG #15334: Partition elimination not working as expected when usingenum as partition key

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15334
Logged by:          Damir Ciganović-Janković
Email address:      damir.ciganovic.jankovic@gmail.com
PostgreSQL version: 10.2
Operating system:   Linux (Centos), Windows
Description:

Our tables are partitioned by partition key RANGE (resolution, timestamp).

Timestamp is of type 'timestamp without time zone' and resolution is hour
own enum type which contains values 'HOUR' , 'DAY' and 'MONTH':
CREATE TYPE resolution AS ENUM (
  'HOUR',
  'DAY',
  'MONTH'
);

Our table "my_report" has timestamp, resolution and count columns.

This is the query I am executing:
select "my_report"."count" from my_report where ("my_report"."resolution" =
'HOUR'::resolution and "my_report"."timestamp" >= timestamp '2018-08-16
07:00:00' and "my_report"."timestamp" < timestamp '2018-08-16 10:00:00.0')
limit 10; 
(NOTE: I simplified the query so real query and execution plan are
different, but I think you will understand me)

I noticed that when doing this query via psql cmd I get this execution plan

+++++
...
->  Append  (cost=0.00..159.76 rows=1 width=24) (actual time=0.585..0.585
rows=0 loops=1)
    ->  Seq Scan on my_report_hour_20180816  (cost=0.00..159.76 rows=1
width=24) (actual time=0.584..0.584 rows=0 loops=1)
        Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
zone) AND (resolution = 'HOUR'::resolution))
        Rows Removed by Filter: 4672
...
++++++

This is the correct behaviour, we picked the partition where our timestamps
are in range, and also resolution is HOUR.
Now, when executing this same query via our client (Java), but passing the
resolution as a parameter (and not hardcoding HOUR value in my query like
first time). I would expect that execution plan will be the same, but that
is not the case:

++++++
...
->  Append  (cost=0.00..372.24 rows=3 width=24) (actual time=1.117..1.117
rows=0 loops=1)
    ->  Seq Scan on my_report_hour_20180816  (cost=0.00..183.12 rows=1
width=24) (actual time=0.589..0.589 rows=0 loops=1)
        Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
zone) AND (resolution = ('HOUR'::cstring)::resolution))
        Rows Removed by Filter: 4672
   ->  Seq Scan on my_report_day_201808  (cost=0.00..94.56 rows=1 width=24)
(actual time=0.265..0.265 rows=0 loops=1)
        Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
zone) AND (resolution = ('HOUR'::cstring)::resolution))
        Rows Removed by Filter: 2336
    ->  Seq Scan on my_report_month_201808  (cost=0.00..94.56 rows=1
width=24) (actual time=0.261..0.261 rows=0 loops=1)
        Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
zone) AND (resolution = ('HOUR'::cstring)::resolution))
        Rows Removed by Filter: 2336
...
++++++

As we can see here, our execution will do seq scan on all my_report
partitions for that period (resolutions: day, month and hour) even though we
put HOUR as our query parameter. One thing that I observed is that insead of
(resolution = 'HOUR'::resolution)) like we got in the first example, filters
look like this (resolution = ('HOUR'::cstring)::resolution)).

It seems to me that PostgreSQL is not removing the (::cstring) part before
partition elimination so that it is forced to go through all of them
bassically ignoring the resolution value.

I have a workaround now by putting exact value in the query (I wrote exact
string `"my_report"."resolution" = 'DAY'::resolution`) into my query, and
not passing the value as parameter, but this looks to me as a bug. 

Thanks in advance


Re: BUG #15334: Partition elimination not working as expected whenusing enum as partition key

От
Amit Langote
Дата:
Hi.

On 2018/08/16 23:05, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      15334
> Logged by:          Damir Ciganović-Janković
> Email address:      damir.ciganovic.jankovic@gmail.com
> PostgreSQL version: 10.2
> Operating system:   Linux (Centos), Windows
> Description:        
> 
> Our tables are partitioned by partition key RANGE (resolution, timestamp).
> 
> Timestamp is of type 'timestamp without time zone' and resolution is hour
> own enum type which contains values 'HOUR' , 'DAY' and 'MONTH':
> CREATE TYPE resolution AS ENUM (
>   'HOUR',
>   'DAY',
>   'MONTH'
> );
> 
> Our table "my_report" has timestamp, resolution and count columns.
> 
> This is the query I am executing:
> select "my_report"."count" from my_report where ("my_report"."resolution" =
> 'HOUR'::resolution and "my_report"."timestamp" >= timestamp '2018-08-16
> 07:00:00' and "my_report"."timestamp" < timestamp '2018-08-16 10:00:00.0')
> limit 10; 
> (NOTE: I simplified the query so real query and execution plan are
> different, but I think you will understand me)
> 
> I noticed that when doing this query via psql cmd I get this execution plan
> 
> +++++
> ...
> ->  Append  (cost=0.00..159.76 rows=1 width=24) (actual time=0.585..0.585
> rows=0 loops=1)
>     ->  Seq Scan on my_report_hour_20180816  (cost=0.00..159.76 rows=1
> width=24) (actual time=0.584..0.584 rows=0 loops=1)
>         Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
> time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
> zone) AND (resolution = 'HOUR'::resolution))
>         Rows Removed by Filter: 4672
> ...
> ++++++
> 
> This is the correct behaviour, we picked the partition where our timestamps
> are in range, and also resolution is HOUR.
> Now, when executing this same query via our client (Java), but passing the
> resolution as a parameter (and not hardcoding HOUR value in my query like
> first time). I would expect that execution plan will be the same, but that
> is not the case:
> 
> ++++++
> ...
> ->  Append  (cost=0.00..372.24 rows=3 width=24) (actual time=1.117..1.117
> rows=0 loops=1)
>     ->  Seq Scan on my_report_hour_20180816  (cost=0.00..183.12 rows=1
> width=24) (actual time=0.589..0.589 rows=0 loops=1)
>         Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
> time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
> zone) AND (resolution = ('HOUR'::cstring)::resolution))
>         Rows Removed by Filter: 4672
>    ->  Seq Scan on my_report_day_201808  (cost=0.00..94.56 rows=1 width=24)
> (actual time=0.265..0.265 rows=0 loops=1)
>         Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
> time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
> zone) AND (resolution = ('HOUR'::cstring)::resolution))
>         Rows Removed by Filter: 2336
>     ->  Seq Scan on my_report_month_201808  (cost=0.00..94.56 rows=1
> width=24) (actual time=0.261..0.261 rows=0 loops=1)
>         Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
> time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
> zone) AND (resolution = ('HOUR'::cstring)::resolution))
>         Rows Removed by Filter: 2336
> ...
> ++++++
> 
> As we can see here, our execution will do seq scan on all my_report
> partitions for that period (resolutions: day, month and hour) even though we
> put HOUR as our query parameter. One thing that I observed is that insead of
> (resolution = 'HOUR'::resolution)) like we got in the first example, filters
> look like this (resolution = ('HOUR'::cstring)::resolution)).

It's strange that the expression "('HOUR'::cstring)::resolution" got
through without getting the following error

ERROR:  cannot cast type cstring to resolution
LINE 1: select ('HOUR'::cstring)::resolution;

For example,

select ('HOUR'::cstring)::resolution;
ERROR:  cannot cast type cstring to resolution
LINE 1: select ('HOUR'::cstring)::resolution;

> It seems to me that PostgreSQL is not removing the (::cstring) part before
> partition elimination so that it is forced to go through all of them
> bassically ignoring the resolution value.
> 
> I have a workaround now by putting exact value in the query (I wrote exact
> string `"my_report"."resolution" = 'DAY'::resolution`) into my query, and
> not passing the value as parameter, but this looks to me as a bug. 

You listed 10.2 as the PG version being used, which indeed had a bug that
caused partition pruning to not work correctly with enum type partition
key, which got fixed in 10.5 [1].

I'm mystified as to why it worked at all if you're are using 10.2 (you
said it worked correctly when running the command using psql with the
hard-coded value of resolution.)  So maybe, it's not the bug of 10.2 that
got fixed in 10.5 is what's causing you the trouble, but then it's also
not clear what the problem really is.

Thanks,
Amit


[1] "Fix handling of partition keys whose data type uses a polymorphic
btree operator class, such as arrays"

https://www.postgresql.org/docs/devel/static/release-10-5.html



Re: BUG #15334: Partition elimination not working as expected whenusing enum as partition key

От
Damir Ciganović-Janković
Дата:
> Hi.
> On 2018/08/16 23:05, PG Bug reporting form wrote:
> > The following bug has been logged on the website:
> > 
> > Bug reference:      15334
> > Logged by:          Damir Ciganović-Janković
> > Email address:      damir(dot)ciganovic(dot)jankovic(at)gmail(dot)com
> > PostgreSQL version: 10.2
> > Operating system:   Linux (Centos), Windows
> > Description:        
> > 
> > Our tables are partitioned by partition key RANGE (resolution, timestamp).
> > 
> > Timestamp is of type 'timestamp without time zone' and resolution is hour
> > own enum type which contains values 'HOUR' , 'DAY' and 'MONTH':
> > CREATE TYPE resolution AS ENUM (
> >   'HOUR',
> >   'DAY',
> >   'MONTH'
> > );
> > 
> > Our table "my_report" has timestamp, resolution and count columns.
> > 
> > This is the query I am executing:
> > select "my_report"."count" from my_report where ("my_report"."resolution" =
> > 'HOUR'::resolution and "my_report"."timestamp" >= timestamp '2018-08-16
> > 07:00:00' and "my_report"."timestamp" < timestamp '2018-08-16 10:00:00.0')
> > limit 10; 
> > (NOTE: I simplified the query so real query and execution plan are
> > different, but I think you will understand me)
> > 
> > I noticed that when doing this query via psql cmd I get this execution plan
> > 
> > +++++
> > ...
> > ->  Append  (cost=0.00..159.76 rows=1 width=24) (actual time=0.585..0.585
> > rows=0 loops=1)
> >     ->  Seq Scan on my_report_hour_20180816  (cost=0.00..159.76 rows=1
> > width=24) (actual time=0.584..0.584 rows=0 loops=1)
> >         Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
> > time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
> > zone) AND (resolution = 'HOUR'::resolution))
> >         Rows Removed by Filter: 4672
> > ...
> > ++++++
> > 
> > This is the correct behaviour, we picked the partition where our timestamps
> > are in range, and also resolution is HOUR.
> > Now, when executing this same query via our client (Java), but passing the
> > resolution as a parameter (and not hardcoding HOUR value in my query like
> > first time). I would expect that execution plan will be the same, but that
> > is not the case:
> > 
> > ++++++
> > ...
> > ->  Append  (cost=0.00..372.24 rows=3 width=24) (actual time=1.117..1.117
> > rows=0 loops=1)
> >     ->  Seq Scan on my_report_hour_20180816  (cost=0.00..183.12 rows=1
> > width=24) (actual time=0.589..0.589 rows=0 loops=1)
> >         Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
> > time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
> > zone) AND (resolution = ('HOUR'::cstring)::resolution))
> >         Rows Removed by Filter: 4672
> >    ->  Seq Scan on my_report_day_201808  (cost=0.00..94.56 rows=1 width=24)
> > (actual time=0.265..0.265 rows=0 loops=1)
> >         Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
> > time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
> > zone) AND (resolution = ('HOUR'::cstring)::resolution))
> >         Rows Removed by Filter: 2336
> >     ->  Seq Scan on my_report_month_201808  (cost=0.00..94.56 rows=1
> > width=24) (actual time=0.261..0.261 rows=0 loops=1)
> >         Filter: (("timestamp" >= '2018-08-16 07:00:00'::timestamp without
> > time zone) AND ("timestamp" < '2018-08-16 10:00:00'::timestamp without time
> > zone) AND (resolution = ('HOUR'::cstring)::resolution))
> >         Rows Removed by Filter: 2336
> > ...
> > ++++++
> > 
> > As we can see here, our execution will do seq scan on all my_report
> > partitions for that period (resolutions: day, month and hour) even though we
> > put HOUR as our query parameter. One thing that I observed is that insead of
> > (resolution = 'HOUR'::resolution)) like we got in the first example, filters
> > look like this (resolution = ('HOUR'::cstring)::resolution)).
> It's strange that the expression "('HOUR'::cstring)::resolution" got
> through without getting the following error
> ERROR:  cannot cast type cstring to resolution
> LINE 1: select ('HOUR'::cstring)::resolution;
> For example,
> select ('HOUR'::cstring)::resolution;
> ERROR:  cannot cast type cstring to resolution
> LINE 1: select ('HOUR'::cstring)::resolution;
> > It seems to me that PostgreSQL is not removing the (::cstring) part before
> > partition elimination so that it is forced to go through all of them
> > bassically ignoring the resolution value.
> > 
> > I have a workaround now by putting exact value in the query (I wrote exact
> > string `"my_report"."resolution" = 'DAY'::resolution`) into my query, and
> > not passing the value as parameter, but this looks to me as a bug. 
> You listed 10.2 as the PG version being used, which indeed had a bug that
> caused partition pruning to not work correctly with enum type partition
> key, which got fixed in 10.5 [1].
> I'm mystified as to why it worked at all if you're are using 10.2 (you
> said it worked correctly when running the command using psql with the
> hard-coded value of resolution.)  So maybe, it's not the bug of 10.2 that
> got fixed in 10.5 is what's causing you the trouble, but then it's also
> not clear what the problem really is.
> Thanks,
> Amit
> [1] "Fix handling of partition keys whose data type uses a polymorphic
> btree operator class, such as arrays"

I know about the issue and the fix, but this is a different bug, it seems to me. I will try to explain the issue to be more clear.
This is the enum, table and its partitions:
+++++++++++++++++++++++++++++
create type ab as enum ('A', 'B');
create table test (key ab not null) partition by list (key);
create table test_a partition of test for values in ('A');
create table test_b partition of test for values in ('B');

We are using Java, so this is the code:
try (Connection connection = DriverManager.getConnection(...);
    PreparedStatement stmt = connection.prepareStatement("EXPLAIN SELECT * FROM test WHERE key = ?::ab")) {

    stmt.setString(1, "A");
    try (ResultSet rs = stmt.executeQuery()) {
        while (rs.next()) {
            System.out.println(rs.getString(1));
        }
    }

Simple select with enum value as a parameter. Note that we didn't put "::cstring" in query. This is the result of the EXPLAIN SELECT from the code:
Append  (cost=0.00..109.25 rows=26 width=4)
  ->  Seq Scan on test_a  (cost=0.00..54.63 rows=13 width=4)
        Filter: (key = ('A'::cstring)::ab)
  ->  Seq Scan on test_b  (cost=0.00..54.63 rows=13 width=4)
        Filter: (key = ('A'::cstring)::ab)
I have put wireshark snoop in attachment where we can see that our java client is not the one adding the "::cstring" part into the code.

I hope this helps

Вложения

Re: BUG #15334: Partition elimination not working as expected whenusing enum as partition key

От
Amit Langote
Дата:
On 2018/08/17 16:17, Damir Ciganović-Janković wrote:
> I know about the issue and the fix, but this is a different bug, it seems
> to me. I will try to explain the issue to be more clear.
> This is the enum, table and its partitions:
> +++++++++++++++++++++++++++++
> create type ab as enum ('A', 'B');
> create table test (key ab not null) partition by list (key);
> create table test_a partition of test for values in ('A');
> create table test_b partition of test for values in ('B');
> +++++++++++++++++++++++++++++
> 
> We are using Java, so this is the code:
> +++++++++++++++++++++++++++++
> try (Connection connection = DriverManager.getConnection(...);
>     PreparedStatement stmt = connection.prepareStatement("EXPLAIN SELECT *
> FROM test WHERE key = ?::ab")) {
> 
>     stmt.setString(1, "A");
>     try (ResultSet rs = stmt.executeQuery()) {
>         while (rs.next()) {
>             System.out.println(rs.getString(1));
>         }
>     }
> }
> +++++++++++++++++++++++++++++
> Simple select with enum value as a parameter. Note that we didn't put
> "::cstring" in query. This is the result of the EXPLAIN SELECT from the
> code:
> +++++++++++++++++++++++++++++
> Append  (cost=0.00..109.25 rows=26 width=4)
>   ->  Seq Scan on test_a  (cost=0.00..54.63 rows=13 width=4)
>         Filter: (key = ('A'::cstring)::ab)
>   ->  Seq Scan on test_b  (cost=0.00..54.63 rows=13 width=4)
>         Filter: (key = ('A'::cstring)::ab)
> +++++++++++++++++++++++++++++
> I have put wireshark snoop in attachment where we can see that our java
> client is not the one adding the "::cstring" part into the code.

It's the Postgres EXPLAIN's code for expression deparsing that puts the
'::cstring' there.  I have to wonder why it couldn't just skip adding that
and print it as simply 'A'::ab.  However, I'm not sure if answer to that
question is related to why partition pruning doesn't occur.  Partition
pruning not occurring may however have to do with the fact that
PreparedStatement is being used meaning the planner doesn't get a chance
to do perform the pruning, but then one would see "Filter: (key = $1)" in
the EXPLAIN output if that's the case.  Sorry, I'm out of clues.


By the way, just to reconfirm if your Java application is connecting to
the same server as psql, what plan do you get when you try the same query
via psql connecting to the same server as the Java application?

I get this:

$ psql
Timing is on.
Line style is unicode.
psql (10.2)
Type "help" for help.

create type ab as enum ('A', 'B');
create table test (key ab not null) partition by list (key);
create table test_a partition of test for values in ('A');
create table test_b partition of test for values in ('B');

EXPLAIN SELECT * FROM test WHERE key = 'A'::ab;
                          QUERY PLAN
──────────────────────────────────────────────────────────────
 Append  (cost=0.00..83.75 rows=26 width=4)
   ->  Seq Scan on test_a  (cost=0.00..41.88 rows=13 width=4)
         Filter: (key = 'A'::ab)
   ->  Seq Scan on test_b  (cost=0.00..41.88 rows=13 width=4)
         Filter: (key = 'A'::ab)
(5 rows)

With 10.5, partition for 'B' is pruned.

$ psql
Timing is on.
Line style is unicode.
psql (10.5)
Type "help" for help.

EXPLAIN SELECT * FROM test WHERE key = 'A'::ab;
                          QUERY PLAN
──────────────────────────────────────────────────────────────
 Append  (cost=0.00..41.88 rows=13 width=4)
   ->  Seq Scan on test_a  (cost=0.00..41.88 rows=13 width=4)
         Filter: (key = 'A'::ab)
(3 rows)

Thanks,
Amit



Re: BUG #15334: Partition elimination not working as expected whenusing enum as partition key

От
Damir Ciganović-Janković
Дата:
> It's the Postgres EXPLAIN's code for expression deparsing that puts the
> '::cstring' there.  I have to wonder why it couldn't just skip adding that
> and print it as simply 'A'::ab.  However, I'm not sure if answer to that
> question is related to why partition pruning doesn't occur.  Partition
> pruning not occurring may however have to do with the fact that
> PreparedStatement is being used meaning the planner doesn't get a chance
> to do perform the pruning, but then one would see "Filter: (key = $1)" in
> the EXPLAIN output if that's the case.  Sorry, I'm out of clues.
> By the way, just to reconfirm if your Java application is connecting to
> the same server as psql, what plan do you get when you try the same query
> via psql connecting to the same server as the Java application?
> I get this:
> $ psql
> Timing is on.
> Line style is unicode.
> psql (10.2)
> Type "help" for help.
> create type ab as enum ('A', 'B');
> create table test (key ab not null) partition by list (key);
> create table test_a partition of test for values in ('A');
> create table test_b partition of test for values in ('B');
> EXPLAIN SELECT * FROM test WHERE key = 'A'::ab;
>                           QUERY PLAN
> ──────────────────────────────────────────────────────────────
>  Append  (cost=0.00..83.75 rows=26 width=4)
>    ->  Seq Scan on test_a  (cost=0.00..41.88 rows=13 width=4)
>          Filter: (key = 'A'::ab)
>    ->  Seq Scan on test_b  (cost=0.00..41.88 rows=13 width=4)
>          Filter: (key = 'A'::ab)
> (5 rows)
> With 10.5, partition for 'B' is pruned.
> $ psql
> Timing is on.
> Line style is unicode.
> psql (10.5)
> Type "help" for help.
> EXPLAIN SELECT * FROM test WHERE key = 'A'::ab;
>                           QUERY PLAN
> ──────────────────────────────────────────────────────────────
>  Append  (cost=0.00..41.88 rows=13 width=4)
>    ->  Seq Scan on test_a  (cost=0.00..41.88 rows=13 width=4)
>         Filter: (key = 'A'::ab)
> (3 rows)

I'm sorry, I didn't included this part for 10.2: 
alter table test_a add constraint test_a_check check(key='A');
alter table test_b add constraint test_b_check check(key='B');

(we will update soon :-) )

The results were same. But we managed to find out how to reproduced this in psql. 
This is the output from version 10.5 to eliminate possible fixes in release 10.3-10.5

+++++++++++++++++++++++
+++++++++++++++++++++++
+++++++++++++++++++++++ 
example=# select version();
                          version
------------------------------------------------------------
 PostgreSQL 10.5, compiled by Visual C++ build 1800, 64-bit
(1 row)

example=# -- Prepare schema.
example=# create type ab as enum ('A', 'B');
CREATE TYPE
example=# create table test (key ab not null) partition by list (key);
CREATE TABLE
example=# create table test_a partition of test for values in ('A');
CREATE TABLE
example=# create table test_b partition of test for values in ('B');
CREATE TABLE
example=# -- Insert data.
example=# insert into test (key) values ('A'), ('B');
INSERT 0 2
example=# -- Inline value into query.
example=# explain analyze select * from test where key = 'A'::ab;
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..41.88 rows=13 width=4) (actual time=0.013..0.013 rows=1 loops=1)
   ->  Seq Scan on test_a  (cost=0.00..41.88 rows=13 width=4) (actual time=0.012..0.013 rows=1 loops=1)
         Filter: (key = 'A'::ab)
 Planning time: 0.153 ms
 Execution time: 0.025 ms
(5 rows)

example=# -- Use untyped prepared statment.
example=# prepare untyped_stmt as select * from test where key = $1::ab;
PREPARE
example=# explain analyze execute untyped_stmt('A');
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..41.88 rows=13 width=4) (actual time=0.011..0.012 rows=1 loops=1)
   ->  Seq Scan on test_a  (cost=0.00..41.88 rows=13 width=4) (actual time=0.011..0.011 rows=1 loops=1)
         Filter: (key = 'A'::ab)
 Planning time: 0.201 ms
 Execution time: 0.024 ms
(5 rows)

example=# -- Use typed prepared statment.
example=# prepare typed_stmt(text) as select * from test where key = $1::ab;
PREPARE
example=# explain analyze execute typed_stmt('A');
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..109.25 rows=26 width=4) (actual time=0.015..0.022 rows=1 loops=1)
   ->  Seq Scan on test_a  (cost=0.00..54.63 rows=13 width=4) (actual time=0.015..0.015 rows=1 loops=1)
         Filter: (key = ('A'::cstring)::ab)
   ->  Seq Scan on test_b  (cost=0.00..54.63 rows=13 width=4) (actual time=0.006..0.006 rows=0 loops=1)
         Filter: (key = ('A'::cstring)::ab)
         Rows Removed by Filter: 1
 Planning time: 0.155 ms
 Execution time: 0.042 ms
(8 rows)
+++++++++++++++++++++++
+++++++++++++++++++++++
+++++++++++++++++++++++ 

When we prepared statement with parameter not defined, everything seems fine, 
but when we prepared statement with type of parameter defined it happened again.
We can also see from the plan that rows were actually filtered while executing the second statement. 

Hope this will be easier to track it down now,
Damir

Re: BUG #15334: Partition elimination not working as expected when using enum as partition key

От
Andrew Gierth
Дата:
>>>>> "Damir" == Damir Ciganović-Janković <damir.ciganovic.jankovic@gmail.com> writes:

 >> It's the Postgres EXPLAIN's code for expression deparsing that puts
 >> the '::cstring' there. I have to wonder why it couldn't just skip
 >> adding that and print it as simply 'A'::ab. However, I'm not sure if
 >> answer to that question is related to why partition pruning doesn't
 >> occur.

I think what's happening there is this: the parameter value needs a cast
from text to "ab"; this cast is an IO cast, so it becomes
enum_in(textout('A'),oid), and simplify_function can process
textout('A') as a constant (since textout is immutable) but not
enum_in('A',oid) (because enum_in is only stable, since it depends on
the enum label definitions). So the simplified expression becomes
('A'::cstring)::ab even though that's not actually a valid cast in
normal circumstances.

I haven't checked the partition code, but my guess is that since that's
not a constant (due to the stable cast), it can't be used for pruning in
pg10 (in pg11 it seems to get pruned at execution time).

From JDBC, I believe there's an option to make setString pass parameters
as being unknown-type (oid 0) rather than as varchars. Maybe that would
be a possible workaround?

--
Andrew (irc:RhodiumToad)


Re: BUG #15334: Partition elimination not working as expected whenusing enum as partition key

От
Amit Langote
Дата:
On Fri, Aug 17, 2018 at 20:21 Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
>>>>> "Damir" == Damir Ciganović-Janković <damir.ciganovic.jankovic@gmail.com> writes:

 >> It's the Postgres EXPLAIN's code for expression deparsing that puts
 >> the '::cstring' there. I have to wonder why it couldn't just skip
 >> adding that and print it as simply 'A'::ab. However, I'm not sure if
 >> answer to that question is related to why partition pruning doesn't
 >> occur.

I think what's happening there is this: the parameter value needs a cast
from text to "ab"; this cast is an IO cast, so it becomes
enum_in(textout('A'),oid), and simplify_function can process
textout('A') as a constant (since textout is immutable) but not
enum_in('A',oid) (because enum_in is only stable, since it depends on
the enum label definitions). So the simplified expression becomes
('A'::cstring)::ab even though that's not actually a valid cast in
normal circumstances.

Ah, I see.

I haven't checked the partition code, but my guess is that since that's
not a constant (due to the stable cast), it can't be used for pruning in
pg10 (in pg11 it seems to get pruned at execution time).

Yeah, that indeed seems to be the problem.

Thanks,
Amit

Re: BUG #15334: Partition elimination not working as expected whenusing enum as partition key

От
Amit Langote
Дата:
On 2018/08/17 19:10, Damir Ciganović-Janković wrote:
> I'm sorry, I didn't included this part for 10.2:
> alter table test_a add constraint test_a_check check(key='A');
> alter table test_b add constraint test_b_check check(key='B');
> 
> Same workaround as I did in #15042
> https://www.postgresql.org/message-id/2b20c220-50cb-3fae-da57-92e8cb3675dd%40lab.ntt.co.jp
> :-)
> (we will update soon :-) )

Ah, there it is. :-)

> The results were same. But we managed to find out how to reproduced this in
> psql.
> This is the output from version 10.5 to eliminate possible fixes in release
> 10.3-10.5

[ ... ]

> example=# -- Use typed prepared statment.
> example=# prepare typed_stmt(text) as select * from test where key = $1::ab;
> PREPARE
> example=# explain analyze execute typed_stmt('A');
>                                                QUERY PLAN
> --------------------------------------------------------------------------------------------------------
>  Append  (cost=0.00..109.25 rows=26 width=4) (actual time=0.015..0.022
> rows=1 loops=1)
>    ->  Seq Scan on test_a  (cost=0.00..54.63 rows=13 width=4) (actual
> time=0.015..0.015 rows=1 loops=1)
>          Filter: (key = ('A'::cstring)::ab)
>    ->  Seq Scan on test_b  (cost=0.00..54.63 rows=13 width=4) (actual
> time=0.006..0.006 rows=0 loops=1)
>          Filter: (key = ('A'::cstring)::ab)
>          Rows Removed by Filter: 1
>  Planning time: 0.155 ms
>  Execution time: 0.042 ms
> (8 rows)
> +++++++++++++++++++++++
> +++++++++++++++++++++++
> +++++++++++++++++++++++
> 
> When we prepared statement with parameter not defined, everything seems
> fine,
> but when we prepared statement with type of parameter defined it happened
> again.
> We can also see from the plan that rows were actually filtered while
> executing the second statement.
> 
> Hope this will be easier to track it down now,

Thanks for that explanation.  I was wrong in my previous email in saying
that having the extra cast (::cstring) isn't related to partition pruning
failing.  It IS related, as Andrew pointed out in his reply.  Having the
"::cstring" in the expression ('MONTH'::cstring::resolution) makes the
expression non-immutable, which means the planner won't use it to perform
pruning.  But, as you might be able to see, that's not really a bug of
partition pruning code, it's just how things work.

Andrew also mentioned a workaround in his reply to use setString such that
JDBC doesn't pass 'MONTH' as of type 'text' and instead as of unknown
type.  I could't find the documentation of Postgres JDBC's setString to
see how one can do that, but maybe you can try it out.

Thanks,
Amit



Re: BUG #15334: Partition elimination not working as expected when using enum as partition key

От
Andrew Gierth
Дата:
>>>>> "Amit" == Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

 Amit> Andrew also mentioned a workaround in his reply to use setString
 Amit> such that JDBC doesn't pass 'MONTH' as of type 'text' and instead
 Amit> as of unknown type. I could't find the documentation of Postgres
 Amit> JDBC's setString to see how one can do that, but maybe you can
 Amit> try it out.

Disclaimer: I don't do java stuff myself, all I know about this is from
reading the code.

First, there's a connection parameter stringtype=unspecified that
affects all setString calls on the connection. Second, there may be a
way to use setObject instead; setObject(1, str, Types.DISTINCT) looks
like it might work (or maybe Types.OTHER) - unless I'm completely
misreading the pgjdbc code, that should end up sending the right thing
to the server.

-- 
Andrew (irc:RhodiumToad)


Re: BUG #15334: Partition elimination not working as expected whenusing enum as partition key

От
Damir Ciganović-Janković
Дата:
Hey Amit and Andrew, thank you for helping me :-) Your explanations makes sense.
We have the way to get around it currently and are eagerly waiting for the 11 release :-) 

Once again, thank you for your hard work :-)
Damir

On Mon, Aug 20, 2018 at 8:06 AM Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
>>>>> "Amit" == Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

 Amit> Andrew also mentioned a workaround in his reply to use setString
 Amit> such that JDBC doesn't pass 'MONTH' as of type 'text' and instead
 Amit> as of unknown type. I could't find the documentation of Postgres
 Amit> JDBC's setString to see how one can do that, but maybe you can
 Amit> try it out.

Disclaimer: I don't do java stuff myself, all I know about this is from
reading the code.

First, there's a connection parameter stringtype=unspecified that
affects all setString calls on the connection. Second, there may be a
way to use setObject instead; setObject(1, str, Types.DISTINCT) looks
like it might work (or maybe Types.OTHER) - unless I'm completely
misreading the pgjdbc code, that should end up sending the right thing
to the server.

--
Andrew (irc:RhodiumToad)