Обсуждение: missing optimization - column <> column

Поиск

Список

Период

Сортировка

missing optimization - column <> column

От

Pavel Stehule

Дата:

05 декабря 2016 г., 15:19:04

I found some crazy queries in one customer application. These queries are stupid, but it was surprise for me so there are not some simple optimization

create table foo(a int);
insert into foo select generate_series(1,100000);
analyze foo;
explain select * from foo where a <> a;

It does full scan of foo, although it should be replaced by false in planner time.

Same issue is a expression a = a .. can be replaced by true

Regards

Pavel

Re: missing optimization - column <> column

От

Tom Lane

Дата:

05 декабря 2016 г., 15:24:28

Pavel Stehule <pavel.stehule@gmail.com> writes:
> I found some crazy queries in one customer application. These queries are
> stupid, but it was surprise for me so there are not some simple optimization

> create table foo(a int);
> insert into foo select generate_series(1,100000);
> analyze foo;
> explain select * from foo where a <> a;

> It does full scan of foo, although it should be replaced by false in
> planner time.

> Same issue is a expression a = a .. can be replaced by true

Wrong; those expressions yield NULL for NULL input.  You could perhaps
optimize them slightly into some form of is-null test, but it hardly
seems worth the planner cycles to check for.

If you write something like "1 <> 1", it will be folded.
        regards, tom lane

Re: missing optimization - column <> column

От

Stephen Frost

Дата:

05 декабря 2016 г., 15:25:02

Pavel,

* Pavel Stehule (pavel.stehule@gmail.com) wrote:
> I found some crazy queries in one customer application. These queries are
> stupid, but it was surprise for me so there are not some simple optimization
>
> create table foo(a int);
> insert into foo select generate_series(1,100000);
> analyze foo;
> explain select * from foo where a <> a;
>
> It does full scan of foo, although it should be replaced by false in
> planner time.

a <> a could go to NULL.  Obviously, that'll be false for such a simple
case, but it might not work out that way in a more complicated WHERE
clause.

> Same issue is a expression a = a .. can be replaced by true

a = a can't be replaced unless you know that 'a' can't be NULL.

In short, fix the application.

Thanks!

Stephen

Re: missing optimization - column <> column

От

Pavel Stehule

Дата:

05 декабря 2016 г., 15:29:33

2016-12-05 16:24 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:
> I found some crazy queries in one customer application. These queries are
> stupid, but it was surprise for me so there are not some simple optimization

> create table foo(a int);
> insert into foo select generate_series(1,100000);
> analyze foo;
> explain select * from foo where a <> a;

> It does full scan of foo, although it should be replaced by false in
> planner time.

> Same issue is a expression a = a .. can be replaced by true

Wrong; those expressions yield NULL for NULL input. You could perhaps
optimize them slightly into some form of is-null test, but it hardly
seems worth the planner cycles to check for.

understand

If you write something like "1 <> 1", it will be folded.

it works, but a <> a not

Regards

Pavel

regards, tom lane

Re: missing optimization - column <> column

От

Pavel Stehule

Дата:

05 декабря 2016 г., 15:36:01

2016-12-05 16:23 GMT+01:00 Stephen Frost <sfrost@snowman.net>:

Pavel,

* Pavel Stehule (pavel.stehule@gmail.com) wrote:
> I found some crazy queries in one customer application. These queries are
> stupid, but it was surprise for me so there are not some simple optimization
>
> create table foo(a int);
> insert into foo select generate_series(1,100000);
> analyze foo;
> explain select * from foo where a <> a;
>
> It does full scan of foo, although it should be replaced by false in
> planner time.

a <> a could go to NULL. Obviously, that'll be false for such a simple
case, but it might not work out that way in a more complicated WHERE
clause.

it should be false everywhere

I don't defend a design of the application - it is exactly wrong. But sometimes, it can be generated by some tool or it can be a human error. And bad performance can be a big problems on systems, where you cannot to deploy fix simply.

It is hard to say what should be good design - because these queries are slow, I know so these queries are wrong, but these queries does significant IO utilization - and I cannot to fix the application, because I am not a author and the fix will not be available in next week.

Regards

Pavel

> Same issue is a expression a = a .. can be replaced by true

a = a can't be replaced unless you know that 'a' can't be NULL.

In short, fix the application.

Thanks!

Stephen

Re: missing optimization - column <> column

От

Stephen Frost

Дата:

05 декабря 2016 г., 15:42:26

Pavel,

* Pavel Stehule (pavel.stehule@gmail.com) wrote:
> 2016-12-05 16:23 GMT+01:00 Stephen Frost <sfrost@snowman.net>:
> > * Pavel Stehule (pavel.stehule@gmail.com) wrote:
> > > I found some crazy queries in one customer application. These queries are
> > > stupid, but it was surprise for me so there are not some simple
> > optimization
> > >
> > > create table foo(a int);
> > > insert into foo select generate_series(1,100000);
> > > analyze foo;
> > > explain select * from foo where a <> a;
> > >
> > > It does full scan of foo, although it should be replaced by false in
> > > planner time.
> >
> > a <> a could go to NULL.  Obviously, that'll be false for such a simple
> > case, but it might not work out that way in a more complicated WHERE
> > clause.
> >
>
> it should be false everywhere

No, it's NULL, not false, if 'a' is NULL:

=# SELECT 1 WHERE (NULL <> NULL) IS NULL;?column?
----------       1
(1 row)

=*# SELECT 1 WHERE (FALSE) IS NULL;?column?
----------
(0 rows)

You can not make the assumption that 'a <> a' is always false.

Thanks!

Stephen

Re: missing optimization - column <> column

От

Pavel Stehule

Дата:

05 декабря 2016 г., 15:47:48

2016-12-05 16:41 GMT+01:00 Stephen Frost <sfrost@snowman.net>:

Pavel,

* Pavel Stehule (pavel.stehule@gmail.com) wrote:
> 2016-12-05 16:23 GMT+01:00 Stephen Frost <sfrost@snowman.net>:
> > * Pavel Stehule (pavel.stehule@gmail.com) wrote:
> > > I found some crazy queries in one customer application. These queries are
> > > stupid, but it was surprise for me so there are not some simple
> > optimization
> > >
> > > create table foo(a int);
> > > insert into foo select generate_series(1,100000);
> > > analyze foo;
> > > explain select * from foo where a <> a;
> > >
> > > It does full scan of foo, although it should be replaced by false in
> > > planner time.
> >
> > a <> a could go to NULL. Obviously, that'll be false for such a simple
> > case, but it might not work out that way in a more complicated WHERE
> > clause.
> >
>
> it should be false everywhere

No, it's NULL, not false, if 'a' is NULL:

=# SELECT 1 WHERE (NULL <> NULL) IS NULL;
?column?
----------
1
(1 row)

=*# SELECT 1 WHERE (FALSE) IS NULL;
?column?
----------
(0 rows)

You can not make the assumption that 'a <> a' is always false.

ok - when the expression is tested on NULL, then a <> a should not be reduced. But when there are not this test, then I can replace this expression by false.

Regards

Pavel

Thanks!

Stephen

Re: missing optimization - column <> column

От

Pantelis Theodosiou

Дата:

05 декабря 2016 г., 16:04:54

On Mon, Dec 5, 2016 at 3:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:
> I found some crazy queries in one customer application. These queries are
> stupid, but it was surprise for me so there are not some simple optimization

> create table foo(a int);
> insert into foo select generate_series(1,100000);
> analyze foo;
> explain select * from foo where a <> a;

> It does full scan of foo, although it should be replaced by false in
> planner time.

> Same issue is a expression a = a .. can be replaced by true

Wrong; those expressions yield NULL for NULL input. You could perhaps
optimize them slightly into some form of is-null test, but it hardly
seems worth the planner cycles to check for.

If you write something like "1 <> 1", it will be folded.

regards, tom lane

Would it be worth replacing the condition with the equivalent?

I mean would that help optimizing better some queries when it knows that a is (not) nullable or when "a" is more complicated expression?

a <> a : (a IS NULL) AND NULL

a = a : (a IS NOT NULL) OR NULL

Pantelis Theodosiou

Re: missing optimization - column <> column

От

Corey Huinker

Дата:

05 декабря 2016 г., 19:02:47

Would it be worth replacing the condition with the equivalent?
I mean would that help optimizing better some queries when it knows that a is (not) nullable or when "a" is more complicated expression?

a <> a : (a IS NULL) AND NULL
a = a : (a IS NOT NULL) OR NULL

I think you're looking for

a IS DISTINCT FROM a

And that will work for cases where a might be null.

I have no opinion about whether adding such a test to the planner is worth it.

Re: missing optimization - column <> column

От

Pantelis Theodosiou

Дата:

05 декабря 2016 г., 20:59:44

On Mon, Dec 5, 2016 at 7:02 PM, Corey Huinker <corey.huinker@gmail.com> wrote:

Would it be worth replacing the condition with the equivalent?
I mean would that help optimizing better some queries when it knows that a is (not) nullable or when "a" is more complicated expression?

a <> a : (a IS NULL) AND NULL
a = a : (a IS NOT NULL) OR NULL

I think you're looking for

a IS DISTINCT FROM a

And that will work for cases where a might be null.

I have no opinion about whether adding such a test to the planner is worth it.

No, (a IS DISTINCT FROM a) will evaluate to FALSE when a is NULL. The other conditions (a <> a) , ((a IS NULL) AND NULL) will evaluate to NULL.

Re: missing optimization - column <> column

От

Serge Rielau

Дата:

05 декабря 2016 г., 20:59:59

Actually there are lots of things that can be done with this sort of theorem proving.

And NULL is a plenty good answer for a filter, just not for a check constraint.

Amongst them INSERT through UNION ALL for symmetric views which can be handy for FDW partitioned tables.

One such implementation an be found here:

https://www.google.com/patents/US6728952 (apparently expired)

Cheers

Serge

Salesforce.com

On Dec 5, 2016, at 7:28 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

2016-12-05 16:24 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I found some crazy queries in one customer application. These queries are
> stupid, but it was surprise for me so there are not some simple optimization

> create table foo(a int);
> insert into foo select generate_series(1,100000);
> analyze foo;
> explain select * from foo where a <> a;

> It does full scan of foo, although it should be replaced by false in
> planner time.

> Same issue is a expression a = a .. can be replaced by true

Wrong; those expressions yield NULL for NULL input. You could perhaps
optimize them slightly into some form of is-null test, but it hardly
seems worth the planner cycles to check for.

understand

If you write something like "1 <> 1", it will be folded.

it works, but a <> a not

Regards

Pavel

regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: missing optimization - column <> column