On Fri, Jul 21, 2017 at 10:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> But the bigger picture is that doing something that helps to any
> useful extent would require a really substantial amount of new,
> datatype- and operator-specific knowledge that doesn't exist in the
> system today. And as Craig noted, applying that knowledge would
> be expensive, even in cases where it failed to help.
>
> So, color me skeptical ...
I agree, but with a caveat. If somebody felt like doing all of that
work, and either made it cheap enough to justify enabling it by
default or added a controlling GUC, it'd be fine with me. We've
talked before about having knobs to adjust how hard the optimizer
tries to optimize things, and this would be a good candidate for such
a thing. The bigger issue from my perspective is that I really doubt
that anybody wants to put the effort into doing something like this in
a principled way.
Another very similar (but possibly easier) case is:
select * from pgbench_accounts where aid = 1.0;
This will use a sequential scan rather than an index scan, because the
query optimizer doesn't know that the only integer for which =(int4,
numeric) will return true is 1. Therefore it has to scan the whole
table one row at a time and check, for each one, whether the =
operator returns true. It can't cast the constant to an integer
because the user might have written 1.1 rather than 1.0, in which case
the cast would fail; but the query should return 0 rows, not ERROR.
You can imagine fixing this by having some kind of datatype-specific
knowledge that would replace "aid = 1.0" with "aid = 1" and "aid =
1.1" with "false"; it would also have to know that "aid = 9999999999"
should be changed to "false" because 9999999999 isn't representable as
int4.
I have, however, decided not to volunteer to be the one who works on
that project.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company