On Fri, Jul 1, 2016 at 9:11 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Jul 1, 2016 at 9:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Maybe, but neither UNION nor UNION ALL would duplicate the semantics
>>> of OR, so there's some handwaving here that I missed.
>
>> SELECT * FROM foo WHERE a = 5 OR a = 4
>> isn't equivalent to
>> SELECT * FROM foo WHERE a = 5
>> UNION
>> SELECT * FROM foo WHERE a = 4
>> ?
>
> It probably is, but you're assuming that "a" appears in the list of
> columns being unioned. If you make that just "SELECT b FROM ..."
> then the latter form gets rid of duplicate b values where the first
> doesn't. On the other hand, UNION ALL might introduce duplicates
> not present in the OR query's result.
Yeah. Also, even if you could parse out those cases, it's major
optimization fence. Consider if you have an ORDER BY clause here:
SELECT FROM foo WHERE a OR b ORDER BY c;
... by pushing inside a union, you're going to be in trouble in real
world cases. That's just a mess and it would add a lot of runtime
analysis of the alternative paths. It's hard for me to believe
rewriting is easier and simpler than rewriting 'false OR x' to 'x'. I
also thing that constant folding strategies are going to render much
more sensible output to EXPLAIN.
FYI, The query is something along the lines of
SELECT * FROM foo
WHERE ('a' = 'a' AND EXISTS ...) OR ('a' = 'b' AND EXISTS ...) OR ('a' = 'c' AND EXISTS ...)
...where the left side of the equality is a parameterized 'filter
mode' flag. That way the query can introduce filtering behaviors
without doing dynamic acrobatics.
merlin