Обсуждение: Execute query with EXCEPT, INTERSECT as anti-join, join?

Поиск
Список
Период
Сортировка

Execute query with EXCEPT, INTERSECT as anti-join, join?

От
"Colin 't Hart"
Дата:
Hi,

I can't get Postgresql to execute a query with EXCEPT (or INTERSECT)
as an anti-join (or join).

Is this even possible?

If not currently possible, is this something we would like to have?

Cheers,

Colin



Re: Execute query with EXCEPT, INTERSECT as anti-join, join?

От
Tom Lane
Дата:
"Colin 't Hart" <colinthart@gmail.com> writes:
> I can't get Postgresql to execute a query with EXCEPT (or INTERSECT)
> as an anti-join (or join).

> Is this even possible?

No, and it probably won't ever be, since the semantics aren't the same.
EXCEPT/INTERSECT imply duplicate elimination.
        regards, tom lane



Re: Execute query with EXCEPT, INTERSECT as anti-join, join?

От
"Colin 't Hart"
Дата:
On 11 November 2013 14:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Colin 't Hart" <colinthart@gmail.com> writes:
>> I can't get Postgresql to execute a query with EXCEPT (or INTERSECT)
>> as an anti-join (or join).
>
>> Is this even possible?
>
> No, and it probably won't ever be, since the semantics aren't the same.
> EXCEPT/INTERSECT imply duplicate elimination.

Can't we just use DISTINCT for that?

Given a query

<query_1> EXCEPT <query_2>

isn't it always possible to rewrite this as

select distinct * from (<query_1>) q1 where not exists (select 1 from
(<query_2>) q2 where q1.col1 = q2.col1 and q1.col2 = c2.col2 and ...
and q1.colN = q2.colN)

?


Regards,

Colin



Re: Execute query with EXCEPT, INTERSECT as anti-join, join?

От
Tom Lane
Дата:
"Colin 't Hart" <colin@sharpheart.org> writes:
> On 11 November 2013 14:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> No, and it probably won't ever be, since the semantics aren't the same.
>> EXCEPT/INTERSECT imply duplicate elimination.

> Can't we just use DISTINCT for that?

If you have to do a DISTINCT it's not clear to me that you're going to get
much win.

(The bigger picture here is that pretty much zero optimization effort has
been spent on EXCEPT/INTERSECT, because they're just not used that much
compared to other places where we could put that effort.)
        regards, tom lane



Re: Execute query with EXCEPT, INTERSECT as anti-join, join?

От
"Colin 't Hart"
Дата:
On 11 November 2013 15:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Colin 't Hart" <colin@sharpheart.org> writes:
>> On 11 November 2013 14:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> No, and it probably won't ever be, since the semantics aren't the same.
>>> EXCEPT/INTERSECT imply duplicate elimination.
>
>> Can't we just use DISTINCT for that?
>
> If you have to do a DISTINCT it's not clear to me that you're going to get
> much win.
>
> (The bigger picture here is that pretty much zero optimization effort has
> been spent on EXCEPT/INTERSECT, because they're just not used that much
> compared to other places where we could put that effort.)

I'm asking because I just encountered several cases where the
anti-join was *much* faster. In each case <query_1>'s result was
relatively small compared to <query_2>'s result or the "related rows"
from <query_2> were a much smaller set than the whole result of
<query_2>. In these cases, when the executor new how the two halves of
the query were related -- and that's the crux here: by writing EXCEPT
the executor couldn't determine how the two halves of the query were
related -- the anti-join was about 1000 times faster.

I think it's similar to the NOT IN which most DBMSes solved about 20
years ago but before that everyone used to rewrite by hand as NOT
EXISTS: sometimes we want to write query as EXCEPT because it's
clearer but execute it as an anti-join with DISTINCT.

Would these be difficult to build in? While I know a lot about how
DBMS engines work I've not hacked at PG internals. I'd be more than
willing to look at it, but could use some pointers as to where to
start. In particular, does PG rewrite queries in any way? Is it
possible to simply "rewrite the query" and then pass to the optimizer
to see if it would result in a better plan?

if I can improve EXCEPT, the same could also be applied to INTERSECT
as a "DISTINCT join".

Thanks & regards,

Colin



Re: Execute query with EXCEPT, INTERSECT as anti-join, join?

От
Tom Lane
Дата:
"Colin 't Hart" <colinthart@gmail.com> writes:
> Would these be difficult to build in?

Well, you'd have to worry about the ALL cases, as well as how to determine
whether you're actually getting a win (which would probably be rather
tough, really, as the choice would have to be made before we've fired up
any of the planner machinery that supports statistical estimation :-().
The code that plans this is in src/backend/optimizer/prep/prepunion.c.
        regards, tom lane