Re: [HACKERS] Performance improvement for joins where outer side is unique

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] Performance improvement for joins where outer side is unique
Дата
Msg-id 1545.1485535460@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] Performance improvement for joins where outer side is unique  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [HACKERS] Performance improvement for joins where outer side is unique  (Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] Performance improvement for joins where outer side is unique  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
I wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
>> hmm. I'm having trouble understanding why this is a problem for Unique
>> joins, but not for join removal?

> Ah, you know what, that's just mistaken.  I was thinking that we
> short-circuited the join on the strength of the hash (or merge) quals
> only, but actually we check all the joinquals first.  As long as the
> uniqueness proof uses only joinquals and not conditions that will end up
> as otherquals, it's fine.

Actually, after thinking about that some more, it seems to me that there
is a performance (not correctness) issue here: suppose that we have
something like
select ... from t1 left join t2 on t1.x = t2.x and t1.y < t2.y

If there's a unique index on t2.x, we'll be able to mark the join
inner-unique.  However, short-circuiting would only occur after
finding a row that passes both joinquals.  If the y condition is
true for only a few rows, this would pretty nearly disable the
optimization.  Ideally we would short-circuit after testing the x
condition only, but there's no provision for that.

This might not be a huge problem for outer joins.  My sense of typical
SQL style is that the joinquals (ON conditions) are likely to be
exactly what you need to prove inner uniqueness, while random other
conditions will be pushed-down from WHERE and hence will be otherquals.
But I'm afraid it is quite a big deal for inner joins, where we dump
all available conditions into the joinquals.  We might need to rethink
that choice.

At least for merge and hash joins, it's tempting to think about a
short-circuit test being made after testing just the merge/hash quals.
But we'd have to prove uniqueness using only the merge/hash quals,
so the planning cost might be unacceptably high --- particularly for
merge joins which often don't use all available mergeable quals.
In the end I think we probably want to keep the short-circuit in the
same place where it is for SEMI/ANTI cases (which have to have it
exactly there for semantic correctness).

I'm afraid though that we may have to do something about the
irrelevant-joinquals issue in order for this to be of much real-world
use for inner joins.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superusercheck
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: [HACKERS] pg_ls_dir & friends still have a hard-coded superusercheck