Re: HashJoin w/option to unique-ify inner rel

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: HashJoin w/option to unique-ify inner rel
Дата
Msg-id 603c8f070904161728p60da50a0o4833fbe9ba1a4f94@mail.gmail.com
обсуждение исходный текст
Ответ на Re: HashJoin w/option to unique-ify inner rel  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Thu, Apr 16, 2009 at 7:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Upon further review, it appears that a big part of this problem is
>> that cost_hashjoin() doesn't understand that it needs cost semi-joins
>> differently from inner or left joins.
>
> Yeah, I have a note to look into that before 8.4 final.  The same is
> true for nestloops: stopping after hitting one match to the current
> outer can make a big difference, and that's not reflected in costsize.c
> yet.  I'm not sure whether cost_mergejoin needs to care.

Cool.  It's worth noting that this is also a problem for anti-joins,
which can also stop after hitting one match.

For merge join, it looks like you might save something if there are
duplicates in both the inner and outer lists.  If the outer side
contains 1,1,1,1,1,2 and the inner side contains 1,1,1,2,2, then an
INNER JOIN or LEFT JOIN will rescan the first three tuples of the
inner side three times, whereas a SEMI or ANTI JOIN will scan the
first tuple three times and then the next two only once.  But I'm not
sure if we can estimate this accurately enough to matter.

...Robert


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: HashJoin w/option to unique-ify inner rel
Следующее
От: Robert Haas
Дата:
Сообщение: Re: HashJoin w/option to unique-ify inner rel