Re: [HACKERS] Range Merge Join v1

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: [HACKERS] Range Merge Join v1
Дата
Msg-id CAMp0ubdY_6k11V_u4AHDPOySNbuTmM=kasNj90v5jyM3fa-nSA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Range Merge Join v1  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: [HACKERS] Range Merge Join v1  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-hackers
On Thu, Aug 31, 2017 at 1:52 AM, Jeff Davis <pgsql@j-davis.com> wrote:
> Updated patch attached. Changelog:
>
> * Rebased
> * Changed MJCompare to return an enum as suggested, but it has 4
> possible values rather than 3.
> * Added support for joining on contains or contained by (@> or <@) and
> updated tests.

The current patch does not work well with multiple keys, and I believe
it's important to solve because it will make it usable for
multi-dimension spatial joins.

The problem is this: the algorithm for a single key demands that the
input is sorted by (lower(r) NULLS FIRST, upper(r) NULLS LAST). That's
easy, because that's also the sort order for the range operator class,
so everything just works.

For multiple keys, the order of the input is:  lower(r1) NULLS FIRST, lower(r2) NULLS FIRST,  upper(r1) NULLS LAST,
upper(r2)NULLS LAST
 

But that can't match up with any opclass, because an opclass can only
order one attribute at a time. In this case, the lower bound of r2 is
more significant than the upper bound of r1.

It's easy enough to adapt the execution to make this work as long as
the input is properly sorted. The challenge is making the optimizer
choose the sort keys properly.

I have tried a few approaches. The current approach that I'm using is:
* have a new, special range operator family with no operator classes
* in check_mergejoinable(), detect the && operator and set the
mergeopfamilies to contain only the special operator family
* in try_mergejoin_path(), it will convert the pathkeys using that
operator class into pathkeys over a "lower" expression over the same
EC; and then add on additional pathkeys for the "upper" expressions
onto the end of the pathkey list

Any comments or alternative suggestions welcome. This will probably
take a few days at least, so I put the patch in "waiting on author"
state.

Regards,    Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Chris Travers
Дата:
Сообщение: Re: [HACKERS] Add Roman numeral conversion to to_number
Следующее
От: Dmitriy Sarafannikov
Дата:
Сообщение: [HACKERS] Improving DISTINCT with LooseScan node