Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Дата
Msg-id CA+TgmobOADxTbsCt-j+dDVefWGK1WxY4p8AVDp1Pz48_TX4XTA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Fri, Mar 13, 2015 at 8:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Mar 13, 2015 at 2:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I don't object to the concept, but I think that is a pretty bad place
>>> to put the hook call: add_paths_to_joinrel is typically called multiple
>>> (perhaps *many*) times per joinrel and thus this placement would force
>>> any user of the hook to do a lot of repetitive work.
>
>> Interesting point.  I guess the question is whether a some or all
>> callers are going to actually *want* a separate call for each
>> invocation of add_paths_to_joinrel(), or whether they'll be happy to
>> operate on the otherwise-complete path list.
>
> Hmm.  You're right, it's certainly possible that some users would like to
> operate on each possible pair of input relations, rather than considering
> the joinrel "as a whole".  Maybe we need two hooks, one like your patch
> and one like I suggested.

Let me attempt to summarize subsequent discussion on this thread by
saying the hook location that you proposed (just before set_cheapest)
has not elicited any enthusiasm from anyone else.  In a nutshell, the
problem is that a single callback for a large join problem is just
fine if there are no special joins involved, but in any other
scenario, nobody knows how to use a hook at that location for anything
useful.  To push down a join to the remote server, you've got to
figure out how to emit an SQL query for it.  To execute it with a
custom join strategy, you've got to know which of those joins should
have inner join semantics vs. left join semantics.  A hook/callback in
make_join_rel() or in add_paths_to_joinrel() makes that relatively
straightforward. Otherwise, it's not clear what to do, short of
copy-and-pasting join_search_one_level().  If you have a suggestion,
I'd like to hear it.

If not, I'm going to press forward with the idea of putting the
relevant logic in either add_paths_to_joinrel(), as previously
proposed, or perhaps up oe level in make_one_rel().  Either way, if
you don't need to be called multiple times per joinrel, you can stash
a flag inside whatever you hang off of the joinrel's fdw_private and
return immediately on every call after the first.  I think that's
cheap enough that we shouldn't get too stressed about it: for FDWs, we
only call the hook at all if everything in the joinrel uses the same
FDW, so it won't get called at all except for joinrels where it's
likely to win big; for custom joins, multiple calls are quite likely
to be useful and necessary, and if the hook burns too much CPU time
for the query performance you get out of it, that's the custom-join
provider's fault, not ours.  The current patch takes this approach one
step further and attempts FDW pushdown only once per joinrel.  It does
that because, while postgres_fdw DOES need the jointype and a valid
innerrel/outerrel breakdown to figure out what query to generate, it
does NOT every possible breakdown; rather, the first one is as good as
any other. But this might not be true for a non-PostgreSQL remote
database.  So I think it's better to call the hook every time and let
the hook return without doing anything if it wants.

I'm still not totally sure whether make_one_rel() is better than
add_paths_to_joinrel().  The current patch attempts to split the
difference by doing FDW pushdown from make_one_rel() and custom joins
from add_paths_to_joinrel().  I dunno why; if possible, those two
things should happen in the same place.  Doing it in make_one_rel()
makes for fewer arguments and fewer repetitive calls, but that's not
much good if you would have had a use for the extra arguments that
aren't computed until we get down to add_paths_to_joinrel().  I'm not
sure whether that's the case or not.  The latest version of the
postgres_fdw patch doesn't seem to mind not having extra_lateral_rels,
but I'm wondering if that's busted.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Turning off HOT/Cleanup sometimes
Следующее
От: Kevin Grittner
Дата:
Сообщение: Re: Freeze avoidance of very large table.