Re: [HACKERS] postgres_fdw bug in 9.6
От | Etsuro Fujita |
---|---|
Тема | Re: [HACKERS] postgres_fdw bug in 9.6 |
Дата | |
Msg-id | b06b113e-73a1-fa5e-91b6-28bda2551a08@lab.ntt.co.jp обсуждение исходный текст |
Ответ на | Re: [HACKERS] postgres_fdw bug in 9.6 (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>) |
Ответы |
Re: [HACKERS] postgres_fdw bug in 9.6
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-hackers |
On 2016/12/13 23:13, Ashutosh Bapat wrote: > On Tue, Dec 13, 2016 at 4:15 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I believe there are probably more problems here, or at least if there >> aren't, it's not clear why not. Because of GetExistingLocalJoinPath's >> lack of curiosity about what's underneath the join pathnode it picks, >> it seems to me that it's possible for it to return a path tree that >> *isn't* all local joins. If we're looking at, say, a hash path for >> a 4-way join, whose left input is a hash path for a 3-way join, whose >> left input is a 2-way foreign join, what's stopping that from being >> returned as a "local" path tree? > Right, the so called "local join tree" can contain ForeignPath > representing joins between foreign relations. AFAIU what protects us > from getting into problem is that postgresRecheckForeignScan calls > ExecProcNode() on the outer plan. So, even if there are ForeignPaths > in fdw_outerpath tree, corresponding plans would use a local join plan > to apply EPQ recheck. The code in GetExistingLocalJoinPath() avoids > the redirection at least for the inner or outer ForeignPaths. I think Ashutosh is right. > Any > comment claiming that path tree under fdw_outerpath is entirely > "local" should be removed, if it survives the fix. +1 >> Likewise, it seems like the code is trying to reject any custom-path join >> types, or at least this barely-intelligible comment seems to imply that: >> >> * Just skip anything else. We don't know if corresponding >> * plan would build the output row from whole-row references >> * of base relations and execute the EPQ checks. >> >> But this coding fails to notice any such join type that's below the >> level of the immediate two join inputs. I wrote the first version of GetExistingLocalJoinPath (and postgresRecheckForeignScan), to verify that the changes to core for pushing down foreign/custom joins in 9.5 would work well for EPQ rechecks. And I rejected custom paths in that version because I intended to use that function not only for foreign joins but custom joins. (IIRC, for 9.6, I proposed to put GetExistingLocalJoinPath in a more common area such as src/backend/optimizer/path/joinpath.c, but that was ignored...) >> I kind of wonder why this infrastructure exists at all; it's not the way >> I'd have foreseen handling EPQ for remote joins. However, while "throw >> it away and start again" might be good advice going forward, I suppose >> it won't be very popular for applying to 9.6. >> >> One way that we could make things better is to rely on the knowledge >> that EPQ isn't asked to evaluate joins for more than one row per input >> relation, and therefore using merge or hash join technology is really >> overkill. We could make a tree of simple nestloop joins, which aren't >> going to care about sort order, if we could identify the correct join >> clauses to apply. At least some of that could be laid off on the FDW, >> which if it's gotten this far at all, ought to know what join clauses >> need to be enforced by the foreign join. So I'm thinking a little bit >> in terms of "just collect the foreign scans for all the base rels >> included in the join and then build a cross-product nestloop join tree, >> applying all the join clauses at the top". This would have the signal >> value that it's guaranteed to succeed and so can be left for later, >> rather than having to expensively redo it at each level of join planning. > I am not able to understand how this strategy of applying all join > clauses on top of cross product would work if there are OUTER joins > involved. Wouldn't nullable sides change the result? I'm not able to understand that, either. >> (Hm, that does sound a lot like "throw it away and start again", doesn't >> it. But what we've got here is busted enough that I'm not sure there >> are good alternatives. Maybe for 9.6 we could just rewrite >> GetExistingLocalJoinPath, and limp along doing a lot of redundant >> computation during planning.) An alternative I was thinking was (1) to create an equivalent nestloop join path for each foreign/custom join path and store it in fdw_outerpath as-is, except when that join path implements a full join, in which case a merge or hash join path is created, and (2) to apply postgresRecheckForeignScan to the outer subplan created from the fdw_outerpath when executing EPQ rechecks. So, I was thinking to provide a helper function that creates the equivalent local join path for a given foreign/custom join path in #1. > A possible short-term fix may be this: instead of picking any random > path to stick into fdw_outerpath, we choose a path which covers the > pathkeys of ForeignPath. If postgres_fdw was able to come up with a > ForeignPath with those pathkeys, there is high chance that there > exists a local path with those pathkeys. Since we are yet to call > add_path() with the ForeignPath being built, that local path should > exist in the path list. Stick that path in fdw_outerpath. If such a > path doesn't exist, return NULL from GetExistingLocalJoinPath(). > postgres_fdw wouldn't push down the join in that case and we will > loose some optimization, but that might be better than throwing an > error. Seems reasonable. Best regards, Etsuro Fujita
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Alexander LawДата:
Сообщение: [HACKERS] Re: [COMMITTERS] pgsql: Build HTML documentation using XSLTstylesheets by default
Следующее
От: Heikki LinnakangasДата:
Сообщение: Re: pg_authid.rolpassword format (was Re: [HACKERS] Passwordidentifiers, protocol aging and SCRAM protocol)