Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Дата
Msg-id CAFjFpRcp2Emv8u+jun2_Q=r5wdCigqHNNR8wehnBEK3gD6FU1Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Thu, Feb 4, 2016 at 2:42 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Feb 3, 2016 at 12:08 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> The patch implements your algorithm to deparse a query as described in
> previous mail. The logic is largely coded in deparseFromExprForRel() and
> foreign_join_ok(). The later one pulls up the clauses from joining relations
> and first one deparses the FROM clause recursively.

Cool!

+               /* Add outer relation. */
+               appendStringInfo(buf, "(%s", join_sql_o.data);
+
+               /* Add join type */
+               appendStringInfo(buf, " %s JOIN ",
get_jointype_name(fpinfo->jointype));
+
+               /* Add inner relation */
+               appendStringInfo(buf, "%s", join_sql_i.data);
+
+               /* Append ON clause; ON (TRUE) in case empty join clause list */
+               appendStringInfoString(buf, " ON ");

Uh, couldn't that all be done as a single appendStringInfo?


Done.
 
It seems a little tortured the way you're passing "relations" all the
way down the callstack from deparseSelectStmtForRel, and at each level
it might be NULL.  If you made a rule that the caller MUST pass a
StringInfo, then you could get rid of some conditional logic in
deparseFromExprForRel.   By the way, deparseSelectSql()'s header
comment could use an update to mention this additional argument.
Generally, it's helpful to say in each relevant function header
comment something like "May be NULL" or "Must not be NULL" in cases
like this to clarify the API contract.

Done.

How about building this string when we construct fpinfo? We will waste some cycles for base relations but we will have lesser arguments in deparsing routines. I have attached patch recursive_relations.patch implementing this idea. The patch can be applied on top of the attached patches.
 

Similarly, I would be inclined to continue to require that
deparseTargetList() have retrieved_attrs != NULL.  If the caller
doesn't want the value, it can pass a dummy variable and ignore the
return value.  This is of course slightly more cycles, but I think
it's unlikely to matter, and making the code simpler would be good.

Done.
 

+ * Function is the entry point to deparse routines while constructing
+ * ForeignScan plan or estimating cost and size for ForeignPath. It is called
+ * recursively to build SELECT statements for joining relations of a
pushed down
+ * foreign join.

"This function is the entrypoint to the routines, either when
constructing ForeignScan plan or when estimating" etc.

I have removed this comment altogether. The second sentence in the comment no more holds true as we are not calling this function recursively any more. The first statement too doesn't add much value, The thing that is says, was true even before join pushdown and at that time that sentence wasn't there. The opening comment says what that function does.
 

+ * tuple descriptor for the corresponding foreign scan. For a base relation,
+ * which is not part of a pushed down join, fpinfo->attrs_used can be used to
+ * construct SELECT clause, thus the function doesn't need tlist. Hence when
+ * tlist passed, the function assumes that it's constructing the SELECT
+ * statement to be part of a pushed down foreign join.

I thought you got rid of that assumption.  I think it should be gotten
rid of, and then the comment can go too.  If we're keeping the comment
for some reason, should be "doesn't need the tlist" and when "when the
tlist is passed".

Done. tlist will be used only for join relations. For base relations fpinfo->attrs_used will be used.
 

+ * 1, since those are the attribute numbers are in the corresponding scan.

Extra "are".  Should be: "Those are the attribute numbers in the
corresponding scan."

I don't have that comment in the patch anymore. Probably got removed as part of the other refactoring.
 

Would it be nuts to set fdw_scan_tlist in all cases?  Would the code
come out simpler than what we have now?

deparesTargetList has an optimization when whole-row reference appears. It doesn't include whole-row reference and instead includes all the attributes, so that whole-row reference can be constructed at the time of projection. We will have to mimic similar logic while creating fdw_scan_tlist for base relations. Otherwise, we will fetch larger row from the foreign table. I am still working on this part. Mostly will post it with the next patch.

In set_foreignscan_references(), we have
 1109     if (fscan->fdw_scan_tlist != NIL || fscan->scan.scanrelid == 0)
If we are to set fdw_scan_tlist for base relation, it would stamp the Vars with INDEX_VAR which would be undesirable. May be we should just change that condition to if (fscan->scan.scanrelid == 0). What do you think?

Attaching patches
pg_fdw_core_v8.patch: core changes
pg_fdw_join_v8.patch: postgres_fdw changes for join pushdown
pg_join_pd_v8.patch: combined patch for ease of testing.
recursive_relations.patch: for building relation description while constructing fpinfo.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Support for N synchronous standby servers - take 2
Следующее
От: Ashutosh Bapat
Дата:
Сообщение: Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)