Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

Поиск

Список

Период

Сортировка

От	Ashutosh Bapat
Тема	Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Дата	2 февраля 2016 г. 19:21:59
Msg-id	CAFjFpReTBA+RdDBV3pJ=0_SQ9QbNgPm_AyF79UV652bawHrQ0Q@mail.gmail.com обсуждение исходный текст
Ответ на	Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs) (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs) (Robert Haas <robertmhaas@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Tue, Feb 2, 2016 at 10:13 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Feb 1, 2016 at 6:48 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> More when I have a bit more time to look at this...

Why does deparseSelectStmtForRel change the order of the existing
arguments? I have no issue with adding new arguments as required, but
rearranging the existing argument order doesn't serve any useful
purpose that is immediately apparent.

deparseSelectStmtForRel has two sets of arguments, input and output. They are separated in the declaration all inputs come first, followed by all outputs. The inputs were ordered according to their appearance in SELECT statement, so I added tlist before remote_conds. I should have added relations, which is an output argument, at the end, but I accidentally added it between existing output arguments. Anyway, I will go ahead and just add the new arguments after the existing ones.

There's also no real need for
the rel -> foreignrel renaming.

That was an unintentional change during merge. Sorry for that. Reverted it.

+ /*
+ * If we have constructed the SELECT clause from the targetlist, construct
+ * the retrieved attributes list as continuously increasing list of
+ * integers.
+ */
+ if (retrieved_attrs && tlist)
+ {
+ int i;
+ *retrieved_attrs = NIL;
+ for (i = 1; i <= list_length(tlist); i++)
+ *retrieved_attrs = lappend_int(*retrieved_attrs, i);
+ }

This is really wonky. First, you pass retrieved_attrs to
deparseSelectSqlForBaseRel, but then you have this code which blows it
away and replaces it if tlist != NIL. So I guess this will run always
for a join relation, and for a base relation only sometimes. But
that's certainly not at all clear. I think you need to find some way
of rearranging this so that it's more obvious what is going on here.

I have pushed retrieved_attrs as an argument to deparseExplicitTargetList() and deparseSelectSqlForJoinRel() to keep the things consistent.

I suggest not renaming the existing deparseTargetList() and instead
coming up with a different name for the new thing you need, maybe
deparseExplicitTargetList().

Done.

How about adding another sentence to the header comment for
appendConditions() saying something like "This is used for both WHERE
clauses and for JOIN .. ON"?

Done.

+ * The targetlist is list of TargetEntry's which in turn contains Var nodes.

contain.

Done.

+/*
+ * Output join name for given join type */

Formatting.

Done.

This patch, overall, is badly in need of a pgindent run.

Sorry, I haven't run pgindent on the attached patches. But will do that next.

+ /*
+ * XXX
+ * Since the query is being built in recursive manner from bottom up,
+ * the FOR UPDATE/SHARE clause referring the base relations can
not be added
+ * at the top level. They need to be added to the subqueries corresponding
+ * to the base relations. This has an undesirable effect of locking more
+ * rows than specified by user, as it locks even those rows from base
+ * relations which are not part of the final join result. To avoid this
+ * undesirable effect, we need to build the join query without the
+ * subqueries, which for now, seems hard.
+ */

This is a second good reason that we should actually do that work
instead of complaining that it's too hard. The first one is that the
queries that come out of this right now are too long and hard to read.
I actually don't see why this is all that hard. Deparsing the target
list is simple enough; you just need to emit tab.col using varno to
determine what tab looks like and varattno to determine what col looks
like. The trickier part is emitting the FROM clause. But this doesn't
seem all that hard either. Suppose that when we construct the fpinfo
(PgFdwRelationInfo *) for a relation, we include in it a FROM clause
appropriate to that rel. So, for a baserel, it's something like "foo
r4" where 4 is foo's RTI. For a joinrel, do this:

1. Emit the FROM clause constructed for the outer relation,
surrounding it with parentheses if the outer relation is a joinrel.
2. Emit " JOIN ", " LEFT JOIN ", " RIGHT JOIN ", or " FULL JOIN "
according to the join type.
3. Emit the FROM clause constructed for the inner relation,
surrounding it with parentheses if the inner relation is a joinrel.
4. Emit " ON ".
5. Emit the joinqual.

This will produce nice things like (foo r3 JOIN bar r4 ON r3.x = r4.x)
JOIN baz r2 ON r3.y = r2.y

Then, you'd also need to stuff the conditions into the
PgFdwRelationInfo so that those could be added to paths constructed at
higher levels. But that's not too hard either. Basically you'd end
up with mostly the same stuff you have now, but the PgFdwRelationInfo
would store a join clause and a set of deparsed quals to be included
in the FROM and WHERE clauses respectively. And then you'd pull the
information from the inner and outer sides to build up what you need
at the joinrel level.

I was thinking on the similar lines except rN aliases. I think there will be problem for queries like
postgres=# explain verbose select * from lt left join (select bar.a, foo.b from bar left join foo on (bar.a = foo.a) where bar.b + foo.b < 10) q on (lt.b = q.b);
                                   QUERY PLAN
--------------------------------------------------------------------------------
Hash Right Join (cost=318.03..872.45 rows=43 width=16)
   Output: lt.a, lt.b, bar.a, foo.b
   Hash Cond: (foo.b = lt.b)
   -> Merge Join (cost=317.01..839.07 rows=8513 width=8)
         Output: bar.a, foo.b
         Merge Cond: (bar.a = foo.a)
         Join Filter: ((bar.b + foo.b) < 10)
         -> Sort (cost=158.51..164.16 rows=2260 width=8)
               Output: bar.a, bar.b
               Sort Key: bar.a
               -> Seq Scan on public.bar (cost=0.00..32.60 rows=2260 width=8)
                     Output: bar.a, bar.b
         -> Sort (cost=158.51..164.16 rows=2260 width=8)
               Output: foo.b, foo.a
               Sort Key: foo.a
               -> Seq Scan on public.foo (cost=0.00..32.60 rows=2260 width=8)
                     Output: foo.b, foo.a
   -> Hash (cost=1.01..1.01 rows=1 width=8)
         Output: lt.a, lt.b
         -> Seq Scan on public.lt (cost=0.00..1.01 rows=1 width=8)
               Output: lt.a, lt.b
(21 rows)

The subquery q is pulled up, so there won't be trace of q in the join tree except may be a useless RTE for the subquery. There will be RelOptInfo representing join between lt, bar and foo and a RelOptInfo for join between bar and foo. The join filter bar.b + foo.b < 10 needs to be evaluated before joining (bar, foo) with lt and should go with bar left join foo. But the syntax doesn't support something like "bar left join foo on (bar.a = foo.a) where bar.b + foo.b". So we will have to construct a SELECT statement for this join and add to the FROM clause with a subquery alias and then refer the columns of foo and bar with that subquery alias.

Further during the process of qual placement, quals that can be evaluated at the level of given relation in the join tree are attached to that relation if they can be pushed down. Thus if we see a qual attached to a given relation, AFAIU, we can not say whether it needs to be evaluated there (similar to above query) or planner pushed it down for optimization, and thus for every join relation with quals we will need to build subqueries with aliases.

I am still looking at how we can make this work.

This would actually be faster than what you've got right now, because
right now you're recursing down the whole join tree all over again at
each new join level, maybe not the best plan.

+ if (foreignrel->reloptkind == RELOPT_JOINREL)
+ return;
+
/* Add any necessary FOR UPDATE/SHARE. */
deparseLockingClause(&context);

Done.

Generally, I think in these kinds of cases it is better to reverse the
test and get rid of the return statement, like this:

if (foreignrel->reloptkind != RELOPT_JOINREL)
deparseLockingClause(&context);

The way you wrote it, somebody who wants to add more code to the end
of the function will probably have to make that change anyhow.
Really, your intention here was to skip that code for joins, not to

skip the rest of the function for baserels.

Yes, you are right. Actually before deparseLockingClause change was committed, all of that code was here, so this way worked. But I forgot to change it while merging. Thanks for pointing out.

@@ -1424,9 +1875,7 @@ deparseVar(Var *node, deparse_expr_cxt *context)
printRemoteParam(pindex, node->vartype, node->vartypmod, context);
}
else
- {
printRemotePlaceholder(node->vartype, node->vartypmod, context);
- }
}
}

Useless hunk.

Reverted the change.

+ * Constructing queries representating SEMI and ANTI joins is hard, hence

Typo.

Done. Thanks for pointing out the typos.

Attached patches
pg_fdw_core_v6.patch: core changes
pg_fdw_join_v6.patch: postgres_fdw changes for join pushdown
pg_join_pd_v6.patch: combined patch for ease of testing.

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Ashutosh Bapat
Дата: 02 февраля 2016 г., 19:18:16
Сообщение: Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

Следующее

От: Robert Haas
Дата: 02 февраля 2016 г., 19:24:03
Сообщение: Re: pgbench - allow backslash-continuations in custom scripts

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

Вложения

Предыдущее

Следующее