Re: Push down more full joins in postgres_fdw

Поиск
Список
Период
Сортировка
От Etsuro Fujita
Тема Re: Push down more full joins in postgres_fdw
Дата
Msg-id 3e5989a9-c322-cc0d-1476-cd0707675d10@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Push down more full joins in postgres_fdw  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Ответы Re: Push down more full joins in postgres_fdw  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Re: Push down more full joins in postgres_fdw  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Список pgsql-hackers
On 2016/09/06 22:07, Ashutosh Bapat wrote:
> On Fri, Sep 2, 2016 at 3:55 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp <mailto:fujita.etsuro@lab.ntt.co.jp>> wrote:

>     On 2016/08/22 15:49, Ashutosh Bapat wrote:
>         1. deparsePlaceHolderVar looks odd - each of the deparse*
>         function is
>         named as deparse + <name of the parser node the string would parse
>         into>. PlaceHolderVar is not a parser node, so no string is
>         going to be
>         parsed as a PlaceHolderVar. May be you want to name it as
>         deparseExerFromPlaceholderVar or something like that.

>     The name "deparseExerFromPlaceholderVar" seems long to me.  How
>     about "deparsePlaceHolderExpr"?

> There isn't any node with name PlaceHolderExpr.

I'll rename it to "deparseExerInPlaceholderVar", then.

>         3. I think registerAlias stuff should happen really at the time of
>         creating paths and should be stored in fpinfo. Without that it
>         will be
>         computed every time we deparse the query. This means every time
>         we try
>         to EXPLAIN the query at various levels in the join tree and once
>         for the
>         query to be sent to the foreign server.

>     Hmm.  I think the overhead in calculating aliases would be
>     negligible compared to the overhead in explaining each remote query
>     for costing or sending the remote query for execution.  So, I
>     created aliases in the same way as remote params created and stored
>     into params_list in deparse_expr_cxt.  I'm not sure it's worth
>     complicating the code.

> We defer remote parameter creation till deparse time since the the
> parameter numbers are dependent upon the sequence in which we deparse
> the query. Creating them at the time of path creation and storing them
> in fpinfo is not possible because a. those present in the joining
> relations will conflict with each other and need some kind of
> serialization at the time of deparsing b. those defer for differently
> parameterized paths or paths with different pathkeys. We don't defer it
> because it's very light on performance.
>
> That's not true with the alias information. As long as we detect which
> relations need subqueries, their RTIs are enough to create unique
> aliases e.g. if a relation involves RTIs 1, 2, 3 corresponding subquery
> can have alias r123 without conflicting with any other alias.

Hmm.  But another concern about the approach of generating an subselect 
alias for a path, if needed, at the path creation time would be that the 
path might be rejected by add_path, which would result in totally 
wasting cycles for generating the subselect alias for the path.

> However minimum overhead it might have, we will deparse the query every
> time we create a path involving that kind of relation i.e. for different
> pathkeys, different parameterization and different joins in which that
> relation participates in. This number can be significant. We want to
> avoid this overhead if we can.

Exactly.  As I said above, I think the overhead would be negligible 
compared to the overhead in explaining each remote query for costing or 
the overhead in sending the final remote query for execution, though.

>         5. The blocks related to inner and outer relations in
>         deparseFromExprForRel() look same. We should probably separate
>         that code
>         out into a function and call it at two places.

>     Done.

> I see you have created function deparseOperandRelation() for the
> same. I guess, this should be renamed as deparseRangeTblRef() since it
> will create RangeTblRef node when parsed back.

OK, if there no opinions of others, I'll rename it to the name proposed 
by you, "deparseRangeTblRef".

>         6.
>         !     if (is_placeholder)
>         !         errcontext("placeholder expression at position %d in
>         select list",
>         !                    errpos->cur_attno);
>         A user wouldn't know what a placeholder expression is, there is
>         no such
>         term in the documentation. We have to device a better way to
>         provide an
>         error context for an expression in general.

>     Though I proposed that, I don't think that it's that important to
>     let users know that the expression is from a PHV.  How about just
>     saying "expression", not "placeholder expression", so that we have
>     the message "expression at position %d in select list" in the context?

> Hmm, that's better than placeholder expression, but not as explanatory
> as it should be since we won't be printing the "select list" in the
> error context and user won't have a clue as to what error context is
> about.

I don't think so.  Consider an example of the conversion error message, 
which is from the regression test:

SELECT  ft1.c1,  ft2.c2, ft1 FROM ft1, ft2 WHERE ft1.c1 = ft2.c1 AND 
ft1.c1 = 1;
ERROR:  invalid input syntax for integer: "foo"
CONTEXT:  whole-row reference to foreign table "ft1"

As shown in the example, the error message is displayed under a remote 
query for execution.  So, ISTM it's reasonable to print something like 
"expression at position %d in select list" in the context if an 
expression in a PHV.

> But I don't have any good suggestions right now. May be we should
> print the whole expression? But that can be very long, I don't know.

ISTM that it's a bit too expensive to print the whole expression in the 
error context.

> This patch tries to do two things at a time 1. support join pushdown for
> FULL join when the joining relations have remote conditions 2. better
> support for fetching placeholder vars, whole row references and some
> system columns. To make reviews easy, I think we should split the patch
> into two 1. supporting subqueries to be deparsed with support for one of
> the above (I will suggest FULL join support) 2. support for the other.

OK, will try.

Best regards,
Etsuro Fujita





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Bug in two-phase transaction recovery
Следующее
От: Etsuro Fujita
Дата:
Сообщение: Re: Push down more UPDATEs/DELETEs in postgres_fdw