Re: Problems with plan estimates in postgres_fdw

Поиск
Список
Период
Сортировка
От Etsuro Fujita
Тема Re: Problems with plan estimates in postgres_fdw
Дата
Msg-id 5C6BBD7A.1030702@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Problems with plan estimates in postgres_fdw  (Antonin Houska <ah@cybertec.at>)
Ответы Re: Problems with plan estimates in postgres_fdw  (Antonin Houska <ah@cybertec.at>)
Список pgsql-hackers
(2019/02/18 23:21), Antonin Houska wrote:
> Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp>  wrote:
>> (2019/02/15 21:46), Antonin Houska wrote:
>>> ok, I understand now. I assume that the patch
>>>
>>> https://www.postgresql.org/message-id/5C66A056.60007%40lab.ntt.co.jp
>>>
>>> obsoletes the code snippet we discussed above.
>>
>> Sorry, I don't understand this.  Could you elaborate on that?
>
> I thought that the assignments
>
> +             startup_cost += outerrel->reltarget->cost.startup;
>
> and
>
> +             run_cost += outerrel->reltarget->cost.per_tuple * input_rows;
>
> in the patch you posted in
> https://www.postgresql.org/message-id/5C66A056.60007%40lab.ntt.co.jp do
> replace
>
> +            startup_cost += foreignrel->reltarget->cost.startup;
>
> and
>
> +            run_cost += foreignrel->reltarget->cost.per_tuple * rows;
>
> respectively, which you originally included in the following part of
> 0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patch:
>
> @@ -2829,6 +2872,22 @@ estimate_path_cost_size(PlannerInfo *root,
>               run_cost += foreignrel->reltarget->cost.per_tuple * rows;
>           }
>
> +        /*
> +         * If this is an UPPERREL_ORDERED step on the final scan/join
> +         * relation, the costs obtained from the cache wouldn't yet contain
> +         * the eval cost for the final scan/join target, which would have been
> +         * updated by apply_scanjoin_target_to_paths(); add the eval cost now.
> +         */
> +        if (fpextra&&  !IS_UPPER_REL(foreignrel))
> +        {
> +            /* The costs should have been obtained from the cache. */
> +            Assert(fpinfo->rel_startup_cost>= 0&&
> +                   fpinfo->rel_total_cost>= 0);
> +
> +            startup_cost += foreignrel->reltarget->cost.startup;
> +            run_cost += foreignrel->reltarget->cost.per_tuple * rows;
> +        }
> +

Maybe, my explanation in that thread was not enough, but the changes 
proposed by the patch posted there wouldn't obsolete the above.  Let me 
explain using a foreign-table variant of the example shown in the 
comments for make_group_input_target():

     SELECT a+b, sum(c+d) FROM foreign_table GROUP BY a+b;

When called from postgresGetForeignPaths(), the reltarget for the base 
relation foreign_table would be {a, b, c, d}, for the same reason as the 
LIMIT example in [1], and the foreign_table's rel_startup_cost and 
rel_total_cost would be calculated based on this reltarget in that 
callback routine.  (The tlist eval costs would be zeroes, though).  BUT: 
before called from postgresGetForeignUpperPaths() with the 
UPPERREL_GROUP_AGG stage, the reltarget would be replaced with {a+b, c, 
d}, which is the final scan target as explained in the comments for 
make_group_input_target(), and the eval costs of the new reltarget 
wouldn't be zeros, so the costs of scanning the foreign_table would need 
to be adjusted to include the eval costs.  That's why I propose the 
assignments in estimate_path_cost_size() shown above.

On the other hand, the code bit added by 
0001-postgres_fdw-Perform-UPPERREL_ORDERED-step-remotely-v3.patch 
handles the case where a post-scan/join processing step other than 
grouping/aggregation (ie, the final sort or LIMIT/OFFSET) is performed 
directly on a base or join relation, as in the LIMIT example.  So, the 
two changes are handling different cases, hence both changes would be 
required.

(I think it might be possible that we can merge the two changes into one 
by some refactoring of estimate_path_cost_size() or something else, but 
I haven't considered that hard yet.  Should we do that?)

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/5C669DFD.30002%40lab.ntt.co.jp



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: Copy function for logical replication slots
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: Speed up transaction completion faster after many relations areaccessed in a transaction