Re: Wrong cost estimation for foreign tables join withuse_remote_estimate disabled

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: Wrong cost estimation for foreign tables join withuse_remote_estimate disabled
Дата
Msg-id CAFjFpRcFVSg7_F27SwVWHnckd6iXFVKLO7qnnFK5vzrA1W1k-Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Wrong cost estimation for foreign tables join withuse_remote_estimate disabled  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Список pgsql-hackers
Sorry here's preview link [1]

[1]
https://ashutoshpg.blogspot.com/b/post-preview?token=TCTIKGQBAAA.2iKpIUItkwZLkXiujvs0zad-DtDdKbwIdRFCGbac9--XbqcA-xnCdz4wmbD4hIaEHuyg5Xrz8eZq8ZNmw83yfQ.HXi__guM-7SzdIWi27QkjA&postId=4019325618679658571&type=POST

On Fri, Jun 22, 2018 at 4:00 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> On Fri, Jun 22, 2018 at 11:56 AM, Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>>
>>
>> On 21.06.2018 20:08, Tom Lane wrote:
>>>
>>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes:
>>>>
>>>> The following very simple test reduce the problem with wrong cost
>>>> estimation:
>>>> create foreign table t1_fdw(x integer, y integer) server pg_fdw options
>>>> (table_name 't1', use_remote_estimate 'false');
>>>> create foreign table t2_fdw(x integer) server pg_fdw options (table_name
>>>> 't2', use_remote_estimate 'false');
>>>> It is possible to force Postgres to use correct plan by setting
>>>> "fdw_startup_cost" to some very large value (100000000 for example).
>>>> ...
>>>> Also correct plan is used when use_remote_estimate is true.
>>>
>>> If you are unhappy about the results with use_remote_estimate off, don't
>>> run it that way.  The optimizer does not have a crystal ball.
>>
>>
>> As I wrote, use_remote_estimate can not be used because in this case query
>> compilation time is unacceptable (10 seconds, while time of query execution
>> itself is ~200msec).
>> So the problem can be addressed in two ways:
>>
>> 1. Try to reduce time of remote estimation. I wonder why postgres_fdw sends
>> so much queries to remote server. For join of two tables there are 7
>> queries.
>> I suspect that for ~20 joined tables in the original query number of calls
>> is more than hundred,  so on wonder that it takes so much time.
>> 2. Try to make optimizer make better estimation of join cost based on local
>> statistic (please notice that ANALYZE is explicitly called for all foreign
>> tables and number of rows in the result was correctly calculated).
>>
>
> I think estimate_path_cost_size() is too pessimistic about how many
> times the join conditions are evaluated (Sorry, I have written that
> code when I was worked on join pushdown for postgres_fdw.)
>
>             /* Estimate of number of rows in cross product */
>             nrows = fpinfo_i->rows * fpinfo_o->rows;
>
> and somewhere down in the code
>            run_cost += nrows * join_cost.per_tuple;
>
> It assumes that the join conditions are run on the cross-product of
> the joining tables. In reality that never happens for large tables. In
> such cases the optimizer will choose either hash or merge join, which
> will apply join conditions only on a small portion of cross-product.
> But the reason it was written that way was the local server can not
> estimate the fraction of cross product on which the join conditions
> will be applied. May be we could assume that the join conditions will
> be applied to only 1% of the cross product, i.e. run_cost +=
> clamp_rows(nrows/100) * join_cost.per_tuple. With this change I think
> the cost of remote plan will be less than local plan.
>
> Here's a preview of blog, I am planning to publish soon, about this
> issue at [1]. It has a bit more details.
>
> [1]
https://www.blogger.com/blogger.g?blogID=5253679863234367862#editor/target=post;postID=4019325618679658571;onPublishedMenu=allposts;onClosedMenu=allposts;postNum=0;src=postname
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: Wrong cost estimation for foreign tables join withuse_remote_estimate disabled
Следующее
От: Konstantin Knizhnik
Дата:
Сообщение: Re: Wrong cost estimation for foreign tables join withuse_remote_estimate disabled