use_remote_estimate usage for join pushdown in postgres_fdw

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема use_remote_estimate usage for join pushdown in postgres_fdw
Дата
Msg-id CAFjFpRepSC2e3mZ1uYSopJD6R19fOZ0dNNf9Z=gnyKSB6wGk5g@mail.gmail.com
обсуждение исходный текст
Ответы Re: use_remote_estimate usage for join pushdown in postgres_fdw  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Hi All,
postgres_fdw documentation says following about use_remote_estimate (http://www.postgresql.org/docs/devel/static/postgres-fdw.html)
--
use_remote_estimate
This option, which can be specified for a foreign table or a foreign server, controls whether postgres_fdw issues remote EXPLAIN commands to obtain cost estimates. A setting for a foreign table overrides any setting for its server, but only for that table. The default is false.
--

I am trying to see, how should we use this option in the context of join pushdown and for
that matter any pushdown involving more than one table.

I came up with following arguments
1. Foreign base relations derive their use_remote_estimate setting either from the server setting or the per table setting. A join between two foreign relations should derive its use_remote_estimate setting from the joining relations (recursively). This means that we will use EXPLAIN to estimate costs of join if "all" the involved base foreign relations have use_remote_estimate true (either they derive it from the server level setting or table level setting).

2. Similar to 1, but use EXPLAIN to estimate costs if "any" of the involved base foreign relations have use_remote_estimate is true.

3. Since join between two foreign relations is not a table level phenomenon, but a server level phenomenon, we should use server level setting. This means that we will use EXPLAIN output to estimate costs of join if the foreign server has use_remote_estimate true, irrespective of the setting for individual foreign relations involved in that join.

Unfortunately the documentation and comments in code do not say much about the intention (i.e. why and how is this setting expected to be used) of this setting in the context or server.

The intention behind server level setting is more confusing. It does not override table level setting, so it is not intended to be used for a prohibitive reason like e.g. server doesn't support EXPLAIN the way it will be interpreted locally. It seems to act more like a default in case table level setting is absent. User may set table level use_remote_estimate to true, if cost of EXPLAIN is very small compared to that of table scan (with or without indexes) or adding conditional clauses to the query alters the costs heavily that the cost of EXPLAIN itself is justified. But I can be wrong about these intentions.

If we go by the above intention behind table level setting, 2nd argument makes more sense as the table for which use_remote_estimate is true, can change the cost of join heavily because of the clauses in the join and it's better to get it from the foreign server than guessing it locally.

Comments/suggestions are welcome.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Error with index on unlogged table
Следующее
От: Etsuro Fujita
Дата:
Сообщение: Re: Remaining 9.5 open items