Re: SQL/MED - file_fdw

Поиск

Список

Период

Сортировка

От	Shigeru HANADA
Тема	Re: SQL/MED - file_fdw
Дата	7 февраля 2011 г. 14:30:43
Msg-id	20110208003029.8736.6989961C@metrosystems.co.jp обсуждение исходный текст
Ответ на	Re: SQL/MED - file_fdw (Itagaki Takahiro <itagaki.takahiro@gmail.com>)
Ответы	Re: SQL/MED - file_fdw (Itagaki Takahiro <itagaki.takahiro@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Mon, 7 Feb 2011 21:00:53 +0900
Itagaki Takahiro <itagaki.takahiro@gmail.com> wrote:
> On Mon, Feb 7, 2011 at 16:01, Shigeru HANADA <hanada@metrosystems.co.jp> wrote:
> > This patch is based on latest FDW API patches which are posted in
> > another thread "SQL/MED FDW API", and copy_export-20110104.patch which
> > was posted by Itagaki-san.
>
> I have questions about estimate_costs().
>
> * What value does baserel->tuples have?
>   Foreign tables are never analyzed for now. Is the number correct?

No, baserel->tuples is always 0, and baserel->pages is 0 too.
In addition, width and rows are set to 100 and 1, as default values.
I think ANALYZE support is needed to make PostgreSQL's standard
optimization for foreign scans.  At present, estimation for foreign
tables would be awful.

> * Your previous measurement showed it has much more startup_cost.
>   When you removed ReScan, it took long time but planner didn't choose
>   materialized plans. It might come from lower startup costs.

I tested joining file_fdw tables, accounts and branches, which are
initialized with "pgbench -i -s 10" and exported to csv files.

At first, I tried adding random_page_cost (4.0) to startup_cost as
cost to open file (though it's groundless), but materialized was not
chosen.  After updating pg_class.reltuples to correct value, Hash-join
was choosen for same query.  With disabling Hash-join, finally
materialized was choosen.

ISTM that choosing simple nested loop would come from wrong
estimation of loop count, but not from startup cost.  IMHO, supporting
analyze (PG-style statistics) is necessary to make PostgreSQL to
generate sane plan.

> * Why do you use lstat() in it?
>   Even if the file is a symlink, we will read the linked file in the
>   succeeding copy. So, I think it should be stat() rather than lstat().

Good catch!  Fixed version is attached.

Regards,
--
Shigeru Hanada

Вложения

file_fdw-20110208.patch.gz

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Cédric Villemain
Дата: 07 февраля 2011 г., 14:22:48
Сообщение: Re: Spread checkpoint sync

Следующее

От: Greg Smith
Дата: 07 февраля 2011 г., 14:44:18
Сообщение: Re: Spread checkpoint sync

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: SQL/MED - file_fdw

Вложения

Предыдущее

Следующее