Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

Поиск
Список
Период
Сортировка
От Kouhei Kaigai
Тема Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Дата
Msg-id 9A28C8860F777E439AA12E8AEA7694F8010CDAD4@BPXM15GP.gisp.nec.co.jp
обсуждение исходный текст
Ответ на Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Ответы Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)  (Shigeru HANADA <shigeru.hanada@gmail.com>)
Список pgsql-hackers
> 2015/04/09 10:48、Kouhei Kaigai <kaigai@ak.jp.nec.com> のメール:
> * merge_fpinfo()
> >>> It seems to me fpinfo->rows should be joinrel->rows, and
> >>> fpinfo->width also should be joinrel->width.
> >>> No need to have special intelligence here, isn't it?
> >>
> >>
> >> Oops. They are vestige of my struggle which disabled SELECT clause optimization
> >> (omit unused columns).  Now width and rows are inherited from joinrel.
> Besides
> >> that, fdw_startup_cost and fdw_tuple_cost seem wrong, so I fixed them to use
> simple
> >> summary, not average.
> >>
> > Does fpinfo->fdw_startup_cost represent a cost to open connection to remote
> > PostgreSQL, doesn't it?
> >
> > postgres_fdw.c:1757 says as follows:
> >
> >    /*
> >     * Add some additional cost factors to account for connection overhead
> >     * (fdw_startup_cost), transferring data across the network
> >     * (fdw_tuple_cost per retrieved row), and local manipulation of the data
> >     * (cpu_tuple_cost per retrieved row).
> >     */
> >
> > If so, does a ForeignScan that involves 100 underlying relation takes 100
> > times heavy network operations on startup? Probably, no.
> > I think, average is better than sum, and max of them will reflect the cost
> > more correctly.
> 
> In my current opinion, no. Though I remember that I've written such comments
> before :P.
> 
> Connection establishment occurs only once for the very first access to the server,
> so in the use cases with long-lived session (via psql, connection pooling, etc.),
> taking connection overhead into account *every time* seems too pessimistic.
> 
> Instead, for practical cases, fdw_startup_cost should consider overheads of query
> construction and getting first response of it (hopefully it minus retrieving
> actual data).  These overheads are visible in the order of milliseconds.  I’m
> not sure how much is appropriate for the default, but 100 seems not so bad.
> 
> Anyway fdw_startup_cost is per-server setting as same as fdw_tuple_cost, and it
> should not be modified according to the width of the result, so using
> fpinfo_o->fdw_startup_cost would be ok.
>
Indeed, I forgot the connection cache mechanism. As long as we define
fdw_startup_cost as you mentioned, it seems to me your logic is heuristically
reasonable.

> > Also, fdw_tuple_cost introduce the cost of data transfer over the network.
> > I thinks, weighted average is the best strategy, like:
> >  fpinfo->fdw_tuple_cost =
> >    (fpinfo_o->width / (fpinfo_o->width + fpinfo_i->width) *
> fpinfo_o->fdw_tuple_cost +
> >    (fpinfo_i->width / (fpinfo_o->width + fpinfo_i->width) *
> fpinfo_i->fdw_tuple_cost;
> >
> > That's just my suggestion. Please apply the best way you thought.
> 
> I can’t agree that strategy, because 1) width 0 causes per-tuple cost 0, and 2)
> fdw_tuple_cost never vary in a foreign server.  Using fpinfo_o->fdw_tuple_cost
> (it must be identical to fpinfo_i->fdw_tuple_cost) seems reasonable.  Thoughts?
>
OK, you are right.

I think it is time to hand over the patch reviewing to committers.
So, let me mark it "ready for committers".

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: "rejected" vs "returned with feedback" in new CF app
Следующее
От: Magnus Hagander
Дата:
Сообщение: psql showing owner in \dT