Re: [POC] FETCH limited by bytes.

Поиск
Список
Период
Сортировка
От Corey Huinker
Тема Re: [POC] FETCH limited by bytes.
Дата
Msg-id CADkLM=fwPC+r8p_LAe12OkgsUbyAQtq77SZ2vEv7LLMQ_mcccg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [POC] FETCH limited by bytes.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Ответы Re: [POC] FETCH limited by bytes.  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-hackers
On Fri, Sep 4, 2015 at 2:45 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello,

> > > @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node,
> > int eflags)
...
> > > +     def = get_option(table->options, "fetch_size");
>
> > I don't think it's a good idea to make such checks at runtime - and
> > either way it's somethign that should be reported back using an
> > ereport(), not an elog.
>
> > Also, it seems somewhat wrong to determine this at execution
> > time. Shouldn't this rather be done when creating the foreign scan node?
> > And be a part of the scan state?
>
> I agree, that was my original plan, but I wasn't familiar enough with the
> FDW architecture to know where the table struct and the scan struct were
> both exposed in the same function.
>
> What I submitted incorporated some of Kyotaro's feedback (and working
> patch) to my original incomplete patch.

Sorry, it certainly shouldn't be a good place to do such thing. I
easily selected the place in order to avoid adding new similar
member in multiple existing structs (PgFdwRelationInfo and
PgFdwScanState).

Having a new member fetch_size is added in PgFdwRelationInfo and
PgFdwScanState, I think postgresGetForeignRelSize is the best
place to do that, from the point that it collects basic
information needed to calculate scan costs.

# Fetch sizes of foreign join would be the future issue..

> typedef struct PgFdwRelationInfo
> {
  ...
+ int    fetch_size;   /* fetch size for this remote table */

====================
> postgreGetForeignRelSize()
> {
  ...
>     fpinfo->table = GetForeignTable(foreigntableid);
>     fpinfo->server = GetForeignServer(fpinfo->table->serverid);
>
+     def = get_option(table->options, "fetch_size");
+     ..
+     fpinfo->fetch_size = strtod(defGetString...

Also it is doable in postgresGetForeignPlan and doing there
removes redundant copy of fetch_size from PgFdwRelation to
PgFdwScanState but theoretical basis would be weak.

regards,

> > On 2015-02-27 13:50:22 -0500, Corey Huinker wrote:
> > > +static DefElem*
> > > +get_option(List *options, char *optname)
> > > +{
> > > +     ListCell *lc;
> > > +
> > > +     foreach(lc, options)
> > > +     {
> > > +             DefElem *def = (DefElem *) lfirst(lc);
> > > +
> > > +             if (strcmp(def->defname, optname) == 0)
> > > +                     return def;
> > > +     }
> > > +     return NULL;
> > > +}
> >
> >
> > >       /*
> > >        * Do nothing in EXPLAIN (no ANALYZE) case.  node->fdw_state stays
> > NULL.
> > > @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node,
> > int eflags)
> > >       server = GetForeignServer(table->serverid);
> > >       user = GetUserMapping(userid, server->serverid);
> > >
> > > +     /* Reading table options */
> > > +     fsstate->fetch_size = -1;
> > > +
> > > +     def = get_option(table->options, "fetch_size");
> > > +     if (!def)
> > > +             def = get_option(server->options, "fetch_size");
> > > +
> > > +     if (def)
> > > +     {
> > > +             fsstate->fetch_size = strtod(defGetString(def), NULL);
> > > +             if (fsstate->fetch_size < 0)
> > > +                     elog(ERROR, "invalid fetch size for foreign table
> > \"%s\"",
> > > +                              get_rel_name(table->relid));
> > > +     }
> > > +     else
> > > +             fsstate->fetch_size = 100;
> >
> > I don't think it's a good idea to make such checks at runtime - and
> > either way it's somethign that should be reported back using an
> > ereport(), not an elog.
>
> > Also, it seems somewhat wrong to determine this at execution
> > time. Shouldn't this rather be done when creating the foreign scan node?
> > And be a part of the scan state?
>
> I agree, that was my original plan, but I wasn't familiar enough with the
> FDW architecture to know where the table struct and the scan struct were
> both exposed in the same function.
>
> What I submitted incorporated some of Kyotaro's feedback (and working
> patch) to my original incomplete patch.
>
> > Have you thought about how this option should cooperate with join
> > pushdown once implemented?
> >
>
> I hadn't until now, but I think the only sensible thing would be to
> disregard table-specific settings once a second foreign table is detected,
> and instead consider only the server-level setting.
>
> I suppose one could argue that if ALL the tables in the join had the same
> table-level setting, we should go with that, but I think that would be
> complicated, expensive, and generally a good argument for changing the
> server setting instead.

--
Kyotaro Horiguchi
NTT Open Source Software Center


Ok, with some guidance from RhodiumToad (thanks!) I was able to get the proper RelOptInfo->Plan->Scan handoff.

What I don't know how to do is show that the proper fetch sizes are being used on the remote server with the resources available in the regression test. Suggestions welcome.

This patch works for my original added test-cases, and works for me connecting to a redshift cluster that we have, the queries show up in the console like this:
     FETCH 101 FROM c1
     FETCH 30 FROM c1
     FETCH 50 FROM c1

The (redacted) source of that test is as follows:


begin;
create extension if not exists postgres_fdw;

create server redshift foreign data wrapper postgres_fdw
options (host 'REDACTED', port '5439', dbname 'REDACTED', fetch_size '101');

select * from pg_foreign_server;

create user mapping for public server redshift
options ( user 'REDACTED', password 'REDACTED');

select * from pg_user_mappings;

create foreign table test_table ( date date, tval text )
server redshift
options (table_name 'REDACTED');

select count(*) from ( select tval from test_table where date = 'REDACTED' ) x;

alter server redshift options ( set fetch_size '30' );

select count(*) from ( select tval from test_table where date = 'REDACTED' ) x;

alter foreign table test_table options ( fetch_size '50' );

select count(*) from ( select tval from test_table where date = 'REDACTED' ) x;

rollback;


Attached is the patch / diff against current master.
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: row_security GUC, BYPASSRLS
Следующее
От: Noah Misch
Дата:
Сообщение: Re: row_security GUC, BYPASSRLS