Re: sql/med review - problems with patching

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: sql/med review - problems with patching
Дата
Msg-id AANLkTilfzj08sgvGWDd4HhjDeo5A0ErdixPP3_H44oyQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: sql/med review - problems with patching  (David Fetter <david@fetter.org>)
Список pgsql-hackers
2010/7/20 David Fetter <david@fetter.org>:
> On Tue, Jul 20, 2010 at 11:40:18AM +0200, Pavel Stehule wrote:
>> 2010/7/20 Itagaki Takahiro <itagaki.takahiro@gmail.com>:
>> > 2010/7/14 Pavel Stehule <pavel.stehule@gmail.com>:
>> >> please, can you refresh patch, please?
>> >
>> > Updated patch attached. The latest version is always in the git
>> > repo.  http://repo.or.cz/w/pgsql-fdw.git   (branch: fdw) I'm
>> > developing the patch on postgres' git repo. So, regression test
>> > for dblink might fail because of out-of-sync issue between cvs and
>> > git.
>> >
>> >> When I looked to documentation I miss a some tutorial for foreign
>> >> tables. There are only reference. I miss some paragraph where is
>> >> cleanly and simple specified what is possible now and whot isn't
>> >> possible. Enhancing of dblink isn't documented
>> >
>> > Sure. I'll start to write documentation when we agree the design
>> > of FDW.
>> >
>> >> In function  pgIterate(ForeignScanState *scanstate) you are
>> >> iterare via pg result. I am thinking so using a cursor and
>> >> fetching multiple rows should be preferable.
>> >
>> > Sure, but I'm thinking that it will be improved after libpq
>> > supports protocol-level cursor. The libpq improvement will be
>> > applied much more applications including postgresql_fdw.
>> >
>>
>> is there some time frame for this task - or ToDo point? Minimally it
>> has to be documented, because it can be a issue on larger sets -
>> speed, memory usage. I am afraid about speed for queries like
>>
>> select * from large_dblink_tab limit 100;
>
> The general issue of passing qualifiers to the remote data source is
> complex, especially when the DML for that data source is different
> from PostgreSQL's DML.
>
> Do you have some ideas as to how to solve this problem in general?  In
> this case?

yes, I can. I expect so you can read from foreign table row by row,
because it is supported by executor - via ExecForeignScan().

dblink exec is called inside this method and result is stored.
Repeated call of this method means repeated reading from tuplestore
storage. So outer limit hasn't effect on loaded data - full result is
fetched. I propose little bit different strategy. Using a cursor and
repeating fetching of n rows. Plpgsql "FOR" statement uses 50 rows.
For external tables 1000 can be enough. With this strategy max (n - 1)
rows are fetched uselessly. So we don't need a local tuplestore and we
will have a result early - with LIMIT clause.

so
ExecInitForeignScan()
--> open cursor

ExecForeignScan()
--> if there are some fetched rows, return row
--> if not, fetch 1000 rows

ExecEndScan()
--> close cursor

Regards

Pavel Stehule

same mechanism is used in plpgsql.



>
> Cheers,
> David.
> --
> David Fetter <david@fetter.org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: reducing NUMERIC size for 9.1
Следующее
От: Etienne Dube
Дата:
Сообщение: Re: [PATCH] "could not reattach to shared memory" on Windows