Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION
От | Pavel Stehule |
---|---|
Тема | Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION |
Дата | |
Msg-id | 162867790911250232u75985e29mba6310358d4b2911@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: [PATCH 4/4] Add tests to dblink covering use of COPY
TO FUNCTION
(Jeff Davis <pgsql@j-davis.com>)
|
Список | pgsql-hackers |
2009/11/25 Jeff Davis <pgsql@j-davis.com>: > On Wed, 2009-11-25 at 09:23 +0100, Pavel Stehule wrote: >> > If SRFs use a tuplestore in that situation, it sounds like that should >> > be fixed. Why do we need to provide alternate syntax involving COPY? >> >> It isn't problem of SRF function design. It allow both mode - row and >> tuplestor. > > select * from generate_series(1,1000000000) limit 1; > > That statement takes a long time, which indicates to me that it's > materializing the result of the SRF. And there's no insert there. This is missing optimalisation. If I understand - PostgreSQL wait for complete result set - so in this case materialisation is necessary. In your query pg do materialisation too early. postgres=# select * from generate_series(1,100000) limit 1;generate_series ───────────────── 1 (1 row) Time: 59,540 ms postgres=# select generate_series(1,100000) limit 1;generate_series ───────────────── 1 (1 row) Time: 1,107 ms But usually we can process all rows from SRF function - so problem with LIMIT isn't significant. I am testing: 1. postgres=# select count(*) from generate_series(1,1000000); count ─────────1000000 (1 row) Time: 930,720 ms 2. postgres=# select count(*) from (select generate_series(1,1000000)) x; count ─────────1000000 (1 row) Time: 276,511 ms 2. is significantly faster then 1 (there are not SRF materialisation) postgres=# create table foo(a integer); CREATE TABLE postgres=# insert into foo select generate_series(1,1000000); INSERT 0 1000000 Time: 4274,869 ms postgres=# insert into foo select * from generate_series(1,1000000); INSERT 0 1000000 Time: 4814,756 ms postgres=# copy foo to '/tmp/xxx'; COPY 1000000 Time: 1033,078 ms postgres=# set synchronous_commit to off; SET postgres=# copy foo from '/tmp/xxx'; COPY 1000000 Time: 2749,277 ms postgres=# insert into foo select generate_series(1,1000000); INSERT 0 1000000 Time: 3948,734 ms generate_function is fast and simple - but still COPY is about 30% faster > >> This is problem of INSERT statement, resp. INSERT INTO >> SELECT implementation. > > If "tmp" is a new table, and "zero" is a table with a million zeros in > it, then: > insert into tmp select 1/i from zero; > fails instantly. That tells me that it's not materializing the result of > the select; rather, it's feeding the rows in one at a time. > I thing, so materialisation is every time, when you use any SQL statement without cursor. > Can show me in more detail what you mean? I'm having difficulty > understanding your short replies. I thing, so COPY tab from/to fce() should be used for very large import export, where INSERT SELECT needs minimally one materialisation. p.s. I am sorry - I am not native speaker - so I am speaking in short replies. Pavel > > Regards, > Jeff Davis > >
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Hannu KrosingДата:
Сообщение: Re: [PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION