Re: very slow when writing query to file
| От | boris pezzatti |
|---|---|
| Тема | Re: very slow when writing query to file |
| Дата | |
| Msg-id | 4EB0FD94.7010703@wsl.ch обсуждение исходный текст |
| Ответ на | Re: very slow when writing query to file (boris pezzatti <boris.pezzatti@wsl.ch>) |
| Ответы |
Re: very slow when writing query to file
|
| Список | pgadmin-support |
... or (I promise this is the last guess) each row is appended to the file in the the loop, and on some OS's there is a problem keeping open the file (so that each time the file must be opened again ... adding a lot of extra time). Maybe creating all the "virtual file" in a variable and storing all at once (or let say in chunks of 100 Mb) could solve the problem ...
On 11/02/2011 09:10 AM, boris pezzatti wrote:
Boris Pezzatti
or maybe the retrieved data in RAM are somehow lazy bound ... ?
On 11/02/2011 08:58 AM, boris pezzatti wrote:Thank you Fernando for reproducing this.
I suspect there must be some part of code in the* for each row * for each columnloops that result inefficient only on some machines or OS's (I'm using Archlinux).
In fact the extra time I and Fernando get can not only be attributed to adding commas, " and line feeds. The same file which I retrived in more than one hour (32Mb), could be saved with openoffice in 10 seconds, changing column separators, text delimiters and encoding.
Could it be possible that the use of e.g. error catching in the loops results in poor performance on some systems? (I'm not a c++ programmer ...)
On 11/01/2011 11:23 AM, Guillaume Lelarge wrote:On Mon, 2011-10-31 at 18:26 -0300, Fernando Hevia wrote:[...] I could reproduce the issue in a fresh Windows 7 install with no other apps running other than pgAdmin v1.14.0. >From what I could see, the execute-to-file function runs in 2 stages: 1. Rows are retrieved from DB server to RAM 2. Rows are written from RAM to fileThat's right.The delay clearly occurs in step 2.That's also right.While with small datasets (<2000) the writing to disk delay is barely perceivable, when the number of rows is incremented (>10k) it is quite distinctive how step 1 keeps completing in the expected time frame but step 2 takes much much longer. In any case it should be the other way around.Nope, step 2 does a lot of work.With 10k rows (2.3 MB) it took 3.5 seconds to retrieve data from DB and 40 seconds to write the file to an SATA 7200 disk with write-through cache. With 100k rows (23 MB) the DB retrieve went for 35 seconds while the file writing part took over 4 minutes (didn't wait for it to finish).I don't have the same numbers: * 1000 : 1s * 10000 : 1s * 200000 : 7s * 1000000 : 18sThe file is being written at an avg 60 KB per second, which is extremely slow.If the only thing pgAdmin does was writing, I would agree. But, actually, it does a lot more things: * for each row * for each column * adds the column separator, if needed * grabs one cell's value * quotes the value, if needed (which also means doubling the quote if it's within the value) * adds the line separator * converts it to the encoding, if needed * writes it to the file That could take some time. I searched if there were some parts that took much longer than others, but failed to find one.--
Swiss Federal Research Institute WSL
| Research unit Community Ecology Team Insubric Ecosystems via Belsoggiorno 22 CH-6500 Bellinzona Switzerland | phone direct ++41 91 821 52 32 phone ++41 91 821 52 30 fax ++41 91 821 52 39 boris.pezzatti@wsl.ch http://www.wsl.ch |
--
Boris PezzattiSwiss Federal Research Institute WSL
| Research unit Community Ecology Team Insubric Ecosystems via Belsoggiorno 22 CH-6500 Bellinzona Switzerland | phone direct ++41 91 821 52 32 phone ++41 91 821 52 30 fax ++41 91 821 52 39 boris.pezzatti@wsl.ch http://www.wsl.ch |
--
Boris PezzattiSwiss Federal Research Institute WSL
| Research unit Community Ecology Team Insubric Ecosystems via Belsoggiorno 22 CH-6500 Bellinzona Switzerland | phone direct ++41 91 821 52 32 phone ++41 91 821 52 30 fax ++41 91 821 52 39 boris.pezzatti@wsl.ch http://www.wsl.ch |
В списке pgadmin-support по дате отправления: