Обсуждение: Performance testing of COPY (SELECT) TO

Поиск
Список
Период
Сортировка

Performance testing of COPY (SELECT) TO

От
Böszörményi Zoltán
Дата:
Hi,

we have a large export here, I made an in-house benchmark
between Informix, plain PostgreSQL-8.1.4 and
8.2devel+COPY(SELECT) using the same data and query.
Find the results below for the two PostgreSQL versions.
With PostgreSQL 8.1.4, I used this:

begin;
select ... into temp myquery1;
copy myquery1 to stdout csv delimiter '|';
rollback;

With 8.2devel, I simple used
copy (select ...) to stdout csv delimiter '|';

# of clients:            1*        3**        10**
PostgreSQL        1:33        10:58        55:46
PostgreSQL 8.2        1:19        4:55        18:28

*  - average of 4 runs, the first was with cold caches after reboot
** - 1 run, average of cliens' runtimes

Performance between 8.1.4 and 8.2devel is interesting:
1 client: 15%
3 clients: 55.2%
10 clients: 66.9%

The same machine was used for testing.

Best regards,
Zoltán Böszörményi



Re: Performance testing of COPY (SELECT) TO

От
Tom Lane
Дата:
Böszörményi Zoltán <zboszor@dunaweb.hu> writes:
> With PostgreSQL 8.1.4, I used this:

> begin;
> select ... into temp myquery1;
> copy myquery1 to stdout csv delimiter '|';
> rollback;

The performance of this would doubtless vary a lot with the temp_buffers
setting.  Did you try different values?

It'd also be interesting to time the same way (with a temp table) in
devel.  I don't remember whether we did any performance work on the
COPY CSV data path in this cycle, or whether that was all present in
8.1.  In any case it'd be worth proving that the COPY SELECT patch isn't
degrading performance of the copy-a-relation case.
        regards, tom lane


Re: Performance testing of COPY (SELECT) TO

От
Böszörményi Zoltán
Дата:
> Böszörményi Zoltán <zboszor@dunaweb.hu> writes:
>> With PostgreSQL 8.1.4, I used this:
>
>> begin;
>> select ... into temp myquery1;
>> copy myquery1 to stdout csv delimiter '|';
>> rollback;
>
> The performance of this would doubtless vary a lot with the temp_buffers
> setting.  Did you try different values?

Yes, I did, but now checked back with 8.2CVS.
The previously quoted result was achieved with
temp_buffers = 1000 on both 8.1.4 and 8.2CVS.
On 8.2CVS with temp_buffers = 4096, the 10 client case kills
the machine with swapping, but the 3 client runtime with
COPY(SELECT) went down to 2:41. The SELECT INTO TEMP
case went down to 3:36.

> It'd also be interesting to time the same way (with a temp table) in
> devel.  I don't remember whether we did any performance work on the
> COPY CSV data path in this cycle, or whether that was all present in
> 8.1.  In any case it'd be worth proving that the COPY SELECT patch isn't
> degrading performance of the copy-a-relation case.

I will report back with that, say on Monday.

In the meantime, I documented the COPY (SELECT) case
and modified parser/analyze.c and tcop/utility.c so neither of them
calls anything from under another directory. I think it's cleaner now.
Also, I tried to implement more closely what printtup() does.
Please, review.

Best regards,
Zoltán Böszörményi

Вложения

Re: Performance testing of COPY (SELECT) TO

От
Böszörményi Zoltán
Дата:
> It'd also be interesting to time the same way (with a temp table) in
> devel.  I don't remember whether we did any performance work on the
> COPY CSV data path in this cycle, or whether that was all present in
> 8.1.  In any case it'd be worth proving that the COPY SELECT patch isn't
> degrading performance of the copy-a-relation case.

In the export, there is a largish table, that has both many columns and rows.

With COPY(SELECT) patch applied:

time psql -c "copy (select * from table) to 'file'" dbx
COPY 886046

real    0m13.253s
user    0m0.000s
sys     0m0.000s

time psql -c "copy table to 'file'" dbx
COPY 886046

real    0m13.234s
user    0m0.000s
sys     0m0.000s

time psql -c "copy table to stdout" dbx >file

real    0m15.155s
user    0m0.540s
sys     0m0.450s

time psql -c "copy (select * from table) to stdout" dbx >file

real    0m15.079s
user    0m0.540s
sys     0m0.590s

Surprisingly, without the COPY(SELECT) patch it's slower,
it's the lowest from five runs, e.g. with warm caches:

time psql -c "copy table to 'file'" dbx

real    0m20.464s
user    0m0.000s
sys     0m0.010s

time psql -c "copy table to stdout" dbx >file

real    0m25.753s
user    0m0.570s
sys     0m0.460s

With the original settings, temp_buffers = 1000 on 8.2CVS,
the one client case looks like this: first run 1:44, second run 1:12,
third run 1:04. It seems it's a bit faster both on startup and on
subsequent runs.

Best regards,
Zoltán Böszörményi



Re: Performance testing of COPY (SELECT) TO

От
Zoltan Boszormenyi
Дата:
Hi,

Böszörményi Zoltán írta:
>> Böszörményi Zoltán <zboszor@dunaweb.hu> writes:
>>     
>>> With PostgreSQL 8.1.4, I used this:
>>>       
>>> begin;
>>> select ... into temp myquery1;
>>> copy myquery1 to stdout csv delimiter '|';
>>> rollback;
>>>       
>> The performance of this would doubtless vary a lot with the temp_buffers
>> setting.  Did you try different values?
>>     
>
> Yes, I did, but now checked back with 8.2CVS.
> The previously quoted result was achieved with
> temp_buffers = 1000 on both 8.1.4 and 8.2CVS.
> On 8.2CVS with temp_buffers = 4096, the 10 client case kills
> the machine with swapping, but the 3 client runtime with
> COPY(SELECT) went down to 2:41. The SELECT INTO TEMP
> case went down to 3:36.
>
>   
>> It'd also be interesting to time the same way (with a temp table) in
>> devel.  I don't remember whether we did any performance work on the
>> COPY CSV data path in this cycle, or whether that was all present in
>> 8.1.  In any case it'd be worth proving that the COPY SELECT patch isn't
>> degrading performance of the copy-a-relation case.
>>     
>
> I will report back with that, say on Monday.
>   

It seems my previous mail hasn't reached
the hackers list, I answer here.

In the export, there is a largish table,
that has both many columns and rows.

With COPY(SELECT) patch applied:

time psql -c "copy (select * from table) to 'file'" dbx
COPY 886046

real    0m13.253s
user    0m0.000s
sys     0m0.000s

time psql -c "copy table to 'file'" dbx
COPY 886046

real    0m13.234s
user    0m0.000s
sys     0m0.000s

time psql -c "copy table to stdout" dbx >file

real    0m15.155s
user    0m0.540s
sys     0m0.450s

time psql -c "copy (select * from table) to stdout" dbx >file

real    0m15.079s
user    0m0.540s
sys     0m0.590s

Surprisingly, without the COPY(SELECT) patch it's slower,
this is the lowest from five runs, e.g. with warm caches:

time psql -c "copy table to 'file'" dbx

real    0m20.464s
user    0m0.000s
sys     0m0.010s

time psql -c "copy table to stdout" dbx >file

real    0m25.753s
user    0m0.570s
sys     0m0.460s

With the original settings, temp_buffers = 1000 on 8.2CVS,
the export runtime with one client looks like this:
first run 1:44, second run 1:12, third run 1:04.
It seems it's a bit faster both on startup and on
subsequent runs.

Best regards,
Zoltán Böszörményi




Re: Performance testing of COPY (SELECT) TO

От
Bruce Momjian
Дата:
Your patch has been added to the PostgreSQL unapplied patches list at:
http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------


B�sz�rm�nyi Zolt�n wrote:
> > B?sz?rm?nyi Zolt?n <zboszor@dunaweb.hu> writes:
> >> With PostgreSQL 8.1.4, I used this:
> >
> >> begin;
> >> select ... into temp myquery1;
> >> copy myquery1 to stdout csv delimiter '|';
> >> rollback;
> >
> > The performance of this would doubtless vary a lot with the temp_buffers
> > setting.  Did you try different values?
> 
> Yes, I did, but now checked back with 8.2CVS.
> The previously quoted result was achieved with
> temp_buffers = 1000 on both 8.1.4 and 8.2CVS.
> On 8.2CVS with temp_buffers = 4096, the 10 client case kills
> the machine with swapping, but the 3 client runtime with
> COPY(SELECT) went down to 2:41. The SELECT INTO TEMP
> case went down to 3:36.
> 
> > It'd also be interesting to time the same way (with a temp table) in
> > devel.  I don't remember whether we did any performance work on the
> > COPY CSV data path in this cycle, or whether that was all present in
> > 8.1.  In any case it'd be worth proving that the COPY SELECT patch isn't
> > degrading performance of the copy-a-relation case.
> 
> I will report back with that, say on Monday.
> 
> In the meantime, I documented the COPY (SELECT) case
> and modified parser/analyze.c and tcop/utility.c so neither of them
> calls anything from under another directory. I think it's cleaner now.
> Also, I tried to implement more closely what printtup() does.
> Please, review.
> 
> Best regards,
> Zolt?n B?sz?rm?nyi

[ Attachment, skipping... ]

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Performance testing of COPY (SELECT) TO

От
Zoltan Boszormenyi
Дата:
Bruce Momjian írta:
> Your patch has been added to the PostgreSQL unapplied patches list at:
>
>     http://momjian.postgresql.org/cgi-bin/pgpatches
>
> It will be applied as soon as one of the PostgreSQL committers reviews
> and approves it.
>

Thanks. Would you please add this instead?
psql built-in \copy (select ...) now also work.

Best regards,
Zoltán Böszörményi


Вложения