Обсуждение: scaling multiple connections
I am getting a bit concerned about Postgres 7.1 performance with multiple connections. Postgres does not seem to scaling very well. Below there is a list of outputs from pgbench with different number of clients, you will see that postgres' performance in the benchmark drops with each new connection. Shouldn't the tps stay fairly constant? I am using pgbench because I saw this performance issue with a project I was developing. I decided to try doing operations in parallel, thinking that postgres would scale. What I found was the more machines I added to the task, the slower the processing was. Anyone have any ideas? Is this how it is supposed to be? My postmaster start line looks like: /usr/local/pgsql/bin/postmaster -A0 -N 24 -B 4096 -i -S -D/sqlvol/pgdev -o -F -fs -S 2048 The database is on a dedicated PCI-IDE/66 promise card, with a 5400rpm maxtor drive, not the best hardware, I grant you, but that should have little to do with the scaling aspect. I am running redhat linux 7.0, kernel 2.4.3. 512M ram, dual PIII 600mhz. [markw@snoopy pgbench]$ ./pgbench -v -c 1 -t 30 pgbench starting vacuum...end. starting full vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 number of clients: 1 number of transactions per client: 30 number of transactions actually processed: 30/30 tps = 218.165952(including connections establishing) tps = 245.062001(excluding connections establishing) [markw@snoopy pgbench]$ ./pgbench -v -c 2 -t 30 pgbench starting vacuum...end. starting full vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 number of clients: 2 number of transactions per client: 30 number of transactions actually processed: 60/60 tps = 200.861024(including connections establishing) tps = 221.175326(excluding connections establishing) [markw@snoopy pgbench]$ ./pgbench -v -c 3 -t 30 pgbench starting vacuum...end. starting full vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 number of clients: 3 number of transactions per client: 30 number of transactions actually processed: 90/90 tps = 144.053242(including connections establishing) tps = 154.083205(excluding connections establishing) [markw@snoopy pgbench]$ ./pgbench -v -c 4 -t 30 pgbench starting vacuum...end. starting full vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 number of clients: 4 number of transactions per client: 30 number of transactions actually processed: 120/120 tps = 129.709537(including connections establishing) tps = 137.852284(excluding connections establishing) [markw@snoopy pgbench]$ ./pgbench -v -c 5 -t 30 pgbench starting vacuum...end. starting full vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 number of clients: 5 number of transactions per client: 30 number of transactions actually processed: 150/150 tps = 103.569559(including connections establishing) tps = 108.535287(excluding connections establishing) ....... [markw@snoopy pgbench]$ ./pgbench -v -c 20 -t 30 pgbench starting vacuum...end. starting full vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 number of clients: 20 number of transactions per client: 30 number of transactions actually processed: 600/600 tps = 40.600209(including connections establishing) tps = 41.352773(excluding connections establishing)
mlw <markw@mohawksoft.com> writes:
> I am getting a bit concerned about Postgres 7.1 performance with
> multiple connections. Postgres does not seem to scaling very
> well. Below there is a list of outputs from pgbench with different
> number of clients, you will see that postgres' performance in the
> benchmark drops with each new connection. Shouldn't the tps stay
> fairly constant?
There was quite a long thread about this in pghackers back in Jan/Feb
(or so). You might want to review it. One thing I recall is that
you need a "scaling factor" well above 1 if you want meaningful results
--- at scale factor 1, all of the transactions want to update the same
row, so of course there's no parallelism and a lot of lock contention.
The default WAL tuning parameters (COMMIT_DELAY, WAL_SYNC_METHOD, and
friends) are probably not set optimally in 7.1. We are hoping to hear
about some real-world performance results so that we can tweak them in
future releases. I do not trust benchmarks as simplistic as pgbench for
doing that kind of tweaking, however.
regards, tom lane
Tom Lane wrote:
>
> mlw <markw@mohawksoft.com> writes:
> > I am getting a bit concerned about Postgres 7.1 performance with
> > multiple connections. Postgres does not seem to scaling very
> > well. Below there is a list of outputs from pgbench with different
> > number of clients, you will see that postgres' performance in the
> > benchmark drops with each new connection. Shouldn't the tps stay
> > fairly constant?
>
> There was quite a long thread about this in pghackers back in Jan/Feb
> (or so). You might want to review it. One thing I recall is that
> you need a "scaling factor" well above 1 if you want meaningful results
> --- at scale factor 1, all of the transactions want to update the same
> row, so of course there's no parallelism and a lot of lock contention.
>
> The default WAL tuning parameters (COMMIT_DELAY, WAL_SYNC_METHOD, and
> friends) are probably not set optimally in 7.1. We are hoping to hear
> about some real-world performance results so that we can tweak them in
> future releases. I do not trust benchmarks as simplistic as pgbench for
> doing that kind of tweaking, however.
>
I agree with you about the benchmarks, but it does behave similar to what I
have in my app, which is why I used it for an example.
If you are familiar with cddb (actually freedb.org) I am taking that data in
putting it into postgres. The steps are: (pseudo code)
select nextval('cdid_seq');
begin;
insert into titles (...) values (...);
for(i=0; i < tracks; i++)insert into tracks (...) values (...);
commit;
When running stand alone on my machine, it will hovers around 130 full CDs per
second. When I start two processes it drops to fewer than 100 inserts per
second. When I add another, it drops even more. The results I posted with
pgbench pretty much showed what I was seeing in my program.
I hacked the output of pgbench to get me tabbed delimited fields to chart, but
it is easier to look at, see the results below. This is the same build and same
startup scripts on the two different machines. I know this isn't exactly
scientific, but I have a few bells going off suggesting that postgres has some
SMP scaling issues.
My Dual PIII 600MHZ, 500M RAM, Linux 2.4.3 SMP
pg_xlog is pointed to a different drive than is base.
I/O Promise dual IDE/66, xlog on one drive, base on another.
count transaction time (excluding connection)
1 32 175.116
2 32 138.288
3 32 102.890
4 32 88.243
5 32 77.024
6 32 62.648
7 32 61.231
8 32 60.017
9 32 56.034
10 32 57.854
11 32 50.812
12 32 53.019
13 32 50.289
14 32 46.421
15 32 44.496
16 32 45.297
17 32 41.725
18 32 46.048
19 32 45.007
20 32 41.584
21 32 43.420
22 32 39.640
23 32 43.250
24 32 41.617
25 32 42.511
26 32 38.369
27 32 38.919
28 32 38.813
29 32 39.242
30 32 39.859
31 32 37.938
32 32 41.516
Single processor PII 450, 256M, Linux 2.2.16
pg_xlog pointing to different drive than base
I/O Adaptec 2940, Two seagate barracudas.
count transaction time (excluding connection)
1 32 154.539
2 32 143.609
3 32 144.608
4 32 141.718
5 32 128.759
6 32 154.388
7 32 144.097
8 32 149.828
9 32 143.092
10 32 146.548
11 32 141.613
12 32 139.692
13 32 137.425
14 32 137.227
15 32 134.669
16 32 128.277
17 32 127.440
18 32 121.224
19 32 121.915
20 32 120.740
21 32 118.562
22 32 116.271
23 32 113.883
24 32 113.558
25 32 109.293
26 32 108.782
27 32 108.796
28 32 105.684
29 32 103.614
30 32 102.232
31 32 100.514
32 32 99.339
At 08:39 AM 26-04-2001 -0400, mlw wrote: >I am getting a bit concerned about Postgres 7.1 performance with multiple >connections. Postgres does not seem to scaling very well. Below there is a list >of outputs from pgbench with different number of clients, you will see that > >My postmaster start line looks like: >/usr/local/pgsql/bin/postmaster -A0 -N 24 -B 4096 -i -S -D/sqlvol/pgdev -o -F >-fs -S 2048 Maybe it's the -fs in your start up line. I tried a similar start line as yours but without -fs and I get consistent tps values for pgbench. ./pgbench -v -c 1 -t 30 test starting vacuum...end. starting full vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 number of clients: 1 number of transactions per client: 30 number of transactions actually processed: 30/30 tps = 161.938949(including connections establishing) tps = 180.060140(excluding connections establishing) [lylyeoh@nimbus pgbench]$ ./pgbench -v -c 3 -t 30 test starting vacuum...end. starting full vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 number of clients: 3 number of transactions per client: 30 number of transactions actually processed: 90/90 tps = 172.909666(including connections establishing) tps = 189.845782(excluding connections establishing) [lylyeoh@nimbus pgbench]$ ./pgbench -v -c 4 -t 30 test starting vacuum...end. starting full vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 number of clients: 4 number of transactions per client: 30 number of transactions actually processed: 120/120 tps = 172.909417(including connections establishing) tps = 189.319538(excluding connections establishing) Tested machine is a Dell Poweredge 1300 uniprocessor PIII 500MHz with 128MB RAM, and a single 9GB HDD. With -fs there's a decrease, but not as marked as your case. So not sure if it's really the problem. Try that out. Cheerio, Link.