Обсуждение: old server, new server, same performance

Поиск
Список
Период
Сортировка

old server, new server, same performance

От
Piotr Legiecki
Дата:
Hi

I have a situation at my work which I simply don't understand and hope
that here I can find some explanations.

What is on the scene:
A - old 'server' PC AMD Athlon64 3000+, 2GB RAM, 1 ATA HDD 150GB, Debian
etch, postgresql 8.1.19
B - new server HP DL 360, 12GB RAM, Intel Xeon 8 cores CPU, fast SAS
(mirrored) HDDs, Debian 64 bit, lenny, backported postgresql 8.1.19
C - our Windows application based on Postgresql 8.1 (not newer)

and second role actors (for pgAdmin)
D - my old Windows XP computer, Athlon64 X2 3800+, with 100Mbit ethernet
E - new laptop with Ubuntu, 1000Mbit ethernet

The goal: migrate postgresql from A to B.

Simple and works fine (using pg_dump, psql -d dbname <bakcup_file).

So what is the problem? My simple 'benchmarks' I have done with pgAdmin
in spare time.

pgAdmin is the latest 1.8.2 on both D and E.
Using pgAdmin on my (D) computer I have run SELECT * from some_table;
and noted the execution time on both A and B servers:
- on A (the old one) about 120sec
- on B (the new monster) about 120sec (???)

(yes, there is almost no difference)

On the first test runs the postgresql configs on both servers were the
same, so I have started to optimize (according to postgresql wiki) the
postgresql on the new (B) server. The difference  with my simple select
* were close to 0.

So this is my first question. Why postgresql behaves so strangely?
Why there is no difference in database speed between those two machines?

I thought about hardware problem on B, but:
hdparm shows 140MB/sec on B and 60MB on A (and buffered reads are 8GB on
B and 800MB on A)
bonnie++ on B:
> Version 1.03d       ------Sequential Output------ --Sequential Input- --Random-
>                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
> Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
> malwa           24G 51269  71 49649  10 34974   6 48969  82 147840  13  1150   1
on A:
> Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
>                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
> Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
> irys             4G 42961  93 41125  13 14414   3 20262  48 38487   5 167.0   0

Here the difference in writings is not so big (wonder why, the price
between those machines is huge) but in readings are noticeably better on B.

Ok, those were the tests done using my old Windows PC (D) computer. So I
have decided to do the same using my new laptop with Ubuntu (E).
The results were soooo strange that now I am completely confused.

The same SELECT:
- on A first (fresh) run 30sec, second (and so on) about 11sec (??!)
- on B first run 80sec, second (and so on) about 80sec also

What is going on here? About 8x faster on slower machine?

One more thing comes to my mind. The A server has iso-8859-2 locale and
database is set to latin2, the B server has utf8 locale, but database is
still latin2. Does it matter anyway?

So here I'm stuck and hope for help. Is there any bottleneck? How to
find it?

Regards
Piotr

Re: old server, new server, same performance

От
Scott Marlowe
Дата:
2010/5/14 Piotr Legiecki <piotrlg@sci.pam.szczecin.pl>:
> So what is the problem? My simple 'benchmarks' I have done with pgAdmin in
> spare time.
>
> pgAdmin is the latest 1.8.2 on both D and E.
> Using pgAdmin on my (D) computer I have run SELECT * from some_table; and
> noted the execution time on both A and B servers:

So, any chance you'll run it like I asked:

select count(*) from some_table;

?

Re: old server, new server, same performance

От
Scott Carey
Дата:
On May 14, 2010, at 3:52 PM, Scott Marlowe wrote:

> 2010/5/14 Piotr Legiecki <piotrlg@sci.pam.szczecin.pl>:
>> So what is the problem? My simple 'benchmarks' I have done with pgAdmin in
>> spare time.
>>
>> pgAdmin is the latest 1.8.2 on both D and E.
>> Using pgAdmin on my (D) computer I have run SELECT * from some_table; and
>> noted the execution time on both A and B servers:
>
> So, any chance you'll run it like I asked:
>
> select count(*) from some_table;
>
> ?

I agree that select * is a very bad test and probably the problem here.  Even if you do 'select * from foo' locally to
avoidthe network and pipe it to /dev/null, it is _significantly_ slower than count(*) because of all the data
serialization.

>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


Re: old server, new server, same performance

От
Piotr Legiecki
Дата:
Scott Marlowe pisze:
> 2010/5/14 Piotr Legiecki <piotrlg@sci.pam.szczecin.pl>:
>> So what is the problem? My simple 'benchmarks' I have done with pgAdmin in
>> spare time.
>>
>> pgAdmin is the latest 1.8.2 on both D and E.
>> Using pgAdmin on my (D) computer I have run SELECT * from some_table; and
>> noted the execution time on both A and B servers:
>
> So, any chance you'll run it like I asked:
>
> select count(*) from some_table;
>

Sorry, but it was computer-less weekend ;-)

So to answer all questions in one mail:
1. The database is autovacuumed, at first (the default debian setting)
every one minute, than I have set it to one hour.

2. select count(*) from some_table; runs in a fraction of a second on
the console on both servers (there are only 4000 records, the second
longer table has 50000 but it does not matter very much). From pg_admin
the results are:
- slow server (and the longest table in my db) 938ms (first run) and
about 40ms next ones
- fast server 110ms first run, about 30ms next ones.
Well, finally my new server deservers its name ;-) The later times as I
understand are just cache readings from postgresql itself?

3. The configs. As noted earlier, at first test they were the same,
later I started to optimize the faster server from the defaults to some
higher values, without any significant gain.
The package on slower server is just deb package from etch (4.0)
repository, the one on fast server (which is newer lenny - 5.0) is
compiled from deb source package on that server.
fast server: http://pgsql.privatepaste.com/edf2ec36c3
slow server: http://pgsql.privatepaste.com/bdc141f0be

4. Machine. The new server has 5 SAS disks (+ 1 spare), but I don't
remember how they are set up now (looks like mirror for system '/' and
RAID5 for rest - including DB). size of the DB is 405MB

So still I don't get this: select * from table; on old server takes 0,5
sec, on new one takes 6sec. Why there is so big difference? And it does
not matter how good or bad select is to measure performance, because I
don't measure  the performance, I measure the relative difference.
Somwhere there is a bottleneck.

Regards
P.

Re: old server, new server, same performance

От
Scott Marlowe
Дата:
On Mon, May 17, 2010 at 2:06 AM, Piotr Legiecki <piotrlg@ams.edu.pl> wrote:
> 2. select count(*) from some_table; runs in a fraction of a second on the
> console on both servers (there are only 4000 records, the second longer
> table has 50000 but it does not matter very much). From pg_admin the results
> are:
> - slow server (and the longest table in my db) 938ms (first run) and about
> 40ms next ones
> - fast server 110ms first run, about 30ms next ones.
> Well, finally my new server deservers its name ;-) The later times as I
> understand are just cache readings from postgresql itself?
SNIP
> So the server itself seems faster.
> So still I don't get this: select * from table; on old server takes 0,5 sec,
> on new one takes 6sec. Why there is so big difference? And it does not
> matter how good or bad select is to measure performance, because I don't
> measure  the performance, I measure the relative difference. Somwhere there
> is a bottleneck.

Yep, the network I'd say.  How fast are things like scp between the
various machines?

> 4. Machine. The new server has 5 SAS disks (+ 1 spare), but I don't remember
> how they are set up now (looks like mirror for system '/' and RAID5 for rest
> - including DB). size of the DB is 405MB

Get off of RAID-5 if possible.  A 3 Disk RAID-5 is the slowest
possible combination for RAID-5 and RAID-5 is generally the poorest
choice for a db server.

Re: old server, new server, same performance

От
Scott Marlowe
Дата:
On Mon, May 17, 2010 at 2:10 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Mon, May 17, 2010 at 2:06 AM, Piotr Legiecki <piotrlg@ams.edu.pl> wrote:
>> 2. select count(*) from some_table; runs in a fraction of a second on the
>> console on both servers (there are only 4000 records, the second longer
>> table has 50000 but it does not matter very much). From pg_admin the results
>> are:
>> - slow server (and the longest table in my db) 938ms (first run) and about
>> 40ms next ones
>> - fast server 110ms first run, about 30ms next ones.
>> Well, finally my new server deservers its name ;-) The later times as I
>> understand are just cache readings from postgresql itself?
> SNIP
>> So the server itself seems faster.
>> So still I don't get this: select * from table; on old server takes 0,5 sec,
>> on new one takes 6sec. Why there is so big difference? And it does not
>> matter how good or bad select is to measure performance, because I don't
>> measure  the performance, I measure the relative difference. Somwhere there
>> is a bottleneck.
>
> Yep, the network I'd say.  How fast are things like scp between the
> various machines?
>
>> 4. Machine. The new server has 5 SAS disks (+ 1 spare), but I don't remember
>> how they are set up now (looks like mirror for system '/' and RAID5 for rest
>> - including DB). size of the DB is 405MB
>
> Get off of RAID-5 if possible.  A 3 Disk RAID-5 is the slowest
> possible combination for RAID-5 and RAID-5 is generally the poorest
> choice for a db server.

I refer you to this classic post on the subject:
http://www.mail-archive.com/pgsql-general@postgresql.org/msg93043.html

Re: old server, new server, same performance

От
Scott Marlowe
Дата:
Whoops, wrong thread.

On Mon, May 17, 2010 at 2:52 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Mon, May 17, 2010 at 2:10 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>> On Mon, May 17, 2010 at 2:06 AM, Piotr Legiecki <piotrlg@ams.edu.pl> wrote:
>>> 2. select count(*) from some_table; runs in a fraction of a second on the
>>> console on both servers (there are only 4000 records, the second longer
>>> table has 50000 but it does not matter very much). From pg_admin the results
>>> are:
>>> - slow server (and the longest table in my db) 938ms (first run) and about
>>> 40ms next ones
>>> - fast server 110ms first run, about 30ms next ones.
>>> Well, finally my new server deservers its name ;-) The later times as I
>>> understand are just cache readings from postgresql itself?
>> SNIP
>>> So the server itself seems faster.
>>> So still I don't get this: select * from table; on old server takes 0,5 sec,
>>> on new one takes 6sec. Why there is so big difference? And it does not
>>> matter how good or bad select is to measure performance, because I don't
>>> measure  the performance, I measure the relative difference. Somwhere there
>>> is a bottleneck.
>>
>> Yep, the network I'd say.  How fast are things like scp between the
>> various machines?
>>
>>> 4. Machine. The new server has 5 SAS disks (+ 1 spare), but I don't remember
>>> how they are set up now (looks like mirror for system '/' and RAID5 for rest
>>> - including DB). size of the DB is 405MB
>>
>> Get off of RAID-5 if possible.  A 3 Disk RAID-5 is the slowest
>> possible combination for RAID-5 and RAID-5 is generally the poorest
>> choice for a db server.
>
> I refer you to this classic post on the subject:
> http://www.mail-archive.com/pgsql-general@postgresql.org/msg93043.html

Re: old server, new server, same performance

От
Piotr Legiecki
Дата:
Scott Marlowe pisze:

>>> So still I don't get this: select * from table; on old server takes 0,5 sec,
>>> on new one takes 6sec. Why there is so big difference? And it does not
>>> matter how good or bad select is to measure performance, because I don't
>>> measure  the performance, I measure the relative difference. Somwhere there
>>> is a bottleneck.
>> Yep, the network I'd say.  How fast are things like scp between the
>> various machines?

Sure it is, but not in a way one could expect:
- scp from 1000Gbit laptop to old server 27MB/sec
- scp from the same laptop to new server 70MB/sec
Both servers have 1000Gbit connection. So it is still mysterious why old
server makes 9x faster select?
I don't claim that something is slow on new (or even older) server. Not
at all. the application works fine (still on older machine). I only
wonder about those differences.

>>> 4. Machine. The new server has 5 SAS disks (+ 1 spare), but I don't remember
>>> how they are set up now (looks like mirror for system '/' and RAID5 for rest
>>> - including DB). size of the DB is 405MB
>> Get off of RAID-5 if possible.  A 3 Disk RAID-5 is the slowest
>> possible combination for RAID-5 and RAID-5 is generally the poorest
>> choice for a db server.

Sure I know that RAID-5 is slower than mirror but anyway how much
slower? And for sure not as much as single ATA disk.

> I refer you to this classic post on the subject:
> http://www.mail-archive.com/pgsql-general@postgresql.org/msg93043.html

Well, this thread is about benchmarking databases (or even worse,
comparison between two RDBMS). I'm not benchmarking anything, just
compare one factor.

P.

Re: old server, new server, same performance

От
Scott Marlowe
Дата:
On Mon, May 17, 2010 at 3:52 AM, Piotr Legiecki <piotrlg@ams.edu.pl> wrote:
> Scott Marlowe pisze:
>
>>>> So still I don't get this: select * from table; on old server takes 0,5
>>>> sec,
>>>> on new one takes 6sec. Why there is so big difference? And it does not
>>>> matter how good or bad select is to measure performance, because I don't
>>>> measure  the performance, I measure the relative difference. Somwhere
>>>> there
>>>> is a bottleneck.
>>>
>>> Yep, the network I'd say.  How fast are things like scp between the
>>> various machines?
>
> Sure it is, but not in a way one could expect:
> - scp from 1000Gbit laptop to old server 27MB/sec
> - scp from the same laptop to new server 70MB/sec
> Both servers have 1000Gbit connection. So it is still mysterious why old
> server makes 9x faster select?
> I don't claim that something is slow on new (or even older) server. Not at
> all. the application works fine (still on older machine). I only wonder
> about those differences.

Is one connecting via SSL?  Is this a simple flat switched network, or
are these machines on different segments connected via routers?

>>>> 4. Machine. The new server has 5 SAS disks (+ 1 spare), but I don't
>>>> remember
>>>> how they are set up now (looks like mirror for system '/' and RAID5 for
>>>> rest
>>>> - including DB). size of the DB is 405MB
>>>
>>> Get off of RAID-5 if possible.  A 3 Disk RAID-5 is the slowest
>>> possible combination for RAID-5 and RAID-5 is generally the poorest
>>> choice for a db server.
>
> Sure I know that RAID-5 is slower than mirror but anyway how much slower?
> And for sure not as much as single ATA disk.

Actually, given the amount of read read / write write RAID5 does, it
can be slower than a single drive, by quite a bit.  A mirror set only
reads twice as fast, it writes the same speed as a single disk.
RAID-5 is antithetical to good db performance (unless you hardly ever
write).

>
>> I refer you to this classic post on the subject:
>> http://www.mail-archive.com/pgsql-general@postgresql.org/msg93043.html
>
> Well, this thread is about benchmarking databases (or even worse, comparison
> between two RDBMS). I'm not benchmarking anything, just compare one factor.

That was a mis-post...

Re: old server, new server, same performance

От
Piotr Legiecki
Дата:
Scott Marlowe pisze:

> Is one connecting via SSL?  Is this a simple flat switched network, or
> are these machines on different segments connected via routers?

SSL is disabled.
It is switched network, all tested computers are in the same segment.


Finally I have switched the production database from old server to new
one and strange things happened. The same query on new server I have
used before with 30sec results now runs about 9 sec (so the same time as
the old production server). Hm...

Is it possible that database under some load performs better because it
makes some something that database  not used is not doing?


P.