Обсуждение: About “context-switching issue on Xeon” test case ?

Поиск
Список
Период
Сортировка

About “context-switching issue on Xeon” test case ?

От
RD黄永卫
Дата:

 

Hi

 

Anybody have the test case of “ context-switching issue on Xeon from  Tm lane

 

Best regards,

Ray Huang

 

Re: About “context-switching issue on Xeon” test case ?

От
Greg Smith
Дата:
RD黄永卫 wrote:
>
> Anybody have the test case of “ context-switching issue on Xeon” from
> Tm lane ?
>

That takes me back:
http://archives.postgresql.org/pgsql-performance/2004-04/msg00280.php

That's a problem seen on 2004 era Xeon processors, and with PostgreSQL
7.4. I doubt it has much relevance nowadays, given a) that whole area of
the code was rewritten for PostgreSQL 8.1, and b) today's Xeons are
nothing like 2004's Xeons.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: [PERFORM] About “context-switching issue on Xeon” test case ?

От
Scott Marlowe
Дата:
2010/4/9 Greg Smith <greg@2ndquadrant.com>:
> RD黄永卫 wrote:
>>
>> Anybody have the test case of “ context-switching issue on Xeon” from
>> Tm lane ?
>>
>
> That takes me back:
> http://archives.postgresql.org/pgsql-performance/2004-04/msg00280.php
>
> That's a problem seen on 2004 era Xeon processors, and with PostgreSQL
> 7.4. I doubt it has much relevance nowadays, given a) that whole area of
> the code was rewritten for PostgreSQL 8.1, and b) today's Xeons are
> nothing like 2004's Xeons.

It's important to appreciate that all improvements in scalability for
xeons, opterons, and everything else has mostly just moved further
along to the right on the graph where you start doing more context
switching than work, and the performance falls off.  The same way that
(sometimes) throwing more cores at a problem can help.  For most
office sized pgsql servers there's still a real possibility of having
a machine getting slammed and one of the indicators of that is that
context switches per second will start to jump up and the machine gets
sluggish.

For 2 sockets Intel rules the roost.  I'd imagine AMD's much faster
bus architecture for >2 sockets would make them the winner, but I
haven't had a system like that to test, either Intel or AMD.

Re: [PERFORM] About “context-switching issue on Xeon” test case ?

От
Scott Marlowe
Дата:
2010/4/9 Greg Smith <greg@2ndquadrant.com>:
> RD黄永卫 wrote:
>>
>> Anybody have the test case of “ context-switching issue on Xeon” from
>> Tm lane ?
>>
>
> That takes me back:
> http://archives.postgresql.org/pgsql-performance/2004-04/msg00280.php
>
> That's a problem seen on 2004 era Xeon processors, and with PostgreSQL
> 7.4. I doubt it has much relevance nowadays, given a) that whole area of
> the code was rewritten for PostgreSQL 8.1, and b) today's Xeons are
> nothing like 2004's Xeons.

Note that I found this comment by some guy named Greg from almost a
year ago that I thought was relevant.  (I'd recommend reading the
whole thread.)

http://archives.postgresql.org/pgsql-performance/2009-06/msg00097.php

Re: About “context-switching issue on Xeon” test case ?

От
Greg Smith
Дата:
Scott Marlowe wrote:
> For 2 sockets Intel rules the roost.  I'd imagine AMD's much faster
> bus architecture for >2 sockets would make them the winner, but I
> haven't had a system like that to test, either Intel or AMD.
>

AMD has been getting such poor performance due to the RAM they've been
using (DDR2-800) that it really doesn't matter--Intel has been thrashing
them across the board continuously since the "Nehalem" processors became
available, which started in volume in 2009. Intel systems with 3
channels of DDR3-1066 or faster outperform any scale of AMD deployment
on DDR2, and nowadays even Intel's chapter desktop processors have 2
channels of DDR3-1600 in them.

That's been the situation for almost 18 months now anyway. AMD's new
"Magny-Cours" Opterons have finally adopted DDR3-1333, and closed the
main performance gap with Intel again. Recently I've found the "Oracle
Calling Circle" benchmarking numbers that Anand runs seem to match what
I see in terms CPU-bound PostgreSQL database workloads, and the latest
set at
http://it.anandtech.com/show/2978/amd-s-12-core-magny-cours-opteron-6174-vs-intel-s-6-core-xeon/8
show how the market now fits together. AMD had a clear lead when it was
Xeon E5450 vs. Opteron 2389, Intel pulled way ahead with the X5570 and
later processors. Only this month did the Opteron 6174 finally become
competitive again. They're back to being only a little slower at two
sockets, instead of not even close. A 4 socket version of the latest
Opterons with DDR3 might even unseat Intel on some workloads, it's at
least possible again.

Anyway, returning to "context switching on Xeon", there were some
specific issues with the older PostgreSQL code that conflicted badly
with the Xeons of the time, and the test case Tom put together was good
at inflicting the issue. There certainly are still potential ways to
have the current processors and database code run into context switching
issues. I wouldn't expect that particular test case would be the best
way to go looking for them though, which is the reason I highlighted its
age and general obsolescence.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: [PERFORM] About “context-switching issue on Xeon” test case ?

От
Scott Marlowe
Дата:
2010/4/10 Greg Smith <greg@2ndquadrant.com>:
> Scott Marlowe wrote:
>> For 2 sockets Intel rules the roost.  I'd imagine AMD's much faster
>> bus architecture for >2 sockets would make them the winner, but I
>> haven't had a system like that to test, either Intel or AMD.
>>
>
> AMD has been getting such poor performance due to the RAM they've been
> using (DDR2-800) that it really doesn't matter--Intel has been thrashing
> them across the board continuously since the "Nehalem" processors became
> available, which started in volume in 2009.

Considering the nehalems are only available (or were at least) in two
socket varieties, and that opterons have two channels per socket
wouldn't the aggregate performance of 8 sockets x 2 channels each beat
the 2 sockets / 3 channels each?

答复: [PERFORM] About “context-switching issue on Xeon” test case ?

От
RD黄永卫
Дата:

    Thank you for you reply!

 

    one of the indicators of that is that context switches per second will start to jump up and the machine gets

Sluggish

 

--> Here is my database server indicator: 

   

These is ther VMSTAT  log of my database server as below:

 

2010-04-07 04:03:15 procs                      memory      swap          io     system         cpu

2010-04-07 04:03:15  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa

2010-04-07 14:04:27  3  0      0 2361272 272684 3096148    0    0     3  1445  973 14230  7  8 84  0

2010-04-07 14:05:27  2  0      0 2361092 272684 3096220    0    0     3  1804 1029 31852  8 10 81  1

2010-04-07 14:06:27  1  0      0 2362236 272684 3096564    0    0     3  1865 1135 19689  9  9 81  0

2010-04-07 14:07:27  1  0      0 2348400 272720 3101836    0    0     3  1582 1182 149461 15 17 67  0

2010-04-07 14:08:27  3  0      0 2392028 272840 3107600    0    0     3  3093 1275 203196 24 23 53  1

2010-04-07 14:09:27  3  1      0 2386224 272916 3107960    0    0     3  2486 1331 193299 26 22 52  0

2010-04-07 14:10:27 34  0      0 2332320 272980 3107944    0    0     3  1692 1082 214309 24 22 54  0

2010-04-07 14:11:27  1  0      0 2407432 273028 3108092    0    0     6  2770 1540 76643 29 13 57  1

2010-04-07 14:12:27  9  0      0 2358968 273104 3108388    0    0     7  2639 1466 10603 22  6 72  1

 

    My postgres version: 8.1.3;

My OS version: Linux version 2.4.21-47.Elsmp((Red Hat Linux 3.2.3-54)

My CPU:

processor       : 7

vendor_id       : GenuineIntel

cpu family      : 15

model           : 6

model name      : Intel(R) Xeon(TM) CPU 3.40GHz

stepping        : 8

cpu MHz         : 3400.262

cache size      : 1024 KB

physical id     : 1

 

 

I donnt know what make the context-switching  storm

How should I investigate  the real reason ?

Could you please give me some advice ?

 

Thank you !

 

-----邮件原件-----
发件人: Scott Marlowe [mailto:scott.marlowe@gmail.com]
发送时间: 2010年4月10 13:05
收件人: Greg Smith
抄送: RD黄永卫; pgsql-performance@postgresql.org
主题: Re: [PERFORM] About context-switching issue on Xeon test case

 

2010/4/9 Greg Smith <greg@2ndquadrant.com>:

> RD黄永卫 wrote:

>> 

>> Anybody have the test case of context-switching issue on Xeon from

>> Tm lane

>> 

> That takes me back:

> http://archives.postgresql.org/pgsql-performance/2004-04/msg00280.php

> That's a problem seen on 2004 era Xeon processors, and with PostgreSQL

> 7.4. I doubt it has much relevance nowadays, given a) that whole area of

> the code was rewritten for PostgreSQL 8.1, and b) today's Xeons are

> nothing like 2004's Xeons.

 

It's important to appreciate that all improvements in scalability for

xeons, opterons, and everything else has mostly just moved further

along to the right on the graph where you start doing more context

switching than work, and the performance falls off.  The same way that

(sometimes) throwing more cores at a problem can help.  For most

office sized pgsql servers there's still a real possibility of having

a machine getting slammed and one of the indicators of that is that

context switches per second will start to jump up and the machine gets

sluggish.

 

For 2 sockets Intel rules the roost.  I'd imagine AMD's much faster

bus architecture for >2 sockets would make them the winner, but I

haven't had a system like that to test, either Intel or AMD.