Обсуждение: About “context-switching issue on Xeon” test case ?
Hi,
Anybody have the test case of “ context-switching issue on Xeon” from Tm lane ?
Best regards,
Ray Huang
RD黄永卫 wrote: > > Anybody have the test case of “ context-switching issue on Xeon” from > Tm lane ? > That takes me back: http://archives.postgresql.org/pgsql-performance/2004-04/msg00280.php That's a problem seen on 2004 era Xeon processors, and with PostgreSQL 7.4. I doubt it has much relevance nowadays, given a) that whole area of the code was rewritten for PostgreSQL 8.1, and b) today's Xeons are nothing like 2004's Xeons. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
2010/4/9 Greg Smith <greg@2ndquadrant.com>: > RD黄永卫 wrote: >> >> Anybody have the test case of “ context-switching issue on Xeon” from >> Tm lane ? >> > > That takes me back: > http://archives.postgresql.org/pgsql-performance/2004-04/msg00280.php > > That's a problem seen on 2004 era Xeon processors, and with PostgreSQL > 7.4. I doubt it has much relevance nowadays, given a) that whole area of > the code was rewritten for PostgreSQL 8.1, and b) today's Xeons are > nothing like 2004's Xeons. It's important to appreciate that all improvements in scalability for xeons, opterons, and everything else has mostly just moved further along to the right on the graph where you start doing more context switching than work, and the performance falls off. The same way that (sometimes) throwing more cores at a problem can help. For most office sized pgsql servers there's still a real possibility of having a machine getting slammed and one of the indicators of that is that context switches per second will start to jump up and the machine gets sluggish. For 2 sockets Intel rules the roost. I'd imagine AMD's much faster bus architecture for >2 sockets would make them the winner, but I haven't had a system like that to test, either Intel or AMD.
2010/4/9 Greg Smith <greg@2ndquadrant.com>: > RD黄永卫 wrote: >> >> Anybody have the test case of “ context-switching issue on Xeon” from >> Tm lane ? >> > > That takes me back: > http://archives.postgresql.org/pgsql-performance/2004-04/msg00280.php > > That's a problem seen on 2004 era Xeon processors, and with PostgreSQL > 7.4. I doubt it has much relevance nowadays, given a) that whole area of > the code was rewritten for PostgreSQL 8.1, and b) today's Xeons are > nothing like 2004's Xeons. Note that I found this comment by some guy named Greg from almost a year ago that I thought was relevant. (I'd recommend reading the whole thread.) http://archives.postgresql.org/pgsql-performance/2009-06/msg00097.php
Scott Marlowe wrote: > For 2 sockets Intel rules the roost. I'd imagine AMD's much faster > bus architecture for >2 sockets would make them the winner, but I > haven't had a system like that to test, either Intel or AMD. > AMD has been getting such poor performance due to the RAM they've been using (DDR2-800) that it really doesn't matter--Intel has been thrashing them across the board continuously since the "Nehalem" processors became available, which started in volume in 2009. Intel systems with 3 channels of DDR3-1066 or faster outperform any scale of AMD deployment on DDR2, and nowadays even Intel's chapter desktop processors have 2 channels of DDR3-1600 in them. That's been the situation for almost 18 months now anyway. AMD's new "Magny-Cours" Opterons have finally adopted DDR3-1333, and closed the main performance gap with Intel again. Recently I've found the "Oracle Calling Circle" benchmarking numbers that Anand runs seem to match what I see in terms CPU-bound PostgreSQL database workloads, and the latest set at http://it.anandtech.com/show/2978/amd-s-12-core-magny-cours-opteron-6174-vs-intel-s-6-core-xeon/8 show how the market now fits together. AMD had a clear lead when it was Xeon E5450 vs. Opteron 2389, Intel pulled way ahead with the X5570 and later processors. Only this month did the Opteron 6174 finally become competitive again. They're back to being only a little slower at two sockets, instead of not even close. A 4 socket version of the latest Opterons with DDR3 might even unseat Intel on some workloads, it's at least possible again. Anyway, returning to "context switching on Xeon", there were some specific issues with the older PostgreSQL code that conflicted badly with the Xeons of the time, and the test case Tom put together was good at inflicting the issue. There certainly are still potential ways to have the current processors and database code run into context switching issues. I wouldn't expect that particular test case would be the best way to go looking for them though, which is the reason I highlighted its age and general obsolescence. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
2010/4/10 Greg Smith <greg@2ndquadrant.com>: > Scott Marlowe wrote: >> For 2 sockets Intel rules the roost. I'd imagine AMD's much faster >> bus architecture for >2 sockets would make them the winner, but I >> haven't had a system like that to test, either Intel or AMD. >> > > AMD has been getting such poor performance due to the RAM they've been > using (DDR2-800) that it really doesn't matter--Intel has been thrashing > them across the board continuously since the "Nehalem" processors became > available, which started in volume in 2009. Considering the nehalems are only available (or were at least) in two socket varieties, and that opterons have two channels per socket wouldn't the aggregate performance of 8 sockets x 2 channels each beat the 2 sockets / 3 channels each?
Thank you for you reply!
“one of the indicators of that is that context switches per second will start to jump up and the machine gets
Sluggish”
--> Here is my database server indicator:
These is ther VMSTAT log of my database server as below:
2010-04-07 04:03:15 procs memory swap io system cpu
2010-04-07 04:03:15 r b swpd free buff cache si so bi bo in cs us sy id wa
2010-04-07 14:04:27 3 0 0 2361272 272684 3096148 0 0 3 1445 973 14230 7 8 84 0
2010-04-07 14:05:27 2 0 0 2361092 272684 3096220 0 0 3 1804 1029 31852 8 10 81 1
2010-04-07 14:06:27 1 0 0 2362236 272684 3096564 0 0 3 1865 1135 19689 9 9 81 0
2010-04-07 14:07:27 1 0 0 2348400 272720 3101836 0 0 3 1582 1182 149461 15 17 67 0
2010-04-07 14:08:27 3 0 0 2392028 272840 3107600 0 0 3 3093 1275 203196 24 23 53 1
2010-04-07 14:09:27 3 1 0 2386224 272916 3107960 0 0 3 2486 1331 193299 26 22 52 0
2010-04-07 14:10:27 34 0 0 2332320 272980 3107944 0 0 3 1692 1082 214309 24 22 54 0
2010-04-07 14:11:27 1 0 0 2407432 273028 3108092 0 0 6 2770 1540 76643 29 13 57 1
2010-04-07 14:12:27 9 0 0 2358968 273104 3108388 0 0 7 2639 1466 10603 22 6 72 1
My postgres version: 8.1.3;
My OS version: Linux version 2.4.21-47.Elsmp((Red Hat Linux 3.2.3-54)
My CPU:
processor : 7
vendor_id : GenuineIntel
cpu family : 15
model : 6
model name : Intel(R) Xeon(TM) CPU 3.40GHz
stepping : 8
cpu MHz : 3400.262
cache size : 1024 KB
physical id : 1
I donnt know what make the “context-switching” storm ?
How should I investigate the real reason ?
Could you please give me some advice ?
Thank you !
-----邮件原件-----
发件人: Scott Marlowe [mailto:scott.marlowe@gmail.com]
发送时间: 2010年4月10日 13:05
收件人: Greg Smith
抄送: RD黄永卫; pgsql-performance@postgresql.org
主题: Re: [PERFORM] About “context-switching issue on Xeon” test case ?
2010/4/9 Greg Smith <greg@2ndquadrant.com>:
> RD黄永卫 wrote:
>>
>> Anybody have the test case of “ context-switching issue on Xeon” from
>> Tm lane ?
>>
>
> That takes me back:
> http://archives.postgresql.org/pgsql-performance/2004-04/msg00280.php
>
> That's a problem seen on 2004 era Xeon processors, and with PostgreSQL
> 7.4. I doubt it has much relevance nowadays, given a) that whole area of
> the code was rewritten for PostgreSQL 8.1, and b) today's Xeons are
> nothing like 2004's Xeons.
It's important to appreciate that all improvements in scalability for
xeons, opterons, and everything else has mostly just moved further
along to the right on the graph where you start doing more context
switching than work, and the performance falls off. The same way that
(sometimes) throwing more cores at a problem can help. For most
office sized pgsql servers there's still a real possibility of having
a machine getting slammed and one of the indicators of that is that
context switches per second will start to jump up and the machine gets
sluggish.
For 2 sockets Intel rules the roost. I'd imagine AMD's much faster
bus architecture for >2 sockets would make them the winner, but I
haven't had a system like that to test, either Intel or AMD.