Issue with Linux+Pentium SMP Context Switching

Поиск
Список
Период
Сортировка
От Josh Berkus
Тема Issue with Linux+Pentium SMP Context Switching
Дата
Msg-id 200312191030.13499.josh@agliodbs.com
обсуждение исходный текст
Ответы Re: Issue with Linux+Pentium SMP Context Switching  (Kurt Roeckx <Q@ping.be>)
Re: Issue with Linux+Pentium SMP Context Switching  (Manfred Spraul <manfred@colorfullife.com>)
Re: Issue with Linux+Pentium SMP Context Switching  (Shridhar Daithankar <shridhar_daithankar@myrealbox.com>)
Список pgsql-hackers
Folks,

I brought up this issue a couple of weeks ago on the Performance list.  Since 
then, I've gotten e-mail confirmation from a few other users seeing this 
problem.  Here's the shape of the problem, we just don't know what causes it.  
I've been trying to do some profiling, but since I only have production 
systems to work with it's been really slow -- I have to wait for weekly 
downtime for each test.    I'm hoping that someone with a greater knowledge 
of Linux Kernel internals and a good test machine can help out.

Linux Versions Reported: RH and Gentoo reported, Kernels 2.4.18 to 2.4.22Not tested on other distros/kernels.  Kernels
areSMP-enabled.
 
Hardware:  Intel Pentium III and 4 dual-processor systems. 5 of the 6 reported machines are made by Dell; the other is
ahome-build.      Demonstrated on both hyper-threaded and non-hyperthreaded Xeons;      Cannot be reproduced on
Athalons.
Description of the Problem: When a query is made against a table with millions of rows that requires a 
seq scan, large hash join, per-row calculations or other intensive operation, 
the system climbs to tens or hundreds of thousands of context switches per 
second (contrast with, for example, 5000cs/second on AthalonMP).  This hurts 
performance significantly, possibly up to doubling query execution time.Initial debug logging of a test on one Xeon
systemdemonstrating this issue 
 
showed a very large number of unattributed semop() calls.   We are still 
following up on this.

In discussions with Linux kernel hackers online, they blame the way that 
PostgreSQL uses shared memory.   Whether or not they are correct, the effect 
of the issue is to harm PostgreSQL's performance and make us look bad on one 
of the major "enterprise" systems of choice: the multi-processor Xeon system.

Ideas, anyone?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: Proposed Query Planner TODO items
Следующее
От: Kurt Roeckx
Дата:
Сообщение: Re: Issue with Linux+Pentium SMP Context Switching