Обсуждение: scalablility problem
Hi all, When I run multiple TPC-H queries (DBT3) on postgresql, I found the system is not scalable. My machine has 8GB memory, and 4 Xeon Dual Core processor ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. I run multiple q2 queries simultaneously. The results are: 1 process takes 0.65 second to finish. 2 processes take 1.07 seconds. 4 processes take 4.93 seconds. 8 processes take 16.95 seconds. For 4-process case and 8-process case, queries takes even more time than they are executed serially one after another. Because the system has 8GB memory, which is much bigger than the DB size(SF=1), and I warmed the cache before I run the test, I do not think the problem was caused by disk I/O. I think it might be caused by some contentions. But I do not know postgresql much. May anybody give me some clue to find the reasons? Thanks! Xiaoning
> From: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of > Xiaoning Ding > > > Hi all, > > When I run multiple TPC-H queries (DBT3) on postgresql, I > found the system > is not scalable. My machine has 8GB memory, and 4 Xeon Dual > Core processor > ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is > 7.3.18. Is there anyway you can upgrade to 8.2? There have been a lot of performance and scalability enhancements.
Xiaoning Ding <dingxn@cse.ohio-state.edu> writes: > When I run multiple TPC-H queries (DBT3) on postgresql, I found the system > is not scalable. My machine has 8GB memory, and 4 Xeon Dual Core processor > ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. If you are not running PG 8.1 or later, it's really not worth your time to test this. Multiprocessor scalability was hardly even on the radar in 7.3 days. regards, tom lane
Thanks guys, I update PG to 8.2.3. The results are much better now. 1 process : 0.94 second 2 processes: 1.32 seconds 4 processes: 2.03 seconds 8 processes: 2.54 seconds Do you think they are good enough? BTW where can I found some info on what 8.2.3 did to improve scalability compared with pre 8.1 versions? Xiaoning Tom Lane wrote: > Xiaoning Ding <dingxn@cse.ohio-state.edu> writes: >> When I run multiple TPC-H queries (DBT3) on postgresql, I found the system >> is not scalable. My machine has 8GB memory, and 4 Xeon Dual Core processor >> ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. > > If you are not running PG 8.1 or later, it's really not worth your time > to test this. Multiprocessor scalability was hardly even on the radar > in 7.3 days. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > >
On Fri, 2007-03-30 at 15:25, Xiaoning Ding wrote: > Hi all, > > When I run multiple TPC-H queries (DBT3) on postgresql, I found the system > is not scalable. My machine has 8GB memory, and 4 Xeon Dual Core processor > ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. I > run multiple > q2 queries simultaneously. The results are: > > 1 process takes 0.65 second to finish. > 2 processes take 1.07 seconds. > 4 processes take 4.93 seconds. > 8 processes take 16.95 seconds. > > For 4-process case and 8-process case, queries takes even more time than > they are executed serially one after another. Because the system has 8GB > memory, which is much bigger than the DB size(SF=1), and I warmed the cache > before I run the test, I do not think the problem was caused by disk I/O. You may be right, you may be wrong. What did top / vmstat have to say about IO wait / disk idle time? PostgreSQL has to commit transactions to disk. TPC-H does both business decision mostly read queries, as well as mixing in writes. If you have one hard drive, it may well be that activity is stacking up waiting on those writes. > I think it might be caused by some contentions. But I do not know postgresql > much. May anybody give me some clue to find the reasons? Others have mentioned your version of postgresql. 7.3 is quite old, as it came out at the end of 2002. Seeing as 7.3 is the standard pgsql version supported by RHEL3, and RHEL came with a 2.6.9 kernel, I'm gonna guess your OS is about that old too. pgsql 7.3 cannot take advantage of lots of shared memory, and has some issues scaling to lots of CPUs / processes. While RHEL won't be EOLed for a few more years (redhat promises 7 years I think) it's really not a great choice for getting started today. RHEL5 just released and RHEL4 is very stable. There are several things to look at to get better performance. 1: Late model PostgreSQL. Go with 8.2.3 or as a minimum 8.1.8 2: Late model Unix. 3: RAID controller with battery backed cache 4: Plenty of memory. 5: Lots of hard drives 6: 4 to 8 CPUs. Then, google postgresql performance tuning. There are three or four good tuning guides that pop up right at the top.
On Fri, 2007-03-30 at 16:38, Xiaoning Ding wrote: > Thanks guys, > > I update PG to 8.2.3. The results are much better now. > 1 process : 0.94 second > 2 processes: 1.32 seconds > 4 processes: 2.03 seconds > 8 processes: 2.54 seconds > > Do you think they are good enough? > BTW where can I found some info on what 8.2.3 did to improve > scalability compared with pre 8.1 versions? Very nice, eh? I'd say look through -hackers and -perform to see some of it, but as usual, the source code is the reference. You'd be surprised how well commented it is.
Quoth dingxn@cse.ohio-state.edu (Xiaoning Ding): > When I run multiple TPC-H queries (DBT3) on postgresql, I found the system > is not scalable. My machine has 8GB memory, and 4 Xeon Dual Core processor > ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. I > I think it might be caused by some contentions. But I do not know postgresql > much. May anybody give me some clue to find the reasons? Two primary issues: 1. You're running a horrendously ancient version of PostgreSQL. The 7.3 series is Really Old. Newer versions have *enormous* improvements that are likely to be *enormously* relevant. Upgrade to 8.2. 2. There are known issues with the combination of Xeon processors and PAE memory addressing; that sort of hardware tends to be *way* less speedy than the specs would suggest. There have been "context switching" issues on this sort of hardware that are enormously worsened if you're running on a version of PostgreSQL that is 4 major releases out of date. -- wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','gmail.com'). http://cbbrowne.com/info/x.html I am not a number! I am a free man!
On 30.03.2007, at 19:18, Christopher Browne wrote: > 2. There are known issues with the combination of Xeon processors and > PAE memory addressing; that sort of hardware tends to be *way* less > speedy than the specs would suggest. That is not true as the current series of processors (Woodcrest and the like) are also called Xeon. You probably mean the Pentium IV era Xeons. cug
On Fri, Mar 30, 2007 at 10:00:30PM -0600, Guido Neitzer wrote: >On 30.03.2007, at 19:18, Christopher Browne wrote: >>2. There are known issues with the combination of Xeon processors and >>PAE memory addressing; that sort of hardware tends to be *way* less >>speedy than the specs would suggest. > >That is not true as the current series of processors (Woodcrest and >the like) are also called Xeon. You probably mean the Pentium IV era >Xeons. Well, the newer ones can address large amounts of memory directly, without using PAE, but the original comment was correct--PAE is slow regardless of what processor implements it. Mike Stone
Scott Marlowe wrote: > On Fri, 2007-03-30 at 15:25, Xiaoning Ding wrote: >> Hi all, >> >> When I run multiple TPC-H queries (DBT3) on postgresql, I found the system >> is not scalable. My machine has 8GB memory, and 4 Xeon Dual Core processor >> ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. I >> run multiple >> q2 queries simultaneously. The results are: >> >> 1 process takes 0.65 second to finish. >> 2 processes take 1.07 seconds. >> 4 processes take 4.93 seconds. >> 8 processes take 16.95 seconds. >> >> For 4-process case and 8-process case, queries takes even more time than >> they are executed serially one after another. Because the system has 8GB >> memory, which is much bigger than the DB size(SF=1), and I warmed the cache >> before I run the test, I do not think the problem was caused by disk I/O. > > You may be right, you may be wrong. What did top / vmstat have to say > about IO wait / disk idle time? > > PostgreSQL has to commit transactions to disk. TPC-H does both business > decision mostly read queries, as well as mixing in writes. If you have > one hard drive, it may well be that activity is stacking up waiting on > those writes. Shouldn't writes be asynchronous in linux ? >> I think it might be caused by some contentions. But I do not know postgresql >> much. May anybody give me some clue to find the reasons? > > Others have mentioned your version of postgresql. 7.3 is quite old, as > it came out at the end of 2002. Seeing as 7.3 is the standard pgsql > version supported by RHEL3, and RHEL came with a 2.6.9 kernel, I'm gonna > guess your OS is about that old too. > > pgsql 7.3 cannot take advantage of lots of shared memory, and has some > issues scaling to lots of CPUs / processes. I use RHEL 4. I can not understand how the scalability related with shared memory? > While RHEL won't be EOLed for a few more years (redhat promises 7 years > I think) it's really not a great choice for getting started today. > RHEL5 just released and RHEL4 is very stable. > > There are several things to look at to get better performance. > > 1: Late model PostgreSQL. Go with 8.2.3 or as a minimum 8.1.8 > 2: Late model Unix. > 3: RAID controller with battery backed cache > 4: Plenty of memory. > 5: Lots of hard drives > 6: 4 to 8 CPUs. > > Then, google postgresql performance tuning. There are three or four good > tuning guides that pop up right at the top. > >
>> pgsql 7.3 cannot take advantage of lots of shared memory, and has some >> issues scaling to lots of CPUs / processes. > > I use RHEL 4. I can not understand how the scalability related with > shared memory? It isn't RHEL4 and shared memory. It is PostgreSQL and shared memory. Things have changed with PostgreSQL since 7.3 (7.3 is really god awful old) that allow it to more effectively access shared memory and thus provide better performance. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
Christopher Browne wrote: > Quoth dingxn@cse.ohio-state.edu (Xiaoning Ding): >> When I run multiple TPC-H queries (DBT3) on postgresql, I found the system >> is not scalable. My machine has 8GB memory, and 4 Xeon Dual Core processor >> ( 8 cores in total). OS kernel is linux 2.6.9. Postgresql is 7.3.18. I > >> I think it might be caused by some contentions. But I do not know postgresql >> much. May anybody give me some clue to find the reasons? > > Two primary issues: > > 1. You're running a horrendously ancient version of PostgreSQL. The > 7.3 series is Really Old. Newer versions have *enormous* > improvements that are likely to be *enormously* relevant. > > Upgrade to 8.2. 8.2 is really much better. > > 2. There are known issues with the combination of Xeon processors and > PAE memory addressing; that sort of hardware tends to be *way* less > speedy than the specs would suggest. I think PAE slows each query process. It would not affect scalability. > There have been "context switching" issues on this sort of hardware > that are enormously worsened if you're running on a version of > PostgreSQL that is 4 major releases out of date. How does PG 8.2 address this issue? by setting processor affinity? Thanks! Xiaoning
Michael Stone wrote: > On Fri, Mar 30, 2007 at 10:00:30PM -0600, Guido Neitzer wrote: >> On 30.03.2007, at 19:18, Christopher Browne wrote: >>> 2. There are known issues with the combination of Xeon processors and >>> PAE memory addressing; that sort of hardware tends to be *way* less >>> speedy than the specs would suggest. >> >> That is not true as the current series of processors (Woodcrest and >> the like) are also called Xeon. You probably mean the Pentium IV era >> Xeons. > > Well, the newer ones can address large amounts of memory directly, > without using PAE, but the original comment was correct--PAE is slow > regardless of what processor implements it. Here is the information in /proc/cpuinfo processor : 6 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 2.80GHz stepping : 8 cpu MHz : 2793.091 cache size : 2048 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est cid cx16 xtpr bogomips : 5586.08 clflush size : 64 cache_alignment : 128 address sizes : 36 bits physical, 48 bits virtual > > Mike Stone > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate > >
"Joshua D. Drake" <jd@commandprompt.com> writes: >> I use RHEL 4. I can not understand how the scalability related with >> shared memory? > It isn't RHEL4 and shared memory. It is PostgreSQL and shared memory. > Things have changed with PostgreSQL since 7.3 (7.3 is really god awful > old) that allow it to more effectively access shared memory and thus > provide better performance. Some specifics: * bufmgr algorithms redesigned to allow larger number of shared buffers to be used effectively * bufmgr redesigned to not have a single lock for management of all shared buffers; likewise for lockmgr * lots of marginal tweaks such as paying attention to cache line alignment of "hot" shared data structures I'm probably forgetting some things but I think the bufmgr and lockmgr changes were the biggest improvements in this area. regards, tom lane
Tom Lane wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: >>> I use RHEL 4. I can not understand how the scalability related with >>> shared memory? > >> It isn't RHEL4 and shared memory. It is PostgreSQL and shared memory. >> Things have changed with PostgreSQL since 7.3 (7.3 is really god awful >> old) that allow it to more effectively access shared memory and thus >> provide better performance. > > Some specifics: > > * bufmgr algorithms redesigned to allow larger number of shared buffers > to be used effectively > > * bufmgr redesigned to not have a single lock for management of all > shared buffers; likewise for lockmgr > > * lots of marginal tweaks such as paying attention to cache line > alignment of "hot" shared data structures > > I'm probably forgetting some things but I think the bufmgr and lockmgr > changes were the biggest improvements in this area. > > regards, tom lane That is very helpful. Thanks! Xiaoning > >
Xiaoning Ding wrote: > Postgresql is 7.3.18. [...] > 1 process takes 0.65 second to finish. > I update PG to 8.2.3. The results are [...] now. > 1 process : 0.94 second You sure about your test environment? Anything else running at the same time, perhaps? I'm a bit surprised that 8.2.3 would be 40% slower than 7.3.18 even in the single process case.
I repeated the test again. It took 0.92 second under 8.2.3. I checked system load using top and ps. There is no other active processes. Xiaoning Ron Mayer wrote: > Xiaoning Ding wrote: >> Postgresql is 7.3.18. [...] >> 1 process takes 0.65 second to finish. > >> I update PG to 8.2.3. The results are [...] now. >> 1 process : 0.94 second > > You sure about your test environment? Anything else > running at the same time, perhaps? > > I'm a bit surprised that 8.2.3 would be 40% slower than 7.3.18 > even in the single process case. > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > >
I may have missed this but have you tuned your postgresql configuration ? 8.2 tuning guidelines are significantly different than 7.3 Dave On 1-Apr-07, at 1:51 PM, Xiaoning Ding wrote: > I repeated the test again. It took 0.92 second under 8.2.3. > I checked system load using top and ps. There is no other > active processes. > > Xiaoning > > Ron Mayer wrote: >> Xiaoning Ding wrote: >>> Postgresql is 7.3.18. [...] >>> 1 process takes 0.65 second to finish. >>> I update PG to 8.2.3. The results are [...] now. >>> 1 process : 0.94 second >> You sure about your test environment? Anything else >> running at the same time, perhaps? >> I'm a bit surprised that 8.2.3 would be 40% slower than 7.3.18 >> even in the single process case. >> ---------------------------(end of >> broadcast)--------------------------- >> TIP 2: Don't 'kill -9' the postmaster > > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match