Обсуждение: cpu comparison
I have 2 small servers, one a fairly new server with a x3450 (4-core with HT) cpu running at 2.67GHz and an older E5335 (4-core) cpu running at 2GHz. I have been quite surprised how the E5335 compares very closely to the x3450, but maybe I have tested it wrongly. here's the CPUINFO: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU E5335 @ 2.00GHz stepping : 7 cpu MHz : 1995.036 cache size : 4096 KB physical id : 3 siblings : 1 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu tsc msr pae cx8 apic mtrr cmov pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc pni vmx ssse3 cx16 lahf_lm bogomips : 4989.65 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: OS: CentOS 64bit Postgres: 9.0.4 compiled processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 30 model name : Intel(R) Xeon(R) CPU X3450 @ 2.67GHz stepping : 5 cpu MHz : 2660.099 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm [8] bogomips : 5319.92 OS: CentOS 32bit Postgres: 9.0.4 compiled In my testing I have a 32bit CentOS on the x3450, but a 64bit CentOS on the E5335. Can this make such a bit difference or should the perform fairly close to the same speed? Both servers have 8GB of RAM, and the database I tested with is only 3.7GB. I'm a bit surprised as the x3450 has DDR3, while the E5335 has DDR2, and of course because of the cycle speed difference alone I would think the X3450 should beat the E5335.
On Mon, Jul 18, 2011 at 01:48:20PM -0600, M. D. wrote: > I have 2 small servers, one a fairly new server with a x3450 (4-core > with HT) cpu running at 2.67GHz and an older E5335 (4-core) cpu > running at 2GHz. > > I have been quite surprised how the E5335 compares very closely to > the x3450, but maybe I have tested it wrongly. > > here's the CPUINFO: > processor : 3 > vendor_id : GenuineIntel > cpu family : 6 > model : 15 > model name : Intel(R) Xeon(R) CPU E5335 @ 2.00GHz > stepping : 7 > cpu MHz : 1995.036 > cache size : 4096 KB > physical id : 3 > siblings : 1 > core id : 0 > cpu cores : 1 > fpu : yes > fpu_exception : yes > cpuid level : 10 > wp : yes > flags : fpu tsc msr pae cx8 apic mtrr cmov pat clflush > acpi mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc pni vmx > ssse3 cx16 lahf_lm > bogomips : 4989.65 > clflush size : 64 > cache_alignment : 64 > address sizes : 36 bits physical, 48 bits virtual > power management: > OS: CentOS 64bit > Postgres: 9.0.4 compiled > > processor : 7 > vendor_id : GenuineIntel > cpu family : 6 > model : 30 > model name : Intel(R) Xeon(R) CPU X3450 @ 2.67GHz > stepping : 5 > cpu MHz : 2660.099 > cache size : 8192 KB > physical id : 0 > siblings : 8 > core id : 3 > cpu cores : 4 > apicid : 7 > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 11 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr > pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm > pbe nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx > smx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm [8] > bogomips : 5319.92 > OS: CentOS 32bit > Postgres: 9.0.4 compiled > > > In my testing I have a 32bit CentOS on the x3450, but a 64bit CentOS > on the E5335. Can this make such a bit difference or should the > perform fairly close to the same speed? Both servers have 8GB of > RAM, and the database I tested with is only 3.7GB. > > I'm a bit surprised as the x3450 has DDR3, while the E5335 has DDR2, > and of course because of the cycle speed difference alone I would > think the X3450 should beat the E5335. > Yes, you have basically shown that running two different tests give different results -- or that an apple is not an orange. You need to only vary 1 variable at a time for it to mean anything. Regards, Ken
Dne 18.7.2011 22:11, ktm@rice.edu napsal(a): >> > In my testing I have a 32bit CentOS on the x3450, but a 64bit CentOS >> > on the E5335. Can this make such a bit difference or should the >> > perform fairly close to the same speed? Both servers have 8GB of >> > RAM, and the database I tested with is only 3.7GB. >> > >> > I'm a bit surprised as the x3450 has DDR3, while the E5335 has DDR2, >> > and of course because of the cycle speed difference alone I would >> > think the X3450 should beat the E5335. >> > > Yes, you have basically shown that running two different tests give > different results -- or that an apple is not an orange. You need to > only vary 1 variable at a time for it to mean anything. He just run the same test on two different machines - I'm not sure what's wrong with it? Sure, it would be nice to compare 32bit to 32bit, but the OP probably can't do that and wonders if this is the cause. Why is that comparing apples and oranges? According to http://www.cpubenchmark.net, the X3450 is about 2x as fast as E5335 (5,298 vs. 2,575), although this is just a synthetic score. I'm a bit confused by the E5335 cpuinfo output, because it says "cpu cores : 1" as I'd expect "4" here. I do recall hyperthreading generally was not recommended for a DB, not sure if that changed recently. A quick search revealed this post http://serverfault.com/questions/219791/hyperthreading-vs-sql-server-postgresql stating that since Nehalem CPUs (and X3450 is Nehalem) this should not be a problem anymore. Not sure if it's true, I guess it's worth testing as it might slow down the X3450 box. OP: We need more details about the test's has run, without them we're just guessing. Have you collected some system stats (vmstat, iostat) during the test? Tomas
On 7/18/11 12:48 PM, M. D. wrote: > I have 2 small servers, one a fairly new server with a x3450 (4-core > with HT) cpu running at 2.67GHz and an older E5335 (4-core) cpu running > at 2GHz. > > I have been quite surprised how the E5335 compares very closely to the > x3450, but maybe I have tested it wrongly. What test? What were the results? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Mon, Jul 18, 2011 at 11:56:40PM +0200, Tomas Vondra wrote: > Dne 18.7.2011 22:11, ktm@rice.edu napsal(a): > >> > In my testing I have a 32bit CentOS on the x3450, but a 64bit CentOS > >> > on the E5335. Can this make such a bit difference or should the > >> > perform fairly close to the same speed? Both servers have 8GB of > >> > RAM, and the database I tested with is only 3.7GB. > >> > > >> > I'm a bit surprised as the x3450 has DDR3, while the E5335 has DDR2, > >> > and of course because of the cycle speed difference alone I would > >> > think the X3450 should beat the E5335. > >> > > > Yes, you have basically shown that running two different tests give > > different results -- or that an apple is not an orange. You need to > > only vary 1 variable at a time for it to mean anything. > > He just run the same test on two different machines - I'm not sure > what's wrong with it? Sure, it would be nice to compare 32bit to 32bit, > but the OP probably can't do that and wonders if this is the cause. Why > is that comparing apples and oranges? > It is only that 32 vs. 64 bit, compiler and other things can easily make a factor of 2 change in the results. So it is not telling you much about the processor differences, neccessarily. Regards, Ken
M. D. wrote: > I'm a bit surprised as the x3450 has DDR3, while the E5335 has DDR2, > and of course because of the cycle speed difference alone I would > think the X3450 should beat the E5335. Try comparing them with stream-scaling to see what happens: https://github.com/gregs1104/stream-scaling You can't really test CPU performance in a simple way anymore; it varies depending on the number of processes running at once. This test is the best way I've found to show how that works. On a single thread, the X3450 may not be significantly better than the E5535. But what should happen is that total speed keeps going up as you add more threads on the newer system, while the old DDR2 model stays as the same basic total. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
On Mon, Jul 18, 2011 at 6:47 PM, Greg Smith <greg@2ndquadrant.com> wrote: > M. D. wrote: >> >> I'm a bit surprised as the x3450 has DDR3, while the E5335 has DDR2, and >> of course because of the cycle speed difference alone I would think the >> X3450 should beat the E5335. > > Try comparing them with stream-scaling to see what happens: > > https://github.com/gregs1104/stream-scaling > > You can't really test CPU performance in a simple way anymore; it varies > depending on the number of processes running at once. This test is the best > way I've found to show how that works. On a single thread, the X3450 may > not be significantly better than the E5535. But what should happen is that > total speed keeps going up as you add more threads on the newer system, > while the old DDR2 model stays as the same basic total. By way of example we have a server with dual 6 core opterons that runs on 667MHz memory and it maxes out the stream test with 8 threads, getting no faster as you add threads. OTOH, our 4x12 core opteron machines with 1333MHz memory and like 8 different channels to it, scales right up to 40 or more threads running the stream test.
On 07/18/2011 03:56 PM, Tomas Vondra wrote: > Dne 18.7.2011 22:11,ktm@rice.edu napsal(a): >>>> In my testing I have a 32bit CentOS on the x3450, but a 64bit CentOS >>>> on the E5335. Can this make such a bit difference or should the >>>> perform fairly close to the same speed? Both servers have 8GB of >>>> RAM, and the database I tested with is only 3.7GB. >>>> >>>> I'm a bit surprised as the x3450 has DDR3, while the E5335 has DDR2, >>>> and of course because of the cycle speed difference alone I would >>>> think the X3450 should beat the E5335. >>>> >> Yes, you have basically shown that running two different tests give >> different results -- or that an apple is not an orange. You need to >> only vary 1 variable at a time for it to mean anything. > He just run the same test on two different machines - I'm not sure > what's wrong with it? Sure, it would be nice to compare 32bit to 32bit, > but the OP probably can't do that and wonders if this is the cause. Why > is that comparing apples and oranges? > > According tohttp://www.cpubenchmark.net, the X3450 is about 2x as fast > as E5335 (5,298 vs. 2,575), although this is just a synthetic score. > > I'm a bit confused by the E5335 cpuinfo output, because it says "cpu > cores : 1" as I'd expect "4" here. > > I do recall hyperthreading generally was not recommended for a DB, not > sure if that changed recently. A quick search revealed this post > > http://serverfault.com/questions/219791/hyperthreading-vs-sql-server-postgresql > > stating that since Nehalem CPUs (and X3450 is Nehalem) this should not > be a problem anymore. Not sure if it's true, I guess it's worth testing > as it might slow down the X3450 box. > > OP: We need more details about the test's has run, without them we're > just guessing. Have you collected some system stats (vmstat, iostat) > during the test? > > Tomas > Thank you. That was exactly my reason for posting. I did some more serious testing, and it seems like what I was testing with did not give my proper results at all, or maybe because I had not tweaked the config file. After more testing, I'm seeing the x3450 more than 2x faster as the E5335. This is just a simple test, but it's something that is run on a continuous basis in this application so that's what I wanted to test with. Table item_change has around 2M rows. If someone would, please, can you tell me if it would help me to partition the item_change table (it has a date column)? As far as I've seen, an application needs to change if a table is partitioned, right? Here's the query I ran: explain analyse select item.item_id,item_plu.number,item.description, (select dept.name from dept where dept.dept_id = item.dept_id), (select subdept.name from subdept where subdept.subdept_id = item.subdept_id), (select sum(on_hand) from item_change where item_change.item_id = item.item_id), (select sum(on_order) from item_change where item_change.item_id = item.item_id), (select sum(total_cost) from item_change where item_change.item_id = item.item_id), (select price from item_price where item_price.item_id = item.item_id and item_price.zone_id = 'OUe1zXgADRnWemS1grOerQ' and item_price.price_type = 0 and item_price.size_name = item.sell_size) from item join item_plu on item.item_id = item_plu.item_id and item_plu.seq_num = 0 where item.inactive_on is null; E5335 QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=0.27..56795323.05 rows=79821 width=95) (actual time=0.270..35769.722 rows=72273 loops=1) Merge Cond: (item.item_id = item_plu.item_id) -> Index Scan using item_pkey on item (cost=0.00..9599.57 rows=72249 width=86) (actual time=0.011..216.709 rows=72273 loops=1) Filter: (inactive_on IS NULL) -> Index Scan using item_plu_pkey on item_plu (cost=0.00..5551.89 rows=79821 width=32) (actual time=0.013..226.435 rows=80114 loops=1) Index Cond: (item_plu.seq_num = 0) SubPlan 1 -> Seq Scan on dept (cost=0.00..5.16 rows=1 width=8) (actual time=0.003..0.007 rows=1 loops=72273) Filter: (dept_id = $0) SubPlan 2 -> Index Scan using subdept_pkey on subdept (cost=0.00..5.27 rows=1 width=8) (actual time=0.009..0.011 rows=1 loops=72273) Index Cond: (subdept_id = $1) SubPlan 3 -> Aggregate (cost=231.86..231.87 rows=1 width=6) (actual time=0.152..0.153 rows=1 loops=72273) -> Index Scan using item_change_i2 on item_change (cost=0.00..231.63 rows=91 width=6) (actual time=0.021..0.094 rows=28 loops=72273) Index Cond: (item_id = $2) SubPlan 4 -> Aggregate (cost=231.86..231.87 rows=1 width=5) (actual time=0.132..0.133 rows=1 loops=72273) -> Index Scan using item_change_i2 on item_change (cost=0.00..231.63 rows=91 width=5) (actual time=0.021..0.076 rows=28 loops=72273) Index Cond: (item_id = $2) SubPlan 5 -> Aggregate (cost=231.86..231.87 rows=1 width=8) (actual time=0.133..0.134 rows=1 loops=72273) -> Index Scan using item_change_i2 on item_change (cost=0.00..231.63 rows=91 width=8) (actual time=0.021..0.075 rows=28 loops=72273) Index Cond: (item_id = $2) SubPlan 6 -> Index Scan using item_price_i3 on item_price (cost=0.00..5.29 rows=1 width=7) (actual time=0.015..0.017 rows=1 loops=72273) Index Cond: (item_id = $2) Filter: ((zone_id = 'OUe1zXgADRnWemS1grOerQ'::bpchar) AND (price_type = 0) AND ((size_name)::text = ($3)::text)) Total runtime: 35871.253 ms (29 rows) X3450 QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=0.15..57610807.07 rows=80066 width=95) (actual time=0.141..14680.486 rows=72247 loops=1) Merge Cond: (item.item_id = item_plu.item_id) -> Index Scan using item_pkey on item (cost=0.00..10446.59 rows=72181 width=86) (actual time=0.005..79.796 rows=72247 loops=1) Filter: (inactive_on IS NULL) -> Index Scan using item_plu_pkey on item_plu (cost=0.00..5456.43 rows=80066 width=32) (actual time=0.012..75.303 rows=80085 loops=1) Index Cond: (item_plu.seq_num = 0) SubPlan 1 -> Seq Scan on dept (cost=0.00..5.16 rows=1 width=8) (actual time=0.001..0.003 rows=1 loops=72247) Filter: (dept_id = $0) SubPlan 2 -> Index Scan using subdept_pkey on subdept (cost=0.00..5.27 rows=1 width=8) (actual time=0.007..0.007 rows=1 loops=72247) Index Cond: (subdept_id = $1) SubPlan 3 -> Aggregate (cost=234.53..234.54 rows=1 width=6) (actual time=0.060..0.060 rows=1 loops=72247) -> Index Scan using item_change_i2 on item_change (cost=0.00..234.29 rows=92 width=6) (actual time=0.018..0.041 rows=28 loops=72247) Index Cond: (item_id = $2) SubPlan 4 -> Aggregate (cost=234.53..234.54 rows=1 width=5) (actual time=0.053..0.053 rows=1 loops=72247) -> Index Scan using item_change_i2 on item_change (cost=0.00..234.29 rows=92 width=5) (actual time=0.018..0.034 rows=28 loops=72247) Index Cond: (item_id = $2) SubPlan 5 -> Aggregate (cost=234.53..234.54 rows=1 width=8) (actual time=0.053..0.053 rows=1 loops=72247) -> Index Scan using item_change_i2 on item_change (cost=0.00..234.29 rows=92 width=8) (actual time=0.018..0.034 rows=28 loops=72247) Index Cond: (item_id = $2) SubPlan 6 -> Index Scan using item_price_i3 on item_price (cost=0.00..5.29 rows=1 width=7) (actual time=0.012..0.013 rows=1 loops=72247) Index Cond: (item_id = $2) Filter: ((zone_id = 'OUe1zXgADRnWemS1grOerQ'::bpchar) AND (price_type = 0) AND ((size_name)::text = ($3)::text)) Total runtime: 14695.559 ms (29 rows)
			
				I'm just top posting this because this whole thread needs a reset before it goes any farther.
		
	Start with a real description of these hosts - Number and types of disks, filesystem configs, processors, memory, OS, etc.  If your db is small enough to fit into RAM, please show us the db config you are using which ensures that you are making best use of available RAM, etc.
Then we need to know what your test looks like - showing us a query and an explain plan without any info about the table structure, indexes, number of rows, and table usage patterns doesn't provide anywhere near enough info to diagnose inefficiency.
There are several documents linked right from the page for this mailing list that describe exactly how to go about providing enough info to get help from the list.  Please read through them, then update us with the necessary information, and I'm sure we'll be able to offer you some insight into what is going on.
And for the record, your app probably doesn't need to change to use table partitioning, at least when selecting data.  Depending upon how data is loaded, you may need to change how you do inserts.  But it is impossible to comment on whether partitioning might help you without knowing table structure, value distributions, query patterns, and number of rows in the table.  If you are always selecting over the whole range of data, partitioning isn't likely to buy you anything, for example.