Re: Hardware suggestions for maximum read performance

От: Mike McCann
Тема: Re: Hardware suggestions for maximum read performance
Дата: ,
Msg-id: 44A45613-5E99-4E61-975E-48F957B971FC@mbari.org
(см: обсуждение, исходный текст)
Ответ на: Re: Hardware suggestions for maximum read performance  (Jeff Janes)
Ответы: Re: Hardware suggestions for maximum read performance  (Jeff Janes)
Re: Hardware suggestions for maximum read performance  (Greg Smith)
Список: pgsql-performance

Скрыть дерево обсуждения

Hardware suggestions for maximum read performance  (Mike McCann, )
 Re: Hardware suggestions for maximum read performance  (Scott Marlowe, )
  Re: Hardware suggestions for maximum read performance  (Jeff Janes, )
   Re: Hardware suggestions for maximum read performance  (Mike McCann, )
    Re: Hardware suggestions for maximum read performance  (Jeff Janes, )
     Re: Hardware suggestions for maximum read performance  (Mike McCann, )
      Re: Hardware suggestions for maximum read performance  (Scott Marlowe, )
    Re: Hardware suggestions for maximum read performance  (Greg Smith, )
     Re: Hardware suggestions for maximum read performance  (Scott Marlowe, )
 Re: Hardware suggestions for maximum read performance  (Arjen van der Meijden, )
  Re: Hardware suggestions for maximum read performance  (Scott Marlowe, )
 Re: Hardware suggestions for maximum read performance  (Julien Cigar, )
 Re: Hardware suggestions for maximum read performance  ("Yuri Levinsky", )

On May 7, 2013, at 4:21 PM, Jeff Janes wrote:

On Thu, May 2, 2013 at 6:35 PM, Scott Marlowe <> wrote:
On Thu, May 2, 2013 at 5:11 PM, Mike McCann <> wrote:
> Hello,
>
> We are in the fortunate situation of having more money than time to help
> solve our PostgreSQL 9.1 performance problem.
>
> Our server hosts databases that are about 1 GB in size with the largest
> tables having order 10 million 20-byte indexed records. The data are loaded
> once and then read from a web app and other client programs.  Some of the
> queries execute ORDER BY on the results. There are typically less than a
> dozen read-only concurrent connections to any one database.

I wouldn't count on this being a problem that can be fixed merely by throwing money at it.

How many rows does any one of these queries need to access and then ORDER BY?

...

>
> HP ProLiant DL360p Gen 8
> Dual Intel Xeon 2.4GHz 4-core E5-2609 CPUs
> 64GB RAM
> 2x146GB 15K SAS hard drives
> 3x200GB SATA SLC SSDs
> + the usual accessories (optical drive, rail kit, dual power supplies)

If your DB is 1G, and will grow to 10G then the IO shouldn't be any
problem, as the whole db should be cached in memory.


But it can take a surprisingly long time to get it cached in the first place, from a cold start.

If that is the problem, pg_prewarm could help.  


Cheers,

Jeff

Thank you everyone for your suggestions.

It's clear that our current read performance was not limited by hardware.  An 'explain analyze' for a sample query is:

stoqs_march2013_s=# show work_mem;
 work_mem 
----------
 1MB
(1 row)

stoqs_march2013_s=# explain analyze select * from stoqs_measuredparameter order by datavalue;
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=541002.15..549456.68 rows=3381814 width=20) (actual time=6254.780..7244.074 rows=3381814 loops=1)
   Sort Key: datavalue
   Sort Method: external merge  Disk: 112424kB
   ->  Seq Scan on stoqs_measuredparameter  (cost=0.00..55359.14 rows=3381814 width=20) (actual time=0.011..354.385 rows=3381814 loops=1)
 Total runtime: 7425.854 ms
(5 rows)


Increasing work_mem to 355 MB improves the performance by a factor of 2:

stoqs_march2013_s=# set work_mem='355MB';
SET
stoqs_march2013_s=# explain analyze select * from stoqs_measuredparameter order by datavalue;
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=422106.15..430560.68 rows=3381814 width=20) (actual time=2503.078..2937.130 rows=3381814 loops=1)
   Sort Key: datavalue
   Sort Method: quicksort  Memory: 362509kB
   ->  Seq Scan on stoqs_measuredparameter  (cost=0.00..55359.14 rows=3381814 width=20) (actual time=0.016..335.745 rows=3381814 loops=1)
 Total runtime: 3094.601 ms
(5 rows)

I tried changing random_page_cost to from 4 to 1 and saw no change.

I'm wondering now what changes might get this query to run in less than one second.  If all the data is in memory, then will faster CPU and memory be the things that help?

We have an alternate (a bit more conventional) server configuration that we are considering:

HP ProLiant DL360p Gen 8
Dual Intel Xeon 3.3GHz 4-core E5-2643 CPUs
128GB PC3-12800 RAM
16x146GB 15K SAS hard drives
HP Smart Array P822/2GB FBWC controller + P420i w/ 2GB FBWC
+ the usual accessories (optical drive, rail kit, dual power supplies)


All suggestions welcomed!

-Mike

--
Mike McCann
Software Engineer
Monterey Bay Aquarium Research Institute
7700 Sandholdt Road
Moss Landing, CA 95039-9644
Voice: 831.775.1769  Fax: 831.775.1736 http://www.mbari.org


В списке pgsql-performance по дате сообщения:

От: Scott Marlowe
Дата:
Сообщение: Re: Hardware suggestions for maximum read performance
От: Sameer Thakur
Дата:
Сообщение: Predicate information in EXPLAIN Command