Обсуждение: Got that new server, now it's time for config!
Here we go again! Based on recommendations made here, I got my client to migrate off of our Windows 2003 Server x64 box to a new Linux box. # CENTOS 5.4 # Linux mdx_octo 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux # pgsql 8.3.10, 8 CPUs, 48GB RAM # RAID 10, 4 Disks Below are the config values of this production server (those not listed are those stubbed out) . Sadly, in an attempt to improve the server's performance, someone wiped out all of the changes I had made to date, along with comments indicating previous values, reason for the change, etc. This is a data warehouse production server, used for ETL. 500 GB database, approx 8000 tables and growing, although the vast majority of them are the original import resource tables and are rarely accessed. The actual core data is about 200 tables, consisting of millions of rows. Data importing and content management is done via a 15,000 line TCL import scripts and application base (as this is ETL with fuzzy logic, not just COPY... FROM...) . So, we have the hardware, we have the O/S - but I think our config leaves much to be desired. Typically, our planner makes nad decisions, picking seq scan over index scan, where index scan has a better result. Can anyone see any obvious faults? Carlo autovacuum = on autovacuum_analyze_scale_factor = 0.05 autovacuum_analyze_threshold = 1000 autovacuum_naptime = 1min autovacuum_vacuum_cost_delay = 50 autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 1000 bgwriter_lru_maxpages = 100 checkpoint_segments = 128 checkpoint_warning = 290s client_min_messages = debug1 datestyle = 'iso, mdy' default_statistics_target = 250 default_text_search_config = 'pg_catalog.english' lc_messages = 'C' lc_monetary = 'C' lc_numeric = 'C' lc_time = 'C' listen_addresses = '*' log_destination = 'stderr' log_error_verbosity = verbose log_line_prefix = '%t ' log_min_error_statement = debug1 log_min_messages = debug1 logging_collector = on maintenance_work_mem = 256MB max_connections = 100 max_fsm_relations = 1000 max_locks_per_transaction = 128 port = 5432 shared_buffers = 4096MB shared_preload_libraries = '$libdir/plugins/plugin_debugger.so' track_counts = on vacuum_cost_delay = 5 wal_buffers = 4MB wal_sync_method = open_sync work_mem = 64MB
On 3/22/10 4:36 PM, Carlo Stonebanks wrote: > Here we go again! > > Can anyone see any obvious faults? > > Carlo > > maintenance_work_mem = 256MB I'm not sure how large your individual tables are, but you might want to bump this value up to get faster vacuums. > max_fsm_relations = 1000 I think this will definitely need to be increased > work_mem = 64MB Most data warehousing loads I can think of will need more work_mem, but this depends on how large of data sets you are planning to sort.
Carlo Stonebanks wrote: > So, we have the hardware, we have the O/S - but I think our config > leaves much to be desired. Typically, our planner makes nad decisions, > picking seq scan over index scan, where index scan has a better result. > You're not setting effective_cache_size, so I wouldn't expect it to ever choose an index scan given the size of your data set. The planner thinks that anything bigger than 128MB isn't likely to fit in RAM by default, which favors sequential scans. That parameter should probably be >24GB on your server, so it's off by more than two orders of magnitude. > wal_sync_method = open_sync This is a scary setting to be playing with on Linux when using ext3 filesystems due to general kernel bugginess in this area. See http://archives.postgresql.org/pgsql-hackers/2007-10/msg01310.php for an example. I wouldn't change this from the default in your position if using that filesystem. I'd drastically increase effective_cache_size, put wal_sync_method back to the default, and then see how things go for a bit before tweaking anything else. Nothing else jumped out as bad in your configuration besides the extremely high logging levels, haven't looked at it that carefully yet though. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
On Tue, Mar 23, 2010 at 12:12 AM, Greg Smith <greg@2ndquadrant.com> wrote:
Carlo Stonebanks wrote:You're not setting effective_cache_size, so I wouldn't expect it to ever choose an index scan given the size of your data set. The planner thinks that anything bigger than 128MB isn't likely to fit in RAM by default, which favors sequential scans. That parameter should probably be >24GB on your server, so it's off by more than two orders of magnitude.So, we have the hardware, we have the O/S - but I think our config leaves much to be desired. Typically, our planner makes nad decisions, picking seq scan over index scan, where index scan has a better result.
+1
I'm curious why you've set:
log_min_error_statement = debug1
log_min_messages = debug1client_min_messages = debug1
Although not directly addressing the problem of using index scans, this is going to be causing lots of message verbosity, possibly (based on your rate) enough to clobber the disks more than you need to.
-Scott M
This is a scary setting to be playing with on Linux when using ext3 filesystems due to general kernel bugginess in this area. See http://archives.postgresql.org/pgsql-hackers/2007-10/msg01310.php for an example. I wouldn't change this from the default in your position if using that filesystem.wal_sync_method = open_sync
I'd drastically increase effective_cache_size, put wal_sync_method back to the default, and then see how things go for a bit before tweaking anything else. Nothing else jumped out as bad in your configuration besides the extremely high logging levels, haven't looked at it that carefully yet though.
--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance