Re: PostgreSQL performance problem -> tuning
От | Richard Huxton |
---|---|
Тема | Re: PostgreSQL performance problem -> tuning |
Дата | |
Msg-id | 200308061030.36610.dev@archonet.com обсуждение исходный текст |
Ответ на | PostgreSQL performance problem -> tuning (Yaroslav Mazurak <yamazurak@Lviv.Bank.Gov.UA>) |
Ответы |
Re: PostgreSQL performance problem -> tuning
|
Список | pgsql-performance |
On Wednesday 06 August 2003 08:34, Yaroslav Mazurak wrote: > Hi All! > > > I have installed PostgreSQL 7.3.2 on FreeBSD 4.7, running on PC with > CPU Pentium II 400MHz and 384Mb RAM. Version 7.3.4 is just out - probably worth upgrading as soon as it's convenient. > Problem is that SQL statement (see below) is running too long. With > current WHERE clause 'SUBSTR(2, 2) IN ('NL', 'NM') return 25 records. > With 1 record, SELECT time is about 50 minutes and takes approx. 120Mb > RAM. With 25 records SELECT takes about 600Mb of memory and ends after > about 10 hours with error: "Memory exhausted in AllocSetAlloc(32)". [snip] > > Current postgresql.conf settings (some) are: > > === Cut === > max_connections = 8 > > shared_buffers = 8192 > max_fsm_relations = 256 > max_fsm_pages = 65536 > max_locks_per_transaction = 16 > wal_buffers = 256 > > sort_mem = 131072 This sort_mem value is *very* large - that's 131MB for *each sort* that gets done. I'd suggest trying something in the range 1,000-10,000. What's probably happening with the error above is that PG is allocating ridiculous amounts of memory, the machines going into swap and everything eventually grinds to a halt. > vacuum_mem = 16384 > > checkpoint_segments = 4 > checkpoint_timeout = 300 > commit_delay = 32000 > commit_siblings = 4 > fsync = false I'd turn fsync back on - unless you don't mind losing your data after a crash. > enable_seqscan = false Don't tinker with these in a live system, they're only really for testing/debugging. > effective_cache_size = 65536 So you typically get about 256MB cache usage in top/free? > === Cut === > > SELECT statement is: > > SELECT showcalc('B00204', dd, r020, t071) AS s04 > FROM v_file02wide > WHERE a011 = 3 > AND inrepdate(data) > AND SUBSTR(ncks, 2, 2) IN ('NL', 'NM') > AND r030 = 980; Hmm - mostly views and function calls, OK - I'll read on. > (cost=174200202474.99..174200202474.99 rows=1 width=143) -> Hash Join ^^^^^^^ This is a BIG cost estimate and you've got lots more like them. I'm guessing it's because of the sort_mem / enable_seqscan settings you have. The numbers don't make sense to me - it sounds like you've pushed the cost estimator into a very strange corner. > Function showcalc definition is: > > CREATE OR REPLACE FUNCTION showcalc(VARCHAR(10), VARCHAR(2), VARCHAR(4), > NUMERIC(16)) RETURNS NUMERIC(16) > LANGUAGE SQL AS ' > -- Parameters: code, dd, r020, t071 > SELECT COALESCE( > (SELECT sc.koef * $4 > FROM showing AS s NATURAL JOIN showcomp AS sc > WHERE s.kod LIKE $1 > AND NOT SUBSTR(acc_mask, 1, 1) LIKE ''['' > AND SUBSTR(acc_mask, 1, 4) LIKE $3 > AND SUBSTR(acc_mask, 5, 1) LIKE SUBSTR($2, 1, 1)), Obviously, you could use = for these 3 rather than LIKE ^^^ Same below too. > (SELECT SUM(sc.koef * COALESCE(showcalc(SUBSTR(acc_mask, 2, > LENGTH(acc_mask) - 2), $2, $3, $4), 0)) > FROM showing AS s NATURAL JOIN showcomp AS sc > WHERE s.kod LIKE $1 > AND SUBSTR(acc_mask, 1, 1) LIKE ''[''), > 0) AS showing; > '; > > View v_file02wide is: > > CREATE VIEW v_file02wide AS > SELECT id_a011 AS a011, data, obl.ko, obl.nazva AS oblast, b030, > banx.box AS ncks, banx.nazva AS bank, > epr.dd, r020, r030, a3, r030.nazva AS valuta, k041, > -- Sum equivalent in national currency > t071 * get_kurs(id_r030, data) AS t070, > t071 > FROM v_file02 AS vf02 > JOIN kod_obl AS obl USING(id_obl) > JOIN (dov_bank NATURAL JOIN dov_tvbv) AS banx > ON banx.id_v = vf02.id_v > AND data BETWEEN COALESCE(banx.dataa, data) > AND COALESCE(banx.datab, data) > JOIN ek_pok_r AS epr USING(id_dd) > JOIN kl_r020 USING(id_r020) > JOIN kl_r030 AS r030 USING(id_r030) > JOIN kl_k041 USING(id_k041); You might want to rewrite the view so it doesn't use explicit JOIN statements, i.e FROM a,b WHERE a.id=b.ref rather than FROM a JOIN b ON id=ref At the moment, this will force PG into making the joins in the order you write them (I think this is changed in v7.4) > Function inrepdate is: > > CREATE OR REPLACE FUNCTION inrepdate(DATE) RETURNS BOOL > LANGUAGE SQL AS ' > -- Returns true if given date is in repdate > SELECT (SELECT COUNT(*) FROM repdate > WHERE $1 BETWEEN COALESCE(data1, CURRENT_DATE) > AND COALESCE(data2, CURRENT_DATE)) > > > 0; You can probably replace this with: SELECT true FROM repdate WHERE $1 ... You'll need to look at where it's used though. [snip table sizes] > Table has indexes almost for all selected fields. That's not going to help you for the SUBSTR(...) stuff, although you could use functional indexes (see manuals/list archives for details). First thing is to get those two configuration settings somewhere sane, then we can tune properly. You might like the document at: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html -- Richard Huxton Archonet Ltd
В списке pgsql-performance по дате отправления: