Re: Need help in setting optimal configuration for a huge

Поиск
Список
Период
Сортировка
От Craig A. James
Тема Re: Need help in setting optimal configuration for a huge
Дата
Msg-id 435BBAF0.8080500@modgraph-usa.com
обсуждение исходный текст
Ответ на Need help in setting optimal configuration for a huge database.  (Kishore B <kishorebh@gmail.com>)
Ответы Re: Need help in setting optimal configuration for a huge  ("Steinar H. Gunderson" <sgunderson@bigfoot.com>)
Re: Need help in setting optimal configuration for a huge  (Tomasz Rybak <bogomips@post.pl>)
Re: Need help in setting optimal configuration for a huge  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Need help in setting optimal configuration for a huge database.  (Kishore B <kishorebh@gmail.com>)
Список pgsql-performance
> We are facing a* critical situation because of the performance of the
> **database** .* Even a basic query like select count(*) from
> bigger_table is taking about 4 minutes to return.

Several other replies have mentioned that COUNT() requires a full table scan, but this point can't be emphasized
enough:Don't do it!  People who are migrating from other environments (Oracle or MySQL) are used to COUNT(), MIN() and
MAX()returning almost instantaneously, certainly on indexed columns.  But for reasons that have something to do with
transactions,these operations are unbelievably slow in PostgreSQL.   

Here are the alternatives that I've learned.

COUNT() -- There is no good substitute.  What I do is create a new column, "ROW_NUM" with an auto-incrementing
sequence. Every time I insert a row, it gets a new value.  Unfortunately, this doesn't work if you ever delete a row.
Thealternative is a more complex pair of triggers, one for insert and one for delete, that maintains the count in a
separateone-row table.  It's a nuisance, but it's a lot faster than doing a full table scan for every COUNT(). 

MIN() and MAX() -- These are surprisingly slow, because they seem to do a full table scan EVEN ON AN INDEXED COLUMN!  I
don'tunderstand why, but happily there is an effective substitute: 

   select mycolumn from mytable order by mycolumn limit 1;  -- same as MIN()

   select mycolumn from mytable order by mycolumn desc limit 1;  -- same as MAX()

For a large table, MIN or MAX can take 5-10 minutes, where the above "select..." replacements can return in one
millisecond.

You should carefully examine your entire application for COUNT, MIN, and MAX, and get rid of them EVERYWHERE.  This may
bethe entire source of your problem.  It was in my case.   

This is, in my humble opinion, the only serious flaw in PostgreSQL.  I've been totally happy with it in every other
way,and once I understood these shortcomings, my application is runs faster than ever on PostgreSQL. 

Craig

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Steinar H. Gunderson"
Дата:
Сообщение: Re: Materializing a sequential scan
Следующее
От: "Steinar H. Gunderson"
Дата:
Сообщение: Re: Need help in setting optimal configuration for a huge