Re: [ SOLVED ] select count(*) very slow on an already

Поиск
Список
Период
Сортировка
От Bill Moran
Тема Re: [ SOLVED ] select count(*) very slow on an already
Дата
Msg-id 407EECEA.7010500@potentialtech.com
обсуждение исходный текст
Ответ на Re: [ SOLVED ] select count(*) very slow on an already  ("Shea,Dan [CIS]" <Dan.Shea@ec.gc.ca>)
Список pgsql-performance
Shea,Dan [CIS] wrote:
> Bill, if you had alot of updates and deletions and wanted to optimize your
> table, can you just issue the cluster command.
> Will the cluster command rewrite the table without the obsolete data that a
> vacuum flags or do you need to issue a vacuum first?

 From the reference docs:

"During the cluster operation, a temporary copy of the table is created that
contains the table data in the index order. Temporary copies of each index
on the table are created as well. Therefore, you need free space on disk at
least equal to the sum of the table size and the index sizes.

"CLUSTER preserves GRANT, inheritance, index, foreign key, and other ancillary
information about the table.

"Because the optimizer records statistics about the ordering of tables, it is
advisable to run ANALYZE on the newly clustered table. Otherwise, the optimizer
may make poor choices of query plans."

The primary reason CLUSTER exists is to allow you to physically reorder a table
based on a key.  This should provide a performance improvement if data with
the same key is accessed all at once.  (i.e. if you do "SELECT * FROM table WHERE
key=5" and it returns 100 rows, those 100 rows are guaranteed to be all on the
same part of the disk after CLUSTER, thus a performance improvement should result.)

Updates and inserts will add data in the next available space in a table with no
regard for any keys, and _may_ require running all over the disk to retrieve
the data in the previous example query.

I doubt if CLUSTER is an end-all optimization tool.  The specific reason I
suggested it was because the original poster was asking for an easier way to
drop/recreate a table (as prior experimentation had shown this to improve
performance)  I can't think of anything easier than "CLUSTER <tablename> ON
<keyname>"

Since CLUSTER recreates the table, it implicitly removes the dead tuples.
However, it's going to be a LOT slower than vacuum, so if dead tuples are the
main problem, vacuum is still the way to go.

--
Bill Moran
Potential Technologies
http://www.potentialtech.com


В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Toooo many context switches (maybe SLES8?)
Следующее
От: Josh Berkus
Дата:
Сообщение: Re: Toooo many context switches (maybe SLES8?)