Re: Need to run CLUSTER to keep performance

Поиск
Список
Период
Сортировка
От Tomáš Vondra
Тема Re: Need to run CLUSTER to keep performance
Дата
Msg-id 4732F0F6.2020509@fuzzy.cz
обсуждение исходный текст
Ответ на Need to run CLUSTER to keep performance  (Rafael Martinez <r.m.guerrero@usit.uio.no>)
Список pgsql-performance
Performance problems with heavily modified tables (UPDATE or DELETE) are
usually caused by not vacuuming. There are two main modes the VACUUM can
run in (plain or full) and the former works in a much more aggressive
way (exclusive locking, etc). Try to run VACUUM FULL VERBOSE on the
table and see if it helps.

A way to fix this is usually a proper setting of pg_autovacuum daemon -
it may work on the tables that are not modified heavily, but it does not
work for the heavily modified ones. Do you have the autovacuum daemon
enabled? What are the settings of it? Try to set it a little bit more
aggressive (this can be done on a table level).

The stats from pg_stat_all_tables are nice, but I guess the stats that
matter are located in pg_class catalog, the most interesting beeing
reltuples and relpages columns - run

   SELECT relname, relpages, reltuples WHERE relname LIKE 'hosts';

and observe the number of pages before and afrer the vacuum full (or
cluster). I guess the number of pages increases quite fast and the
autovacuum daemon is not able to reclaim that - and this is probably the
cause why scanning 12 MB of data takes 2 sec, which is way too much -
the table is acrually much bigger as it contains a lot of dead data).

Tomas


> Hello
>
> This is a question about something we have seen sometimes in the last
> months. It happens with tables with a large amount of updates/selects
> compared with the amount of inserts/deletes. The sizes of these tables
> are small and the amount of rows too.
>
> The 'problem' is that performance decrease during the day and the only
> thing that helps is to run CLUSTER on the table with problems. VACUUM
> ANALYZE does not help.
>
> Some information that can help to find out why this happens:
>
> - PostgreSQL version: 8.1.9
>
> ------------------------------------------------------------------------------
> scanorama=# SELECT pg_size_pretty(pg_relation_size('hosts'));
>
>  pg_size_pretty
> ----------------
>  12 MB
> ------------------------------------------------------------------------------
> scanorama=# SELECT count(*) FROM hosts ;
>
>  count
> -------
>  16402
> ------------------------------------------------------------------------------
> scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;
>
>  Seq Scan on hosts  (cost=0.00..2771.56 rows=66756 width=314) (actual
> time=0.008..2013.415 rows=16402 loops=1)
>  Total runtime: 2048.486 ms
> ------------------------------------------------------------------------------
> scanorama=# VACUUM ANALYZE ;
> VACUUM
> ------------------------------------------------------------------------------
> scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;
>
>  Seq Scan on hosts  (cost=0.00..2718.57 rows=61357 width=314) (actual
> time=0.008..1676.283 rows=16402 loops=1)
>  Total runtime: 1700.826 ms
> ------------------------------------------------------------------------------
> scanorama=# CLUSTER hosts_pkey ON hosts ;
> CLUSTER
> ------------------------------------------------------------------------------
> scanorama=# EXPLAIN ANALYZE SELECT * FROM hosts;
>
>  Seq Scan on hosts  (cost=0.00..680.02 rows=16402 width=314) (actual
> time=0.008..31.205 rows=16402 loops=1)
>  Total runtime: 53.635 ms
> ------------------------------------------------------------------------------
> scanorama=# SELECT * from pg_stat_all_tables WHERE relname LIKE 'hosts';
>  relid  | schemaname | relname | seq_scan | seq_tup_read | idx_scan |
> idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del
>
--------+------------+---------+----------+--------------+----------+---------------+-----------+-----------+-----------
>  105805 | public     | hosts   |  1996430 |  32360280252 |  2736391 |
>    3301856 |       948 |   1403325 |       737
>
> The information from pg_stat_all_tables is from the last 20 days.
> ------------------------------------------------------------------------------
> INFO:  analyzing "public.hosts"
> INFO:  "hosts": scanned 2536 of 2536 pages, containing 16410 live rows
> and 57042 dead rows; 16410 rows in sample, 16410 estimated total rows
> INFO:  free space map contains 191299 pages in 786 relations
> DETAIL:  A total of 174560 page slots are in use (including overhead).
> 174560 page slots are required to track all free space.
> Current limits are:  2000000 page slots, 4000 relations, using 12131 KB.
> ------------------------------------------------------------------------------
>
> The tables with this 'problem' are not big, so CLUSTER finnish very fast
> and it does not have an impact in the access because of locking. But we
> wonder why this happens.
>
> Do you need more information?
>
> Thanks in advance.
> regards


В списке pgsql-performance по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Need to run CLUSTER to keep performance
Следующее
От: Rafael Martinez
Дата:
Сообщение: Re: Need to run CLUSTER to keep performance