Обсуждение: [GENERAL] Table not cleaning up drom dead tuples

Поиск
Список
Период
Сортировка

[GENERAL] Table not cleaning up drom dead tuples

От
Антон Тарабрин
Дата:
Good day. It seems that we have some strange case of VACUUM malfunction and table bloating.

PostgreSQL 9.5.3
OS #uname -a
FreeBSD db-host 10.2-RELEASE-p18 FreeBSD 10.2-RELEASE-p18 #0: Sat May 28 08:53:43 UTC 2016 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64

General info about our database:
https://gist.github.com/aCLr/dec78ab031749e517550ac11f8233f70

Information about problematic tables:
https://gist.github.com/tarabanton/edf7f8dc26eb7ec37a9cfa3424493871

As you can see, there 2 to 10 times dead tuples compared to actual row count.
We've tried VACUUM FULL and CLUSTER without any success.
There is no long term locks, idle in transaction requests or prepared transactions.

We are temporarily fixing this like that:
BEGIN;
CREATE TABLE _orders_temp (LIKE orders_y2017_m3 INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES INCLUDING COMMENTS);
INSERT INTO _orders_temp select * from orders_y2017_m3;
ALTER TABLE orders_y2017_m3 RENAME TO __orders_y2017_m3_to_drop;
ALTER TABLE __orders_y2017_m3_to_drop NO INHERIT orders;
ALTER TABLE _orders_temp RENAME TO orders_y2017_m3;
ALTER TABLE orders_y2017_m3 INHERIT orders;
COMMIT;

but bloat returns again and again
--
___________________________
С наилучшими пожеланиями,
Антон Тарабрин

With best regards,
Anton Tарабрин

Re: [GENERAL] Table not cleaning up drom dead tuples

От
Glyn Astill
Дата:

> From: Антон Тарабрин <tarabanton@gmail.com>
> To: pgsql-general@postgresql.org
> Sent: Tuesday, 14 March 2017, 12:09
> Subject: [GENERAL] Table not cleaning up drom dead tuples

> General info about  our database:

> https://gist.github.com/aCLr/dec78ab031749e517550ac11f8233f70
>
> Information about problematic tables:
> https://gist.github.com/tarabanton/edf7f8dc26eb7ec37a9cfa3424493871
>
> As you can see, there 2 to 10 times dead tuples compared to actual row count.
> We've tried VACUUM FULL and CLUSTER without any success.
> There is no long term locks, idle in transaction requests or prepared transactions.
>
> We are temporarily fixing this like that:
> BEGIN;
> CREATE TABLE _orders_temp (LIKE orders_y2017_m3 INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES INCLUDING
COMMENTS);
> INSERT INTO _orders_temp select * from orders_y2017_m3;
> ALTER TABLE orders_y2017_m3 RENAME TO __orders_y2017_m3_to_drop;
> ALTER TABLE __orders_y2017_m3_to_drop NO INHERIT orders;
> ALTER TABLE _orders_temp RENAME TO orders_y2017_m3;
> ALTER TABLE orders_y2017_m3 INHERIT orders;
> COMMIT;
>
> but bloat returns again and again

Some bloat is to be expected unless you've totally static data due to the postgres MVCC model.

Are you saying vacuum full and cluster aren't removing the bloat?  Sounds unlikely to me.


Issues only arise when you can't manage it; from what you've posted we can see autovacuum is running, but perhaps it's
notkeeping up with your workload, or your update patterns make it difficult to keep bloat down; we can see some
rollbackswhich I'm sure are part of your problem. 

You could try updating to the latest minor version of postgres as there are a few fixes to autovacuum in versions after
9.5.3,but I think analyzing your update patterns and/or tuning autovacuum will be your starting point. 


Re: [GENERAL] Table not cleaning up drom dead tuples

От
Антон Тарабрин
Дата:
Yep. VACUUM FULL not helping us on OLD table, that are not getting updated and not used in any requests. Bloat is still there 
This is production system, so now we are investigating why it's happening.

> Information about problematic tables:
At the link above is some results of VACUUM (ANALYZE, VERBOSE) from source (OLD) table.

вт, 14 мар. 2017 г. в 16:53, Glyn Astill <glynastill@yahoo.co.uk>:
> We are temporarily fixing this like that:
> BEGIN;
> CREATE TABLE _orders_temp (LIKE orders_y2017_m3 INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES INCLUDING COMMENTS);
> INSERT INTO _orders_temp select * from orders_y2017_m3;
> ALTER TABLE orders_y2017_m3 RENAME TO __orders_y2017_m3_to_drop;
> ALTER TABLE __orders_y2017_m3_to_drop NO INHERIT orders;
> ALTER TABLE _orders_temp RENAME TO orders_y2017_m3;
> ALTER TABLE orders_y2017_m3 INHERIT orders;
> COMMIT;
>
> but bloat returns again and again

Some bloat is to be expected unless you've totally static data due to the postgres MVCC model.

Are you saying vacuum full and cluster aren't removing the bloat?  Sounds unlikely to me.


Issues only arise when you can't manage it; from what you've posted we can see autovacuum is running, but perhaps it's not keeping up with your workload, or your update patterns make it difficult to keep bloat down; we can see some rollbacks which I'm sure are part of your problem.

You could try updating to the latest minor version of postgres as there are a few fixes to autovacuum in versions after 9.5.3, but I think analyzing your update patterns and/or tuning autovacuum will be your starting point.
--
___________________________
С наилучшими пожеланиями,
Антон Тарабрин

With best regards,
Anton Tarabrin

Re: [GENERAL] Table not cleaning up drom dead tuples

От
Glyn Astill
Дата:
> ________________________________

> From: Антон Тарабрин <tarabanton@gmail.com>
> To: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
> Sent: Tuesday, 14 March 2017, 14:05
> Subject: Re: [GENERAL] Table not cleaning up drom dead tuples
>
>
> Yep. VACUUM FULL not helping us on OLD table, that are not getting updated and not used in any requests. Bloat is
stillthere  
> This is production system, so now we are investigating why it's happening.
>
>
> > Information about problematic tables:
> > https://gist.github.com/tarabanton/edf7f8dc26eb7ec37a9cfa3424493871
> At the link above is some results of VACUUM (ANALYZE, VERBOSE) from source (OLD) table.
>

So what's the output of vacuum full? Or are you saying you can't sustain the exclusive lock vacuum full would require?

Plain vacuum can only reclaim free space at the end of the table, fragmented dead rows can only be marked available for
reuse.

Perhaps give us some idea of activity on your database/tables:


select * from pg_stat_user_tables where relname in ('__orders_y2017_m3_to_drop', '__orders_y2017_m2_to_drop');
select * from pg_stat_database;


Re: [GENERAL] Table not cleaning up drom dead tuples

От
Антон Тарабрин
Дата:
This tables is original ones, it doesn't have any activity now. We copied data to NEW tables and trying to solve root of the problem

<THIS_DB> - target database where broken tables are located

----- VACUUM FULL VERBOSE
<THIS_DB>=# VACUUM (FULL, VERBOSE) __orders_y2017_m2_to_drop;
INFO:  vacuuming "public.__orders_y2017_m2_to_drop"
INFO:  "__orders_y2017_m2_to_drop": found 0 removable, 3179076 nonremovable row versions in 551423 pages
DETAIL:  1778770 dead row versions cannot be removed yet.
CPU 30.92s/102.66u sec elapsed 184.69 sec.

<THIS_DB>=# VACUUM (FULL, VERBOSE) __orders_y2017_m3_to_drop;
INFO:  vacuuming "public.__orders_y2017_m3_to_drop"
INFO:  "__orders_y2017_m3_to_drop": found 0 removable, 9103104 nonremovable row versions in 1520371 pages
DETAIL:  8396820 dead row versions cannot be removed yet.
CPU 65.00s/284.03u sec elapsed 399.66 sec.


----- DB INFO
<THIS_DB>=# select * from pg_stat_user_tables where relname in ('__orders_y2017_m3_to_drop', '__orders_y2017_m2_to_drop');
   relid   | schemaname |          relname          | seq_scan | seq_tup_read | idx_scan  | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze |          last_vacuum          |        last_autovacuum        |         last_analyze          |       last_autoanalyze        | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count 
-----------+------------+---------------------------+----------+--------------+-----------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+--------------+------------------+---------------+-------------------
 179718008 | public     | __orders_y2017_m2_to_drop |     5615 |   7934041177 | 328044580 |    7979850698 |         0 |   3065776 |         0 |         25685 |    3082885 |    1759481 |                   0 | 2017-03-14 11:57:40.388527+00 | 2017-03-14 07:37:50.907757+00 | 2017-03-14 11:57:42.656628+00 | 2017-03-13 16:15:55.60846+00  |            5 |               96 |             4 |                15
 207347508 | public     | __orders_y2017_m3_to_drop |     1128 |    794959804 | 129799001 |    1292952066 |    706089 |   8377499 |         0 |        118035 |    8937540 |    8406385 |                   0 | 2017-03-14 11:57:58.026816+00 | 2017-03-14 10:09:08.597031+00 | 2017-03-14 11:57:59.117331+00 | 2017-03-14 04:11:11.370923+00 |            4 |               11 |             4 |                 7
(2 rows)

<THIS_DB>=# select * from pg_stat_database;
   datid   |      datname       | numbackends | xact_commit | xact_rollback | blks_read |   blks_hit   | tup_returned  | tup_fetched  | tup_inserted | tup_updated | tup_deleted | conflicts | temp_files | temp_bytes | deadlocks | blk_read_time | blk_write_time |          stats_reset          
-----------+--------------------+-------------+-------------+---------------+-----------+--------------+---------------+--------------+--------------+-------------+-------------+-----------+------------+------------+-----------+---------------+----------------+-------------------------------
   4906146 | <THIS_DB>            |          62 |    24781721 |       5888121 | 492125811 | 348274702788 | 1127846911908 | 250049066062 |    413981238 |   188610068 |      397036 |         0 |         53 | 7507001344 |         1 |             0 |              0 | 2017-03-06 02:33:26.466458+00
 113713583 | sentry             |           0 |      350030 |           342 |     11574 |     33444698 |      22519113 |     10577975 |         2438 |       27672 |           2 |         0 |          0 |          0 |         0 |             0 |              0 | 2017-03-06 02:33:24.156858+00
 148539615 | test               |           0 |           0 |             0 |         0 |            0 |             0 |            0 |            0 |           0 |           0 |         0 |          0 |          0 |         0 |             0 |              0 | 
 161510793 | ... |           0 |           0 |             0 |         0 |            0 |             0 |            0 |            0 |           0 |           0 |         0 |          0 |          0 |         0 |             0 |              0 | 
(8 rows)


вт, 14 мар. 2017 г. в 17:37, Glyn Astill <glynastill@yahoo.co.uk>:
So what's the output of vacuum full? Or are you saying you can't sustain the exclusive lock vacuum full would require?

Plain vacuum can only reclaim free space at the end of the table, fragmented dead rows can only be marked available for reuse.

Perhaps give us some idea of activity on your database/tables:


select * from pg_stat_user_tables where relname in ('__orders_y2017_m3_to_drop', '__orders_y2017_m2_to_drop');
select * from pg_stat_database;
--
___________________________
С наилучшими пожеланиями,
Антон Тарабрин

With best regards,
Anton Tarabrin

Re: [GENERAL] Table not cleaning up drom dead tuples

От
Glyn Astill
Дата:

> This tables is original ones, it doesn't have any activity now. We copied data to NEW tables and trying to solve root
ofthe problem 
>
> <THIS_DB> - target database where broken tables are located
>
>
> ----- VACUUM FULL VERBOSE
> <THIS_DB>=# VACUUM (FULL, VERBOSE) __orders_y2017_m2_to_drop;
> INFO:  vacuuming "public.__orders_y2017_m2_to_drop"
> INFO:  "__orders_y2017_m2_to_drop": found 0 removable, 3179076 nonremovable row versions in 551423 pages
> DETAIL:  1778770 dead row versions cannot be removed yet.
> CPU 30.92s/102.66u sec elapsed 184.69 sec.
>
> <THIS_DB>=# VACUUM (FULL, VERBOSE) __orders_y2017_m3_to_drop;
> INFO:  vacuuming "public.__orders_y2017_m3_to_drop"
> INFO:  "__orders_y2017_m3_to_drop": found 0 removable, 9103104 nonremovable row versions in 1520371 pages
>
> DETAIL:  8396820 dead row versions cannot be removed yet.
> CPU 65.00s/284.03u sec elapsed 399.66 sec.
>
>
> ----- DB INFO
> <THIS_DB>=# select * from pg_stat_user_tables where relname in ('__orders_y2017_m3_to_drop',
'__orders_y2017_m2_to_drop');
>    relid   | schemaname |          relname          | seq_scan | seq_tup_read | idx_scan  | idx_tup_fetch | n_tup_ins
|n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze |          last_vacuum
|       last_autovacuum        |         last_analyze          |       last_autoanalyze        | vacuum_count |
autovacuum_count| analyze_count | autoanalyze_count  
>
-----------+------------+---------------------------+----------+--------------+-----------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+--------------+------------------+---------------+-------------------
>  179718008 | public     | __orders_y2017_m2_to_drop |     5615 |   7934041177 | 328044580 |    7979850698 |         0
|  3065776 |         0 |         25685 |    3082885 |    1759481 |                   0 | 2017-03-14 11:57:40.388527+00
|2017-03-14 07:37:50.907757+00 | 2017-03-14 11:57:42.656628+00 | 2017-03-13 16:15:55.60846+00  |            5 |
     96 |             4 |                15 
>  207347508 | public     | __orders_y2017_m3_to_drop |     1128 |    794959804 | 129799001 |    1292952066 |    706089
|  8377499 |         0 |        118035 |    8937540 |    8406385 |                   0 | 2017-03-14 11:57:58.026816+00
|2017-03-14 10:09:08.597031+00 | 2017-03-14 11:57:59.117331+00 | 2017-03-14 04:11:11.370923+00 |            4 |
     11 |             4 |                 7 
> (2 rows)
>
> <THIS_DB>=# select * from pg_stat_database;
>    datid   |      datname       | numbackends | xact_commit | xact_rollback | blks_read |   blks_hit   | tup_returned
| tup_fetched  | tup_inserted | tup_updated | tup_deleted | conflicts | temp_files | temp_bytes | deadlocks |
blk_read_time| blk_write_time |          stats_reset  
>
-----------+--------------------+-------------+-------------+---------------+-----------+--------------+---------------+--------------+--------------+-------------+-------------+-----------+------------+------------+-----------+---------------+----------------+-------------------------------
>    4906146 | <THIS_DB>            |          62 |    24781721 |       5888121 | 492125811 | 348274702788 |
1127846911908| 250049066062 |    413981238 |   188610068 |      397036 |         0 |         53 | 7507001344 |
1|             0 |              0 | 2017-03-06 02:33:26.466458+00 
>
>  113713583 | sentry             |           0 |      350030 |           342 |     11574 |     33444698 |
22519113|     10577975 |         2438 |       27672 |           2 |         0 |          0 |          0 |         0 |
         0 |              0 | 2017-03-06 02:33:24.156858+00 
>
>  148539615 | test               |           0 |           0 |             0 |         0 |            0 |
0|            0 |            0 |           0 |           0 |         0 |          0 |          0 |         0 |
  0 |              0 |  
>  161510793 | ... |           0 |           0 |             0 |         0 |            0 |             0 |
0|            0 |           0 |           0 |         0 |          0 |          0 |         0 |             0 |
    0 |  
> (8 rows)

Quite a large quantity of rollbacks there.  In your initial email the longest running transaction was an autovacuum
taskwasn't it?  Are you sure there are no other long running transactions? 


Whats the output of:

select min(xact_start) from pg_stat_activity where state<>'idle';
select * from pg_stat_activity where state<>'idle' order by xact_start limit 1;


Re: [GENERAL] Table not cleaning up drom dead tuples

От
Антон Тарабрин
Дата:
We're, in general, pretty carefull with our DB, as it contains important data. 
Most rollback is issued by application (which processes all data inside transactions).

p.s.  Time is in UTC (GMT+0)

<THIS_DB>=# select min(xact_start) from pg_stat_activity where state<>'idle';
              min
-------------------------------
 2017-03-14 15:36:05.432139+00
(1 row)

<THIS_DB>=# select * from pg_stat_activity where state<>'idle' order by xact_start limit 1;
  datid  | datname |  pid  | usesysid | usename | application_name | client_addr | client_hostname | client_port |         backend_start         |          xact_start           |          query_start          |         state_change          | waiting | state  | backend_xid | backend_xmin |                      query
---------+---------+-------+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+---------+--------+-------------+--------------+--------------------------------------------------
 4906146 | <THIS_DB> | 37235 |       10 | pgsql   |                  |             |                 |             | 2017-03-14 05:55:43.287128+00 | 2017-03-14 15:36:05.432139+00 | 2017-03-14 15:36:05.432139+00 | 2017-03-14 15:36:05.432141+00 | f       | active |             |   1621959045 | autovacuum: VACUUM public.stats_y2017_m3_d13_hk2
(1 row)

вт, 14 мар. 2017 г. в 18:15, Glyn Astill <glynastill@yahoo.co.uk>:
Quite a large quantity of rollbacks there.  In your initial email the longest running transaction was an autovacuum task wasn't it?  Are you sure there are no other long running transactions?


Whats the output of:

select min(xact_start) from pg_stat_activity where state<>'idle';
select * from pg_stat_activity where state<>'idle' order by xact_start limit 1;
--
___________________________
С наилучшими пожеланиями,
Антон Тарабрин

With best regards,
Anton Tarabrin

Re: [GENERAL] Table not cleaning up drom dead tuples

От
Glyn Astill
Дата:
> We're, in general, pretty carefull with our DB, as it contains important data.
> Most rollback is issued by application (which processes all data inside transactions).
>
> p.s.  Time is in UTC (GMT+0)
>
> <THIS_DB>=# select min(xact_start) from pg_stat_activity where state<>'idle';
>               min
> -------------------------------
>  2017-03-14 15:36:05.432139+00
> (1 row)
>
> <THIS_DB>=# select * from pg_stat_activity where state<>'idle' order by xact_start limit 1;
>   datid  | datname |  pid  | usesysid | usename | application_name | client_addr | client_hostname | client_port |
    backend_start         |          xact_start           |          query_start          |         state_change
 | waiting | state  | backend_xid | backend_xmin |                      query 
>
---------+---------+-------+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+---------+--------+-------------+--------------+--------------------------------------------------
>  4906146 | <THIS_DB> | 37235 |       10 | pgsql   |                  |             |                 |             |
2017-03-1405:55:43.287128+00 | 2017-03-14 15:36:05.432139+00 | 2017-03-14 15:36:05.432139+00 | 2017-03-14
15:36:05.432141+00| f       | active |             |   1621959045 | autovacuum: VACUUM public.stats_y2017_m3_d13_hk2 
> (1 row)
>
>


And no prepared transactions you say?

select * from pg_prepared_xacts;

Perhaps someone else will chime in ...


Re: [GENERAL] Table not cleaning up drom dead tuples

От
Антон Тарабрин
Дата:
Nope, we've checked everything before deciding to write in mail list.
We have no idea why it's happening.

вт, 14 мар. 2017 г. в 19:10, Glyn Astill <glynastill@yahoo.co.uk>:
And no prepared transactions you say?

select * from pg_prepared_xacts;

Perhaps someone else will chime in ...
--
___________________________
С наилучшими пожеланиями,
Антон Тарабрин

With best regards,
Anton Tarabrin

Re: [GENERAL] Table not cleaning up drom dead tuples

От
Jeff Janes
Дата:
On Tue, Mar 14, 2017 at 5:09 AM, Антон Тарабрин <tarabanton@gmail.com> wrote:
Good day. It seems that we have some strange case of VACUUM malfunction and table bloating.

PostgreSQL 9.5.3

Are you using replication slots?

See this, fixed in 9.5.5:

commit de396a1cb34626619ddc6fb9dec6d12abee8b589
Author: Andres Freund <andres@anarazel.de>
Date:   Wed Aug 17 13:15:03 2016 -0700

    Properly re-initialize replication slot shared memory upon creation.

    Slot creation did not clear all fields upon creation. After start the
    memory is zeroed, but when a physical replication slot was created in
    the shared memory of a previously existing logical slot, catalog_xmin
    would not be cleared. That in turn would prevent vacuum from doing its
    duties.


Cheers,

Jeff

Re: [GENERAL] Table not cleaning up drom dead tuples

От
Антон Тарабрин
Дата:
That was exactly that one bug. Thanks a lot

вт, 14 марта 2017 г., 23:26 Jeff Janes <jeff.janes@gmail.com>:
On Tue, Mar 14, 2017 at 5:09 AM, Антон Тарабрин <tarabanton@gmail.com> wrote:
Good day. It seems that we have some strange case of VACUUM malfunction and table bloating.

PostgreSQL 9.5.3

Are you using replication slots?

See this, fixed in 9.5.5:

commit de396a1cb34626619ddc6fb9dec6d12abee8b589
Author: Andres Freund <andres@anarazel.de>
Date:   Wed Aug 17 13:15:03 2016 -0700

    Properly re-initialize replication slot shared memory upon creation.

    Slot creation did not clear all fields upon creation. After start the
    memory is zeroed, but when a physical replication slot was created in
    the shared memory of a previously existing logical slot, catalog_xmin
    would not be cleared. That in turn would prevent vacuum from doing its
    duties.


Cheers,

Jeff
--
___________________________
С наилучшими пожеланиями,
Антон Тарабрин

With best regards,
Anton Tarabrin