Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

Поиск
Список
Период
Сортировка
От Joshua D. Drake
Тема Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise
Дата
Msg-id 0ec4c002-a219-942d-a686-0804c826c3e0@commandprompt.com
обсуждение исходный текст
Ответ на Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise  (Sokolov Yura <y.sokolov@postgrespro.ru>)
Ответы Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise  ("Joshua D. Drake" <jd@commandprompt.com>)
Список pgsql-hackers
Hello,

I changed the test to run for 6 hours at a time regardless of number of 
transactions. I also changed the du command to only look at the database 
(previously wal logs were included). This is the clearest indication of 
the problem I have been able to produce.

Again, this is with 128 clients and 500 warehouses. The first test is a 
clean test, everything dropped, vacuumed etc... Each subsequent test is 
just starting the test again to have breakpoints.


-------------------------------------+----------- autovacuum                          | on
autovacuum_analyze_scale_factor    | 0.1 autovacuum_analyze_threshold        | 50 autovacuum_freeze_max_age           |
200000000autovacuum_max_workers              | 12 autovacuum_multixact_freeze_max_age | 400000000 autovacuum_naptime
             | 10 autovacuum_vacuum_cost_delay        | 0 autovacuum_vacuum_cost_limit        | 5000
autovacuum_vacuum_scale_factor     | 0.1 autovacuum_vacuum_threshold         | 50 autovacuum_work_mem                 |
-1log_autovacuum_min_duration         | -1 max_wal_size                        | 640 checkpoint_timeout
  | 86400 checkpoint_completion_target        | 0.5
 

Starting base metric
50G     /srv/main/base

Test 1:
90G     /srv/main/base
TPS: 838

Test 2:
121G    /srv/main/base
TPS: 725

Test 3:
146G    /srv/main/base
TPS: 642

Test 4:
171G    /srv/main/base
TPS: 549

Test 5:
189G    /srv/main/base
TPS: 489

Test 6:
208G    /srv/main/base
TPS: 454

As you can see even with aggressive vacuuming, over a period of 36 hours 
life gets increasingly miserable.

The largest table is:

postgres=# select 
pg_size_pretty(pg_total_relation_size('bmsql_order_line')); pg_size_pretty
---------------- 148 GB
(1 row)

postgres=# \d bmsql_order_line             Table "public.bmsql_order_line"     Column     |            Type
|Modifiers
 
----------------+-----------------------------+----------- ol_w_id        | integer                     | not null
ol_d_id       | integer                     | not null ol_o_id        | integer                     | not null
ol_number     | integer                     | not null ol_i_id        | integer                     | not null
ol_delivery_d | timestamp without time zone | ol_amount      | numeric(6,2)                | ol_supply_w_id | integer
                 | ol_quantity    | integer                     | ol_dist_info   | character(24)               |
 
Indexes:    "bmsql_order_line_pkey" PRIMARY KEY, btree (ol_w_id, ol_d_id, 
ol_o_id, ol_number)
Foreign-key constraints:    "ol_order_fkey" FOREIGN KEY (ol_w_id, ol_d_id, ol_o_id) REFERENCES 
bmsql_oorder(o_w_id, o_d_id, o_id)    "ol_stock_fkey" FOREIGN KEY (ol_supply_w_id, ol_i_id) REFERENCES 
bmsql_stock(s_w_id, s_i_id)

With the PK being

postgres=# select pg_size_pretty(pg_relation_size('bmsql_order_line_pkey')); pg_size_pretty
---------------- 48 GB
(1 row)

I tried to see how much data we are dealing with here:

postgres=# select count(*) from bmsql_order_line;   count
----------- 910324839
(1 row)

Time: 503965.767 ms

And just to show that we were pushing to get these numbers:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle           2.38    0.00    2.20    1.98    0.00   93.44

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sdb            2027.40       239.99         0.05       1199          0
sda               0.80         0.00         0.01          0          0



So we have 910M rows, and it took 8.39941667 minutes to count them at 
240MB/s.

I know this is a lot of data and as I said previously, happy to let 
anyone look at it. However, we clearly have something deeper to look into.

Thanks in advance,

JD


-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
*****     Unless otherwise stated, opinions are my own.   *****



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Improve perfomance for index search ANY(ARRAY[]) condition with single item
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: [HACKERS] Buildfarm failure and dubious coding in predicate.c