checkpoint spikes

Поиск
Список
Период
Сортировка
От Janning
Тема checkpoint spikes
Дата
Msg-id 201006091853.23782.ml@planwerk6.de
обсуждение исходный текст
Ответы Re: checkpoint spikes  (Janning <ml@planwerk6.de>)
Список pgsql-general
Hi,

we currently encounter an increasing load on our website. With the increasing
load we see some problems on our database. so we checked what happens and we
saw spikes in our load when checkpoints are about to finish.

Our configuration:

max_connections = 125
ssl = false
shared_buffers = 500MB
work_mem = 15MB
maintenance_work_mem = 250MB
synchronous_commit = off
full_page_writes = off

checkpoint_segments = 10
checkpoint_timeout = 10min
checkpoint_completion_target = 0.9

random_page_cost = 2
effective_cache_size = 5000MB

autovacuum = off

(we put autovacuum to off because we suspected it).

all the other parameters are set to default (beside log parameters and so on).

Our machine has 12 GB RAM, i7-975 CPU and a SW-Raid-1 for datadir and another
one for the rest of the server (including postgresql logfiles). Disk are
"Barracuda 7200.11 SATA 3Gb/s 1.5-TB" and we are running debian lenny.

these are our checkpoint log statements:

2010-06-09 17:24:27 CEST [6375]: [1-1] LOG:  checkpoint starting: time
2010-06-09 17:28:09 CEST [6375]: [2-1] LOG:  checkpoint complete: wrote 2861
buffers (4.5%); 0 transaction log file(s) added, 0 removed, 1 recycled;
write=193.057 s, sync=29.259 s, total=222.353 s
2010-06-09 17:34:27 CEST [6375]: [3-1] LOG:  checkpoint starting: time
2010-06-09 17:39:09 CEST [6375]: [4-1] LOG:  checkpoint complete: wrote 3247
buffers (5.1%); 0 transaction log file(s) added, 0 removed, 2 recycled;
write=255.255 s, sync=26.911 s, total=282.177 s
2010-06-09 17:44:27 CEST [6375]: [5-1] LOG:  checkpoint starting: time
2010-06-09 17:49:41 CEST [6375]: [6-1] LOG:  checkpoint complete: wrote 2746
buffers (4.3%); 0 transaction log file(s) added, 0 removed, 2 recycled;
write=280.743 s, sync=33.392 s, total=314.147 s
2010-06-09 17:54:27 CEST [6375]: [7-1] LOG:  checkpoint starting: time
2010-06-09 17:58:59 CEST [6375]: [8-1] LOG:  checkpoint complete: wrote 3118
buffers (4.9%); 0 transaction log file(s) added, 0 removed, 1 recycled;
write=253.293 s, sync=18.585 s, total=271.892 s
2010-06-09 18:04:27 CEST [6375]: [9-1] LOG:  checkpoint starting: time
2010-06-09 18:08:46 CEST [6375]: [10-1] LOG:  checkpoint complete: wrote 2695
buffers (4.2%); 0 transaction log file(s) added, 0 removed, 2 recycled;
write=225.173 s, sync=33.789 s, total=258.972 s
2010-06-09 18:14:27 CEST [6375]: [11-1] LOG:  checkpoint starting: time
2010-06-09 18:18:30 CEST [6375]: [12-1] LOG:  checkpoint complete: wrote 2868
buffers (4.5%); 0 transaction log file(s) added, 0 removed, 2 recycled;
write=215.561 s, sync=27.701 s, total=243.271 s

What we saw is a rather long sync time. And exactly at this time, our
responses become slow and the server load increases.

this is from pg_stat_bgwriter:

 checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean |
maxwritten_clean | buffers_backend | buffers_alloc

-------------------+-----------------+--------------------+---------------+------------------+-----------------+---------------
              3495 |               0 |            9070242 |      10798927 |
14421 |         6412707 |     208340755


We tried to increase checkpoint_timeout to 20 minutes but it was getting
worse:

2010-06-09 18:34:27 CEST [6375]: [13-1] LOG:  checkpoint starting: time
2010-06-09 18:42:49 CEST [6375]: [14-1] LOG:  checkpoint complete: wrote 2956
buffers (4.6%); 0 transaction log file(s) added, 0 removed, 1 recycled;
write=448.265 s, sync=54.087 s, total=502.377 s

here we see a 54 sec sync time and a much higher load on sync time compared to
the 10 minutes checkpoint_timeout.

do you have any hints for us how to tune our configuration to avoid spikes?

kind regards
Janning








В списке pgsql-general по дате отправления:

Предыдущее
От: Vick Khera
Дата:
Сообщение: Re: vacuum process is hanging
Следующее
От: Scott Marlowe
Дата:
Сообщение: Re: vacuum process is hanging