Re: postgresql latency & bgwriter not doing its job
От | Fabien COELHO |
---|---|
Тема | Re: postgresql latency & bgwriter not doing its job |
Дата | |
Msg-id | alpine.DEB.2.10.1408261026060.4394@sto обсуждение исходный текст |
Ответ на | Re: postgresql latency & bgwriter not doing its job (Andres Freund <andres@2ndquadrant.com>) |
Ответы |
Re: postgresql latency & bgwriter not doing its job
|
Список | pgsql-hackers |
> What are the other settings here? checkpoint_segments, > checkpoint_timeout, wal_buffers? They simply are the defaults: checkpoint_segments = 3 checkpoint_timeout = 5min wal_buffers = -1 I did some test checkpoint_segments = 1, the problem is just more frequent but shorter. I also reduced wal_segsize down to 1MB, which also made it even more frequent but much shorter, so the overall result was an improvement with 5% to 3% of transactions lost instead of 10-14%, if I recall correctly. I have found no solution on this path. > Could you show the output of log_checkpoints during that run? Checkpoint > spreading only works halfway efficiently if all checkpoints are > triggered by "time" and not by "xlog". I do 500 seconds tests, so there could be at most 2 timeout triggered checkpoints. Given the write load it takes about 2 minutes to fill the 3 16 MB buffers (8 kb * 50 tps (there is one page modified per transaction) * 120 s ~ 48 MB), so checkpoints are triggered by xlog. The maths are consistent with logs (not sure which prooves which, though:-): LOG: received SIGHUP, reloading configuration files LOG: parameter "log_checkpoints" changed to "on" LOG: checkpointstarting: xlog LOG: checkpoint complete: wrote 5713 buffers (34.9%); 0 transaction log file(s) added, 0 removed,0 recycled; write=51.449 s, sync=4.857 s, total=56.485 s; sync files=12, longest=2.160 s, average=0.404 s LOG: checkpoint starting: xlog LOG: checkpoint complete: wrote 6235 buffers (38.1%); 0 transaction log file(s) added,0 removed, 3 recycled; write=53.500 s, sync=5.102 s, total=58.670 s; sync files=8, longest=2.689 s, average=0.637s LOG: checkpoint starting: xlog LOG: checkpoint complete: wrote 6250 buffers (38.1%); 0 transaction log file(s) added, 0 removed, 3 recycled; write=53.888 s, sync=4.504 s, total=58.495 s; sync files=8, longest=2.627s, average=0.563 s LOG: checkpoint starting: xlog LOG: checkpoint complete: wrote 6148 buffers (37.5%); 0transaction log file(s) added, 0 removed, 3 recycled; write=53.313 s, sync=6.437 s, total=59.834 s; sync files=8,longest=3.680 s, average=0.804 s LOG: checkpoint starting: xlog LOG: checkpoint complete: wrote 6240 buffers(38.1%); 0 transaction log file(s) added, 0 removed, 3 recycled; write=149.008 s, sync=5.448 s, total=154.566s; sync files=9, longest=3.788 s, average=0.605 s Note that my current effective solution is to do as if "checkpoints_timeout = 0.2s": it works fine if I do my own spreading. -- Fabien.
В списке pgsql-hackers по дате отправления: