Re: checkpoints after database start/immediate checkpoints
От | Robert Haas |
---|---|
Тема | Re: checkpoints after database start/immediate checkpoints |
Дата | |
Msg-id | CA+TgmoYH=oFhnjgG7AtxHX3vJqpssJ=WP7848jv7C3xN6RjCbQ@mail.gmail.com обсуждение исходный текст |
Ответ на | checkpoints after database start/immediate checkpoints (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: checkpoints after database start/immediate checkpoints
(Andres Freund <andres@anarazel.de>)
|
Список | pgsql-hackers |
On Mon, Feb 1, 2016 at 7:43 PM, Andres Freund <andres@anarazel.de> wrote: > Right now it takes checkpoint_timeout till we start a checkpoint, and > checkpoint_timeout + checkpoint_timeout * checkpoint_completion_target > till we complete the first checkpoint after shutdown/forced checkpoints. > > That means a) that such checkpoint will often be bigger/more heavyweight > than the following ones, not what you want after a restart/create > database/basebackup... b) reliable benchmarks measuring steady state > have to run for even longer. > > There's some logic to that behaviour though: With a target of 0 it'd be > weird to start a checkpoint directly after startup, and even otherwise > there'll be less to write at the beginning. > > As an example: > 2016-02-02 01:32:58.053 CET 21361 LOG: checkpoint complete: wrote 186041 buffers (71.0%); 1 transaction log file(s) added,0 removed, 0 recycled; sort=0.071 s, write=9.504 s, sync=0.000 s, total=9.532 s; sync files=0, longest=0.000 s, average=0.000s; distance=1460260 kB, estimate=1460260 kB > 2016-02-02 01:32:58.053 CET 21361 LOG: checkpoint starting: time > 2016-02-02 01:33:08.061 CET 21361 LOG: checkpoint complete: wrote 127249 buffers (48.5%); 0 transaction log file(s) added,0 removed, 89 recycled; sort=0.045 s, write=9.987 s, sync=0.000 s, total=10.007 s; sync files=0, longest=0.000 s, average=0.000s; distance=1187558 kB, estimate=1432989 kB > 2016-02-02 01:33:58.216 CET 21361 LOG: checkpoint complete: wrote 124649 buffers (47.5%); 0 transaction log file(s) added,0 removed, 69 recycled; sort=0.048 s, write=10.160 s, sync=0.000 s, total=10.176 s; sync files=0, longest=0.000 s,average=0.000 s; distance=1135086 kB, estimate=1337776 kB > 2016-02-02 01:33:58.216 CET 21361 LOG: checkpoint starting: time > 2016-02-02 01:34:08.060 CET 21361 LOG: checkpoint complete: wrote 123298 buffers (47.0%); 0 transaction log file(s) added,0 removed, 69 recycled; sort=0.049 s, write=9.838 s, sync=0.000 s, total=9.843 s; sync files=0, longest=0.000 s, average=0.000s; distance=1184077 kB, estimate=1322406 kB > 2016-02-02 01:34:08.060 CET 21361 LOG: checkpoint starting: time > 2016-02-02 01:34:18.052 CET 21361 LOG: checkpoint complete: wrote 118165 buffers (45.1%); 0 transaction log file(s) added,0 removed, 72 recycled; sort=0.046 s, write=9.987 s, sync=0.000 s, total=9.992 s; sync files=0, longest=0.000 s, average=0.000s; distance=1198018 kB, estimate=1309967 kB > 2016-02-02 01:34:18.053 CET 21361 LOG: checkpoint starting: time > 2016-02-02 01:34:28.089 CET 21361 LOG: checkpoint complete: wrote 120814 buffers (46.1%); 0 transaction log file(s) added,0 removed, 73 recycled; sort=0.051 s, write=10.020 s, sync=0.000 s, total=10.036 s; sync files=0, longest=0.000 s,average=0.000 s; distance=1203691 kB, estimate=1299339 kB > 2016-02-02 01:34:28.090 CET 21361 LOG: checkpoint starting: time > 2016-02-02 01:34:39.182 CET 21361 LOG: checkpoint complete: wrote 110411 buffers (42.1%); 0 transaction log file(s) added,0 removed, 74 recycled; sort=0.047 s, write=9.908 s, sync=0.000 s, total=11.092 s; sync files=0, longest=0.000 s, average=0.000s; distance=1073612 kB, estimate=1276767 kB > > We wrote roughly 1/3 more wal/buffers during the first checkpoint than > following ones (And yes, the above is with an impossibly low timeout, > but that doesn't change anything). You can make that much more extreme > with larget shared buffers and larger timeouts. > > > I wonder if this essentially point at checkpoint_timeout being wrongly > defined: Currently it means we'll try to finish a checkpoint > (1-checkpoint_completion_target) * timeout before the next one - but > perhaps it should instead be that we start checkpoint_timeout * _target > before the next timeout? Afaics that'd work more graceful in the face of > restarts and forced checkpoints. There's a certain appeal to that, but at the same time it seems pretty wonky. Right now, you can say that a checkpoint is triggered when the amount of WAL reaches X or the amount of time reaches Y, but with the alternative definition it's a bit harder to explain what's going on there. Maybe we should do it anyway, but... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: