Re: checkpoints after database start/immediate checkpoints

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: checkpoints after database start/immediate checkpoints
Дата
Msg-id CA+TgmoYH=oFhnjgG7AtxHX3vJqpssJ=WP7848jv7C3xN6RjCbQ@mail.gmail.com
обсуждение исходный текст
Ответ на checkpoints after database start/immediate checkpoints  (Andres Freund <andres@anarazel.de>)
Ответы Re: checkpoints after database start/immediate checkpoints  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Mon, Feb 1, 2016 at 7:43 PM, Andres Freund <andres@anarazel.de> wrote:
> Right now it takes checkpoint_timeout till we start a checkpoint, and
> checkpoint_timeout + checkpoint_timeout * checkpoint_completion_target
> till we complete the first checkpoint after shutdown/forced checkpoints.
>
> That means a) that such checkpoint will often be bigger/more heavyweight
> than the following ones, not what you want after a restart/create
> database/basebackup... b) reliable benchmarks measuring steady state
> have to run for even longer.
>
> There's some logic to that behaviour though: With a target of 0 it'd be
> weird to start a checkpoint directly after startup, and even otherwise
> there'll be less to write at the beginning.
>
> As an example:
> 2016-02-02 01:32:58.053 CET 21361 LOG:  checkpoint complete: wrote 186041 buffers (71.0%); 1 transaction log file(s)
added,0 removed, 0 recycled; sort=0.071 s, write=9.504 s, sync=0.000 s, total=9.532 s; sync files=0, longest=0.000 s,
average=0.000s; distance=1460260 kB, estimate=1460260 kB 
> 2016-02-02 01:32:58.053 CET 21361 LOG:  checkpoint starting: time
> 2016-02-02 01:33:08.061 CET 21361 LOG:  checkpoint complete: wrote 127249 buffers (48.5%); 0 transaction log file(s)
added,0 removed, 89 recycled; sort=0.045 s, write=9.987 s, sync=0.000 s, total=10.007 s; sync files=0, longest=0.000 s,
average=0.000s; distance=1187558 kB, estimate=1432989 kB 
> 2016-02-02 01:33:58.216 CET 21361 LOG:  checkpoint complete: wrote 124649 buffers (47.5%); 0 transaction log file(s)
added,0 removed, 69 recycled; sort=0.048 s, write=10.160 s, sync=0.000 s, total=10.176 s; sync files=0, longest=0.000
s,average=0.000 s; distance=1135086 kB, estimate=1337776 kB 
> 2016-02-02 01:33:58.216 CET 21361 LOG:  checkpoint starting: time
> 2016-02-02 01:34:08.060 CET 21361 LOG:  checkpoint complete: wrote 123298 buffers (47.0%); 0 transaction log file(s)
added,0 removed, 69 recycled; sort=0.049 s, write=9.838 s, sync=0.000 s, total=9.843 s; sync files=0, longest=0.000 s,
average=0.000s; distance=1184077 kB, estimate=1322406 kB 
> 2016-02-02 01:34:08.060 CET 21361 LOG:  checkpoint starting: time
> 2016-02-02 01:34:18.052 CET 21361 LOG:  checkpoint complete: wrote 118165 buffers (45.1%); 0 transaction log file(s)
added,0 removed, 72 recycled; sort=0.046 s, write=9.987 s, sync=0.000 s, total=9.992 s; sync files=0, longest=0.000 s,
average=0.000s; distance=1198018 kB, estimate=1309967 kB 
> 2016-02-02 01:34:18.053 CET 21361 LOG:  checkpoint starting: time
> 2016-02-02 01:34:28.089 CET 21361 LOG:  checkpoint complete: wrote 120814 buffers (46.1%); 0 transaction log file(s)
added,0 removed, 73 recycled; sort=0.051 s, write=10.020 s, sync=0.000 s, total=10.036 s; sync files=0, longest=0.000
s,average=0.000 s; distance=1203691 kB, estimate=1299339 kB 
> 2016-02-02 01:34:28.090 CET 21361 LOG:  checkpoint starting: time
> 2016-02-02 01:34:39.182 CET 21361 LOG:  checkpoint complete: wrote 110411 buffers (42.1%); 0 transaction log file(s)
added,0 removed, 74 recycled; sort=0.047 s, write=9.908 s, sync=0.000 s, total=11.092 s; sync files=0, longest=0.000 s,
average=0.000s; distance=1073612 kB, estimate=1276767 kB 
>
> We wrote roughly 1/3 more wal/buffers during the first checkpoint than
> following ones (And yes, the above is with an impossibly low timeout,
> but that doesn't change anything). You can make that much more extreme
> with larget shared buffers and larger timeouts.
>
>
> I wonder if this essentially point at checkpoint_timeout being wrongly
> defined: Currently it means we'll try to finish a checkpoint
> (1-checkpoint_completion_target) * timeout before the next one - but
> perhaps it should instead be that we start checkpoint_timeout * _target
> before the next timeout? Afaics that'd work more graceful in the face of
> restarts and forced checkpoints.

There's a certain appeal to that, but at the same time it seems pretty
wonky.  Right now, you can say that a checkpoint is triggered when the
amount of WAL reaches X or the amount of time reaches Y, but with the
alternative definition it's a bit harder to explain what's going on
there.  Maybe we should do it anyway, but...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: extend pgbench expressions with functions
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Copy-pasto in the ExecForeignDelete documentation