Re: A few new options for CHECKPOINT

Поиск
Список
Период
Сортировка
От Bossart, Nathan
Тема Re: A few new options for CHECKPOINT
Дата
Msg-id A07F8329-4870-40B2-9991-5016491AA11A@amazon.com
обсуждение исходный текст
Ответ на Re: A few new options for CHECKPOINT  (Stephen Frost <sfrost@snowman.net>)
Ответы Re: A few new options for CHECKPOINT
Список pgsql-hackers
On 11/27/20, 10:58 AM, "Stephen Frost" <sfrost@snowman.net> wrote:
> If you'd like to show that I'm wrong, and it's entirely possible that I
> am, then retry the above with actual load on the system, and also
> actually look at how much outstanding WAL you end up with given the
> different scenarios which has to be replayed during crash recovery.

I did a little experiment to show the behavior I'm referring to.  I
used these settings:

        checkpoint_completion_target = 0.9
        checkpoint_timeout = 30s
        max_wal_size = 20GB
        WAL segment size is 64MB

I ran the following pgbench command for a few minutes before each
test:

        pgbench postgres -T 3600 -c 64 -j 64 -N

For the first test, I killed Postgres just before an automatic, non-
immediate checkpoint completed.

        2020-11-28 00:31:57 UTC::@:[51770]:LOG:  checkpoint complete...
        2020-11-28 00:32:00 UTC::@:[51770]:LOG:  checkpoint starting: time

        Killed Postgres at 00:32:26 UTC, 29 seconds after latest
        checkpoint completed.

        2020-11-28 00:32:42 UTC::@:[77256]:LOG:  redo starts at 3CF/FD6B8BD0
        2020-11-28 00:32:56 UTC::@:[77256]:LOG:  redo done at 3D0/C94D1D00

        Recovery took 14 seconds and replayed ~3.2 GB of WAL.

        postgres=> SELECT pg_wal_lsn_diff('3D0/C94D1D00', '3CF/FD6B8BD0');
         pg_wal_lsn_diff
        -----------------
              3420557616
        (1 row)

For the second test, I killed Postgres just after an automatic, non-
immediate checkpoint completed.

        2020-11-28 00:41:26 UTC::@:[77475]:LOG:  checkpoint complete...

        Killed Postgres at 00:41:26 UTC, just after latest checkpoint
        completed.

        2020-11-28 00:41:42 UTC::@:[8599]:LOG:  redo starts at 3D3/152EDD78
        2020-11-28 00:41:49 UTC::@:[8599]:LOG:  redo done at 3D3/78358A40

        Recovery took 7 seconds and replayed ~1.5 GB of WAL.

        postgres=> SELECT pg_wal_lsn_diff('3D3/78358A40', '3D3/152EDD78');
         pg_wal_lsn_diff
        -----------------
              1661381832
        (1 row)

Granted, I used a rather aggressive checkpoint_timeout, but I think
this demonstrates that waiting for a non-immediate checkpoint to
complete can lower the amount of WAL needed for recovery, even though
it might not lower it as much as waiting for an immediate checkpoint
would.

Nathan


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: [PoC] Non-volatile WAL buffer
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: POC: postgres_fdw insert batching