Обсуждение: pgsql: Online enabling and disabling of data checksums

Поиск
Список
Период
Сортировка

pgsql: Online enabling and disabling of data checksums

От
Daniel Gustafsson
Дата:
Online enabling and disabling of data checksums

This allows data checksums to be enabled, or disabled, in a running
cluster without restricting access to the cluster during processing.

Data checksums could prior to this only be enabled during initdb or
when the cluster is offline using the pg_checksums app. This commit
introduce functionality to enable, or disable, data checksums while
the cluster is running regardless of how it was initialized.

A background worker launcher process is responsible for launching a
dynamic per-database background worker which will mark all buffers
dirty for all relation with storage in order for them to have data
checksums calculated on write.  Once all relations in all databases
have been processed, the data_checksums state will be set to on and
the cluster will at that point be identical to one which had data
checksums enabled during initialization or via offline processing.

When data checksums are being enabled, concurrent I/O operations
from backends other than the data checksums worker will write the
checksums but not verify them on reading.  Only when all backends
have absorbed the procsignalbarrier for setting data_checksums to
on will they also start verifying checksums on reading.  The same
process is repeated during disabling; all backends write checksums
but do not verify them until the barrier for setting the state to
off has been absorbed by all.  This in-progress state is used to
ensure there are no false negatives (or positives) due to reading
a checksum which is not in sync with the page.

A new testmodule, test_checksums, is introduced with an extensive
set of tests covering both online and offline data checksum mode
changes.  The tests which run concurrent pgbdench during online
processing are gated behind the PG_TEST_EXTRA flag due to being
very expensive to run.  Two levels of PG_TEST_EXTRA flags exist
to turn on a subset of the expensive tests, or the full suite of
multiple runs.

This work is based on an earlier version of this patch which was
reviewed by among others Heikki Linnakangas, Robert Haas, Andres
Freund, Tomas Vondra, Michael Banck and Andrey Borodin.  During
the work on this new version, Tomas Vondra has given invaluable
assistance with not only coding and reviewing but very in-depth
testing.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Magnus Hagander <magnus@hagander.net>
Co-authored-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com
Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/f19c0eccae9680f5785b11cdc58ef571998caec9

Modified Files
--------------
doc/src/sgml/config.sgml                           |    1 +
doc/src/sgml/func/func-admin.sgml                  |   78 +
doc/src/sgml/glossary.sgml                         |   24 +
doc/src/sgml/images/Makefile                       |    1 +
doc/src/sgml/images/datachecksums.gv               |   14 +
doc/src/sgml/images/datachecksums.svg              |   81 +
doc/src/sgml/monitoring.sgml                       |  228 ++-
doc/src/sgml/ref/pg_checksums.sgml                 |    6 +
doc/src/sgml/regress.sgml                          |   14 +
doc/src/sgml/wal.sgml                              |  126 +-
src/backend/access/rmgrdesc/xlogdesc.c             |   58 +-
src/backend/access/transam/xlog.c                  |  502 +++++-
src/backend/backup/basebackup.c                    |   31 +-
src/backend/bootstrap/bootstrap.c                  |    1 +
src/backend/catalog/system_views.sql               |   19 +
src/backend/commands/dbcommands.c                  |    7 +
src/backend/postmaster/Makefile                    |    1 +
src/backend/postmaster/auxprocess.c                |   19 +
src/backend/postmaster/bgworker.c                  |   10 +-
src/backend/postmaster/datachecksum_state.c        | 1612 ++++++++++++++++++++
src/backend/postmaster/meson.build                 |    1 +
src/backend/postmaster/postmaster.c                |    5 +
src/backend/replication/logical/decode.c           |   16 +
src/backend/storage/buffer/bufmgr.c                |    7 +
src/backend/storage/ipc/ipci.c                     |    3 +
src/backend/storage/ipc/procsignal.c               |    8 +
src/backend/storage/page/README                    |    4 +-
src/backend/storage/page/bufpage.c                 |   23 +-
src/backend/utils/activity/pgstat_backend.c        |    2 +
src/backend/utils/activity/pgstat_io.c             |    2 +
src/backend/utils/activity/wait_event_names.txt    |    3 +
src/backend/utils/adt/pgstatfuncs.c                |    8 +-
src/backend/utils/init/miscinit.c                  |    3 +-
src/backend/utils/init/postinit.c                  |   20 +-
src/backend/utils/misc/guc_parameters.dat          |    5 +-
src/backend/utils/misc/guc_tables.c                |    9 +-
src/backend/utils/misc/postgresql.conf.sample      |   10 +-
src/bin/pg_checksums/pg_checksums.c                |    4 +-
src/bin/pg_controldata/pg_controldata.c            |    2 +
src/bin/pg_upgrade/controldata.c                   |    9 +
src/bin/pg_waldump/t/001_basic.pl                  |    3 +-
src/include/access/rmgrlist.h                      |    1 +
src/include/access/xlog.h                          |   17 +-
src/include/access/xlog_internal.h                 |    8 +
src/include/catalog/catversion.h                   |    2 +-
src/include/catalog/pg_control.h                   |    8 +-
src/include/catalog/pg_proc.dat                    |   14 +
src/include/commands/progress.h                    |   16 +
src/include/miscadmin.h                            |    6 +
src/include/postmaster/datachecksum_state.h        |   58 +
src/include/postmaster/proctypelist.h              |    2 +
src/include/replication/decode.h                   |    1 +
src/include/storage/bufpage.h                      |    2 +-
src/include/storage/checksum.h                     |   16 +
src/include/storage/lwlocklist.h                   |    1 +
src/include/storage/procsignal.h                   |    4 +
src/include/utils/backend_progress.h               |    1 +
src/test/modules/Makefile                          |    1 +
src/test/modules/meson.build                       |    1 +
src/test/modules/test_checksums/.gitignore         |    2 +
src/test/modules/test_checksums/Makefile           |   40 +
src/test/modules/test_checksums/README             |   30 +
src/test/modules/test_checksums/meson.build        |   38 +
src/test/modules/test_checksums/t/001_basic.pl     |   63 +
src/test/modules/test_checksums/t/002_restarts.pl  |  110 ++
.../test_checksums/t/003_standby_restarts.pl       |  114 ++
src/test/modules/test_checksums/t/004_offline.pl   |   82 +
src/test/modules/test_checksums/t/005_injection.pl |   74 +
.../modules/test_checksums/t/006_pgbench_single.pl |  275 ++++
.../test_checksums/t/007_pgbench_standby.pl        |  400 +++++
src/test/modules/test_checksums/t/008_pitr.pl      |  189 +++
src/test/modules/test_checksums/t/009_fpi.pl       |   64 +
.../test_checksums/t/DataChecksums/Utils.pm        |  262 ++++
.../modules/test_checksums/test_checksums--1.0.sql |   24 +
src/test/modules/test_checksums/test_checksums.c   |  184 +++
.../modules/test_checksums/test_checksums.control  |    4 +
src/test/perl/PostgreSQL/Test/Cluster.pm           |   36 +
src/test/regress/expected/rules.out                |   35 +
src/test/regress/expected/stats.out                |   18 +-
src/tools/pgindent/typedefs.list                   |    7 +
80 files changed, 5132 insertions(+), 58 deletions(-)


Re: pgsql: Online enabling and disabling of data checksums

От
Aleksander Alekseev
Дата:
Hi Daniel,

> Online enabling and disabling of data checksums
>
> [...]

I noticed a little mistake:

```
/*
 * Await state transition to "on" in all backends. When done we know that
 * data data checksums are both written and verified in all backends.
 */
```

The word "data" is repeated twice.

Also there are inconsistencies in the way
XLogCtlData->data_checksum_version,
ControlFileData->data_checksum_version and certain variables are
assigned. Sometimes a hardcoded 0 is used and sometimes
PG_DATA_CHECKSUM_OFF. I suggest using values of the enum
ChecksumStateType for readability / consistency.

Here are corresponding patches.

--
Best regards,
Aleksander Alekseev

Вложения

Re: pgsql: Online enabling and disabling of data checksums

От
Daniel Gustafsson
Дата:
> On 6 Apr 2026, at 16:39, Aleksander Alekseev <aleksander@tigerdata.com> wrote:
>
> Hi Daniel,
>
>> Online enabling and disabling of data checksums
>>
>> [...]
>
> I noticed a little mistake:

Thanks for looking!

> ```
> /*
> * Await state transition to "on" in all backends. When done we know that
> * data data checksums are both written and verified in all backends.
> */
> ```
>
> The word "data" is repeated twice.

Ugh.

> Also there are inconsistencies in the way
> XLogCtlData->data_checksum_version,
> ControlFileData->data_checksum_version and certain variables are
> assigned. Sometimes a hardcoded 0 is used and sometimes
> PG_DATA_CHECKSUM_OFF. I suggest using values of the enum
> ChecksumStateType for readability / consistency.

PG_DATA_CHECKSUM_OFF didn't exist until quite late in the lifetime of the
patch, and clearly not all uses of 0 were ported over.

> Here are corresponding patches.

I will take another look later today when I have more time, and commit them.

--
Daniel Gustafsson