Re: Online enabling of checksums

Поиск
Список
Период
Сортировка
От Michael Banck
Тема Re: Online enabling of checksums
Дата
Msg-id 20180310150930.GB17017@nighthawk.caipicrew.dd-dns.de
обсуждение исходный текст
Ответ на Re: Online enabling of checksums  (Magnus Hagander <magnus@hagander.net>)
Ответы Re: Online enabling of checksums  (Daniel Gustafsson <daniel@yesql.se>)
Список pgsql-hackers
Hi,

I had a closer look at v3 of the patch now.

On Sat, Mar 03, 2018 at 07:23:31PM +0100, Magnus Hagander wrote:
> Attached is a rebased patch which removes this optimization, updates the
> pg_proc entry for the new format, and changes pg_verify_checksums to use -r
> instead of -o for relfilenode.
 
The patch applies fine with minimal fuzz and compiles with no warnings;
make check and the added isolation tests, as well as the added checksum
tests pass.

If I blindly run "SELECT pg_enable_data_checksums();" on new cluster, I
get:

|postgres=# SELECT pg_enable_data_checksums();
| pg_enable_data_checksums 
|--------------------------
| t
|(1 row)
|
|postgres=# SHOW data_checksums ;
| data_checksums 
|----------------
| inprogress
|(1 row)


However, inspecting the log one sees:

|2018-03-10 14:15:57.702 CET [3313] ERROR:  Database template0 does not allow connections.
|2018-03-10 14:15:57.702 CET [3313] HINT:  Allow connections using ALTER DATABASE and try again.
|2018-03-10 14:15:57.702 CET [3152] LOG:  background worker "checksum helper launcher" (PID 3313) exited with exit code
1

and the background worker is no longer running without any obvious hint
to the client.

I am aware that this is discussed already, but as-is the user experience
is pretty bad, I think pg_enable_data_checksums() should either bail
earlier if it cannot connect to all databases, or it should be better
documented.

Otherwise, if I allow connections to template0, the patch works as
expected, I have not managed to break it so far.

Some further review comments:

> diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
> index f4bc2d4161..89afecb341 100644
> --- a/doc/src/sgml/wal.sgml
> +++ b/doc/src/sgml/wal.sgml
> @@ -230,6 +230,73 @@
[...]
> +   <para>
> +    Checksums can be enabled or disabled online, by calling the appropriate
> +    <link linkend="functions-admin-checksum">functions</link>.
> +    Disabling of checksums takes effect immediately when the function is called.
> +   </para>
> +
> +   <para>
> +    Enabling checksums will put the cluster in <literal>inprogress</literal> mode.
> +    During this time, checksums will be written but not verified. In addition to
> +    this, a background worker process is started that enables checksums on all
> +    existing data in the cluster. Once this worker has completed processing all
> +    databases in the cluster, the checksum mode will automatically switch to
> +    <literal>on</literal>.
> +   </para>
> +
> +   <para>
> +    If the cluster is stopped while in <literal>inprogress</literal> mode, for
> +    any reason, then this process must be restarted manually. To do this,
> +    re-execute the function <function>pg_enable_data_checksums()</function>
> +    once the cluster has been restarted.
> +   </para>

Maybe document the above issue here, unless it is clear that the
templat0-needs-to-allow-connections issue will go away before the patch
is pushed.

> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
> index 47a6c4d895..56aaa88de1 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c

[...]

> +void
> +SetDataChecksumsOn(void)
> +{
> +    if (!DataChecksumsEnabledOrInProgress())
> +        elog(ERROR, "Checksums not enabled or in progress");
> +
> +    LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> +
> +    if (ControlFile->data_checksum_version != PG_DATA_CHECKSUM_INPROGRESS_VERSION)
> +    {
> +        LWLockRelease(ControlFileLock);
> +        elog(ERROR, "Checksums not in in_progress mode");

The string used in "SHOW data_checksums" is "inprogress", not
"in_progress".

[...]

> @@ -7769,6 +7839,16 @@ StartupXLOG(void)
>      CompleteCommitTsInitialization();
>  
>      /*
> +     * If we reach this point with checksums in inprogress state, we notify
> +     * the user that they need to manually restart the process to enable
> +     * checksums.
> +     */
> +    if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_INPROGRESS_VERSION)
> +        ereport(WARNING,
> +                (errmsg("checksum state is \"in progress\" with no worker"),
> +                 errhint("Either disable or enable checksums by calling the pg_disable_data_checksums() or
pg_enable_data_checksums()functions.")));
 

Again, string is "inprogress", not "in progress", not sure if that
matters.

> diff --git a/src/backend/postmaster/checksumhelper.c b/src/backend/postmaster/checksumhelper.c
> new file mode 100644
> index 0000000000..6aa71bcf30
> --- /dev/null
> +++ b/src/backend/postmaster/checksumhelper.c
> @@ -0,0 +1,631 @@
> +/*-------------------------------------------------------------------------
> + *
> + * checksumhelper.c
> + *      Backend worker to walk the database and write checksums to pages

Backend or Background?

[...]

> +typedef struct ChecksumHelperShmemStruct
> +{
> +    pg_atomic_flag launcher_started;
> +    bool        success;
> +    bool        process_shared_catalogs;
> +    /* Parameter values  set on start */

double space.

[...]

> +static bool
> +ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrategy strategy)
> +{
> +    BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
> +    BlockNumber b;
> +
> +    for (b = 0; b < numblocks; b++)
> +    {
> +        Buffer        buf = ReadBufferExtended(reln, forkNum, b, RBM_NORMAL, strategy);
> +        Page        page;
> +        PageHeader    pagehdr;
> +        uint16        checksum;
> +
> +        /* Need to get an exclusive lock before we can flag as dirty */
> +        LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
> +
> +        /* Do we already have a valid checksum? */
> +        page = BufferGetPage(buf);
> +        pagehdr = (PageHeader) page;
> +        checksum = pg_checksum_page((char *) page, b);

This looks like it does not take the segment number into account;
however it is also unclear to me what the purpose of this is, as
checksum is never validated against the pagehdr, and nothing is done
with it. Indeed, I even get a compiler warning about pagehdr and
checksum:

git/postgresql/build/../src/backend/postmaster/checksumhelper.c:
In function ‘ProcessSingleRelationFork’:
git/postgresql/build/../src/backend/postmaster/checksumhelper.c:155:11:
warning: variable ‘checksum’ set but not used
[-Wunused-but-set-variable]
   uint16  checksum;
           ^~~~~~~~
git/postgresql/build/../src/backend/postmaster/checksumhelper.c:154:14:
warning: variable ‘pagehdr’ set but not used [-Wunused-but-set-variable]
   PageHeader pagehdr;
              ^~~~~~~

I guess the above block running pg_checksum_page() is a leftover from
previous versions of the patch and should be removed... 

> +        /*
> +         * Mark the buffer as dirty and force a full page write.
> +         * We have to re-write the page to wal even if the checksum hasn't
> +         * changed, because if there is a replica it might have a slightly
> +         * different version of the page with an invalid checksum, caused
> +         * by unlogged changes (e.g. hintbits) on the master happening while
> +         * checksums were off. This can happen if there was a valid checksum
> +         * on the page at one point in the past, so only when checksums
> +         * are first on, then off, and then turned on again.
> +         */
> +        START_CRIT_SECTION();
> +        MarkBufferDirty(buf);
> +        log_newpage_buffer(buf, false);
> +        END_CRIT_SECTION();

... seeing how MarkBufferDirty(buf) is now run unconditionally.

[...]

> +    /*
> +     * Initialize a connection to shared catalogs only.
> +     */
> +    BackgroundWorkerInitializeConnection(NULL, NULL);
> +
> +    /*
> +     * Set up so first run processes shared catalogs, but not once in every db
> +     */

Comment should have a full stop like the above and below ones?

> +    ChecksumHelperShmem->process_shared_catalogs = true;
> +
> +    /*
> +     * Create a database list.  We don't need to concern ourselves with
> +     * rebuilding this list during runtime since any new created database will
> +     * be running with checksums turned on from the start.
> +     */

[...]

> +    DatabaseList = BuildDatabaseList();
> +
> +    /*
> +     * If there are no databases at all to checksum, we can exit immediately
> +     * as there is no work to do.
> +     */
> +    if (DatabaseList == NIL || list_length(DatabaseList) == 0)
> +        return;
> +
> +    while (true)
> +    {
> +        List       *remaining = NIL;
> +        ListCell   *lc,
> +                   *lc2;
> +        List       *CurrentDatabases = NIL;
> +
> +        foreach(lc, DatabaseList)
> +        {
> +            ChecksumHelperDatabase *db = (ChecksumHelperDatabase *) lfirst(lc);
> +
> +            if (ProcessDatabase(db))
> +            {
> +                pfree(db->dbname);
> +                pfree(db);
> +
> +                if (ChecksumHelperShmem->process_shared_catalogs)
> +
> +                    /*
> +                     * Now that one database has completed shared catalogs, we
> +                     * don't have to process them again .

Stray space.

> +                     */
> +                    ChecksumHelperShmem->process_shared_catalogs = false;
> +            }
> +            else
> +            {
> +                /*
> +                 * Put failed databases on the remaining list.
> +                 */
> +                remaining = lappend(remaining, db);
> +            }
> +        }
> +        list_free(DatabaseList);
> +
> +        DatabaseList = remaining;
> +        remaining = NIL;
> +
> +        /*
> +         * DatabaseList now has all databases not yet processed. This can be
> +         * because they failed for some reason, or because the database was
> +         * DROPed between us getting the database list and trying to process

DROPed looks wrong, and there's no other occurence of it in the source
tree. DROPped looks even weirder, so maybe just "dropped"?

> +         * it. Get a fresh list of databases to detect the second case with.

That sentence looks unfinished or at least is unclear to me.

> +         * Any database that still exists but failed we retry for a limited

I'm not a native speaker, but this looks wrong to me as well, maybe "We
retry any database that still exists but failed for a limited [...]"?

> diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
> index 0fe98a550e..4bb2b7e6ec 100644
> --- a/src/bin/pg_upgrade/controldata.c
> +++ b/src/bin/pg_upgrade/controldata.c
> @@ -591,6 +591,15 @@ check_control_data(ControlData *oldctrl,
>       */
>  
>      /*
> +     * If checksums have been turned on in the old cluster, but the
> +     * checksumhelper have yet to finish, then disallow upgrading. The user
> +     * should either let the process finish, or turn off checksums, before
> +     * retrying.
> +     */
> +    if (oldctrl->data_checksum_version == 2)
> +        pg_fatal("transition to data checksums not completed in old cluster\n");
> +
> +    /*
>       * We might eventually allow upgrades from checksum to no-checksum
>       * clusters.
>       */

See below about src/bin/pg_upgrade/pg_upgrade.h having
data_checksum_version be a bool. 

I checked pg_ugprade (from master to master though), and could find no
off-hand issues, i.e. it reported all issues correctly.

> --- /dev/null
> +++ b/src/bin/pg_verify_checksums/Makefile

[...]

> +check:
> +       $(prove_check)
> +
> +installcheck:
> +       $(prove_installcheck)

If I run "make check" in src/bin/pg_verify_checksums, I get a fat perl
error:

|src/bin/pg_verify_checksums$ LANG=C make check
|rm -rf '/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build'/tmp_install
|/bin/mkdir -p '/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build'/tmp_install/log
|make -C '../../..' DESTDIR='/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build'/tmp_install install
>'/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build'/tmp_install/log/install.log2>&1
 
|rm -rf '/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/src/bin/pg_verify_checksums'/tmp_check
|/bin/mkdir -p '/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/src/bin/pg_verify_checksums'/tmp_check
|cd /home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/../src/bin/pg_verify_checksums &&
TESTDIR='/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/src/bin/pg_verify_checksums'
PATH="/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/tmp_install//bin:$PATH"
LD_LIBRARY_PATH="/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/tmp_install//lib"PGPORT='65432'
PG_REGRESS='/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/src/bin/pg_verify_checksums/../../../src/test/regress/pg_regress'
/usr/bin/prove-I /home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/../src/test/perl/ -I
/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/../src/bin/pg_verify_checksums t/*.pl
 
|Cannot detect source of 't/*.pl'! at /usr/share/perl/5.24/TAP/Parser/IteratorFactory.pm line 261.
|    TAP::Parser::IteratorFactory::detect_source(TAP::Parser::IteratorFactory=HASH(0x55eed10df3e8),
TAP::Parser::Source=HASH(0x55eed10bd358))called at /usr/share/perl/5.24/TAP/Parser/IteratorFactory.pm line 211
 
|    TAP::Parser::IteratorFactory::make_iterator(TAP::Parser::IteratorFactory=HASH(0x55eed10df3e8),
TAP::Parser::Source=HASH(0x55eed10bd358))called at /usr/share/perl/5.24/TAP/Parser.pm line 472
 
|    TAP::Parser::_initialize(TAP::Parser=HASH(0x55eed10df328), HASH(0x55eed0ea2e90)) called at
/usr/share/perl/5.24/TAP/Object.pmline 55
 
|    TAP::Object::new("TAP::Parser", HASH(0x55eed0ea2e90)) called at /usr/share/perl/5.24/TAP/Object.pm line 130
|    TAP::Object::_construct(TAP::Harness=HASH(0x55eed09176b0), "TAP::Parser", HASH(0x55eed0ea2e90)) called at
/usr/share/perl/5.24/TAP/Harness.pmline 852
 
|    TAP::Harness::make_parser(TAP::Harness=HASH(0x55eed09176b0), TAP::Parser::Scheduler::Job=HASH(0x55eed0fdc708))
calledat /usr/share/perl/5.24/TAP/Harness.pm line 651
 
|    TAP::Harness::_aggregate_single(TAP::Harness=HASH(0x55eed09176b0), TAP::Parser::Aggregator=HASH(0x55eed091e520),
TAP::Parser::Scheduler=HASH(0x55eed0fdc6a8))called at /usr/share/perl/5.24/TAP/Harness.pm line 743
 
|    TAP::Harness::aggregate_tests(TAP::Harness=HASH(0x55eed09176b0), TAP::Parser::Aggregator=HASH(0x55eed091e520),
"t/*.pl")called at /usr/share/perl/5.24/TAP/Harness.pm line 558
 
|    TAP::Harness::__ANON__() called at /usr/share/perl/5.24/TAP/Harness.pm line 571
|    TAP::Harness::runtests(TAP::Harness=HASH(0x55eed09176b0), "t/*.pl") called at /usr/share/perl/5.24/App/Prove.pm
line546
 
|    App::Prove::_runtests(App::Prove=HASH(0x55eed090b0c8), HASH(0x55eed0d79cf0), "t/*.pl") called at
/usr/share/perl/5.24/App/Prove.pmline 504
 
|    App::Prove::run(App::Prove=HASH(0x55eed090b0c8)) called at /usr/bin/prove line 13
|Makefile:39: recipe for target 'check' failed
|make: *** [check] Error 2


> diff --git a/src/bin/pg_verify_checksums/pg_verify_checksums.c b/src/bin/pg_verify_checksums/pg_verify_checksums.c
> new file mode 100644
> index 0000000000..a4bfe7284d
> --- /dev/null
> +++ b/src/bin/pg_verify_checksums/pg_verify_checksums.c

[...]

> +    if (DataDir == NULL)
> +    {
> +        if (optind < argc)
> +            DataDir = argv[optind++];
> +        else
> +            DataDir = getenv("PGDATA");
> +    }
> +
> +    /* Complain if any arguments remain */
> +    if (optind < argc)
> +    {
> +        fprintf(stderr, _("%s: too many command-line arguments (first is \"%s\")\n"),
> +                progname, argv[optind]);
> +        fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
> +                progname);
> +        exit(1);
> +    }
> +
> +    if (DataDir == NULL)
> +    {
> +        fprintf(stderr, _("%s: no data directory specified\n"), progname);
> +        fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
> +        exit(1);
> +    }

Those two if (DataDir == NULL) checks could maybe be put together into
one block.

> diff --git a/src/include/postmaster/checksumhelper.h b/src/include/postmaster/checksumhelper.h
> new file mode 100644
> index 0000000000..7f296264a9
> --- /dev/null
> +++ b/src/include/postmaster/checksumhelper.h
> @@ -0,0 +1,31 @@
> +/*-------------------------------------------------------------------------
> + *
> + * checksumhelper.h
> + *      header file for checksum helper deamon

"deamon" is surely wrong (it'd be "daemon"), but maybe "(background)
worker" is better?

> diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
> index 85dd10c45a..bd46bf2ce6 100644
> --- a/src/include/storage/bufpage.h
> +++ b/src/include/storage/bufpage.h
> @@ -194,6 +194,7 @@ typedef PageHeaderData *PageHeader;
>   */
>  #define PG_PAGE_LAYOUT_VERSION        4
>  #define PG_DATA_CHECKSUM_VERSION    1
> +#define PG_DATA_CHECKSUM_INPROGRESS_VERSION        2

I am not very sure about the semantics of PG_DATA_CHECKSUM_VERSION being
1, but I assumed it was a version, like, if we ever decide to use a
different checksumming algorithm, we'd bump it to 2.

Now PG_DATA_CHECKSUM_INPROGRESS_VERSION is defined to 2, which I agree
is convenient, but is there some strategy what to do about this in case
the PG_DATA_CHECKSUM_VERSION needs to be increased?

In any case, src/bin/pg_upgrade/pg_upgrade.h has

        bool            data_checksum_version;

in the ControlData struct, which might need updating?


That's all for now.

Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: [HACKERS] GUC for cleanup indexes threshold.
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: [HACKERS] [PATCH] Incremental sort