Re: Online checksums verification in the backend

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: Online checksums verification in the backend
Дата
Msg-id 20200316030638.GA2331@paquier.xyz
обсуждение исходный текст
Ответ на Re: Online checksums verification in the backend  (Julien Rouhaud <rjuju123@gmail.com>)
Ответы Re: Online checksums verification in the backend  (Julien Rouhaud <rjuju123@gmail.com>)
Список pgsql-hackers
On Wed, Mar 11, 2020 at 08:18:23AM +0100, Julien Rouhaud wrote:
> The cfbot reported a build failure, so here's a rebased v2 which also contains
> the pg_stat_report_failure() call and extra log info.

+ * - if a block is not found in shared_buffers, the LWLock is relased and the
+ *   block is read from disk without taking any lock.  If an error is detected,
+ *   the read block will be discarded and retrieved again while holding the
+ *   LWLock.  This is because an error due to concurrent write is possible but
+ *   very unlikely, so it's better to have an optimistic approach to limit
+ *   locking overhead
This can be risky with false positives, no?  With a large amount of
shared buffer eviction you actually increase the risk of torn page
reads.  Instead of a logic relying on partition mapping locks, which
could be unwise on performance grounds, did you consider different
approaches?  For example a kind of pre-emptive lock on the page in
storage to prevent any shared buffer operation to happen while the
block is read from storage, that would act like a barrier.

+ * Vacuum's GUCs are used to avoid consuming too much resources while running
+ * this tool.
Shouldn't this involve separate GUCs instead of the VACUUM ones?  I
guess that this leads to the fact that this function may be better as
a contrib module, with the addition of some better-suited APIs in core
(see paragraph above).

+Run
+    make check
+or
+    make installcheck
Why is installcheck mentioned here?

I don't think that it is appropriate to place the SQL-callable part in
the existing checksum.c.  I would suggest instead a new file, say
checksumfuncs.c in src/backend/utils/adt/, holding any SQL functions
for checksums.

-SUBDIRS = perl regress isolation modules authentication recovery
 subscription
+SUBDIRS = perl regress isolation modules authentication check_relation \
+     recovery subscription
It seems to me that this test would be a good fit for
src/test/modules/test_misc/.

+static void
+check_all_relations(TupleDesc tupdesc, Tuplestorestate *tupstore,
+                         ForkNumber forknum)
Per the argument of bloat, I think that I would remove
check_all_relation() as this function could take a very long time to
run, and just make the SQL function strict.

+ * - if a block is dirty in shared_buffers, it's ignored as it'll be flushed to
+ *   disk either before the end of the next checkpoint or during recovery in
+ *   case of unsafe shutdown
Wouldn't it actually be a good thing to check that the page on storage
is fine in this case?  This depends on the system settings and the
checkpoint frequency, but checkpoint_timeout can be extended up to 1
day.  And plenty of things could happen to the storage in one day,
including a base backup that includes a corrupted page on storage,
that this function would not be able to detect.

+ * - if a block is otherwise found in shared_buffers, an IO lock is taken on
+ *   the block and the block is then read from storage, ignoring the block in
+ *   shared_buffers
Yeah, I think that you are right here to check the page on storage
anyway.

+ *   we detect if a block is in shared_buffers or not.  See get_buffer()
+ *   comments for more details about the locking strategy.
get_buffer() does not exist in your patch, check_get_buffer() does.

+ * - if a block is not found in shared_buffers, the LWLock is relased and the
[...]
+ * To avoid torn page and possible false postives when reading data, and
Typos.

+   if (!DataChecksumsEnabled())
+       elog(ERROR, "Data checksums are not enabled");
Note that elog() is for the class of errors which are never expected,
and here a caller of pg_check_relation() with checksums disabled can
trigger that.  So you need to call ereport() with
ERRCODE_FEATURE_NOT_SUPPORTED.

+ * - if a block is dirty in shared_buffers, it's ignored as it'll be flushed to
+ *   disk either before the end of the next checkpoint or during recovery in
+ *   case of unsafe shutdown
Not sure that the indentation is going to react well on that part of
the patch, perhaps it would be better to add some "/*-------" at the
beginning and end of the comment block to tell pgindent to ignore this
part?

Based on the feedback gathered on this thread, I guess that you should
have a SRF returning the list of broken blocks, as well as NOTICE
messages.  Another thing to consider is the addition of a range
argument to only check a certain portion of the blocks, say one
segment file at a time, etc.  Fine by me to not include in the first
flavor of the patch.

The patch needs documentation.
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Следующее
От: Noah Misch
Дата:
Сообщение: Re: [HACKERS] WAL logging problem in 9.4.3?