Re: Online verification of checksums

Поиск
Список
Период
Сортировка
От Fabien COELHO
Тема Re: Online verification of checksums
Дата
Msg-id alpine.DEB.2.21.1903030743240.8095@lancre
обсуждение исходный текст
Ответ на Re: Online verification of checksums  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: Online verification of checksums
Список pgsql-hackers
Bonjour Michaël,

>> I gotta say, my conclusion from this debate is that it's simply a
>> mistake to do this without involvement of the server that can use
>> locking to prevent these kind of issues.  It seems pretty absurd to me
>> to have hacky workarounds around partial writes of a live server, around
>> truncation, etc, even though the server has ways to deal with that.
>
> I agree with Andres on this one.  We are never going to make this stuff 
> safe if we don't handle page reads with the proper locks because of torn 
> pages. What I think we should do is provide a SQL function which reads a 
> page in shared mode, and then checks its checksum if its LSN is older 
> than the previous redo point.  This discards cases with rather hot 
> pages, but if the page is hot enough then the backend re-reading the 
> page would just do the same by verifying the page checksum by itself. -- 
> Michael

My 0.02€ about that, as one of the reviewer of the patch:

I agree that having a server function (extension?) to do a full checksum 
verification, possibly bandwidth-controlled, would be a good thing. 
However it would have side effects, such as interfering deeply with the 
server page cache, which may or may not be desirable.

On the other hand I also see value in an independent system-level external 
tool capable of a best effort checksum verification: the current check 
that the cluster is offline to prevent pg_verify_checksum from running is 
kind of artificial, and when online simply counting 
online-database-related checksum issues looks like a reasonable 
compromise.

So basically I think that allowing pg_verify_checksum to run on an online 
cluster is still a good thing, provided that expected errors are correctly 
handled.

-- 
Fabien.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: WIP: BRIN multi-range indexes
Следующее
От: Fabien COELHO
Дата:
Сообщение: RE: Timeout parameters