Re: Offline enabling/disabling of data checksums

Поиск
Список
Период
Сортировка
От Fabien COELHO
Тема Re: Offline enabling/disabling of data checksums
Дата
Msg-id alpine.DEB.2.21.1812281542250.6632@lancre
обсуждение исходный текст
Ответ на Re: Offline enabling/disabling of data checksums  (Magnus Hagander <magnus@hagander.net>)
Список pgsql-hackers
>>> [...]
>>
>> I'm not sure data checksums are particularly great evidence. For example
>> with the recent fsync issues, we might have ended with partial writes
>> (and thus invalid checksums). The OS migh have even told us about the
>> failure, but we've gracefully ignored it. So I'm afraid data checksums
>> are not a particularly great proof it's not our fault.
>
> They are a great evidence that your data is corrupt. You *want* to know
> that your data is corrupt. Even if our best recommendation is "go restore
> your backups", you still want to know. Otherwise you are sitting around on
> data that's corrupt and you don't know about it.
>
> There are certainly many things we can do to improve the experience. But
> not telling people their data is coorrupt when it is, isn't one of them.

Yep, anyone should want to know if their database is corrupt, compare to 
ignoring the fact.

One reason not to enable it could be if the implementation is not trusted, 
i.e. if false positive (corrupt page detected while the data are okay and 
there was only an issue with computing or storing the checksum) can occur.

There is also the performance impact. I did some quick-and-dirty pgbench 
simple update single thread performance tests to compare with vs without 
checksum. Enabling checksums on these tests seems to induce a 1.4% 
performance penalty, although I'm moderately confident about it given the 
standard deviation. At least it is an indication, and it seems to me that 
it is consistent with other figures previously reported on the list.

-- 
Fabien.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: random() (was Re: New GUC to sample log queries)
Следующее
От: Surafel Temesgen
Дата:
Сообщение: Re: pg_dump multi VALUES INSERT