Re: better page-level checksums

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: better page-level checksums
Дата
Msg-id CA+TgmoabE38p2wPqhQ4Q_r-n6KYz-RxNBnPUtCcY3B7C89j_iQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: better page-level checksums  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Ответы Re: better page-level checksums  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers
On Fri, Jun 10, 2022 at 9:36 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> I think there ought to be a bit more principled analysis here than just
> "let's add a lot more bits".  There is probably some kind of information
> to be had about how many CRC bits are useful for a given block size, say.
>
> And then there is the question of performance.  When data checksum were
> first added, there was a lot of concern about that.  CRC is usually
> baked directly into hardware, so it's about as cheap as we can hope for.
>   SHA not so much.

That's all pretty fair. I have to admit that SHA checksums sound quite
expensive, and also that I'm no expert on what kinds of checksums
would be best for this sort of application. Based on the earlier
discussions around TDE, I do think that people want tamper-resistant
checksums here too -- like maybe something where you can't recompute
the checksum without access to some secret. I could propose naive ways
to do that, like prepending a fixed chunk of secret bytes to the
beginning of every block and then running SHA512 or something over the
result, but I'm sure that people with actual knowledge of cryptography
have developed much better and more robust ways of doing this sort of
thing.

I've really been devoting most of my mental energy here to
understanding what problems there are at the PostgreSQL level - i.e.
when we carve out bytes for a wider checksum, what breaks? The only
research that I did to try to understand what algorithms might make
sense was a quick Google search, which led me to the list of
algorithms that btrfs uses. I figured that was a good starting point
because, like a filesystem, we're encrypting fixed-size blocks of
data. However, I didn't intend to present the results of that quick
look as the definitive answer to the question of what might make sense
for PostgreSQL, and would be interested in hearing what you or anyone
else thinks about that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: better page-level checksums
Следующее
От: "Hsu, John"
Дата:
Сообщение: Re: A proposal to force-drop replication slots to make disabling async/sync standbys or logical replication faster in production environments