Re: Transparent column encryption

Поиск
Список
Период
Сортировка
От Jacob Champion
Тема Re: Transparent column encryption
Дата
Msg-id CAAWbhmhk_Wd-4GR=NMfMccke=ZUpprpjRTd3jRRckXQwdu0ZLw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Transparent column encryption  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Transparent column encryption  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Tue, Jul 26, 2022 at 10:52 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Jul 21, 2022 at 2:30 PM Jacob Champion <jchampion@timescale.com> wrote:
> > A minimum padding option would fix the leak here, right? If every
> > entry is the same length then there's no information to be gained, at
> > least in an offline analysis.
>
> Sure, but padding every text column that you have, even the ones
> containing only 'a', out to the length of Don Quixote in the original
> Spanish, is unlikely to be an appealing option.

If you are honestly trying to conceal Don Quixote, I suspect you are
already in the business of making unappealing decisions. I don't think
that's necessarily an argument against hiding the length for
real-world use cases.

> > I think some work around that is probably going to be needed for
> > serious use of this encryption, in part because of the use of text
> > format as the canonical input. If the encrypted values of 1, 10, 100,
> > and 1000 hypothetically leaked their exact lengths, then an encrypted
> > int wouldn't be very useful. So I'd want to quantify (and possibly
> > configure) exactly how much data you can encrypt in a single message
> > before the length starts being leaked, and then make sure that my
> > encrypted values stay inside that bound.
>
> I think most ciphers these days are block ciphers, so you're going to
> get output that is a multiple of the block size anyway - e.g. I think
> for AES it's 128 bits = 16 bytes. So small differences in length will
> be concealed naturally, which may be good enough for some use cases.

Right. My point is, if you have a column that has exactly one
important value that is 17 bytes long when converted to text, you're
going to want to know that block size exactly, because the encryption
will be effectively useless for that value. That size needs to be
documented, and it'd be helpful to know that it's longer than, say,
the longest text representation of our fixed-length column types.

> I'm not really convinced that it's worth putting a lot of effort into
> bolstering the security of this kind of tech above what it naturally
> gives. I think it's likely to be a wild goose chase.

If the goal is to provide real encryption, and not just a toy, I think
you're going to need to put a *lot* of effort into analysis. Even if
the result of the analysis is "we don't plan to address this in v1".

Crypto is inherently a cycle of
make-it-and-break-it-and-fix-it-and-break-it-again. If that's
considered a "wild goose chase" and not seriously pursued at some
level, then this implementation will probably not last long in the
face of real abuse. (That doesn't mean you have to take my advice; I'm
just a dude with opinions -- but you will need to have real
cryptographers look at this, and you're going to need to think about
how the system will evolve when it's broken.)

> If you have major
> worries about someone reading your disk in its entirety, use full-disk
> encryption.

This patchset was designed to protect against the evil DBA case, I
think. Full disk encryption doesn't help.

> Selective encryption is only suitable when you want to add
> a modest level of protection for individual value and are willing to
> accept that some information leakage is likely if an adversary can in
> fact read the full disk.

...but there's a known countermeasure to this particular leakage,
right? Which would make it more suitable for that case.

> Padding values to try to further obscure
> things may be situationally useful, but if you find yourself worrying
> too much about that sort of thing, you likely should have picked
> stronger medicine initially.

In my experience, this entire field is the application of
situationally useful protection. That's one of the reasons it's hard,
and designing this sort of patch is going to be hard too. Putting that
on the user isn't quite fair when you're the ones designing the
system; you determine what they have to worry about when you choose
the crypto.

--Jacob



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: optimize lookups in snapshot [sub]xip arrays
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Unstable tests for recovery conflict handling