Re: Transparent column encryption

Поиск
Список
Период
Сортировка
От Jacob Champion
Тема Re: Transparent column encryption
Дата
Msg-id 9e664928-95ce-4da5-5fd1-6dad3f68c5b5@timescale.com
обсуждение исходный текст
Ответ на Re: Transparent column encryption  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Ответы Re: Transparent column encryption  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Список pgsql-hackers
On 7/12/22 11:29, Peter Eisentraut wrote:
> 
> Updated patch, to resolve some merge conflicts.

Thank you for working on this; it's an exciting feature.

>     The CEK key
>     material is in turn encrypted by an assymmetric key called the column
>     master key (CMK).

I'm not yet understanding why the CMK is asymmetric. Maybe you could use
the public key to add ephemeral, single-use encryption keys that no one
but the private key holder could use (after you forget them on your
side, that is). But since the entire column is encrypted with a single
CEK, you would essentially only be able to do that if you created an
entirely new column or table; do I have that right?

I'm used to public keys being safe for... publication, but if I'm
understanding correctly, it's important that the server admin doesn't
get hold of the public key for your CMK, because then they could
substitute their own CEKs transparently and undermine future encrypted
writes. That seems surprising. Am I just missing something important
about RSAES-OAEP?

> +#define PG_CEK_AEAD_AES_128_CBC_HMAC_SHA_256   130
> +#define PG_CEK_AEAD_AES_192_CBC_HMAC_SHA_384   131
> +#define PG_CEK_AEAD_AES_256_CBC_HMAC_SHA_384   132
> +#define PG_CEK_AEAD_AES_256_CBC_HMAC_SHA_512   133

It looks like these ciphersuites were abandoned by the IETF. Are there
existing implementations of them that have been audited/analyzed? Are
they safe (and do we know that the claims made in the draft are
correct)? How do they compare to other constructions like AES-GCM-SIV
and XChacha20-Poly1305?

> +-- \gencr
> +-- (This just tests the parameter passing; there is no encryption here.)
> +CREATE TABLE test_gencr (a int, b text);
> +INSERT INTO test_gencr VALUES (1, 'one') \gencr
> +SELECT * FROM test_gencr WHERE a = 1 \gencr
> + a |  b
> +---+-----
> + 1 | one
> +(1 row)
> +
> +INSERT INTO test_gencr VALUES ($1, $2) \gencr 2 'two'
> +SELECT * FROM test_gencr WHERE a IN ($1, $2) \gencr 2 3
> + a |  b
> +---+-----
> + 2 | two
> +(1 row)
I'd expect \gencr to error out without sending plaintext. I know that
under the hood this is just setting up a prepared statement, but if I'm
using \gencr, presumably I really do want to be encrypting my data.
Would it be a problem to always set force-column-encryption for the
parameters we're given here? Any unencrypted columns could be provided
directly.

Another idle thought I had was that it'd be nice to have some syntax for
providing a null value to \gencr (assuming I didn't overlook it in the
patch). But that brings me to...

> +  <para>
> +   Null values are not encrypted by transparent column encryption; null values
> +   sent by the client are visible as null values in the database.  If the fact
> +   that a value is null needs to be hidden from the server, this information
> +   needs to be encoded into a nonnull value in the client somehow.
> +  </para>

This is a major gap, IMO. Especially with the switch to authenticated
ciphers, because it means you can't sign your NULL values. And having
each client or user that's out there solve this with a magic in-band
value seems like a recipe for pain.

Since we're requiring "canonical" use of text format, and the docs say
there are no embedded or trailing nulls allowed in text values, could we
steal the use of a single zero byte to mean NULL? One additional
complication would be that the client would have to double-check that
we're not writing a NULL into a NOT NULL column, and complain if it
reads one during decryption. Another complication would be that the
client would need to complain if it got a plaintext NULL.

(The need for robust client-side validation of encrypted columns might
be something to expand on in the docs more generally, since before this
feature, it could probably be assumed that the server was buggy if it
sent you unparsable junk in a column.)

> +   <para>
> +    The <quote>associated data</quote> in these algorithms consists of 4
> +    bytes: The ASCII letters <literal>P</literal> and <literal>G</literal>
> +    (byte values 80 and 71), followed by the algorithm ID as a 16-bit unsigned
> +    integer in network byte order.
> +   </para>

Is this AD intended as a placeholder for the future, or does it serve a
particular purpose?

Thanks,
--Jacob



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Следующее
От: Justin Pryzby
Дата:
Сообщение: Re: MERGE and parsing with prepared statements