Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Дата
Msg-id 20190810025451.daujankh4ftlkb5s@momjian.us
обсуждение исходный текст
Ответ на Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Sehrope Sarkuni <sehrope@jackdb.com>)
Ответы Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-hackers
On Thu, Aug  8, 2019 at 10:17:53PM -0400, Sehrope Sarkuni wrote:
> On Thu, Aug 8, 2019 at 2:16 PM Bruce Momjian <bruce@momjian.us> wrote:
> 
>     On Wed, Aug  7, 2019 at 08:56:18AM -0400, Sehrope Sarkuni wrote:
>     > Simplest approach for derived keys would be to use immutable attributes
>     of the
>     > WAL files as an input to the key derivation. Something like HKDF(MDEK,
>     "WAL:" |
> 
>     So, I am thinking we should use "WAL:" for WAL and "REL:" for heap/index
>     files.
> 
> 
> Sounds good. Any unique convention is fine. Main thing to keep in mind is that
> they're directly tied to the master key so it's not possible to rotate them
> without changing the master key.

A recent email talked about using two different encryption keys for
heap/index and WAL, which allows for future features, and allows for key
rotation of the two independently.  (I already stated how hard key
rotation would be with WAL and pg_rewind.)

> This is in contrast to saving a WDEK key to a file (similar to how the MDEK key
> would be saved) and unlocking it with the MDEK. That has more moving parts but
> would allow that key to be independent of the MDEK. In a later message Stephen
> refers to an example of a replica receiving encrypted WAL and applying it with
> a different MDEK for the page buffers. That's doable with an independent WDEK.

I assumed we would call a command on boot to get a passphrase, which
would unlock the encryption keys.  Is there more being described above?

>     > | timeline_id || wal_segment_num) should be fine for this as it is:
> 
>     I considered using the timeline in the nonce, but then remembered that
>     in timeline switch, we _copy_ the part of the old WAL up to the timeline
>     switch to the new timeline;  see:
> 
>         https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/
>     backend/access/transam/xlog.c;h=f55352385732c6b0124eff5265462f3883fe7435;hb
>     =HEAD#l5502
> 
>        * Initialize the starting WAL segment for the new timeline. If the
>     switch
>        * happens in the middle of a segment, copy data from the last WAL
>     segment
>        * of the old timeline up to the switch point, to the starting WAL
>     segment
>        * on the new timeline.
> 
>     We would need to decrypt/encrypt to do the copy, and just wasn't sure of
>     the value of the timeline in the nonce.  One value to it is that if
>     there some WAL that generated after the timeline switch in the old
>     primary that isn't transfered, there would be potentially new data
>     encrypted with the same key/nonce in the new primary, but if that WAL is
>     not used, odds are it is gone/destroyed/inaccessible, or it would have
>     been used during the switchover, so it didn't seem worth worrying about.
> 
>     One _big_ reason to add the timeline is if you had a standby that you
>     recovered and rolled forward only to a specific transaction, then
>     continued running it as a new primary.  In that case, you would have
>     different WAL encrypted with the same key/nonce, but that sounds like
>     the same as promoting two standbys, and we should just document not to
>     do it.
> 
>     Maybe we need to consider this further.
> 
> 
> Good points. Yes, anything short of generating a new key at promotion time will
> have these issues. If we're not going to do that, no point in adding the
> timeline id if it does not change anything. I had thought only the combo was
> unique but sounds like the segment number is unique on its own. One thing I
> like about a unique per-file key is that it simplifies the IV generation (i.e.
> can start at zero).

Uh, well, the segment number is unique, _except_ for a timeline switch,
where the segment number exists in the old timeline file, and in the new
timeline file, though the new timeline file has more WAL because it has
writes only in the new timeline.  So, in a way, the WAL data is the same
in the old and new timeline files, e.g. 000000010000000000000001 and
000000020000000000000001, but 000000010000000000000001 stops at the
timeline switch and 000000020000000000000001 has more WAL data that
represents cluster activity since the WAL switch, though the two files
are both 16MB.

> What about discarding the rest of the WAL file at promotion and skipping to a
> new file? With a random per-file key in the first page header would ensure that
> going forward all WAL data is encrypted differently. Combine that with
> independent WAL and MDEK keys and everything would be different between two
> replicas promoted from the same point.

Yes, we could do that, but I am hesitant to change the WAL format just
for this, unless there is value to the random number.  We already
discussed that the LSN could be duplicated in different heap/index
files, so we would still have issues.

>     > A unique WDEK per WAL file that is derived from the segment number would
>     not
>     > have that problem. A unique key per-file means the IVs can all start at
>     zero
>     > and the each file can be treated as one encrypted stream. Any encryption/
>     > decryption code would only need to touch the write/read callsites.
> 
>     So, I am now wondering when we should be using a non-zero nonce to
>     start, and when we should be using derived keys.   Should we add the
>     page-number to the derived key for heap/index files too and just use the
>     LSN for nonce, or add the LSN to the derived key too?
> 
> 
> The main cost of using multiple keys is that you need to derive or unlock them
> for each usage.

Right.

> A per-type, per-relation, or per-file derived key with the same non-repeating
> guarantees for the IV (ex: LSN + Page Number) is as secure but allows for
> caching all needed derived keys in memory (it's one per open file descriptor).

Yes, that is why I was trying to use a single encryption key.

> Having page-level derived keys incorporating the LSN + Page Number and starting
> the per-page IV at zero works, but you'd have to perform an HKDF for each page
> read or write. A cache of those derived keys would be much larger (32-bytes per
> page) so presumably you're not going to have them all cached or maybe not
> bother with any caching,

The good news is that the heap/index writes mostly happen in the
background.  I think we are good just using the IV of LSN + Page Number
so we can keep encrypting simple, right?

>     > Even without a per-page MAC, a MAC at some level for WAL has its own
>     benefits
>     > such as perfect corruption detection. It could be per-record,
>     per-N-records,
>     > per-checkpoint, or per-file. The current WAL file format already handles
>     > arbitrary gaps so there is significantly more flexibility in adding it vs
>     > pages. I'm not saying it should be a requirement but, unlike pages, I
>     would not
>     > rule it out just yet as it may not be that complicated.
> 
>     We already have a CRC in the WAL that detects corruption, and that would
>     be encrypted, so it is a MAC. 
> 
> 
> Encrypting a CRC does not make it a cryptographic MAC. It'd have problems
> similar to those discussed for the per-page CRC though it'd still be useful for
> basic corruption detection.

OK. I thought only the CRC length was the problem for using it for a MAC.
  
> 
>     It is an int32, so twice as many bits as
>     the heap/index page CRC --- better, but not great. It would be pretty
> 
>     trivial to increase that to 64 bite if desired.
> 
> 
> "64 bite" is referring to "bit" or "byte"? ;-) I'm guessing bits...

Sorry, bits.

> For the WAL record CRC I think it makes sense to keep the shared wal buffer
> format in place and leave it on the plaintext (rather than on the cipher text
> like is being proposed for the page buffers). The WAL records are not fixed
> length so if the rest of the stream is encrypted there no way for a program
> without the key to be able to figure out the record offsets. Would need *some*
> information like the record size left in plaintext. Not sure the implications
> of that.

OK, I am assuming we can encrypt when the WAL buffers are written to the
file system.

> I still think there could be a separate full MAC at some aggregated level.
> Per-page seems like a good fit as that's how the writes happen and it could be
> calculated just before the actual per-page write.

Well, since people can re-insert old pages without detection, and
because adding a per-page MAC to heap/index would change the page
format, meaning off-line encryption of pages would be very hard, it just
doesn't seem worth it.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: Locale support