Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Дата
Msg-id 20190809151854.7tyuo6owxnumq45h@development
обсуждение исходный текст
Ответ на Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Masahiko Sawada <sawada.mshk@gmail.com>)
RE: [Proposal] Table-level Transparent Data Encryption (TDE) andKey Management Service (KMS)  ("Smith, Peter" <peters@fast.au.fujitsu.com>)
Список pgsql-hackers
On Fri, Aug 09, 2019 at 11:51:23PM +0900, Masahiko Sawada wrote:
>On Fri, Aug 9, 2019 at 10:25 AM Bruce Momjian <bruce@momjian.us> wrote:
>>
>> On Thu, Aug  8, 2019 at 06:31:42PM -0400, Stephen Frost wrote:
>> > > >Crash recovery doesn't happen "all the time" and neither does vacuum
>> > > >freeze, and autovacuum processes are independent of individual client
>> > > >backends- we don't need to (and shouldn't) have the keys in shared
>> > > >memory.
>> > >
>> > > Don't people do physical replication / HA pretty much all the time?
>> >
>> > Strictly speaking, that isn't actually crash recovery, it's physical
>> > replication / HA, and while those are certainly nice to have it's no
>> > guarantee that they're required or that you'd want to have the same keys
>> > for them- conceptually, at least, you could have WAL with one key that
>> > both sides know and then different keys for the actual data files, if we
>> > go with the approach where the WAL is encrypted with one key and then
>> > otherwise is plaintext.
>>
>> Uh, yes, you could have two encryption keys in the data directory, one
>> for heap/indexes, one for WAL, both unlocked with the same passphrase,
>> but what would be the value in that?
>>
>> > > >>That might allow crash recovery and the freeze part of VACUUM FREEZE to
>> > > >>work.  (I don't think we could vacuum since we couldn't read the index
>> > > >>pages to find the matching rows since the index values would be encrypted
>> > > >>too.  We might be able to not encrypt the tid in the index typle.)
>> > > >
>> > > >Why do we need the indexed values to vacuum the index..?  We don't
>> > > >today, as I recall.  We would need the tids though, yes.
>> > >
>> > > Well, we also do collect statistics on the data, for example. But even
>> > > if we assume we wouldn't do that for encrypted indexes (which seems like
>> > > a pretty bad idea to me), you'd probably end up leaking information
>> > > about ordering of the values. Which is generally a pretty serious
>> > > information leak, AFAICS.
>> >
>> > I agree entirely that order information would be bad to leak- but this
>> > is all new ground here and we haven't actually sorted out what such a
>> > partially encrypted btree would look like.  We don't actually have to
>> > have the down-links in the tree be unencrypted to allow vacuuming of
>> > leaf pages, after all.
>>
>> Agreed, but I think we kind of know that the value in cluster-wide
>> encryption is different from multi-key encryption --- both have their
>> value, but right now cluster-wide is the easiest and simplest, and
>> probably meets more user needs than multi-key encryption.  If others
>> want to start scoping out what multi-key encryption would look like, we
>> can discuss it.  I personally would like to focus on cluster-wide
>> encryption for PG 13.
>
>I agree that cluster-wide is more simpler but I'm not sure that it
>meets real needs from users. One example is re-encryption; when the
>key leakage happens, in cluster-wide encryption we end up with doing
>re-encrypt whole database regardless the amount of user sensitive data
>in database. I think it's a big constraint for users because it's
>common that the amount of data such as master table that needs to be
>encrypted doesn't account for a large potion of database. That's one
>reason why I think more fine granularity encryption such as
>table/tablespace is required.
>

TBH I think it's mostly pointless to design for key leakage.

My understanding it that all this work is motivated by the assumption that
Bob can obtain access to the data directory (say, a backup of it). So if
he also manages to get access to the encryption key, we probably have to
assume he already has access to current snapshot of the data directory,
which means any re-encryption is pretty futile.

What we can (and should) optimize for is key rotation, but as that only
changes the master key and not the actual encryption keys, the overhead is
pretty low.

We can of course support "forced" re-encryption, but I think it's
acceptable if that's fairly expensive as long as it can be throttled and
executed in the background (kinda similar to the patch to enable checksums
in the background).

>And in terms of feature development we would implement
>fine-granularity encryption in the future even if the first step is
>cluster-wide encryption? And both TDEs encrypt the same kind of
>database objects (i.e. only  tables , indexes and WAL)? If so, how
>does users  use them depending on cases?
>
>I imagined the case where we had the cluster-wide encryption as the
>first TDE feature. We will enable TDE at initdb time by specifying the
>command-line parameter for TDE. Then TDE is enabled in cluster wide
>and all tables/indexes and WAL are automatically encrypted. Then, if
>we want to implement the more fine granularity encryption how can we
>make users use it? WAL encryption and tables/index encryption are
>enabled at the same time but we want to enable encryption for
>particular tables/indexes after initdb. If the cluster-wide encryption
>is something like a short-cut of encrypting all tables/indexes, I
>personally think that implementing the more fine granularity one first
>and then using it to achieve the more coarse granularity would be more
>easier.
>

Not sure, but I'd expect it to be the other way around, i.e. the more
granular encryption being more complicated. One reason is that with
cluster-wide you can just assume everything is encrypted and handle it the
same way, while with fine-grained encryption you need to whether each
individual object is encrypted, maybe handle it in different ways, etc.

But that's just my guess, really.

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Следующее
От: Jeff Davis
Дата:
Сообщение: Re: Add "password_protocol" connection parameter to libpq