Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Дата
Msg-id CAD21AoBc-o=KZ=BPB5wWVNnBepqe8yqVs_D3eAd3Tr=X=tTGpQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)  (Bruce Momjian <bruce@momjian.us>)
RE: [Proposal] Table-level Transparent Data Encryption (TDE) andKey Management Service (KMS)  ("Smith, Peter" <peters@fast.au.fujitsu.com>)
Список pgsql-hackers
On Thu, Aug 15, 2019 at 10:19 AM Bruce Momjian <bruce@momjian.us> wrote:
>
> On Wed, Aug 14, 2019 at 04:36:35PM +0200, Antonin Houska wrote:
> > I can work on it right away but don't know where to start.
>
> I think the big open question is whether there will be acceptance of an
> all-cluster encyption feature.  I guess if no one objects, we can move
> forward.
>

I still feel that we need to have per table/tablespace keys although
it might not be the first implementation. I think the safeness of both
table/tablespace level and cluster level would be almost the same but
the former would have an advantage in terms of operation and
performance.

> > First, I think we should use a code repository to integrate [1] and [2]
> > instead of sending diffs back and forth. That would force us to resolve
> > conflicts soon and help to avoid duplicate work. The diffs would be created
> > only whe we need to post the next patch version to pgsql-hackers for review,
> > otherwise the discussions of details can take place elsewhere.
>
> Well, we can do that, or just follow the TODO list and apply items as we
> complete them.  We have found that doing everything in one big patch is
> just too hard to review and get accepted.
>
> > The most difficult problem I see now regarding the collaboration is agreement
> > on the key management user interface. The Full-Cluster Encryption feature [1]
> > should not add configuration variables or even tools that the next, more
> > sophisticated version [2] deprecates immediately. Part of the problem is that
>
> Yes, the all-cluster encryption feature has _no_ SQL-level API to
> control it, just a GUC variable that you can use SHOW to see the
> encryption mode.
>
> > [2] puts all (key management related) interaction of postgres with the
> > environment into an external library. As I pointed out in my response to [2],
> > this will not work for frontend applications (e.g. pg_waldump). I think the
> > key management UI for [2] needs to be designed first even if PG 13 should
> > adopt only [1].
>
> I think there are several directions we can go after all-cluster
> encryption, and it does matter because we would want minimal API
> breakage.  The options are:
>
> 1)  Allow per-table encyption control to limit encryption overhead,
> though all of WAL still needs to be encrypted;  we could add a
> per-record encyption flag to WAL records to avoid that.
>
> 2)  Allow user-controlled keys, which are always unlocked, and encrypt
> WAL with one key
>
> 3)  Encrypt only the user-data portion of pages with user-controlled
> keys.  FREEZE and crash recovery work since only the user data is
> encrypted.  WAL is not encrypted, except for the user-data portion
>
> I think once we implement all-cluster encryption, there will be little
> value to #1 unless we find that page encryption is a big performance
> hit, which I think is unlikely based on performance tests so far.
>
> I don't think #2 has much value since the keys have to always be
> unlocked to allow freeze and crash recovery.
>
> I don't think #3 is viable since there is too much information leakage,
> particularly for indexes because the tid is visible.
>
> Now, if someone says they still want 2 & 3, which has happened many
> times, explain how these issues can be reasonable addressed.
>
> I frankly think we will implement all-cluster encryption, and nothing
> else.  I think the next big encryption feature after that will be
> client-side encryption support, which can be done now but is complex;
> it needs to be easier.
>
> > At least it should be clear how [2] will retrieve the master key because [1]
> > should not do it in a differnt way. (The GUC cluster_passphrase_command
> > mentioned in [3] seems viable, although I think [1] uses approach which is
> > more convenient if the passphrase should be read from console.)

I think that we can also provide a way to pass encryption key directly
to postmaster rather than using passphrase. Since it's common that
user stores keys in KMS it's useful if we can do that.

> > Rotation of
> > the master key is another thing that both versions of the feature should do in
> > the same way. And of course, the fronend applications need consistent approach
> > too.
>
> I don't see the value of an external library for key storage.

I think that big benefit is that PostgreSQL can seamlessly work with
external services such as KMS. For instance, when key rotation,
PostgreSQL can register new key to KMS and use it, and it can remove
keys when it no longer necessary. That is, it can enable PostgreSQL to
not only not only getting key from KMS but also registering and
removing keys. And we also can decrypt MDEK in KMS instead of doing in
PostgreSQL which is more safety. In addition, once someone create the
plugin library of an external services individual projects don't need
to create that.


BTW I've created PoC patch for cluster encryption feature. Attached
patch set has done some items of TODO list and some of them can be
used even for finer granularity encryption. Anyway, the implemented
components are followings:

* Initialization stuff (initdb support). initdb has new command line
options: --enc-cipher and --cluster-passphrase-command. --enc-cipher
option accepts either aes-128 or aes-256 values while
--cluster-passphrase-command accepts an arbitrary command. ControlFile
has an integer indicating cluster encryption support, 'off', 'aes-128'
or 'aes-256'.

* 3-tier encryption keys. During initdb we create KEK and MDEK and
write the meta data file(global/pg_kmgr file). When postmaster startup
it reads the kmgr file, verifies the passphrase using HMAC, unwraps
MDEK and derives TDEK and WDEK from MDEK. Currently MDEK, TDEK and
WDEK are stored into shared memory as this is still PoC but we also
can have them in process local memory.

* All cryptographic functions are implemented using OpenSSL. Since
HKDF and key wrap have been introduced in OpenSSL 1.1.0 it requires
1.1.0 or higher.

* Buffer encryption. All tables and indexes data except for vm and fsm
are transparently encrypted.

Missing features so far are followings:

* WAL encryption
* Temporary file encryption
* Command-line tool to change passphrase (KEK key rotation)
* Front-end tool support (pg_waldump, pg_rewind)
* Documentation
* Regression tests

Since some of above items are already implemented in other patches we
can use them.

We can create database cluster while enabling cluster encryption as follows:

$ initdb -D data --enc-cipher=aes-128
--cluster-passphrase-command='echo "secret password"'
$ pg_controldata | grep encryption


Data encryption cipher:               aes-128
$ pg_ctl start

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: Why is infinite_recurse test suddenly failing?
Следующее
От: Antonin Houska
Дата:
Сообщение: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)