Re: Transparent Data Encryption (TDE) and encrypted files
| От | Tomas Vondra | 
|---|---|
| Тема | Re: Transparent Data Encryption (TDE) and encrypted files | 
| Дата | |
| Msg-id | 20191004204657.kxkkwn47g6uiakjv@development обсуждение исходный текст | 
| Ответ на | Re: Transparent Data Encryption (TDE) and encrypted files (Stephen Frost <sfrost@snowman.net>) | 
| Ответы | Re: Transparent Data Encryption (TDE) and encrypted files | 
| Список | pgsql-hackers | 
On Thu, Oct 03, 2019 at 01:26:55PM -0400, Stephen Frost wrote:
>Greetings,
>
>* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
>> On Thu, Oct 03, 2019 at 11:51:41AM -0400, Stephen Frost wrote:
>> >* Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
>> >>On Thu, Oct 03, 2019 at 10:40:40AM -0400, Stephen Frost wrote:
>> >>>People who are looking for 'encrypt all the things' should and will be
>> >>>looking at filesytem-level encryption options.  That's not what this
>> >>>feature is about.
>> >>
>> >>That's almost certainly not true, at least not universally.
>> >>
>> >>It may be true for some people, but a a lot of the people asking for
>> >>in-database encryption essentially want to do filesystem encryption but
>> >>can't use it for various reasons. E.g. because they're running in
>> >>environments that make filesystem encryption impossible to use (OS not
>> >>supporting it directly, no access to the block device, lack of admin
>> >>privileges, ...). Or maybe they worry about people with fs access.
>> >
>> >Anyone coming from other database systems isn't asking for that though
>> >and it wouldn't be a comparable offering to other systems.
>>
>> I don't think that's quite accurate. In the previous message you claimed
>> (1) this isn't what other database systems do and (2) people who want to
>> encrypt everything should just use fs encryption, because that's not
>> what TDE is about.
>>
>> Regarding (1), I'm pretty sure Oracle TDE does pretty much exactly this,
>> at least in the mode with tablespace-level encryption. It's true there
>> is also column-level mode, but from my experience it's far less used
>> because it has a number of annoying limitations.
>
>We're probably being too general and that's ending up with us talking
>past each other.  Yes, Oracle provides tablespace and column level
>encryption, but neither case results in *everything* being encrypted.
>
Possibly. There are far too many different TDE definitions in all those
various threads.
>> So I'm somewhat puzzled by your claim that people coming from other
>> systems are asking for the column-level mode. At least I'm assuming
>> that's what they're asking for, because I don't see other options.
>
>I've seen asks for tablespace, table, and column-level, but it's always
>been about the actual data.  Something like clog is an entirely internal
>structure that doesn't include the actual data.  Yes, it's possible it
>could somehow be used for a side-channel attack, as could other things,
>such as WAL, and as such I'm not sure that forcing a policy of "encrypt
>everything" is actually a sensible approach and it definitely adds
>complexity and makes it a lot more difficult to come up with a sensible
>solution.
>
IMHO the proven design principle is "deny all" by default, i.e. we
should start with the assumption that clog is encrypted and then present
arguments why it's not needed. Maybe it's 100% fine and we don't need to
encrypt it, or maybe it's a minor information leak and is not worth the
extra complexity, or maybe it's not needed for v1. But how do you know?
I don't think that discussion happened anywhere in those threads.
>> >>If you look at how the two threads discussing the FDE design, both of
>> >>them pretty much started as "let's do FDE in the database".
>> >
>> >And that's how some folks continue to see it- let's just encrypt all the
>> >things, until they actually look at it and start thinking about what
>> >that means and how to implement it.
>>
>> This argument also works the other way, though. On Oracle, people often
>> start with the column-level encryption because it seems naturally
>> superior (hey, I can encrypt just the columns I want, ...) and then they
>> start running into the various limitations and eventually just switch to
>> the tablespace-level encryption.
>>
>> Now, maybe we'll be able to solve those limitations - but I think it's
>> pretty unlikely, because those limitations seem quite inherent to how
>> encryption affects indexes etc.
>
>It would probably be useful to discuss the specific limitations that
>you've seen causes people to move away from column-level encryption.
>
>I definitely agree that figuring out how to make things work with
>indexes is a non-trivial challenge, though I'm hopeful that we can come
>up with something sensible.
>
Hope is hardly something we should use to drive design decisions ...
As for the limitations, the column-level limitations in Oracle, this is
what the docs [1] say:
----- <quote> -----
Do not use TDE column encryption with the following database features:
    Index types other than B-tree
    Range scan search through an index
    Synchronous change data capture
    Transportable tablespaces
    Columns that have been created as identity columns
In addition, you cannot use TDE column encryption to encrypt columns
used in foreign key constraints.
----- </quote> -----
Now, some of that is obviously specific to Oracle, but at least some of
it seems to affect us too - certainly range scans through indexes,
possibly data capture (I believe that's mostly logical decoding),
non-btree indexes and identity columns.
Oracle also has a handy "TDE best practices" document [2], which says
when to use column-level encryption - let me quote a couple of points:
* Location of sensitive information is known
* Less than 5% of all application columns are encryption candidates
* Encryption candidates are not foreign-key columns
* Indexes over encryption candidates are normal B-tree indexes (this
  also means no support for indexes on expressions, and likely partial
  indexes)
* No support from hardware crypto acceleration.
Now, maybe we can relax some of those limitations, or maybe those
limitations are acceptable for some applications. But it certainly does
not seem like a clearly superior choice.
There are other interesting arguments in that [2], it's worth a read.
>> >Yeah, it'd be great to just encrypt everything, with a bunch of
>> >different keys, all of which are stored somewhere else, and can be
>> >updated and changed by the user when they need to do a rekeying, but
>> >then you start have to asking about what keys need to be available when
>> >for doing crash recovery, how do you handle a crash in the middle of a
>> >rekeying, how do you handle updating keys from the user, etc..
>> >
>> >Sure, we could offer a dead simple "here, use this one key at database
>> >start to just encrypt everything" and that would be enough for some set
>> >of users (a very small set, imv, but that's subjective, obviously), but
>> >I don't think we could dare promote that as having TDE because it
>> >wouldn't be at all comparable to what other databases have, and it
>> >wouldn't materially move us in the direction of having real TDE.
>>
>> I think that very much depends on the definition of what "real TDE".  I
>> don't know what exactly that means at this point. And as I said before,
>> I think such simple mode *is* comparable to (at least some) solutions
>> available in other databases (as explained above).
>
>When I was researching this, I couldn't find any example of a database
>that wouldn't start without the one magic key that encrypts everything.
>I'm happy to be told that I was wrong in my understanding of that, with
>some examples.
>
>> As for the users, I don't have any objective data about this, but I
>> think the amount of people wanting such simple solution is non-trivial.
>> That does not mean we can't extend it to support more advanced features.
>
>The concern that I raised before and that I continue to worry about is
>that providing such a simple capability will have a lot of limitations
>too (such as having a single key and only being able to rekey during a
>complete downtime, because we have to re-encrypt clog, etc, etc), and
>I don't see it helping us get to more granular TDE because, for that,
>where we really need to start is by building a vault of some kind to
>store the keys in and then figuring out how we do things like crash
>recovery in a sensible way and, ideally, without needing to have access
>to all of (any of?) the keys.
>
Eh? I don't think that "simple mode" has to use a single encryption key
internally, I think the design with single *master* key and multiple
encryption keys works just fine. So when changing the master key, it's
enough to re-encrypt the encryption keys. No need for a downtime etc.
Of course, in some cases it may be desirable to change those encryption
keys too, but that seems like a pretty inherent feature.
>> >>>>I'm not sold on the comments that have been made about encrypting the
>> >>>>server log. I agree that could leak data, but that seems like somebody
>> >>>>else's problem: the log files aren't really under PostgreSQL's
>> >>>>management in the same way as pg_clog is. If you want to secure your
>> >>>>logs, send them to syslog and configure it to do whatever you need.
>> >>>
>> >>>I agree with this.
>> >>
>> >>I don't. I know it's not an easy problem to solve, but it may contain
>> >>user data (which is what we manage). We may allow disabling that, at
>> >>which point it becomes someone else's problem.
>> >
>> >We also send user data to clients, but I don't imagine we're suggesting
>> >that we need to control what some downstream application does with that
>> >data or how it gets stored.  There's definitely a lot of room for
>> >improvement in our logging (in an ideal world, we'd have a way to
>> >actually store the logs in the database, at which point it could be
>> >encrypted or not that way...), but I'm not seeing the need for us to
>> >have a way to encrypt the log files.  If we did encrypt them, we'd have
>> >to make sure to do it in a way that users could still access them
>> >without the database being up and running, which might be tricky if the
>> >key is in the vault...
>>
>> That's a bit of a straw-man argument, really. The client is obviously
>> meant to receive and handle sensitive data, that's it's main purpose.
>> For logging systems the situation is a bit different, it's a general
>> purpose tool, with no idea what the data is.
>
>The argument you're making is that the log isn't intended to have
>sensitive data, but while that might be a nice place to get to, we
>certainly aren't there today, which means that people should really be
>sending the logs to a location that's trusted.
>
Which means they can't really send it anywhere, because they don't have
control over what will be in error messages etc.
Let me quote the PCI DSS standard, which seems like a good example:
    3.4 Render Primary Account Number (PAN), at minimum, unreadable
    anywhere it is stored (including data on portable digital media,
    backup media, in logs) by using any of the following approaches:
    * One-way hashes based on strong cryptography
    * Truncation
    * Index tokens and pads (pads must be securely stored)
    * Strong cryptography with associated key management processes and
      procedures.
I'm no PCI DSS expert, but how can you comply with this (assuming you
want tostore PAN in the database) by only sending the data to trusted
systems?
>> I do understand it's pretty pointless to send encrypted message to such
>> external tools, but IMO it's be good to implement that at least for our
>> internal logging collector.
>
>It's also less than user friendly to log to encrypted files that you
>can't read without having the database system being up, so we'd have to
>figure out at least a solution to that problem, and then if you have
>downstream systems where the logs are going to, you have to decrypt
>them, or have a way to have them not be encrypted perhaps.
>
I don't see why the database would have to be up, as long as the vault
is accessible somehow (i.e. I can imagine a tool for reading encrypted
logs, requesting the key from the same vault).
>In general, wrt the logs, I feel like it's at least a reasonably small
>and independent piece of this, though I wonder if it'll cause similar
>problems when it comes to dealing with crash recovery (how do we log if
>we don't have the key from the vault because we haven't done crash
>recovery yet, for example...).
>
Possibly, I don't have an opinion on this.
regards
[1] https://docs.oracle.com/en/database/oracle/oracle-database/18/asoag/configuring-transparent-data-encryption.html
[2] https://www.oracle.com/technetwork/database/security/twp-transparent-data-encryption-bes-130696.pdf
-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 
		
	В списке pgsql-hackers по дате отправления: