Re: Transparent column encryption
От | Jelte Fennema-Nio |
---|---|
Тема | Re: Transparent column encryption |
Дата | |
Msg-id | CAGECzQSWbraE2Xqk2BUvG94vOkEF1_DT1b1TedqHPJsVrLukcA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Transparent column encryption (Peter Eisentraut <peter@eisentraut.org>) |
Список | pgsql-hackers |
On Thu, 18 Apr 2024 at 13:25, Peter Eisentraut <peter@eisentraut.org> wrote: > Hopefully, the reason for key rotation is mainly that policies require > key rotation, not that keys get compromised all the time. These key rotation policies are generally in place to reduce the impact of a key compromise by limiting the time a compromised key is valid. > This > seems pretty standard to me. For example, I can change the password on > my laptop's file system encryption, which somehow wraps a lower-level > key, but I can't reencrypt the actual file system in place. I think the threat model for this proposal and a laptop's file system encryption are different enough that the same choices/tradeoffs don't automatically translate. Specifically in this proposal the unencrypted CEK is present on all servers that need to read/write those encrypted values. And a successful attacker would then be able to read the encrypted values forever with this key, because it effectively cannot be rotated. That is a much bigger attack surface and risk than a laptop's disk encryption. So, I feel quite strongly that shipping the proposed feature without being able to re-encrypt columns in an online fashion would be a mistake. > That's the > reason for having this two-tier key system in the first place. If we allow for online-rotation of the actual encryption key, then maybe we don't even need this two-tier system ;) Not having this two tier system would have a few benefits in my opinion: 1. We wouldn't need to be sending encrypted key material from the server to every client. Which seems nice from a security, bandwidth and client implementation perspective. 2. Asymmetric encryption of columns is suddenly an option. Allowing certain clients to enter encrypted data into the database but not read it. > Two problems here: One, for deterministic encryption, everyone needs to > agree on the representation, otherwise equality comparisons won't work. > Two, if you give clients the option of storing text or binary, then > clients also get back a mix of text or binary, and it will be a mess. > Just giving the option of storing the payload in binary wouldn't be that > hard, but it's not clear what you can sensibly do with that in the end. How about defining at column creation time if the underlying value should be binary or not? Something like: CREATE TABLE t( mytime timestamp ENCRYPTED WITH (column_encryption_key = cek1, binary=true) ); > Even if the identifiers > somehow were global (but OIDs can also change and are not guaranteed > unique forever), OIDs of existing rows can't just change while a connection is active, right? (all I know is upgrades can change them but that seems fine) Also they are unique within a catalog table, right? > the state of which keys have already been sent is > session state. I agree that this is the case. But it's state that can be tracked fairly easily by a transaction pooler. Similar to how prepared statements can be tracked. And this is easier to do when at the IDs of the same keys are the same across each session to the server, because if they differ then you need to do re-mapping of IDs. > This is kind of like SASL or TLS can add new methods dynamically without > requiring a new version. I mean, as we are learning, making new > protocol versions is kind of hard, so the point was to avoid it. Fair enough > I guess you could do that, but wouldn't that making the decoding of > these messages much more complicated? You would first have to read the > "short" variant, decode the format, and then decide to read the rest. > Seems weird. I see your point. But with the current approach even for queries that don't return any encrypted columns, these useless fields would be part of the RowDescryption. It seems quite annoying to add extra network and parsing overhead all of your queries even if only a small percentage use the encryption feature. Maybe we should add a new message type instead like EncryptedRowDescription, or add some flag field at the start of RowDescription that can be used to indicate that there is encryption info for some of the columns. > Yes, that's what would happen, and that's the intention, so that for > example you can use pg_dump to back up encrypted columns without having > to decrypt them. Okay, makes sense. But I think it would be good to document that. > > A related question to this is that currently libpq throws an error if > > e.g. a master key realm is not defined but another one is. Is that > > really what we want? Is not having one of the realms really that > > different from not providing any realms at all? > > Can you provide a more concrete example of what scenario you have a > concern about? A server has table A and B. A is encrypted with a master key realm X and B is encrypted with master key realm Y. If libpq is only given a key for realm X, and it then tries to read table B, an error is thrown. While if you don't provide any realm at all, you can read from table B just fine, only you will get bytea fields back.
В списке pgsql-hackers по дате отправления: