Re: [FEATURE REQUEST] Encrypted indexes over encrypted data

Поиск
Список
Период
Сортировка
От Bear Giles
Тема Re: [FEATURE REQUEST] Encrypted indexes over encrypted data
Дата
Msg-id CALBNtw4c=MRPXfCPeEnFnb44_5pBOBGYFpp7THj0QLKyaSBc9A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [FEATURE REQUEST] Encrypted indexes over encrypted data  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
There are alternatives. If you know what you want to find, e.g., a search by username or email address, you can store a strong hash of the value as an indexed column. By "strong hash" I mean don't just use md5 or sha1, or even one round with a salt. I can give you more details about how and why offline.

So you might have a record with:

   id serial primary key,
   email_hash text not null indexed,
   first_name_hash text indexed,
   last_name_hash text indexed,
   phone_number_hash text indexed ,
   'wallet'_containing_all_encrypted_values text

and that allows you to search on email, first name, last name, or phone number, or some combination on them. But no expressions. The hashing would be done in your app, not the database. You also probably want to convert everything to lowercase, maybe remove spaces, etc., before computing the hash.

You should be prepared to handle multiple matches. It's unlikely that an email or phone number hash won't be unique but it's safest to always be prepared for more than one match, decrypt the 'wallet', and then do a final comparison. That also gives you a bit of protection from an attacker creating an account and then changing the hash values to match someone else. You can use that to support very limited expressions, e.g., also keep a hash on the first three letters of their last name, but that will compromise your security a bit since it allows an attacker to perform some statistical analysis on the data.

Finally there's the general advice that hashes (and encrypted values) should always have a version number of some sort. It could be something as simple as 3$hash, or it could be a composite column or even a user-defined type. The # indicates is a lookup into a table, perhaps in your app, that tells you which hashing algorithm and salt to use. It makes life a lot easier if the security audit tells you that you need to change your cipher/salt/key/whatever but you can't do it immediately since you don't know everything you need in order to do it, e.g., the password that you need in order to recompute the hash value. With that version number it's easy to continue to accept the existing password so they can log in, and in the background you quietly recompute the hash using the new salt/algorithm/whatever and update their record. I've worked for some pretty knowledgeable companies that have overlooked this.

On Thu, Aug 9, 2018 at 6:05 AM, Andres Freund <andres@anarazel.de> wrote:


On August 9, 2018 5:30:26 PM GMT+05:30, Danylo Hlynskyi <abcz2.uprola@gmail.com> wrote:
> ?Is it hard to implement soluition 2?

Yes.

To the point that I'm fairly certain that an implementation would be considered to costly to maintain (vs benefit) of proposed.

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: [HACKERS] proposal - Default namespaces for XPath expressions(PostgreSQL 11)
Следующее
От: Tom Lane
Дата:
Сообщение: Re: libpq should not look up all host addresses at once