Re: Why the index is not used ?

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Why the index is not used ?
Дата
Msg-id 88005798-8b90-3535-cc1c-a915a9e12d17@2ndquadrant.com
обсуждение исходный текст
Ответ на RE: Why the index is not used ?  (ROS Didier <didier.ros@edf.fr>)
Ответы RE: Why the index is not used ?  (ROS Didier <didier.ros@edf.fr>)
RE: Why the index is not used ?  (ROS Didier <didier.ros@edf.fr>)
Список pgsql-sql
Hi,

On 10/07/2018 08:32 PM, ROS Didier wrote:
> Hi Francisco
> 
>     Thank you for your remark. 
> You're right, but it's the only procedure I found to make search on
> encrypted fields with good response times (using index) !
> 

Unfortunately, that kinda invalidates the whole purpose of in-database
encryption - you'll have encrypted on-disk data in one place, and then
plaintext right next to it. If you're dealing with credit card numbers,
then you presumably care about PCI DSS, and this is likely a direct
violation of that.

> Regarding access to the file system, our servers are in protected
network areas. few people can connect to it.
> 

Then why do you need encryption at all? If you assume access to the
filesystem / storage is protected, why do you bother with encryption?
What is your threat model?

> it's not the best solution, but we have data encryption needs and
> good performance needs too. I do not know how to do it except the
> specified procedure..
>
> if anyone has any proposals to put this in place, I'm interested.
> 

One thing you could do is hashing the value and then searching by the
hash. So aside from having the encrypted column you'll also have a short
hash, and you may use it in the query *together* with the original
condition. It does not need to be unique (in fact it should not be to
make it impossible to reverse the hash), but it needs to have enough
distinct values to make the index efficient. Say, 10k values should be
enough, because that means 0.01% selectivity.

So the function might look like this, for example:

  CREATE FUNCTION cchash(text) RETURNS int AS $$
    SELECT abs(hashtext($1)) % 10000;
  $$ LANGUAGE sql;

and then be used like this:

  CREATE INDEX idx_cartedecredit_cc02 ON cartedecredit(cchash(cc));

and in the query

  SELECT pgp_sym_decrypt(cc, 'motdepasse') FROM cartedecredit
   WHERE pgp_sym_decrypt(cc, 'motdepasse')='test value 32'
     AND cchash(cc) = cchash('test value 32');

Obviously, this does not really solve the issues with having to pass the
password to the query, making it visible in pg_stat_activity, various
logs etc.

Which is why people generally use FDE for the whole disk, which is
transparent and provides the same level of protection.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-sql по дате отправления:

Предыдущее
От: Vladimir Ryabtsev
Дата:
Сообщение: Re: Why the index is not used ?
Следующее
От: Paul McGarry
Дата:
Сообщение: Re: Why the index is not used ?