Обсуждение: Encryption - searching and sorting

Поиск
Список
Период
Сортировка

Encryption - searching and sorting

От
David Welton
Дата:
Hi,

We have a situation where HIPAA data that needs to be encrypted.
Since we have lots of users, and a number of users who access the data
of different people, we cannot simply encrypt the disk and call it
good - it's not fine-grained enough.

So far, we've been encrypting each row, and that actually works out
fairly well, but now that we need to do searching and sorting, things
have naturally become a bit more difficult...

I've been testing a few different solutions with our data, which
shouldn't exceed 10,000 rows or thereabouts, in terms of what needs to
be encrypted/decrypted/searched and sorted.

I prototyped something like this:
http://www.doc.ic.ac.uk/teaching/distinguished-projects/2009/w.harrower.pdf
in Ruby (we're using Rails), and the performance is pretty good (well,
insertion is pretty slow, but that's ok for us), and also allows us to
search for substrings.

However, that did nothing about sorting.  So the next idea was to do
something like this: save a list of names (the data we're storing that
must be encrypted) and database row id's as a Ruby list, and encrypt
that  encrypt(marshal([list ... of ... names])).  The advantage over
having to decrypt single rows is that it seems to be a lot faster to
decrypt one big chunk of data rather than lots of little things.
Searching through a list of N thousand names is actually fairly quick
in Ruby, as is sorting.  So... this would probably work, but it's
pretty gross as a solution in that we're going to have to manually
keep a lot of data synced, and it feels awfully strange to be doing
everything in the application.

However, I can't think of a way to create an "index" like that in
Postgres, either.  Am I overlooking something?  The trick, I think, is
to keep the encrypt/decrypt operations to a minimum even if that
requires encrypting/decrypting a lot of data at once.  Perhaps
something like decrypting to a temporary table, running the queries I
need, and then dumping and encrypting the table back to its binary
field?

Thoughts?

Thank you,
--
David N. Welton

http://www.dedasys.com/

Re: Encryption - searching and sorting

От
Bruno Wolff III
Дата:
On Thu, May 03, 2012 at 15:42:00 +0200,
   David Welton <davidw@dedasys.com> wrote:
>
>Thoughts?

Peter Wayner wrote a book Translucent Databases that has some techniques
for helping solve problems like this. It won't magically solve your
problem, but might give you some more ideas on how you can do it.

Re: Encryption - searching and sorting

От
Matthias
Дата:
2012/5/14 Bruno Wolff III <bruno@wolff.to>:
> On Thu, May 03, 2012 at 15:42:00 +0200,
>  David Welton <davidw@dedasys.com> wrote:
>>
>>
>> Thoughts?

Something I found interesting while researching exactly the same problem:

http://web.mit.edu/ralucap/www/CryptDB-sosp11.pdf

I haven't used any of it because the most interesting index operators
for me are not supported, nor do I know how well it performs in
reality, but the section on encryption and fast searching with the
different algorithms is a really interesting read.

-Matthias

Re: Encryption - searching and sorting

От
Merlin Moncure
Дата:
On Thu, May 3, 2012 at 8:42 AM, David Welton <davidw@dedasys.com> wrote:
> Hi,
>
> We have a situation where HIPAA data that needs to be encrypted.
> Since we have lots of users, and a number of users who access the data
> of different people, we cannot simply encrypt the disk and call it
> good - it's not fine-grained enough.
>
> So far, we've been encrypting each row, and that actually works out
> fairly well, but now that we need to do searching and sorting, things
> have naturally become a bit more difficult...
>
> I've been testing a few different solutions with our data, which
> shouldn't exceed 10,000 rows or thereabouts, in terms of what needs to
> be encrypted/decrypted/searched and sorted.
>
> I prototyped something like this:
> http://www.doc.ic.ac.uk/teaching/distinguished-projects/2009/w.harrower.pdf
> in Ruby (we're using Rails), and the performance is pretty good (well,
> insertion is pretty slow, but that's ok for us), and also allows us to
> search for substrings.
>
> However, that did nothing about sorting.  So the next idea was to do
> something like this: save a list of names (the data we're storing that
> must be encrypted) and database row id's as a Ruby list, and encrypt
> that  encrypt(marshal([list ... of ... names])).  The advantage over
> having to decrypt single rows is that it seems to be a lot faster to
> decrypt one big chunk of data rather than lots of little things.
> Searching through a list of N thousand names is actually fairly quick
> in Ruby, as is sorting.  So... this would probably work, but it's
> pretty gross as a solution in that we're going to have to manually
> keep a lot of data synced, and it feels awfully strange to be doing
> everything in the application.
>
> However, I can't think of a way to create an "index" like that in
> Postgres, either.  Am I overlooking something?  The trick, I think, is
> to keep the encrypt/decrypt operations to a minimum even if that
> requires encrypting/decrypting a lot of data at once.  Perhaps
> something like decrypting to a temporary table, running the queries I
> need, and then dumping and encrypting the table back to its binary
> field?

When you say 'encrypting each row', what does that mean exactly?
Specifically, where is the encryption key and is it available for use
in the database?  If it isn't, which is naturally more secure, then
sorting on the database in natural order is going to be difficult or
impossible.  Indexed lookups are doable though.  Partial key lookups
*might* be possible if you used CBC mode and a static initialization
vector (which is terribly insecure).

merlin