Обсуждение: Index on a Decrypt / Bytea2Text Function

Поиск
Список
Период
Сортировка

Index on a Decrypt / Bytea2Text Function

От
Anthony Presley
Дата:
Hi all,

We tend to do a lot of lookups on our database that look something like:

select
    e.id
from
employee e ,app_user au
    where
au.id=user_id and
au.corporation_id=$1 and
e.ssn is not null and
e.ssn!=' ' and
e.ssn!='' and
e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))=$2

The analyze here looks like:

> explain analyze select e.id from employee e ,app_user au where
au.id=user_id and au.corporation_id=41197 and  e.ssn is not null and
e.ssn!=' ' and e.ssn!='' and e.deleted='N'and
bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
'bf'))='188622250';

QUERY
PLAN
--------------------------------------------------------------------------
 Nested Loop  (cost=0.00..19282.05 rows=122 width=8) (actual
time=24.591..192.435 rows=1 loops=1)
   ->  Index Scan using emp_del on employee e  (cost=0.00..18625.99
rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1)
         Index Cond: (deleted = 'N'::bpchar)
         Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <>
''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text),
text2bytea((id)::text), 'bf'::text)) = '188622250'::text))
   ->  Index Scan using app_user_pkey on app_user au  (cost=0.00..5.36
rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1)
         Index Cond: (au.id = e.user_id)
         Filter: (au.corporation_id = 41197)
 Total runtime: 192.565 ms
(8 rows)

It would appear that almost 100% of this time is taken up by doing the
bytea2text and decrypt() functions.

How would I create an index based on the results of the decrypt and
bytea2text function to improve this select statement?

Thanks!


--
Anthony


Re: Index on a Decrypt / Bytea2Text Function

От
Bill Moran
Дата:
In response to Anthony Presley <anthony@resolution.com>:

> Hi all,
>
> We tend to do a lot of lookups on our database that look something like:
>
> select
>     e.id
> from
> employee e ,app_user au
>     where
> au.id=user_id and
> au.corporation_id=$1 and
> e.ssn is not null and
> e.ssn!=' ' and
> e.ssn!='' and
> e.deleted='N'and
> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
> 'bf'))=$2
>
> The analyze here looks like:
>
> > explain analyze select e.id from employee e ,app_user au where
> au.id=user_id and au.corporation_id=41197 and  e.ssn is not null and
> e.ssn!=' ' and e.ssn!='' and e.deleted='N'and
> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
> 'bf'))='188622250';
>
> QUERY
> PLAN
> --------------------------------------------------------------------------
>  Nested Loop  (cost=0.00..19282.05 rows=122 width=8) (actual
> time=24.591..192.435 rows=1 loops=1)
>    ->  Index Scan using emp_del on employee e  (cost=0.00..18625.99
> rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1)
>          Index Cond: (deleted = 'N'::bpchar)
>          Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <>
> ''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text),
> text2bytea((id)::text), 'bf'::text)) = '188622250'::text))
>    ->  Index Scan using app_user_pkey on app_user au  (cost=0.00..5.36
> rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1)
>          Index Cond: (au.id = e.user_id)
>          Filter: (au.corporation_id = 41197)
>  Total runtime: 192.565 ms
> (8 rows)
>
> It would appear that almost 100% of this time is taken up by doing the
> bytea2text and decrypt() functions.
>
> How would I create an index based on the results of the decrypt and
> bytea2text function to improve this select statement?

The best way would be to unencrypt the column and use a normal index.

Since you're simply using a value in another column as the key anyway,
your design has created all the performance headaches of encryption
with no actual security.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Index on a Decrypt / Bytea2Text Function

От
Thom Brown
Дата:
On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote:
> Hi all,
>
> We tend to do a lot of lookups on our database that look something like:
>
> select
>        e.id
> from
> employee e ,app_user au
>        where
> au.id=user_id and
> au.corporation_id=$1 and
> e.ssn is not null and
> e.ssn!=' ' and
> e.ssn!='' and
> e.deleted='N'and
> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
> 'bf'))=$2
>
> The analyze here looks like:
>
>> explain analyze select e.id from employee e ,app_user au where
> au.id=user_id and au.corporation_id=41197 and  e.ssn is not null and
> e.ssn!=' ' and e.ssn!='' and e.deleted='N'and
> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
> 'bf'))='188622250';
>
> QUERY
> PLAN
> --------------------------------------------------------------------------
>  Nested Loop  (cost=0.00..19282.05 rows=122 width=8) (actual
> time=24.591..192.435 rows=1 loops=1)
>   ->  Index Scan using emp_del on employee e  (cost=0.00..18625.99
> rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1)
>         Index Cond: (deleted = 'N'::bpchar)
>         Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <>
> ''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text),
> text2bytea((id)::text), 'bf'::text)) = '188622250'::text))
>   ->  Index Scan using app_user_pkey on app_user au  (cost=0.00..5.36
> rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1)
>         Index Cond: (au.id = e.user_id)
>         Filter: (au.corporation_id = 41197)
>  Total runtime: 192.565 ms
> (8 rows)
>
> It would appear that almost 100% of this time is taken up by doing the
> bytea2text and decrypt() functions.
>
> How would I create an index based on the results of the decrypt and
> bytea2text function to improve this select statement?
>
> Thanks!
>
>
> --
> Anthony
>
>

Would the following work?:

CREATE INDEX idx_employee_functional ON employee
(bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS
text)),'bf'))

Thom

Re: Index on a Decrypt / Bytea2Text Function

От
Thom Brown
Дата:
On 14 July 2010 20:32, Bill Moran <wmoran@potentialtech.com> wrote:
> In response to Anthony Presley <anthony@resolution.com>:
>
>> Hi all,
>>
>> We tend to do a lot of lookups on our database that look something like:
>>
>> select
>>       e.id
>> from
>> employee e ,app_user au
>>       where
>> au.id=user_id and
>> au.corporation_id=$1 and
>> e.ssn is not null and
>> e.ssn!=' ' and
>> e.ssn!='' and
>> e.deleted='N'and
>> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
>> 'bf'))=$2
>>
>> The analyze here looks like:
>>
>> > explain analyze select e.id from employee e ,app_user au where
>> au.id=user_id and au.corporation_id=41197 and  e.ssn is not null and
>> e.ssn!=' ' and e.ssn!='' and e.deleted='N'and
>> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
>> 'bf'))='188622250';
>>
>> QUERY
>> PLAN
>> --------------------------------------------------------------------------
>>  Nested Loop  (cost=0.00..19282.05 rows=122 width=8) (actual
>> time=24.591..192.435 rows=1 loops=1)
>>    ->  Index Scan using emp_del on employee e  (cost=0.00..18625.99
>> rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1)
>>          Index Cond: (deleted = 'N'::bpchar)
>>          Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <>
>> ''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text),
>> text2bytea((id)::text), 'bf'::text)) = '188622250'::text))
>>    ->  Index Scan using app_user_pkey on app_user au  (cost=0.00..5.36
>> rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1)
>>          Index Cond: (au.id = e.user_id)
>>          Filter: (au.corporation_id = 41197)
>>  Total runtime: 192.565 ms
>> (8 rows)
>>
>> It would appear that almost 100% of this time is taken up by doing the
>> bytea2text and decrypt() functions.
>>
>> How would I create an index based on the results of the decrypt and
>> bytea2text function to improve this select statement?
>
> The best way would be to unencrypt the column and use a normal index.
>
> Since you're simply using a value in another column as the key anyway,
> your design has created all the performance headaches of encryption
> with no actual security.
>
> --
> Bill Moran
> http://www.potentialtech.com
> http://people.collaborativefusion.com/~wmoran/
>
> --

Yes, I immediately thought about what's actually happening as soon as
I sent the last message.  Forget the functional index.

Thom

Re: Index on a Decrypt / Bytea2Text Function

От
Tom Lane
Дата:
Thom Brown <thombrown@gmail.com> writes:
> On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote:
>> select
>>        e.id
>> from
>> employee e ,app_user au
>>        where
>> au.id=user_id and
>> au.corporation_id=$1 and
>> e.ssn is not null and
>> e.ssn!=' ' and
>> e.ssn!='' and
>> e.deleted='N'and
>> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
>> 'bf'))=$2
>>
>> How would I create an index based on the results of the decrypt and
>> bytea2text function to improve this select statement?

> Would the following work?:

> CREATE INDEX idx_employee_functional ON employee
> (bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS
> text)),'bf'))

That would work as far as speeding up the query goes.  However, as Bill
Moran points out nearby, the query reveals a totally incompetent
security design.  There is no value to speak of in encrypting a data
value and then storing the decryption key right beside it.  Perhaps the
excuse is to not have the SSN in cleartext on disk, nevermind whether a
halfway competent attacker could get it back --- but even with that
barely-useful goal, you do *not* want an index like this, because all
the index entries will be cleartext SSNs.

What you really need is to take two steps back and figure out why you
want to encrypt this data and what threats you intend to protect
against.  It's probably possible to make a credibly-secure design that
runs faster than this does, but there's no point at all in improving
the performance of a fundamentally broken design.

            regards, tom lane

Re: Index on a Decrypt / Bytea2Text Function

От
Anthony Presley
Дата:
On Wed, 2010-07-14 at 15:56 -0400, Tom Lane wrote:
> Thom Brown <thombrown@gmail.com> writes:
> > On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote:
> >> select
> >>        e.id
> >> from
> >> employee e ,app_user au
> >>        where
> >> au.id=user_id and
> >> au.corporation_id=$1 and
> >> e.ssn is not null and
> >> e.ssn!=' ' and
> >> e.ssn!='' and
> >> e.deleted='N'and
> >> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
> >> 'bf'))=$2
> >>
> >> How would I create an index based on the results of the decrypt and
> >> bytea2text function to improve this select statement?
>
> > Would the following work?:
>
> > CREATE INDEX idx_employee_functional ON employee
> > (bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS
> > text)),'bf'))
>
> That would work as far as speeding up the query goes.  However, as Bill
> Moran points out nearby, the query reveals a totally incompetent
> security design.  There is no value to speak of in encrypting a data
> value and then storing the decryption key right beside it.  Perhaps the
> excuse is to not have the SSN in cleartext on disk, nevermind whether a
> halfway competent attacker could get it back --- but even with that
> barely-useful goal, you do *not* want an index like this, because all
> the index entries will be cleartext SSNs.
>
> What you really need is to take two steps back and figure out why you
> want to encrypt this data and what threats you intend to protect
> against.  It's probably possible to make a credibly-secure design that
> runs faster than this does, but there's no point at all in improving
> the performance of a fundamentally broken design.
>
>             regards, tom lane
>

Yes, you are right ... the security here serves no purpose other than to
not have SSN's stored on disk in an un-encrypted way.  Unfortunately, we
need to be able to easily, and quickly, reverse the security, so that we
can get access to unencrypted data ... because our application does
export to payroll providers, and many of them still use SSN's as keys.
IE, storing the SSN in an encrypted manner (or using a one-way salt),
won't work.

IE, the reality is that our application has to be able to show / hide
the SSN, so someone breaking into the application (it is likely easier
to steal your manager's password than it would be to hack into the
server), would be able to access the data.

Even if we wanted to tackle *real* security here, I'm not sure how we'd
go about it.  Encrypting any data on a web app would require that the
encryption key and/or salt be stored in some combination of the
database, or app code, which is all vulnerable if someone breaks into
and/or steals the server.  There isn't a "client" piece, like you'd have
with Carbonite, etc...


--
Anthony


Re: Index on a Decrypt / Bytea2Text Function

От
Anthony Presley
Дата:
On Wed, 2010-07-14 at 20:32 +0100, Thom Brown wrote:
> On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote:
> > Hi all,
> >
> > We tend to do a lot of lookups on our database that look something like:
> >
> > select
> >        e.id
> > from
> > employee e ,app_user au
> >        where
> > au.id=user_id and
> > au.corporation_id=$1 and
> > e.ssn is not null and
> > e.ssn!=' ' and
> > e.ssn!='' and
> > e.deleted='N'and
> > bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
> > 'bf'))=$2
> >
> > The analyze here looks like:
> >
> >> explain analyze select e.id from employee e ,app_user au where
> > au.id=user_id and au.corporation_id=41197 and  e.ssn is not null and
> > e.ssn!=' ' and e.ssn!='' and e.deleted='N'and
> > bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
> > 'bf'))='188622250';
> >
> > QUERY
> > PLAN
> > --------------------------------------------------------------------------
> >  Nested Loop  (cost=0.00..19282.05 rows=122 width=8) (actual
> > time=24.591..192.435 rows=1 loops=1)
> >   ->  Index Scan using emp_del on employee e  (cost=0.00..18625.99
> > rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1)
> >         Index Cond: (deleted = 'N'::bpchar)
> >         Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <>
> > ''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text),
> > text2bytea((id)::text), 'bf'::text)) = '188622250'::text))
> >   ->  Index Scan using app_user_pkey on app_user au  (cost=0.00..5.36
> > rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1)
> >         Index Cond: (au.id = e.user_id)
> >         Filter: (au.corporation_id = 41197)
> >  Total runtime: 192.565 ms
> > (8 rows)
> >
> > It would appear that almost 100% of this time is taken up by doing the
> > bytea2text and decrypt() functions.
> >
> > How would I create an index based on the results of the decrypt and
> > bytea2text function to improve this select statement?
> >
> > Thanks!
> >
> >
> > --
> > Anthony
> >
> >
>
> Would the following work?:
>
> CREATE INDEX idx_employee_functional ON employee
> (bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS
> text)),'bf'))
>
> Thom

Unfortunately, that doesn't work:

# CREATE INDEX idx_employee_functional ON employee
(bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS
text)),'bf')));
ERROR:  functions in index expression must be marked IMMUTABLE

Guess we'll need to come up with something else.


--
Anthony


Re: Index on a Decrypt / Bytea2Text Function

От
Bill Moran
Дата:
In response to Anthony Presley <anthony@resolution.com>:

> On Wed, 2010-07-14 at 15:56 -0400, Tom Lane wrote:
> > Thom Brown <thombrown@gmail.com> writes:
> > > On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote:
> > >> select
> > >>        e.id
> > >> from
> > >> employee e ,app_user au
> > >>        where
> > >> au.id=user_id and
> > >> au.corporation_id=$1 and
> > >> e.ssn is not null and
> > >> e.ssn!=' ' and
> > >> e.ssn!='' and
> > >> e.deleted='N'and
> > >> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)),
> > >> 'bf'))=$2
> > >>
> > >> How would I create an index based on the results of the decrypt and
> > >> bytea2text function to improve this select statement?
> >
> > > Would the following work?:
> >
> > > CREATE INDEX idx_employee_functional ON employee
> > > (bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS
> > > text)),'bf'))
> >
> > That would work as far as speeding up the query goes.  However, as Bill
> > Moran points out nearby, the query reveals a totally incompetent
> > security design.  There is no value to speak of in encrypting a data
> > value and then storing the decryption key right beside it.  Perhaps the
> > excuse is to not have the SSN in cleartext on disk, nevermind whether a
> > halfway competent attacker could get it back --- but even with that
> > barely-useful goal, you do *not* want an index like this, because all
> > the index entries will be cleartext SSNs.
> >
> > What you really need is to take two steps back and figure out why you
> > want to encrypt this data and what threats you intend to protect
> > against.  It's probably possible to make a credibly-secure design that
> > runs faster than this does, but there's no point at all in improving
> > the performance of a fundamentally broken design.
> >
> >             regards, tom lane
> >
>
> Yes, you are right ... the security here serves no purpose other than to
> not have SSN's stored on disk in an un-encrypted way.  Unfortunately, we
> need to be able to easily, and quickly, reverse the security, so that we
> can get access to unencrypted data ... because our application does
> export to payroll providers, and many of them still use SSN's as keys.
> IE, storing the SSN in an encrypted manner (or using a one-way salt),
> won't work.
>
> IE, the reality is that our application has to be able to show / hide
> the SSN, so someone breaking into the application (it is likely easier
> to steal your manager's password than it would be to hack into the
> server), would be able to access the data.
>
> Even if we wanted to tackle *real* security here, I'm not sure how we'd
> go about it.  Encrypting any data on a web app would require that the
> encryption key and/or salt be stored in some combination of the
> database, or app code, which is all vulnerable if someone breaks into
> and/or steals the server.  There isn't a "client" piece, like you'd have
> with Carbonite, etc...

You need to do more research on this.  I understand that bosses are
arbitrarily requiring "encrypt the SSN" without understanding what they're
asking for, but doing it half-assed like this is irresponsible to the
point of being criminal.

As Tom says, first identify what attack vectors you're protecting from.

If you just want to protect the data if the server if physically stolen,
disk encryption of the partition where PG has it's data files is
probably your best bet, and on most OSen is pretty easy to set up.  The
only headache there is someone has to manually enter the disk passphrase
any time the system is rebooted.

If you want to protect from other attack vectors, such as SQL injections,
it's a little trickier, but still doable.  The complexity depends on
the rules of your access model.

In the simplest case, you generate a shared secret that every SSN is
encrypted with.  You don't keep the shared secret anywhere in the DB.
But when you create users that need access to the SSNs, you create a
copy of the shared secret that's encrypted with their password.  Now,
when they log in, they can decrypt the shared secret and get access to
the SSNs.

It even makes your application look better, because when they go to access
the SSNs, a screen pops up saying "please enter your password again to
verify that your session hasn't been hijacked" ... which you can tout
as a benefit to the users.

If your access rules are more complex (which it doesn't seem that they
are, but I only know so much) the implementation gets more complex, with
hierarchies of shared secrets and interesting grant/revoke routines --
maybe even PKI -- but at it's core, it's the same setup as what I just
described with additional layers.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/