Обсуждение: pgcrypto and database encryption

Поиск
Список
Период
Сортировка

pgcrypto and database encryption

От
Silvana Di Martino
Дата:
Going on with my evaluation of pgcrypto as a possible solution for our (EU)
privacy law problems, I have to report the following facts:

1) pgcrypto (understandably) supplies just a small collections of server-side
functions, not a general-purpose database-encryption system.

2) This means that pgcrypto does not supply you with any password management
tool. All pgcrypto functions expect to receive the password from the calling
SQL SELECT code as a parameter.

3) This means that you have to manage by yourself all the details of the
password one-time entry at the RDBMS start-up time, its storing (in RAM,
only) and its passage to the SQL SELECTs. IMHO, this is still far to satisfy
our need for a standard and reliable solution to our "privacy law and DB
encryption" problem.

4) What could actually solve our problem is something like the following
scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL
parameter called "pw". This parameter would contain a sequence of
comma-separated databasename/encryption-password pairs. I mean, something
like this:

postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner"

Imagine that postmaster stores these passwords in memory (and only in memory)
as database-specific and database-wide string variables. Now, pgcrypto
functions could read the required password directly from memory (like an
environment variable).

In this way, we could have an encrypted database and a simple way to manage
its passwords.

Any comment?

5) There is also a problem related to what pgcrypto can encrypt and what it
cannot. For example: pgcrypto encrypt functions cannot be applied to DATE and
TIME data types because it would mess up them and make them unacceptable by
the RDBMS engine. We would need specific encrypted data types like ENCDATA
and ENCTIME to handle these cases.

Any comment?

PS: I apologize for bothering americans with these all-EU details but it looks
like this mailing list is the only one I can use for communicating with
others EU pgsql-admins affected by this problem.
-----------------------------------------
Alessandro Bottoni and Silvana Di Martino
alessandrobottoni@interfree.it
silvanadimartino@tin.it

Re: pgcrypto and database encryption

От
Stephan Szabo
Дата:
On Sun, 7 Mar 2004, Silvana Di Martino wrote:

> 3) This means that you have to manage by yourself all the details of the
> password one-time entry at the RDBMS start-up time, its storing (in RAM,
> only) and its passage to the SQL SELECTs. IMHO, this is still far to satisfy
> our need for a standard and reliable solution to our "privacy law and DB
> encryption" problem.
>
> 4) What could actually solve our problem is something like the following
> scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL
> parameter called "pw". This parameter would contain a sequence of
> comma-separated databasename/encryption-password pairs. I mean, something
> like this:
>
> postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner"
>
> Imagine that postmaster stores these passwords in memory (and only in memory)
> as database-specific and database-wide string variables. Now, pgcrypto
> functions could read the required password directly from memory (like an
> environment variable).
>
> In this way, we could have an encrypted database and a simple way to manage
> its passwords.
>
> Any comment?
>
> 5) There is also a problem related to what pgcrypto can encrypt and what it
> cannot. For example: pgcrypto encrypt functions cannot be applied to DATE and
> TIME data types because it would mess up them and make them unacceptable by
> the RDBMS engine. We would need specific encrypted data types like ENCDATA
> and ENCTIME to handle these cases.
>
> Any comment?

Who has access to the unencrypted versions of the data? Are there people
who will have access to the database who might have access to some of the
encrypted data but not all or someone who has access to the database but
none of the encrypted data?  If so, then you presumably need a system
where the decryption will only work for the appropriate people and that
needs to be built in.

Also, what manipulation do you want to do with these values in the
database?  Are you going to need to subtract two encrypted dates or
determine that one is greater than the other?  What about substring or
ordering for encrypted strings?  Does this have to be automatic or is
explicit description of the fact that you want to decrypt them and then do
something sufficient?

Finally, there are some questions about reliability of the system as a
whole. In a system like the -pw above, where do those passwords come from,
is it from a human typing at a console?  If so, you need to consider
downtime and recovery time issues based on who has access to the
passwords.  This doesn't so much affect the law in question but it may
affect contracts that you have if they have downtime requirements.


Re: pgcrypto and database encryption

От
Joe Conway
Дата:
Silvana Di Martino wrote:
> 4) What could actually solve our problem is something like the following
> scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL
> parameter called "pw". This parameter would contain a sequence of
> comma-separated databasename/encryption-password pairs. I mean, something
> like this:
>
> postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner"

But you mentioned earlier that the DBA cannot know the passwords, so who
is going to type all that in?

Does the law require protection from a determined DBA, or just casual
viewing by the DBA? *If* it's the latter, you could do something like this:

1. Export an environment variable , say PGMASTERPASS containing a hex
    encoded password, something like:

    PGMASTERPASS=0102000304 pg_ctl start

2. Use a C function to grab the value of the environment variable -- one
    exists in PL/R already. You could write your own based on that.

3. Combine the master password with other information to make it
    sufficiently unique as a key for your various purposes. For example,
    you might use the md5 hashed password for the current user from
    pg_shadow. This combining should be done securely -- I'd recommend
    taking the HMAC of the user password using the master as the key. The
    result of the HMAC becomes your data encryption/decryption key.

> 5) There is also a problem related to what pgcrypto can encrypt and what it
> cannot. For example: pgcrypto encrypt functions cannot be applied to DATE and
> TIME data types because it would mess up them and make them unacceptable by
> the RDBMS engine. We would need specific encrypted data types like ENCDATA
> and ENCTIME to handle these cases.

Just use bytea for the encrypted stuff, and write plpgsql functions to
convert the bytea output of the decrypt function back to its native
datatype. Here's a more-or-less complete example of what I mean by all this:

--8<--------------------------------------------------------------------

create or replace function text2bytea(text) returns bytea as '
begin
  return $1;
end;
' language plpgsql;

create or replace function timestamp2bytea(timestamp with time zone)
returns bytea as '
begin
  return $1;
end;
' language plpgsql;

create or replace function encrypt_timestamp(timestamp with time zone)
returns bytea as '
declare
  v_in alias for $1;
  v_masterpass bytea;
  v_userpass bytea;
  v_key bytea;
  v_data bytea;
  v_iv bytea;           --skip for simplicity at the moment
begin
  select into v_masterpass decode(value,''hex'') from plr_environ()
where name=''PGMASTERPASS'';
  select into v_userpass text2bytea(substr(passwd,4)) from pg_shadow
where usename = current_user;
  v_key := hmac(v_userpass, v_masterpass, ''sha1'');
  v_data := timestamp2bytea(v_in);

  return encrypt(v_data, v_key, ''aes'');
end;
' language plpgsql;

create or replace function decrypt_timestamp(bytea) returns timestamp
with time zone as '
declare
  v_in alias for $1;
  v_masterpass bytea;
  v_userpass bytea;
  v_key bytea;
  v_data bytea;
  v_iv bytea;           --skip for simplicity at the moment
begin
  select into v_masterpass decode(value,''hex'') from plr_environ()
where name=''PGMASTERPASS'';
  select into v_userpass text2bytea(substr(passwd,4)) from pg_shadow
where usename = current_user;
  v_key := hmac(v_userpass, v_masterpass, ''sha1'');
  v_data := decrypt(v_in, v_key, ''aes'');

  return v_data;
end;
' language plpgsql;

-- here you can see the master password
regression=# select decode(value,'hex') from plr_environ() where
name='PGMASTERPASS';
         decode
----------------------
  \001\002\000\003\004
(1 row)

-- here is an encrypted timestamp
regression=# select encrypt_timestamp(now());
                                         encrypt_timestamp
--------------------------------------------------------------------------------------------------
  \340\333*\0221r\177\022e\011_]X
\374\302Y\201\364\264\362\351e\331\006\266\\\331\236\300\256\335
(1 row)

-- and to prove that it works, this example goes full circle
regression=# select decrypt_timestamp(encrypt_timestamp(now()));
        decrypt_timestamp
-------------------------------
  2004-03-07 10:16:56.192193-08
(1 row)

--8<--------------------------------------------------------------------

As I said above (and others in this thread too), if the DBA (or anyone
with root access on the database server) is sufficiently determined,
they can get around this scheme and view whatever data they want. If
you're really concerned about that scenario, the data should be
encrypted in your application before it ever gets sent to the database,
using a key that is unavailable on the database server.

HTH,

Joe


Re: pgcrypto and database encryption

От
matt@ymogen.net
Дата:
> Does the law require protection from a determined DBA, or just casual
> viewing by the DBA? *If* it's the latter, you could do something like
> this:

The EU directive (and therefore the laws of indiviual countries) requires
that  if someone gets access to your *DATABASE* they cannot get personal
details of individuals out of it.  That is all.  It is intended to protect
people against the kind of idiotic organisations that put their customer
lists in an Excel sheet on their extranet without a password.

This thread has covered many interesting and diverting issues, but the
fundamental issue of legal compliance is more than satisfied by:
1)   Encrypting 'personal information' stored in a DB
2)   Keeping the keys on a different server than the DB
3)   Making reasonable efforts [1] to keep those keys secrets


M


[1]  As far as I can tell from discussions with the Data Protection
Registrar, you do not have to protect them against someone rooting the app
server (since that is essentially impossible without silly investments in
specialised hardware or other excessive costs).

Re: pgcrypto and database encryption

От
Silvana Di Martino
Дата:
Alle 17:40, domenica 7 marzo 2004, Stephan Szabo ha scritto:
> Who has access to the unencrypted versions of the data? Are there people
> who will have access to the database who might have access to some of the
> encrypted data but not all or someone who has access to the database but
> none of the encrypted data?  If so, then you presumably need a system
> where the decryption will only work for the appropriate people and that
> needs to be built in.

Yes: each "authorized operator" must be able to access data belonging to him,
only, even if the database contains data belonging to many different
operators. Moreover, SysAdmin (that are not "authorized operator") must not
be able to access any data, even if they are able to access the database (as
a whole) for maintenance. So, decryption must work for authorized operator
only.

> Also, what manipulation do you want to do with these values in the
> database?  Are you going to need to subtract two encrypted dates or
> determine that one is greater than the other?

Yes: we will need to perform basic operations on encrypted dates, times and so
on. Of course, it would be great to be able to delegate the on-the-fly
decryption/encryption work to the encryption system.

> What about substring or
> ordering for encrypted strings?  Does this have to be automatic or is
> explicit description of the fact that you want to decrypt them and then do
> something sufficient?

Same as above: we would need to perform substring-wise operations and ordering
on encrypted strings. The encryption/decryption mechanism should be
automatic.

>
> Finally, there are some questions about reliability of the system as a
> whole. In a system like the -pw above, where do those passwords come from,
> is it from a human typing at a console?  If so, you need to consider
> downtime and recovery time issues based on who has access to the
> passwords.  This doesn't so much affect the law in question but it may
> affect contracts that you have if they have downtime requirements.

The password should come from a human typing at the console (not our choice:
law imposes it). Downtime will be affected: this is out of our control and
must be accepted by our customers.

Thanks for your comments.

-----------------------------------------
Alessandro Bottoni and Silvana Di Martino
alessandrobottoni@interfree.it
silvanadimartino@tin.it

Re: pgcrypto and database encryption

От
Silvana Di Martino
Дата:
Alle 22:16, domenica 7 marzo 2004, matt@ymogen.net ha scritto:
> The EU directive (and therefore the laws of indiviual countries) requires
> that  if someone gets access to your *DATABASE* they cannot get personal
> details of individuals out of it.  That is all.  It is intended to protect
> people against the kind of idiotic organisations that put their customer
> lists in an Excel sheet on their extranet without a password.
>
> This thread has covered many interesting and diverting issues, but the
> fundamental issue of legal compliance is more than satisfied by:
> 1)   Encrypting 'personal information' stored in a DB

Fine! Now tell me: how do you perform such encryption on PostgreSQL? Using
pgcrypto and suppling your password from external (PHP, Python) code? Where
do you store this password? In a config.php file? How many different
encryption/decryption programs/implementations/logics are you willing to have
on your "n" PostgreSQL servers? How do you maintain them (after the original
programmers are gone)?

Using a few commercial RDBMS, it is just a matter to switch the encryption
feature on and supply the required password each time you start the RDBMS
service up. Unfortunately, PostgreSQL does not supply us with such a
comfortable feature. So, how many commercial licenses of your favorite
commercial RDBMS are you willing (or can you afford) to buy to replace all
your PostgreSQL servers? How much time (programmer's working hours, each at
50 Euro average cost) are you  willing to invest in converting your
PostgreSQL databases to SQL Server, for example?

I'm perfectly aware that law is clear and simple. Nevertheless, its
implementation isn't.

> 2)   Keeping the keys on a different server than the DB

Fine. How and when do you supply the password to the encryption/decryption
process? On demand? At postmaster init time? Using which channel/method?
XML-RPC? SOAP? How do you protect them from a hacker's program that tries to
impersonate the legitimate encrypting program and ask for it?

Once again, Devil is in the details...

> 3)   Making reasonable efforts [1] to keep those keys secrets

> [1]  As far as I can tell from discussions with the Data Protection
> Registrar, you do not have to protect them against someone rooting the app
> server (since that is essentially impossible without silly investments in
> specialised hardware or other excessive costs).

What does "reasonable" mean? We defined already that we are not forced to
replace Police in fighting organized crime but we still have to define a lot
of details. For example: 15.000 euro for a new database license and the
porting of data is a "silly investemen" or a "reasonable investement"? Read
the italian law and you will get surprised from the answer.

Again, I'm sorry to bother you all with such details. Just give me a solution
(that is: a PostgreSQL database encryption method I can actually use) and I
will let you alone.

See you.

-----------------------------------------
Alessandro Bottoni and Silvana Di Martino
alessandrobottoni@interfree.it
silvanadimartino@tin.it

Re: pgcrypto and database encryption

От
Silvana Di Martino
Дата:
Alle 18:19, domenica 7 marzo 2004, Joe Conway ha scritto:
> Silvana Di Martino wrote:
> > 4) What could actually solve our problem is something like the following
> > scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL
> > parameter called "pw". This parameter would contain a sequence of
> > comma-separated databasename/encryption-password pairs. I mean, something
> > like this:
> >
> > postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner"
>
> But you mentioned earlier that the DBA cannot know the passwords, so who
> is going to type all that in?

Accordingly to law, a "authorized operator". A piece of paper transforms a
generic user/sysadmin into a trusted person who can perform such operations
(again, not our choice: law imposes it).

> Does the law require protection from a determined DBA, or just casual
> viewing by the DBA? *If* it's the latter, you could do something like this:
>
> 1. Export an environment variable , say PGMASTERPASS containing a hex
>     encoded password, something like:
>
>     PGMASTERPASS=0102000304 pg_ctl start
>
> 2. Use a C function to grab the value of the environment variable -- one
>     exists in PL/R already. You could write your own based on that.
>
> 3. Combine the master password with other information to make it
>     sufficiently unique as a key for your various purposes. For example,
>     you might use the md5 hashed password for the current user from
>     pg_shadow. This combining should be done securely -- I'd recommend
>     taking the HMAC of the user password using the master as the key. The
>     result of the HMAC becomes your data encryption/decryption key.

This looks like a viable solution. Thanks for the idea.

> > 5) There is also a problem related to what pgcrypto can encrypt and what
> > it cannot. For example: pgcrypto encrypt functions cannot be applied to
> > DATE and TIME data types because it would mess up them and make them
> > unacceptable by the RDBMS engine. We would need specific encrypted data
> > types like ENCDATA and ENCTIME to handle these cases.
>
> Just use bytea for the encrypted stuff, and write plpgsql functions to
> convert the bytea output of the decrypt function back to its native
> datatype. Here's a more-or-less complete example of what I mean by all
> this:

Thanks again! I will use it!

> As I said above (and others in this thread too), if the DBA (or anyone
> with root access on the database server) is sufficiently determined,
> they can get around this scheme and view whatever data they want. If
> you're really concerned about that scenario, the data should be
> encrypted in your application before it ever gets sent to the database,
> using a key that is unavailable on the database server.

Yes, sure: the encryption key must be unavailable on the db server (at least,
unavailable on its hard disk). Using Blowfish or Rijandel and a 128 - 1024
character-long password it would be almost impossible to decript our data
with widely available hardware in human-reasonable time. Nevertheless, this
does not mean that the on-the-fly encryption/decryption process must
necessarly take place out of the DB server/host (a hypothesys that would
complicate our work a lot). Even a determined and skilled sysadmin will have
big problems in finding, reading and decrypting a password stored in the
volatile memory of a running (and strictly surveilled) host by a RDBMS
application (a task similar to finding a mimetic computer virus in memory).
Adding to this the fact that the original password could be supplied by a
remote user (or process) operating on a SSH channel (encrypted channel +
user/process PKI authentication on both channel ends), it could be even hard
to get it by impersonating the legitimate RDBMS program and ask for it to the
password server (a "man-in-the-middle" attack).

Thanks again for your help.
-----------------------------------------
Alessandro Bottoni and Silvana Di Martino
alessandrobottoni@interfree.it
silvanadimartino@tin.it

Re: pgcrypto and database encryption

От
Bruno Wolff III
Дата:
On Mon, Mar 08, 2004 at 08:37:37 +0000,
  Silvana Di Martino <silvanadimartino@tin.it> wrote:
> Alle 18:19, domenica 7 marzo 2004, Joe Conway ha scritto:
> > Silvana Di Martino wrote:
> > > 4) What could actually solve our problem is something like the following
> > > scenario. Imagine that postmaster (or pg_ctrl) could accept a new CL
> > > parameter called "pw". This parameter would contain a sequence of
> > > comma-separated databasename/encryption-password pairs. I mean, something
> > > like this:
> > >
> > > postmaster -i -pw=postnuke:"arriba!",phpnuke:"blade runner"
> >
> > But you mentioned earlier that the DBA cannot know the passwords, so who
> > is going to type all that in?
>
> Accordingly to law, a "authorized operator". A piece of paper transforms a
> generic user/sysadmin into a trusted person who can perform such operations
> (again, not our choice: law imposes it).

This sounds like the real solution to your problem. You should do what
you need to to make the system administrator an authorized user and
then use more practical security methods to secure the data.

Re: pgcrypto and database encryption

От
Stephan Szabo
Дата:
On Mon, 8 Mar 2004, Silvana Di Martino wrote:

> Alle 17:40, domenica 7 marzo 2004, Stephan Szabo ha scritto:
> > Who has access to the unencrypted versions of the data? Are there people
> > who will have access to the database who might have access to some of the
> > encrypted data but not all or someone who has access to the database but
> > none of the encrypted data?  If so, then you presumably need a system
> > where the decryption will only work for the appropriate people and that
> > needs to be built in.
>
> Yes: each "authorized operator" must be able to access data belonging to him,
> only, even if the database contains data belonging to many different
> operators. Moreover, SysAdmin (that are not "authorized operator") must not
> be able to access any data, even if they are able to access the database (as
> a whole) for maintenance. So, decryption must work for authorized operator
> only.

That's what I figured, but given a system where the password is given on
startup to the database, you would need another layer for the security,
have you though about how that would work? A not terrible option would be
to compose the key from something given by the user (perhaps at session
start with a query) with something already stored, but I don't know if the
law would allow that.

Also, there's a question about multiple databases in one instance that
might want separate encryption esp where the authorized users might be
non-overlapping sets.  It might be better to allow a human to use a query
to initialize the passwords.

> > Also, what manipulation do you want to do with these values in the
> > database?  Are you going to need to subtract two encrypted dates or
> > determine that one is greater than the other?
>
> Yes: we will need to perform basic operations on encrypted dates, times and so
> on. Of course, it would be great to be able to delegate the on-the-fly
> decryption/encryption work to the encryption system.
>
> > What about substring or
> > ordering for encrypted strings?  Does this have to be automatic or is
> > explicit description of the fact that you want to decrypt them and then do
> > something sufficient?
>
> Same as above: we would need to perform substring-wise operations and ordering
> on encrypted strings. The encryption/decryption mechanism should be
> automatic.

As above, automatic runs into questions about who can do it and maint
issues, like if encryption happens on insert and the admin cannot see the
decrypted form, the admin cannot backup the database using something like
pg_dump.  This is probably best solved by doing whatever is necessary to
make the admin authorized.

It should be reasonably easy to make types that took input and encrypted
it and had say appropriate output function that checked and decrypted
along with implicit casts that do the same thing. It'd be CPU consuming,
but workable.  Here you run into the question of how it's configured
again.

Re: pgcrypto and database encryption

От
Silvana Di Martino
Дата:
Alle 14:47, lunedì 8 marzo 2004, Bruno Wolff III ha scritto:
> > Accordingly to law, a "authorized operator". A piece of paper transforms
> > a generic user/sysadmin into a trusted person who can perform such
> > operations (again, not our choice: law imposes it).
>
> This sounds like the real solution to your problem. You should do what
> you need to to make the system administrator an authorized user and
> then use more practical security methods to secure the data.

I'm going to think very carefully over this possibility... ;-)

-----------------------------------------
Alessandro Bottoni and Silvana Di Martino
alessandrobottoni@interfree.it
silvanadimartino@tin.it

Re: pgcrypto and database encryption

От
Silvana Di Martino
Дата:
Alle 15:08, lunedì 8 marzo 2004, Stephan Szabo ha scritto:
> That's what I figured, but given a system where the password is given on
> startup to the database, you would need another layer for the security,
> have you though about how that would work? A not terrible option would be
> to compose the key from something given by the user (perhaps at session
> start with a query) with something already stored, but I don't know if the
> law would allow that.

> Also, there's a question about multiple databases in one instance that
> might want separate encryption esp where the authorized users might be
> non-overlapping sets.  It might be better to allow a human to use a query
> to initialize the passwords.

I'm not sure (nobody can actually read the mind of our lawmakers) but I think
that using the same "password's password" (aka "global server password") for
all of the databases managed by a single instance of the PostgreSQL server
would be fine if we are able to guarantee that each user has his own
password.

So, we should have this scenario:
- a single "global server password", supplied just once at the server start-up
by a trusted admin via SSH/SSL and used to encrypt/decrypt all other
passwords. This password should never be stored on disk/db on the same host
as the RDBMS. It must be encrypted and kept in volatile memory (RAM), only.
If the server is shut down, the trusted admin will have to supply it again.
- a set of database-specific passwords used to encrypt/decrypt data. These
passwords could be encrypted using the "global server password" and safely
stored into a table of the database. If we use strong encryption and a long
password (Blowfish or Rijandel with a 512-1024 char-long key) to encrypt the
keys it should be very hard (almost impossible) to crack them.

The user that connect to a database, must authenticate himself using Ident,
SSH or Kerberos. Once accepted the user's connection, a set of server-side
function (pgcrypto) encrypt/decryp data using the database-specific key.

This is almost the same technique used by Oracle. To implement it, PostgreSQL
just lacks a way to supply the "global server password" at start-up. A SQL
Query cannot be used for this both because the password must be visible to
all the database managed by a server and because this password must not be
stored inside the db.

Note that the trusted administrator _can_ be replaced by a server process if:
- the client process that asks for a "global server password" can authenticat
itself in a reliable way (SSH, Kerberos, SSH any other PKI) as a trusted
PostgreSQL server.
- the communication channel is encrypted (SSH/SSL)

> As above, automatic runs into questions about who can do it and maint
> issues, like if encryption happens on insert and the admin cannot see the
> decrypted form, the admin cannot backup the database using something like
> pg_dump.  This is probably best solved by doing whatever is necessary to
> make the admin authorized.

Well, data encryption is meant _also_ to protect data from the admin's eye. As
long as I can see, there is no need to access or decrypt data for performing
backup, restore and any other database-level operation. From the admin point
of view, the database can and should be a black box.

> It should be reasonably easy to make types that took input and encrypted
> it and had say appropriate output function that checked and decrypted
> along with implicit casts that do the same thing. It'd be CPU consuming,
> but workable.  Here you run into the question of how it's configured
> again.

Specific encrytped data types are coming out to be a real need. Without them,
I cannot see how we could perform data sorting, comparison and any other
basic operation. IMHO, this topic should be brought to the attention of the
PostgreSQL developers with the intent to develop a standard set of reliable
encrypted data types. I'm sure that performance would be hugely affected by
encryption but, again, this is out of our reach: law imposes it.

-----------------------------------------
Alessandro Bottoni and Silvana Di Martino
alessandrobottoni@interfree.it
silvanadimartino@tin.it

Re: pgcrypto and database encryption

От
Stephan Szabo
Дата:
On Mon, 8 Mar 2004, Silvana Di Martino wrote:

> Alle 15:08, luned� 8 marzo 2004, Stephan Szabo ha scritto:
> > That's what I figured, but given a system where the password is given on
> > startup to the database, you would need another layer for the security,
> > have you though about how that would work? A not terrible option would be
> > to compose the key from something given by the user (perhaps at session
> > start with a query) with something already stored, but I don't know if the
> > law would allow that.
>
> > Also, there's a question about multiple databases in one instance that
> > might want separate encryption esp where the authorized users might be
> > non-overlapping sets.  It might be better to allow a human to use a query
> > to initialize the passwords.
>
> I'm not sure (nobody can actually read the mind of our lawmakers) but I think
> that using the same "password's password" (aka "global server password") for
> all of the databases managed by a single instance of the PostgreSQL server
> would be fine if we are able to guarantee that each user has his own
> password.

Except that you then need someone who is authorized to view all the
databases correct since that user would potentially be able to get all the
keys?  That still seems to miss the case where the sets of authorized
users have no intersections (which might come up in shared hosting).

> The user that connect to a database, must authenticate himself using Ident,
> SSH or Kerberos. Once accepted the user's connection, a set of server-side
> function (pgcrypto) encrypt/decryp data using the database-specific key.

I think there might still be issues with people who have access to some
(presumably non-encrypted) data in the database while not having access to
the encrypted data unless the SQL permissions system were sufficient.

> This is almost the same technique used by Oracle. To implement it, PostgreSQL
> just lacks a way to supply the "global server password" at start-up. A SQL
> Query cannot be used for this both because the password must be visible to
> all the database managed by a server and because this password must not be
> stored inside the db.

I think C functions with access to a shared memory block wouldn't fail
those two considerations.

> Note that the trusted administrator _can_ be replaced by a server process if:
> - the client process that asks for a "global server password" can authenticat
> itself in a reliable way (SSH, Kerberos, SSH any other PKI) as a trusted
> PostgreSQL server.
> - the communication channel is encrypted (SSH/SSL)

This sounds alot better all in all.  This also seems like it might be
possible to prototype outside the database proper so as to get an idea of
what it would look like.

> > As above, automatic runs into questions about who can do it and maint
> > issues, like if encryption happens on insert and the admin cannot see the
> > decrypted form, the admin cannot backup the database using something like
> > pg_dump.  This is probably best solved by doing whatever is necessary to
> > make the admin authorized.
>
> Well, data encryption is meant _also_ to protect data from the admin's eye. As
> long as I can see, there is no need to access or decrypt data for performing
> backup, restore and any other database-level operation. From the admin point
> of view, the database can and should be a black box.

But, given the fact that the form of the backup is a set of SQL
statements, if the encryption is automatic on input, either the SQL
statements contain encrypted data and you need some way to on restore tell
the system to not encrypted them again or they contain decrypted data. If
accessing the encrypted data without being properly authorized throws an
error rather than giving you the encrypted form, how does the admin do the
selects necessary on the table to even get the encrypted form.

> > It should be reasonably easy to make types that took input and encrypted
> > it and had say appropriate output function that checked and decrypted
> > along with implicit casts that do the same thing. It'd be CPU consuming,
> > but workable.  Here you run into the question of how it's configured
> > again.
>
> Specific encrytped data types are coming out to be a real need. Without them,
> I cannot see how we could perform data sorting, comparison and any other
> basic operation. IMHO, this topic should be brought to the attention of the
> PostgreSQL developers with the intent to develop a standard set of reliable
> encrypted data types. I'm sure that performance would be hugely affected by

Given that it's possible that the types might be an input function, an
output function and one or more type conversion functions, I'd think it'd
be possible to prototype them first.

Re: pgcrypto and database encryption

От
Silvana Di Martino
Дата:
Alle 22:07, lunedì 8 marzo 2004, Stephan Szabo ha scritto:
> > I'm not sure (nobody can actually read the mind of our lawmakers) but I
> > think that using the same "password's password" (aka "global server
> > password") for all of the databases managed by a single instance of the
> > PostgreSQL server would be fine if we are able to guarantee that each
> > user has his own password.
>
> Except that you then need someone who is authorized to view all the
> databases correct since that user would potentially be able to get all the
> keys?  That still seems to miss the case where the sets of authorized
> users have no intersections (which might come up in shared hosting).

Yes, right! The DBAdmin would be authenticated and allowed to access _all_ of
the databases living on a server and, as a consequence, the encryption system
would decrypt and show him _all_ of the data in clear form. To avoid this, we
would need an access control system able to distingush a DBAdmin from a
authorized operator and act consquentely. This could be done with pgcrypto
and PL/PGSQL functions if we had access to the current login username from
within a PL/PGSL function. If not, we could likely do it from within a C
function. In any case, we will have to implement the whole system.

In any case, there will always be an intersection point among all user groups:
the DBA(s).

> > The user that connect to a database, must authenticate himself using
> > Ident, SSH or Kerberos. Once accepted the user's connection, a set of
> > server-side function (pgcrypto) encrypt/decryp data using the
> > database-specific key.
>
> I think there might still be issues with people who have access to some
> (presumably non-encrypted) data in the database while not having access to
> the encrypted data unless the SQL permissions system were sufficient.

While pgcrypto can work at a column level, our hypothetical system will work
at a database level (because its dependence from the PostgreSQL
authentication system). In order to be able to allow the access to a
table/column of a db, while forbidding the access to another table/access of
the same db, we should be able to read the privileges of the current user and
act consequently. I'm not sure but I think there is not any way to read the
privileges of the current user from within a PL/PGSQL or C function so this
would be impossible.

> > This is almost the same technique used by Oracle. To implement it,
> > PostgreSQL just lacks a way to supply the "global server password" at
> > start-up. A SQL Query cannot be used for this both because the password
> > must be visible to all the database managed by a server and because this
> > password must not be stored inside the db.
>
> I think C functions with access to a shared memory block wouldn't fail
> those two considerations.

I agree.

> > Note that the trusted administrator _can_ be replaced by a server process
> > if: - the client process that asks for a "global server password" can
> > authenticat itself in a reliable way (SSH, Kerberos, SSH any other PKI)
> > as a trusted PostgreSQL server.
> > - the communication channel is encrypted (SSH/SSL)
>
> This sounds alot better all in all.  This also seems like it might be
> possible to prototype outside the database proper so as to get an idea of
> what it would look like.

I like this approach much more than any other. A "password server" based on
HTTP, SOAP or XML-RPC (and protected by SSL or SSH) would be an interesting
general purpose tool per se. Its use with PostgreSQL would "just" require to
add to the RDBMS engine the ability to communicate with the password server
via HTTP, SOAP or XML-RPC. That is: another interesting general purpose
feature of the RDBMS engine per se.

> But, given the fact that the form of the backup is a set of SQL
> statements, if the encryption is automatic on input, either the SQL
> statements contain encrypted data and you need some way to on restore tell
> the system to not encrypted them again or they contain decrypted data. If
> accessing the encrypted data without being properly authorized throws an
> error rather than giving you the encrypted form, how does the admin do the
> selects necessary on the table to even get the encrypted form.

Yes, right... unfortunatley. We would need a way to tell the system to always
backup _encrypted_ data and restore them without re-encrypting. A specific
couple of programs like pg_enc_dump and pg_enc_restore would be needed. The
internals of such programs could be managed in this way:
- always backup (download) data in their original format (encrypted if they
are encrypted, clear if they are clear)
- from within your SQL dump file, when needed, signal to the RDBMS engine that
we are restoring an encrypted DB. This should be used by SQL code (pgcrypto
functions) to prevent a second, inopportune encryption of encrypted data. I
think that either a SQL query or a specific, hypothetical "sql file header"
variable could be used for this task.
Again, a specific set of encrypted data types, each including the intelligence
needed to handle this conditional encryption, will be of great help.

> > Specific encrytped data types are coming out to be a real need. Without
> > them, I cannot see how we could perform data sorting, comparison and any
> > other basic operation. IMHO, this topic should be brought to the
> > attention of the PostgreSQL developers with the intent to develop a
> > standard set of reliable encrypted data types. I'm sure that performance
> > would be hugely affected by
>
> Given that it's possible that the types might be an input function, an
> output function and one or more type conversion functions, I'd think it'd
> be possible to prototype them first.

I agree. This could be a good starting point. Pgcrypto could be used as a
basis for this code, apparently.

See you.
-----------------------------------------
Alessandro Bottoni and Silvana Di Martino
alessandrobottoni@interfree.it
silvanadimartino@tin.it