Обсуждение: Re: Key encryption and relational integrity
Il 26/03/2019 18:08, Adrian Klaver ha scritto: > On 3/26/19 9:08 AM, Moreno Andreo wrote: >> Il 26/03/2019 15:24, Adrian Klaver ha scritto: >>> On 3/26/19 7:19 AM, Moreno Andreo wrote: >>>> Hello folks :-) >>>> >>>> Is there any workaround to implement key encryption without >>>> breaking relational integrity? >>> >>> This is going to need more information. >> OK, I'll try to be as clearer as I can >>> For starters 'key' has separate meanings for encryption and RI. I >>> could make some guesses about what you want, but to avoid false >>> assumptions a simple example would be helpful. >> In a master-detail relation, I need to encrypt one of master table PK >> or detail table FK, in order to achieve pseudonimization, required by >> GDPR in Europe when managing particular data >> Imagine I have >> Table users >> id surname last name >> 1 John Doe >> 2 Jane Doe >> 3 Foo Bar >> >> Table medications >> id user_id med >> 1 1 Medication >> 2 1 Ear check >> ... >> ... >> medications.user_id is FK on users.id >> we should achieve >> >> Table medications >> id user_id med >> 1 sgkighs98 Medication >> 2 sghighs98 Ear check >> >> or the opposite (users.id encryption and medications.user_id kept plain) >> >> At a first glance, it IS breaking relational integrity, so is there a >> way to manage this encryption internally so RI is kept safe? > > Not that I know of. RI is based on maintaining a link between parent > and child. So by definition you would be able to get to the parent > record via the child. That's what I was afraid of :-( > > A quick search on pseudonymisation found a boatload of interpretations > of how to implement this: > > "Pseudonymisation' means the processing of personal data in such a > manner that the personal data can no longer be attributed to a > specific data subject without the use of additional information, > provided that such additional information is kept separately and is > subject to technical and organisational measures to ensure that the > personal data are not attributed to an identified or identifiable > natural person." > > > To me it would seem something like: > > Table medications > id user_id med > 1 sgkighs98 Medication > 2 sghighs98 Ear check > > > > Table users > id surname last name > sgkighs98 John Doe > jkopkl1 Jane Doe > uepoti21 Foo Bar > > Where there is no direct link between the two. Are you sure there isn't?... the key "sgkighs98" is present on both tables and I can join tables on that field, so the pseudonimysation does not apply, it's just "separation" (that was OK with the last privacy act, but not with GDPR The problem is not on the application side... there you can do almost anything you want to do. The prolem is that if someone breaks in the server (data breach) it is easy to join patients and their medications. > Instead permissions would prevent linking from medications to users > even via a SELECT. One could also use pgcrypto: > > https://www.postgresql.org/docs/10/pgcrypto.html > > on the users table to further hide the personal info. That's what I used to try to encrypt first name, last name, street address and some other fields (that would be the best solution because RI was not impacted at all), but the customer stated that they have to perform real-time search (like when you type in the Google search box), and the query that has to decrypt all names and return only the ones that begin with a certain set of characters is way too slow (tried on a good i7 configuration, that's about 2 seconds for each key pressed on a 2500-row table). So I dropped this approach. > > *NOTE* I am not a lawyer so any advice on my part as to meeting legal > requirements are just me thinking out loud. I would suggest, if not > already done, getting proper legal advice on what the section quoted > above actually means. Relax, I'm not here to ask and then sue anyone :-)
On 3/28/19 10:36 AM, Moreno Andreo wrote: > Il 26/03/2019 18:08, Adrian Klaver ha scritto: >> On 3/26/19 9:08 AM, Moreno Andreo wrote: >>> Il 26/03/2019 15:24, Adrian Klaver ha scritto: >>>> On 3/26/19 7:19 AM, Moreno Andreo wrote: >>>>> Hello folks :-) >>>>> >>>>> Is there any workaround to implement key encryption without >>>>> breaking relational integrity? >>>> >>>> This is going to need more information. >>> OK, I'll try to be as clearer as I can >>>> For starters 'key' has separate meanings for encryption and RI. I >>>> could make some guesses about what you want, but to avoid false >>>> assumptions a simple example would be helpful. >>> In a master-detail relation, I need to encrypt one of master table PK >>> or detail table FK, in order to achieve pseudonimization, required by >>> GDPR in Europe when managing particular data >>> Imagine I have >>> Table users >>> id surname last name >>> 1 John Doe >>> 2 Jane Doe >>> 3 Foo Bar >>> >>> Table medications >>> id user_id med >>> 1 1 Medication >>> 2 1 Ear check >>> ... >>> ... >>> medications.user_id is FK on users.id >>> we should achieve >>> >>> Table medications >>> id user_id med >>> 1 sgkighs98 Medication >>> 2 sghighs98 Ear check >>> >>> or the opposite (users.id encryption and medications.user_id kept plain) >>> >>> At a first glance, it IS breaking relational integrity, so is there a >>> way to manage this encryption internally so RI is kept safe? >> >> Not that I know of. RI is based on maintaining a link between parent >> and child. So by definition you would be able to get to the parent >> record via the child. > That's what I was afraid of :-( >> >> A quick search on pseudonymisation found a boatload of interpretations >> of how to implement this: >> >> "Pseudonymisation' means the processing of personal data in such a >> manner that the personal data can no longer be attributed to a >> specific data subject without the use of additional information, >> provided that such additional information is kept separately and is >> subject to technical and organisational measures to ensure that the >> personal data are not attributed to an identified or identifiable >> natural person." >> >> >> To me it would seem something like: >> >> Table medications >> id user_id med >> 1 sgkighs98 Medication >> 2 sghighs98 Ear check >> >> >> >> Table users >> id surname last name >> sgkighs98 John Doe >> jkopkl1 Jane Doe >> uepoti21 Foo Bar >> >> Where there is no direct link between the two. > > Are you sure there isn't?... the key "sgkighs98" is present on both > tables and I can join tables on that field, so the pseudonimysation does > not apply, it's just "separation" (that was OK with the last privacy > act, but not with GDPR Yes but you can use permissions to make the user table is unreachable by folks with insufficient permission. > > The problem is not on the application side... there you can do almost > anything you want to do. The prolem is that if someone breaks in the > server (data breach) it is easy to join patients and their medications. That really depends on what level of user they break in as. That is a separate security issue. It also is the difference between pseudonymisation and anonymization, where the latter makes the data totally unrelated to an individuals personal information. > >> Instead permissions would prevent linking from medications to users >> even via a SELECT. One could also use pgcrypto: >> >> https://www.postgresql.org/docs/10/pgcrypto.html >> >> on the users table to further hide the personal info. > That's what I used to try to encrypt first name, last name, street > address and some other fields (that would be the best solution because > RI was not impacted at all), but the customer stated that they have to > perform real-time search (like when you type in the Google search box), > and the query that has to decrypt all names and return only the ones > that begin with a certain set of characters is way too slow (tried on a > good i7 configuration, that's about 2 seconds for each key pressed on a > 2500-row table). So I dropped this approach. >> >> *NOTE* I am not a lawyer so any advice on my part as to meeting legal >> requirements are just me thinking out loud. I would suggest, if not >> already done, getting proper legal advice on what the section quoted >> above actually means. > Relax, I'm not here to ask and then sue anyone :-) Hey, I live in the US its just best policy to make that clear:) > > > -- Adrian Klaver adrian.klaver@aklaver.com
On 2019-03-28 18:36:40 +0100, Moreno Andreo wrote: > Il 26/03/2019 18:08, Adrian Klaver ha scritto: > > To me it would seem something like: > > > > Table medications > > id user_id med > > 1 sgkighs98 Medication > > 2 sghighs98 Ear check > > > > > > > > Table users > > id surname last name > > sgkighs98 John Doe > > jkopkl1 Jane Doe > > uepoti21 Foo Bar > > > > Where there is no direct link between the two. > > Are you sure there isn't?... the key "sgkighs98" is present on both > tables and I can join tables on that field, so the pseudonimysation > does not apply, Yes. It doesn't matter whether the key is 'sgkighs98' or 1438 or 692da0c1-cf2d-476d-8910-7f82c050f8fe. > it's just "separation" (that was OK with the last privacy act, but not > with GDPR I doubt that this is correct. The GDPR doesn't prescribe specific technical means (there may be laws or standards in your country which prescribe such means for medical data, but that's not the GDPR). > The problem is not on the application side... there you can do almost > anything you want to do. The prolem is that if someone breaks in the > server (data breach) it is easy to join patients and their > medications. I sure hope that the doctors are able to join patients and their medications. So at some level that has to be possible. If you assume a break-in into the server, there will always be a level of penetration at which the attacker will be able to access any data an authorized user can access. hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
Вложения
Il 28/03/2019 23:29, Peter J. Holzer ha scritto: > On 2019-03-28 18:36:40 +0100, Moreno Andreo wrote: >> Il 26/03/2019 18:08, Adrian Klaver ha scritto: >>> To me it would seem something like: >>> >>> Table medications >>> id user_id med >>> 1 sgkighs98 Medication >>> 2 sghighs98 Ear check >>> >>> >>> >>> Table users >>> id surname last name >>> sgkighs98 John Doe >>> jkopkl1 Jane Doe >>> uepoti21 Foo Bar >>> >>> Where there is no direct link between the two. >> Are you sure there isn't?... the key "sgkighs98" is present on both >> tables and I can join tables on that field, so the pseudonimysation >> does not apply, > Yes. It doesn't matter whether the key is 'sgkighs98' or 1438 or > 692da0c1-cf2d-476d-8910-7f82c050f8fe. > >> it's just "separation" (that was OK with the last privacy act, but not >> with GDPR > I doubt that this is correct. The GDPR doesn't prescribe specific > technical means (there may be laws or standards in your country which > prescribe such means for medical data, but that's not the GDPR). That was told me by a privacy consultant, there was an Italian law (196/2003) that introduced "minimal security measures", that has been revoked with the GDPR appliance. >> The problem is not on the application side... there you can do almost >> anything you want to do. The prolem is that if someone breaks in the >> server (data breach) it is easy to join patients and their >> medications. > I sure hope that the doctors are able to join patients and their > medications. So at some level that has to be possible. It would be possible at application level, that resides on another server (so it would be compliant the separation between the pseudonimysation and the reverse method) > If you assume a > break-in into the server, there will always be a level of penetration at > which the attacker will be able to access any data an authorized user > can access. That's not what I got reading the GDPR article... but I may have misunderstood (juridic text is non my cup of tea). My understanding was that even in a data breach event there should be a mechanism that prevents (or "mitigate the risk that") the attacker to gain access to the data in the "joined" form, so he cannot acquire that patient John Doe has got Alzheimer, for instance, but only that in that database there is a patient which name is John Doe and someone that has got Alzheimer. And I tried to find a solution, and since I did not like that much what I found (and it seems that neither you do :-) ), I came here hoping that someone would share his experience to shed some light on the topic. > hp >
On 2019-03-29 17:01:07 +0100, Moreno Andreo wrote: > Il 28/03/2019 23:29, Peter J. Holzer ha scritto: > > On 2019-03-28 18:36:40 +0100, Moreno Andreo wrote: > > > it's just "separation" (that was OK with the last privacy act, but not > > > with GDPR > > I doubt that this is correct. The GDPR doesn't prescribe specific > > technical means (there may be laws or standards in your country which > > prescribe such means for medical data, but that's not the GDPR). > > That was told me by a privacy consultant, there was an Italian law > (196/2003) that introduced "minimal security measures", that has been > revoked with the GDPR appliance. > > > > > The problem is not on the application side... there you can do almost > > > anything you want to do. The prolem is that if someone breaks in the > > > server (data breach) it is easy to join patients and their > > > medications. > > I sure hope that the doctors are able to join patients and their > > medications. So at some level that has to be possible. > It would be possible at application level, that resides on another server > (so it would be compliant the separation between the pseudonimysation and > the reverse method) But why would you assume that an attacker cannot get access to that "other server"? > > If you assume a break-in into the server, there will always be a > > level of penetration at which the attacker will be able to access > > any data an authorized user can access. > > That's not what I got reading the GDPR article... but I may have > misunderstood (juridic text is non my cup of tea). My understanding was that > even in a data breach event there should be a mechanism that prevents (or > "mitigate the risk that") the attacker to gain access to the data in the Quoting from article 32 of the GDPR: | Taking into account the state of the art, the costs of implementation | and the nature, scope, context and purposes of processing as well as the | risk of varying likelihood and severity for the rights and freedoms of | natural persons, the controller and the processor shall implement | appropriate technical and organisational measures to ensure a level of | security appropriate to the risk, including inter alia as appropriate: This is basically the gist of technical part of the GDPR. The controller and processor are responsible to "ensure a level of security appropriate to the risk", and it is their job to determine how to do that. The GDPR doesn't say how to do that at all (the legislators were wise enough that any attempt to do that would result in a mess). So you can't say "the GDPR says we have to do it this way" (and if your consultant says that it is probably time to get a different one). You have to consider all the risks (and yes, an attacker getting access to some or all of the data is a risk, but a doctor not being able to access a patient's records is also a risk) and implement the best you can do considering "the state of the art, the costs of implementation", etc. > "joined" form, so he cannot acquire that patient John Doe has got Alzheimer, > for instance, but only that in that database there is a patient which name > is John Doe and someone that has got Alzheimer. I'm not talking about the GDPR here, but about the technical impossibility. If an authorized user (say a doctor or a nurse) can get the information that John Doe has Alzheimer's (and as a patient one would hope that they can), then there will *always* be a way for an attacker to aquire the privileges of that authorized user and get the same information. There is no way around that. You can make it harder, but you can't prevent it. A much better way (IMHO) is to reduce the attack surface: Store only data you need, allow access only for personnel which are actually involved in treating that patient, use good authentication, physically separate systems which can access the data from the internet, don't throw printouts into the waste paper (don't laugh - that happened). If there are people who need access to pseudonymized or aggregate data, copy that data to a separate system ... hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
Вложения
On 3/29/19 9:01 AM, Moreno Andreo wrote: > And I tried to find a solution, and since I did not like that much what > I found (and it seems that neither you do :-) ), I came here hoping that > someone would share his experience to shed some light on the topic. From what you have posted the biggest issue you are having is less then real time search on patient names due to the need to meet pseudonymisation. To me that is always going to be a problem as there are two opposing forces at work, overhead to implement pseudonymisation vs quick lookup. Might be time to lower expectations on what can be done. > > >> hp >> > > > > > -- Adrian Klaver adrian.klaver@aklaver.com
Il 29/03/2019 20:23, Adrian Klaver ha scritto: > On 3/29/19 9:01 AM, Moreno Andreo wrote: > >> And I tried to find a solution, and since I did not like that much >> what I found (and it seems that neither you do :-) ), I came here >> hoping that someone would share his experience to shed some light on >> the topic. > > From what you have posted the biggest issue you are having is less > then real time search on patient names due to the need to meet > pseudonymisation. To me that is always going to be a problem as there > are two opposing forces at work, overhead to implement > pseudonymisation vs quick lookup. Might be time to lower expectations > on what can be done. ... or just do NOT meet pseudonimization at all, but try to enforce other rules suggested bu GDPR. Peter put in evidence a concept " The GDPR doesn't say how to do that at all (the legislators were wise enough that any attempt to do that would result in a mess). So you can't say "the GDPR says we have to do it this way" (and if your consultant says that it is probably time to get a different one). You have to consider all the risks (and yes, an attacker getting access to some or all of the data is a risk, but a doctor not being able to access a patient's records is also a risk) and implement the best you can do considering "the state of the art, the costs of implementation", etc. " that would be absolutely right. I'm not forced to use pseudonimysation if there's the risk to get things worse in a system. I've got to speak about these"two opposing forces at work" to a privacy expert (maybe choosing another one, as Peter suggested :-) ) and ask him if it could be used as a matter of declining pseudonymisation because of "pseudonimysation puts at risk overall performance or database integrity" What do you think? > >> >> >>> hp >>> >> >> >> >> >> > >
On 01/04/19, Moreno Andreo (moreno.andreo@evolu-s.it) wrote: ... > I'm not forced to use pseudonimysation if there's the risk to get > things worse in a system. I've got to speak about these"two opposing > forces at work" to a privacy expert (maybe choosing another one, as > Peter suggested :-) ) and ask him if it could be used as a matter of > declining pseudonymisation because of "pseudonimysation puts at risk > overall performance or database integrity" How to interpret the pseudonymisation conditions is ... complicated. The UK's Information Commissioner's Office (ICO) writes that pseudoanonymisation relates to: “…the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.” and that this "...can reduce the risks to the data subjects". The concept of application realms may be relevant to consider here. An application may be considered GDPR compliant without pseudonymisation if other measures are taken and the use case is appropriate. On the other hand, a copy of a production database in testing which has been pseudonymised may, if compromised, still leak personal data. As the ICO states: “…Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person…” https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/what-is-personal-data/what-is-personal-data/ If leakage occurs pseudonymisation has achieved nothing. Therefore it may be useful to determine if data in a usage realm should be either fully anonymised or not at all. In the latter case the normal GDPR controls must all pertain. Rory
Il 01/04/2019 20:48, Rory Campbell-Lange ha scritto: > On 01/04/19, Moreno Andreo (moreno.andreo@evolu-s.it) wrote: > ... >> I'm not forced to use pseudonimysation if there's the risk to get >> things worse in a system. I've got to speak about these"two opposing >> forces at work" to a privacy expert (maybe choosing another one, as >> Peter suggested :-) ) and ask him if it could be used as a matter of >> declining pseudonymisation because of "pseudonimysation puts at risk >> overall performance or database integrity" > How to interpret the pseudonymisation conditions is ... complicated. Yes, it is indeed... :-) > The > UK's Information Commissioner's Office (ICO) writes that > pseudoanonymisation relates to: > > “…the processing of personal data in such a manner that the personal > data can no longer be attributed to a specific data subject without > the use of additional information, provided that such additional > information is kept separately and is subject to technical and > organisational measures to ensure that the personal data are not > attributed to an identified or identifiable natural person.” > > and that this "...can reduce the risks to the data subjects". > > The concept of application realms may be relevant to consider here. An > application may be considered GDPR compliant without pseudonymisation if > other measures are taken and the use case is appropriate. That could be my case, so I'll have to discuss the strategy and measures to be adopted with a privacy consultant. > > On the other hand, a copy of a production database in testing which has > been pseudonymised may, if compromised, still leak personal data. As the > ICO states: > > “…Personal data which have undergone pseudonymisation, which could > be attributed to a natural person by the use of additional > information should be considered to be information on an > identifiable natural person…” > > https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/what-is-personal-data/what-is-personal-data/ > > If leakage occurs pseudonymisation has achieved nothing. That's another aspect of the question. Thanks for the clarification, Moreno.-