Anonymized database dumps

Поиск
Список
Период
Сортировка
От Janning Vygen
Тема Anonymized database dumps
Дата
Msg-id 4F6614B5.9050407@kicktipp.de
обсуждение исходный текст
Ответы Re: Anonymized database dumps  (Kiriakos Georgiou <kg.postgresql@olympiakos.com>)
Re: Anonymized database dumps  (Bill Moran <wmoran@potentialtech.com>)
Список pgsql-general
Hi,

I am working on postgresql 9.1 and loving it!

Sometimes we need a full database dump to test some performance issues
with real data.

Of course we don't like to have sensible data like bunches of e-mail
addresses on our development machines as they are of no interest for
developers and should be kept secure.

So we need an anonymized database dump. I thought about a few ways to
achieve this.

1. Best solution would be a special db user and some rules which fire on
reading some tables and replace privacy data with some random data. Now
doing a dump as this special user doesn't even copy the sensible data at
all. The user just has a different view on this database even when he
calls pg_dump.

But as rules are not fired on COPY it can't work, right?

2. The other solution I can think of is something like

pg_dump | sed > pgdump_anon

where 'sed' does a lot of magical replace operations on the content of
the dump. I don't think this is going to work reliable.

3. More reliable would be to dump the database, restore it on a
different server, run some sql script which randomize some data, and
dump it again. hmm, seems to be the only reliable way so far. But it is
no fun when dumping and restoring takes an hour.

Does anybody has a better idea how to achieve an anonymized database dump?

regards
Janning





--
Kicktipp GmbH

Venloer Straße 8, 40477 Düsseldorf
Sitz der Gesellschaft: Düsseldorf
Geschäftsführung: Janning Vygen
Handelsregister Düsseldorf: HRB 55639

http://www.kicktipp.de/

В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: One more query
Следующее
От: jgenoese
Дата:
Сообщение: Ubuntu 11.10 Postgres 9.1.3 is missing pg_resetxlog; where can I get it?