Re: pervasiveness of surrogate (also called synthetic) keys

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: pervasiveness of surrogate (also called synthetic) keys
Дата
Msg-id 4DBF6DE7.7060602@postnewspapers.com.au
обсуждение исходный текст
Ответ на Re: pervasiveness of surrogate (also called synthetic) keys  (Rob Sargent <robjsargent@gmail.com>)
Ответы Re: pervasiveness of surrogate (also called synthetic) keys
Re: pervasiveness of surrogate (also called synthetic) keys
Список pgsql-general
On 03/05/11 10:06, Rob Sargent wrote:

> My wife works (at the sql level) with shall we say "records about
> people".  Real records, real people.  Somewhere around 2 million unique
> individuals, several million source records.  They don't all have ssn,
> they don't all have a drivers license.  They don't all have an address,
> many have several addresses (especially over time) and separate people
> have at one time or another lived at the same address.

... and that's before we get into the horror of "what is someone's
name". Which name? Which spelling? Do they even have a single canonical
name? Is their canonical name - if any - expressable in the character
set used by the service? Is it even covered by Unicode?!? Does it make
any sense to split their name up into the traditional
english-speaking-recent-western "family" and "given" name parts? Is
there a single consistent way to do so for their name even if it does? etc.

SSN? What if they don't live in the US or aren't a citizen? What if they
have more than one SSN (yes, it happens)? Or there's one being shared by
more than one person (again, this happens) and they can't get it fixed
or don't want to?

My mother's postal address - before street numbering was introduced a
few years ago on her road - used to be "Rural Delivery Area 2,
SomeTownName, New Zealand". You'd be amazed how many systems could not
cope with that; she used to have to register all sorts of things to her
parents' address in the nearby town.

People decide to key a database on US Social Security number - because
it's a "unique natural key" then discover the exciting problems with
that. While they're battling those, business needs change and the
database starts needing to accept people from other countries, who don't
have a US SSN and don't know what one is or what it looks like.

Tracking people/companies in databases are ideal candidates for
synthetic keying with a solid split/merge procedure to handle
duplicates, overlapping identity records, etc.

--
Craig Ringer



В списке pgsql-general по дате отправления:

Предыдущее
От: Dann Corbit
Дата:
Сообщение: Re: pervasiveness of surrogate (also called synthetic) keys
Следующее
От: Greg Smith
Дата:
Сообщение: Re: pervasiveness of surrogate (also called synthetic) keys