Обсуждение: Create on insert a unique random number

Поиск
Список
Период
Сортировка

Create on insert a unique random number

От
"Campbell, Lance"
Дата:
<div class="Section1"><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">When inserting a record is there a way to have postgres create a random number for a field such that
itis unique?</span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt; 
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">Thanks, </span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span style="font-size:10.0pt;
font-family:Arial">Lance Campbell</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">Project Manager/Software Architect</span></font><p class="MsoNormal"><font face="Arial"
size="2"><spanstyle="font-size:10.0pt; 
font-family:Arial">Web Services at Public Affairs</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;font-family:Arial">University</span></font><fontface="Arial" size="2"><span
style="font-size:10.0pt;font-family:Arial">of Illinois</span></font><p class="MsoNormal"><font face="Arial"
size="2"><spanstyle="font-size:10.0pt; 
font-family:Arial">217.333.0382</span></font><p class="MsoNormal"><font face="Arial" size="2"><span
style="font-size:10.0pt;
font-family:Arial">http://webservices.uiuc.edu</span></font><p class="MsoNormal"><font face="Times New Roman"
size="3"><spanstyle="font-size: 
12.0pt"> </span></font></div>

Re: Create on insert a unique random number

От
chester c young
Дата:
> When inserting a record is there a way to have postgres create a
> random number for a field such that it is unique?


you could use oid


     ____________________________________________________________________________________
Looking for last minute shopping deals?  
Find them fast with Yahoo! Search.  http://tools.search.yahoo.com/newsearch/category.php?category=shopping


Re: Create on insert a unique random number

От
"A. Kretschmer"
Дата:
am  Tue, dem 18.03.2008, um  9:43:01 -0700 mailte chester c young folgendes:
> 
> > When inserting a record is there a way to have postgres create a
> > random number for a field such that it is unique?
> 
> 
> you could use oid

No! No oid, it's deprecated. The solution: use serial.


Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net


Re: Create on insert a unique random number

От
"Campbell, Lance"
Дата:
I created the following table:

create table xyz (
n serial,
abc character varying,
constraint n_pkey primary key (n));

Each time I do an insert:

insert into xyz(abc) values('adf6');

The field n is not random but is sequential.  Is there something I
should do to make the serial number random?

Thanks,


Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
http://webservices.uiuc.edu

-----Original Message-----
From: pgsql-sql-owner@postgresql.org
[mailto:pgsql-sql-owner@postgresql.org] On Behalf Of A. Kretschmer
Sent: Tuesday, March 18, 2008 11:52 AM
To: pgsql-sql@postgresql.org
Subject: Re: [SQL] Create on insert a unique random number

am  Tue, dem 18.03.2008, um  9:43:01 -0700 mailte chester c young
folgendes:
>
> > When inserting a record is there a way to have postgres create a
> > random number for a field such that it is unique?
>
>
> you could use oid

No! No oid, it's deprecated. The solution: use serial.


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: Create on insert a unique random number

От
paul rivers
Дата:
Campbell, Lance wrote:
> I created the following table:
>
> create table xyz (
> n serial, 
> abc character varying, 
> constraint n_pkey primary key (n));
>
> Each time I do an insert:
>
> insert into xyz(abc) values('adf6');
>
> The field n is not random but is sequential.  Is there something I
> should do to make the serial number random?
>
> Thanks,
>
>
>   

Why do you want a random number? Would a guid work?

Paul



Re: Create on insert a unique random number

От
"D'Arcy J.M. Cain"
Дата:
On Tue, 18 Mar 2008 17:51:39 +0100
"A. Kretschmer" <andreas.kretschmer@schollglas.com> wrote:
> am  Tue, dem 18.03.2008, um  9:43:01 -0700 mailte chester c young folgendes:
> > > When inserting a record is there a way to have postgres create a
> > > random number for a field such that it is unique?
> > 
> > you could use oid
> 
> No! No oid, it's deprecated. The solution: use serial.

Not exactly random, is it?

See http://archives.postgresql.org/pgsql-novice/2002-05/msg00198.php
for a possible solution.  Read the followups as well.  There is an
issue with collisions which will only get worse with time.

I wonder though, what is the purpose of this?  I suspect that this is
either a homework problem or you may be attacking some real-world
problem from the wrong angle.

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: Create on insert a unique random number

От
Vivek Khera
Дата:
On Mar 18, 2008, at 1:03 PM, Campbell, Lance wrote:

> The field n is not random but is sequential.  Is there something I
> should do to make the serial number random?

Depending on your "randomness" need, you can alter the increment of  
the sequence so it changes by a different amount than "1" on every  
increment, though it will still be constant increment.

You need to specify what the purpose of it being random is, then you  
might get more useful responses.  Does it need to be random for some  
security purpose?  If so, describe the level of security,  
specifically, against what threat are you defending?


Re: Create on insert a unique random number

От
"A. Kretschmer"
Дата:
am  Tue, dem 18.03.2008, um 12:03:31 -0500 mailte Campbell, Lance folgendes:
> I created the following table:
> 
> create table xyz (
> n serial, 
> abc character varying, 
> constraint n_pkey primary key (n));
> 
> Each time I do an insert:
> 
> insert into xyz(abc) values('adf6');
> 
> The field n is not random but is sequential.  Is there something I
> should do to make the serial number random?

Why do you mean to need a random value? But you can use something like
md5(nextval('your_sequence')) as default. But i can't see an advantage.


>  
> 
> -----Original Message-----

Please, no top-posting. I'm reading from top to bottom.


Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net


Re: Create on insert a unique random number

От
"Campbell, Lance"
Дата:
Thanks for all of your input.  It appears that the best way to do this
is to create a default random number in the primary id field in the
table definition and then return that value after insert.  If an
exception occurs because of duplicates I will simple perform the same
insert statement again. I doubt there would be many duplicate hits if I
use a really large number.

Why use a random number as a primary key?  Security via obscurity.

I build web applications for a living.  In most of my applications it is
preferable to use a random primary key.  Why?

Example:

I built a web application called the Form Builder.  It allows
individuals to create web forms.  After a user is done building their
web form the tool provides a URL for the user to access the form.
Obviously the URL has the random ID of the form in it.  Most of the
forms created with this tool can be accessed and filled out by the
general public.

So why not use a sequential number?  So if I used a sequential number
then a student or outside individual could easily change the number in
the URL to see what other forms there are.  It is not that they don't
have access to the forms but they should not be messing with them if
they really don't have a reason to.  So by using a very large random
number the users filling out a form cannot easily guess what another
form ID is.

Does that make sense?  I have at least a dozen tools that I prefer to
use this approach on.

This is not a security approach.  It is more about not giving obvious
access to people that want to mess around.

Thanks,

Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
http://webservices.uiuc.edu

-----Original Message-----
From: pgsql-sql-owner@postgresql.org
[mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Vivek Khera
Sent: Tuesday, March 18, 2008 12:57 PM
To: pgsql-sql@postgresql.org
Subject: Re: [SQL] Create on insert a unique random number


On Mar 18, 2008, at 1:03 PM, Campbell, Lance wrote:

> The field n is not random but is sequential.  Is there something I
> should do to make the serial number random?

Depending on your "randomness" need, you can alter the increment of
the sequence so it changes by a different amount than "1" on every
increment, though it will still be constant increment.

You need to specify what the purpose of it being random is, then you
might get more useful responses.  Does it need to be random for some
security purpose?  If so, describe the level of security,
specifically, against what threat are you defending?

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: Create on insert a unique random number

От
Erik Jones
Дата:
On Mar 18, 2008, at 1:40 PM, Campbell, Lance wrote:

> Thanks for all of your input.  It appears that the best way to do this
> is to create a default random number in the primary id field in the
> table definition and then return that value after insert.  If an
> exception occurs because of duplicates I will simple perform the same
> insert statement again. I doubt there would be many duplicate hits
> if I
> use a really large number.
>
> Why use a random number as a primary key?  Security via obscurity.
>
> I build web applications for a living.  In most of my applications
> it is
> preferable to use a random primary key.  Why?
>
> Example:
>
> I built a web application called the Form Builder.  It allows
> individuals to create web forms.  After a user is done building their
> web form the tool provides a URL for the user to access the form.
> Obviously the URL has the random ID of the form in it.  Most of the
> forms created with this tool can be accessed and filled out by the
> general public.
>
> So why not use a sequential number?  So if I used a sequential number
> then a student or outside individual could easily change the number in
> the URL to see what other forms there are.  It is not that they don't
> have access to the forms but they should not be messing with them if
> they really don't have a reason to.  So by using a very large random
> number the users filling out a form cannot easily guess what another
> form ID is.
>
> Does that make sense?  I have at least a dozen tools that I prefer to
> use this approach on.
>
> This is not a security approach.  It is more about not giving obvious
> access to people that want to mess around.

I'd say it makes total sense and we do much the same thing with
unsubscribe links in the emails we push here.  However, we keep the
primary key based on sequences and for sent messages generate a random
id using md5(now()::text).  In a sense, then, we have "public" and
"private" keys.

Erik Jones

DBA | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com





Re: Create on insert a unique random number

От
Vivek Khera
Дата:
On Mar 18, 2008, at 2:40 PM, Campbell, Lance wrote:

> Why use a random number as a primary key?  Security via obscurity.
>
> I build web applications for a living.  In most of my applications  
> it is
> preferable to use a random primary key.  Why?

Don't expose the actual ID to the end user; only expose a reversible  
encrypted form of it.  We use a relatively simple hash + check  
character.   If you have several examples of it, you can reverse  
engineer it, but the casual "hacker" is easily thwarted.

You can use stronger encryption on the number when exposed to end  
users if you need.  You're making your DB overly complex.



Re: Create on insert a unique random number

От
Steve Midgley
Дата:
At 11:58 AM 3/18/2008, pgsql-sql-owner@postgresql.org wrote:
>Date: Tue, 18 Mar 2008 13:40:42 -0500
>From: "Campbell, Lance" <lance@uiuc.edu>
>To: "Vivek Khera" <vivek@khera.org>,
>         <pgsql-sql@postgresql.org>
>Subject: Re: Create on insert a unique random number
>Message-ID: 
><B10E6810AC2A2F4EA7550D072CDE8760CDDC34@SAB-FENWICK.sab.uiuc.edu>
>
>Thanks for all of your input.  It appears that the best way to do this
>is to create a default random number in the primary id field in the
>table definition and then return that value after insert.  If an
>exception occurs because of duplicates I will simple perform the same
>insert statement again. I doubt there would be many duplicate hits if 
>I
>use a really large number.
>[snip]
>I built a web application called the Form Builder.  It allows
>individuals to create web forms.  After a user is done building their
>web form the tool provides a URL for the user to access the form.
>Obviously the URL has the random ID of the form in it.  Most of the
>forms created with this tool can be accessed and filled out by the
>general public.
[snip]

Hi Lance,

I think I "get you" as a fellow web systems (aka middleware) guy. My 
opinion is that the use of a "sparse index" is totally reasonable for 
the purpose you describe. But I would argue that you could take it a 
little further in implementation that might keep your db design sane 
while still giving you the sparse index function on the front-end.

1) Create a second field (as someone recommend on this list) that is an 
MD5 of your primary key. Use that as your "accessor" index from the web 
application. But keep the primary key as an integer serial, so that it 
works as expected, and you can build relations normally. I think in the 
end you'll be happier with this method than messing around with a 
custom primary key system.. You can build a trigger that generates the 
MD5 hash every time a record is created (or you can do it in your ORM 
layer in the web app).

2) Also, (but OT) put a monitor on your weblogs to look for "404" 
errors ("page not found" for the sql-only people here). This will 
supplement your sparse index by detecting people who are scanning your 
sparse index space and generating lots of "misses."

Hope that helps,

Steve



Re: Create on insert a unique random number

От
"D'Arcy J.M. Cain"
Дата:
On Tue, 18 Mar 2008 13:40:42 -0500
"Campbell, Lance" <lance@uiuc.edu> wrote:
> Why use a random number as a primary key?  Security via obscurity.

Something with very short shelf life but...

> I build web applications for a living.  In most of my applications it is
> preferable to use a random primary key.  Why?  

I understand why you might need a random field.  My question is, why
does it have to be the primary key?  I'm also not sure why it has to be
unique.  You can always base the URL on both the primary key and the
security field.  Now you don't need to worry about collisions.  In
addition the serial number can be a public reference to the record.

Off-topic but related, funny story, I was once in charge of a medium
sized ISP and some suit came to me and suggested that for extra
security we should not let users pick passwords that already existed in
the system.  My response was "So the error message should be that
someone in the system already has the password that you tried to use?"

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: Create on insert a unique random number

От
"D'Arcy J.M. Cain"
Дата:
On Tue, 18 Mar 2008 12:23:35 -0700
Steve Midgley <public@misuse.org> wrote:
> 1) Create a second field (as someone recommend on this list) that is an 
> MD5 of your primary key. Use that as your "accessor" index from the web 

I strongly disagree for three reasons.  First, if you are going to
generate a key then don't store it.  Just generate it every time.
Second, don't generate it based on a known field.  You may think that
it is secure but what if you private key is compromised?  Do you then
change everyone's security code?  Third, what if one person's
code is compromised?  If it is based on a calculation then you
can't change that one person's security code.

Generate a random number and store that.  You will be much happier when
something goes wrong and something always goes wrong.

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: Create on insert a unique random number

От
Steve Midgley
Дата:
At 12:36 PM 3/18/2008, D'Arcy J.M. Cain wrote:
>On Tue, 18 Mar 2008 12:23:35 -0700
>Steve Midgley <public@misuse.org> wrote:
> > 1) Create a second field (as someone recommend on this list) that 
> is an
> > MD5 of your primary key. Use that as your "accessor" index from the 
> web
>
>I strongly disagree for three reasons.  First, if you are going to
>generate a key then don't store it.  Just generate it every time.
>Second, don't generate it based on a known field.  You may think that
>it is secure but what if you private key is compromised?  Do you then
>change everyone's security code?  Third, what if one person's
>code is compromised?  If it is based on a calculation then you
>can't change that one person's security code.
>
>Generate a random number and store that.  You will be much happier 
>when
>something goes wrong and something always goes wrong.
>
>--
>D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three 
>wolves

Hi D'Arcy,

I'm not clear on your concern here - an MD5 hash doesn't have a private 
key that can be compromised, afaik. It's a one way hash. I don't see 
much difference between making an MD5 of the primary key and generating 
a random number for the "public primary key", except that you shouldn't 
get index collisions with the MD5 method (whereas eventually you will 
with a random number, though of course using a GUID would eliminate 
that concern for practical purposes).

The issue raised by the OP, I believe, is not about security of the 
primary key # itself or its ability to provide unauthorized access to 
the underlying records. The system in question protects its records 
from unauthorized access already.

The issue is about creating an index into a sparse hash so that each 
record is somewhat randomly located in a sparse hash "index space". 
(One valid reason to do this would be if you wanted to hide the total 
number of records in your table from competitors or customers). (Just 
for reference of my view on the problem: 
http://en.wikipedia.org/wiki/Hash_table)

Whether SHA-1 or MD5, I think the point is that if you don't care about 
speed in generating the hash index (which the OP doesn't apparently), 
hash indexing via an encryption algorithm will ensure that the hash 
index is relatively free of "clustering" - which as I understand it, is 
the point of this exercise. Encryption as a hash index generator is 
imperfect for sure, as the Wikipedia article goes at length to discuss, 
but from my perspective it "does the job" - at least as far as the OP 
describes it (or I understood it!). [smile]

I may be way off here of course, and I appreciate the input - any 
thoughts?

Steve



Re: Create on insert a unique random number

От
hubert depesz lubaczewski
Дата:
On Tue, Mar 18, 2008 at 01:40:42PM -0500, Campbell, Lance wrote:
> This is not a security approach.  It is more about not giving obvious
> access to people that want to mess around.

1. keep primary key using standard serial. it will make your life a bit
simpler.
2. add column for text random identifiers (it doesn't have to be number,
and adding characters makes for much better "randomness").
3. check this:
http://www.depesz.com/index.php/2007/06/25/random-text-record-identifiers/

depesz

-- 
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)


Re: Create on insert a unique random number

От
"D'Arcy J.M. Cain"
Дата:
On Tue, 18 Mar 2008 13:57:39 -0700
Steve Midgley <public@misuse.org> wrote:
> At 12:36 PM 3/18/2008, D'Arcy J.M. Cain wrote:
> >On Tue, 18 Mar 2008 12:23:35 -0700
> >Steve Midgley <public@misuse.org> wrote:
> > > 1) Create a second field (as someone recommend on this list) that 
> > is an
> > > MD5 of your primary key. Use that as your "accessor" index from the 
> > web
> >
> >I strongly disagree for three reasons.  First, if you are going to
> >generate a key then don't store it.  Just generate it every time.
> >Second, don't generate it based on a known field.  You may think that
> >it is secure but what if you private key is compromised?  Do you then
> >change everyone's security code?  Third, what if one person's
> >code is compromised?  If it is based on a calculation then you
> >can't change that one person's security code.

> I'm not clear on your concern here - an MD5 hash doesn't have a private 
> key that can be compromised, afaik. It's a one way hash. I don't see 

Right so it is even less useful than I implied.  It can never be
changed so why store it when it can be re-generated at any time.

> much difference between making an MD5 of the primary key and generating 
> a random number for the "public primary key", except that you shouldn't 
> get index collisions with the MD5 method (whereas eventually you will 
> with a random number, though of course using a GUID would eliminate 
> that concern for practical purposes).

But your suggestion was to base this key on the serial primary key so
where is your index collision protection?  You are going to get
collisions on both the serial key and, to a lesser extent, your
generated one. Besides, has anyone ever demonstrated a real issue with
lookups using serial primary keys? I think you are trying to second
guess the database engine with this and I don't think that that is a
great idea. 

> The issue is about creating an index into a sparse hash so that each 
> record is somewhat randomly located in a sparse hash "index space". 
> (One valid reason to do this would be if you wanted to hide the total 
> number of records in your table from competitors or customers). (Just 

If that is your goal then start your serial at something other than 1.
Start at 1,000,000 for example and your first user will think that
you already have one million clients.  Actually, he will think that
you started elsewhere than 1 but he won't know where.

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.


Re: Create on insert a unique random number

От
Steve Midgley
Дата:
At 06:47 AM 3/19/2008, D'Arcy J.M. Cain wrote:
>But your suggestion was to base this key on the serial primary key so
>where is your index collision protection?  You are going to get
>collisions on both the serial key and, to a lesser extent, your
>generated one. Besides, has anyone ever demonstrated a real issue with
>lookups using serial primary keys? I think you are trying to second
>guess the database engine with this and I don't think that that is a
>great idea.
Hi D'Arcy,

I'm not following this line. Maybe we're talking about two different 
things here.. I don't know if Lance is using "CRUD" methodology per se, 
but that's a well accepted web approach and uses (generally) serial 
primary keys in the URL structure as (where numbers are serial pk's):

[website]/contact/12345
[website]/property/45678  [and the client sends GET, POST, PUT, DELETE http requests, or 
mimics, to activate various functions]

Whether CRUD of otherwise, in the model I was promoting, there would be 
two index columns in the table along with other data, a public index 
and a serial primary key. The public index is based on the primary key:

pk | public_pk
1  | md5(1 + fixed salt)
2  | md5(2 + fixed salt)
...

AFAIK, an MD5 hash is guaranteed to generate a unique output for any 
unique input, so the serial key and fixed salt would guarantee no hash 
index collisions on the MD5 output. Of course if a competitor knows 
you're using MD5 and they know your salt, they could calculate all the 
md5 integer hashes and see which ones exist..

But I could care less if he uses md5 or sha-1 or Guids! (I just picked 
MD5 because another poster recommended it and it's very easy to 
implement in Pg). The point I care about is that there would be a 
public_pk that associates to one-and-only-one serial pk. Also that 
public_pk should be 1) not easily guessable, 2) non-clustering (and 
therefore non-serial). Then his url's would look like something like:

[website]/contact/c4ca4238a0b923820dcc509a6f75849b
[website]/property/c81e728d9d4c2f636f067f89cc14862c

> > The issue is about creating an index into a sparse hash so that 
> each
> > record is somewhat randomly located in a sparse hash "index space". 
>
> > (One valid reason to do this would be if you wanted to hide the 
> total
> > number of records in your table from competitors or customers). 
> (Just
>
>If that is your goal then start your serial at something other than 1.
>Start at 1,000,000 for example and your first user will think that
>you already have one million clients.  Actually, he will think that
>you started elsewhere than 1 but he won't know where.

The original post did not want users to be able to type in random 
integers like:

/contact/343

And find out if that record #343 exists or not (regardless of whether 
they can get access to the record - the error generated on 
no-authorization may be different from record-not-found). So starting 
at a million does not fix the OP's issue.
From my perspective, wherever you start your serial index, competitors 
can watch it grow over time, if it's a numeric serial. That could be 
more valuable in many businesses than knowing the initial size of the 
table.

Anyway, I hope that clears up what I was recommending! I didn't 
anticipate it would stir up this much analysis and I hope the OP finds 
your input and mine useful in coming up with a final answer to his 
issue. Thanks for taking the time to consider the issue and I'll look 
forward to any additional ideas or comments you have on this too!

Sincerely,

Steve



Re: Create on insert a unique random number

От
"D'Arcy J.M. Cain"
Дата:
On Wed, 19 Mar 2008 08:28:28 -0700
Steve Midgley <public@misuse.org> wrote:
> I'm not following this line. Maybe we're talking about two different 
> things here.. I don't know if Lance is using "CRUD" methodology per se, 
> but that's a well accepted web approach and uses (generally) serial 
> primary keys in the URL structure as (where numbers are serial pk's):
> 
> [website]/contact/12345
> [website]/property/45678
>    [and the client sends GET, POST, PUT, DELETE http requests, or 
> mimics, to activate various functions]

Yes, I do this all the time.

> Whether CRUD of otherwise, in the model I was promoting, there would be 
> two index columns in the table along with other data, a public index 
> and a serial primary key. The public index is based on the primary key:
> 
> pk | public_pk
> 1  | md5(1 + fixed salt)
> 2  | md5(2 + fixed salt)
> ...
> 
> AFAIK, an MD5 hash is guaranteed to generate a unique output for any 
> unique input, so the serial key and fixed salt would guarantee no hash 
> index collisions on the MD5 output. Of course if a competitor knows 
> you're using MD5 and they know your salt, they could calculate all the 
> md5 integer hashes and see which ones exist..
> 
> But I could care less if he uses md5 or sha-1 or Guids! (I just picked 
> MD5 because another poster recommended it and it's very easy to 
> implement in Pg). The point I care about is that there would be a 
> public_pk that associates to one-and-only-one serial pk. Also that 
> public_pk should be 1) not easily guessable, 2) non-clustering (and 
> therefore non-serial). Then his url's would look like something like:
> 
> [website]/contact/c4ca4238a0b923820dcc509a6f75849b
> [website]/property/c81e728d9d4c2f636f067f89cc14862c

Right and, as you state above, they could be guessable if someone gets
their hands on a relatively small amount of information.  If you simply
generate a random string of n characters where n is based on the amount
of security you need, you can store that and store it in a separate
field in the record.  You don't even need to make them unique.  Just
incorporate the serial number as well as the random string.  There may
conceivably be two records with "1ed6f54e5636837ddae4ef33397ee2cb" as
the key but only one that looks like
"021857.1ed6f54e5636837ddae4ef33397ee2cb".  In fact, you could md5 the
serial key and just string the two together if you really wanted
security through more obscurity but that's probably overkill.

The point here is that no one can guess what someone's URL is, even if
they know the ID, administrators can call up records by ID and
individual secret keys can be changed if compromised without affecting
anyone else.  Also, it's a normalized table.  Storing a value that you
can generate is unnormalized.

> The original post did not want users to be able to type in random 
> integers like:
> 
> /contact/343

See above.  That's not what I was suggesting.

> And find out if that record #343 exists or not (regardless of whether 
> they can get access to the record - the error generated on 
> no-authorization may be different from record-not-found). So starting 
> at a million does not fix the OP's issue.

Certainly you would generate the same error to the web user, even if
you differentiate in your internal error log.

>  From my perspective, wherever you start your serial index, competitors 
> can watch it grow over time, if it's a numeric serial. That could be 
> more valuable in many businesses than knowing the initial size of the 
> table.

I guess it depends on the business case.  Certainly we can always find
a use case where a specific solution fails but that's not how we really
work.  We get all the details of the requirements and then code what
solves them.  See above for the "overkill" method that solves that
issue if it really is one.

> Anyway, I hope that clears up what I was recommending! I didn't 
> anticipate it would stir up this much analysis and I hope the OP finds 
> your input and mine useful in coming up with a final answer to his 
> issue. Thanks for taking the time to consider the issue and I'll look 
> forward to any additional ideas or comments you have on this too!

Yes, discussion is always useful, and fun.  :-)

-- 
D'Arcy J.M. Cain <darcy@druid.net>         |  Democracy is three wolves
http://www.druid.net/darcy/                |  and a sheep voting on
+1 416 425 1212     (DoD#0082)    (eNTP)   |  what's for dinner.