Обсуждение: timestamp as primary key?

Поиск
Список
Период
Сортировка

timestamp as primary key?

От
"Joe Kramer"
Дата:
Hello,

I have table for online chat system that keep messages sent between users.

CREATE TABLE chat_message
(
  message_time timestamp without time zone NOT NULL DEFAULT now(),
  message_body text,
  user_id_from bigint,
  user_id_to bigint,
  CONSTRAINT chat_message_pkey PRIMARY KEY (message_time)
)
WITHOUT OIDS;

I don't want to add int primary key because I don't ever need to find
messages by unique id.

Question: is it okay to use timestamp as primary key, or there is
possibility of collision? (Meaning two processes may INSERT into table
within same millisecond.) It is a web application.


Thanks.

Re: timestamp as primary key?

От
Tomasz Ostrowski
Дата:
On Thu, 19 Oct 2006, Joe Kramer wrote:

> Question: is it okay to use timestamp as primary key, or there is
> possibility of collision? (Meaning two processes may INSERT into table
> within same millisecond.) It is a web application.

If your insert fail you can always try again after some random short
time. Just do that several times and if it fails, say, 5 times
show an error to a user (sugest him to try a lottery - he'll surely
win ;-) ).

Regards
Tometzky
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
                                                      Winnie the Pooh

Re: timestamp as primary key?

От
Richard Huxton
Дата:
Joe Kramer wrote:
> Hello,
>
> I have table for online chat system that keep messages sent between users.
>
> CREATE TABLE chat_message
> (
>  message_time timestamp without time zone NOT NULL DEFAULT now(),

Hmm - timestamp without time zone. So you don't actually care when the
message was sent? Or you know all your users will be in one timezone and
don't care about calculating elapsed times, perhaps. You might want to
re-read the date+time part of the manuals again - the difference between
with/without time zone is subtle but important.

>  message_body text,
>  user_id_from bigint,

An 8-byte integer for user_id? And you're worried about 4 bytes for a
pkey. Well, if you've going to have billions of registered users then I
can see why you'll want to save that four bytes per row.

>  user_id_to bigint,

I'm assuming you've left out fkey references to a user table.

>  CONSTRAINT chat_message_pkey PRIMARY KEY (message_time)
> )
> WITHOUT OIDS;

Without OIDs is a good idea - especially if you're going to have
billions of users sending messages.

> I don't want to add int primary key because I don't ever need to find
> messages by unique id.

So why the meaningless constraint? You're not supposed to have a primary
key to keep the RDBMS police from kicking down your door and dragging
you away, you're supposed to have it one so you can distinguish
individual rows. If you genuinely don't want to be able to locate an
individual message then don't have a primary key at all. Don't lie to
yourself by choosing columns that aren't unique.

If you think you don't need a primary key at all, think about how you
would recover from the following:
   1. Insert a batch of messages from archive.
   2. Accidentally insert the same batch again.
   3. Delete archive.
   4. Notice duplicates.
Best of luck.

> Question: is it okay to use timestamp as primary key, or there is
> possibility of collision? (Meaning two processes may INSERT into table
> within same millisecond.) It is a web application.

You tell us. Is it possible that a database server could allocate the
same now() time for two rows given current hardware (or future hardware
for that matter). What happens if you do multiple inserts per
transaction? What about with multiple processors? What if the system
clock gets reset? What if you end up having two servers and need to
merge their message stores? Does this start to sound like a lot of
uncertain, uncontrollable things to be sure about? So long as you can
*guarantee* that it's impossible to duplicate the time you're fine.

I'd take a step back and check you're clear on all the use-cases for
this message store. I'm guessing you'll decide you *do* need to be able
to distinguish between different messages, that there is no pre-existing
primary key and that you'll want an auto-incremented integer primary-key.

--
   Richard Huxton
   Archonet Ltd

Re: timestamp as primary key?

От
"John D. Burger"
Дата:
cckramer wrote:

> I have table for online chat system that keep messages sent between
> users.

> Question: is it okay to use timestamp as primary key, or there is
> possibility of collision? (Meaning two processes may INSERT into table
> within same millisecond.) It is a web application.

tometzky wrote:

> If your insert fail you can always try again after some random short
> time.

But then the timestamp field does not accurately represent the actual
time of the event.  If you really want a primary key, and you really
don't want to just use a sequence default, I would make the key a
composite:

   PRIMARY KEY (user_id_from, user_id_to, message_time)

This should cut way down on the possibility of key collision.

- John D. Burger
   MITRE


Re: timestamp as primary key?

От
AgentM
Дата:
On Oct 19, 2006, at 10:30 , John D. Burger wrote:

> cckramer wrote:
>
>> I have table for online chat system that keep messages sent
>> between users.
>
>> Question: is it okay to use timestamp as primary key, or there is
>> possibility of collision? (Meaning two processes may INSERT into
>> table
>> within same millisecond.) It is a web application.
>
> tometzky wrote:
>
>> If your insert fail you can always try again after some random short
>> time.
>
> But then the timestamp field does not accurately represent the
> actual time of the event.  If you really want a primary key, and
> you really don't want to just use a sequence default, I would make
> the key a composite:
>
>   PRIMARY KEY (user_id_from, user_id_to, message_time)
>
> This should cut way down on the possibility of key collision.

Only if each message is contained in its own transaction since now()
is effectively a constant throughout a transaction. In this case, I
would choose a surrogate key since it is likely that the table will
be referenced.

-M

Re: timestamp as primary key?

От
"Jim C. Nasby"
Дата:
On Thu, Oct 19, 2006 at 10:36:29AM -0400, AgentM wrote:
> Only if each message is contained in its own transaction since now()
> is effectively a constant throughout a transaction. In this case, I
> would choose a surrogate key since it is likely that the table will
> be referenced.

See timeofday().
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)