Re: uuid type for postgres

Поиск
Список
Период
Сортировка
От mark@mark.mielke.cc
Тема Re: uuid type for postgres
Дата
Msg-id 20050906211624.GB3069@mark.mielke.cc
обсуждение исходный текст
Ответ на Re: uuid type for postgres  (David Fetter <david@fetter.org>)
Ответы Re: uuid type for postgres  (Greg Stark <gsstark@mit.edu>)
Список pgsql-hackers
On Tue, Sep 06, 2005 at 01:49:28PM -0700, David Fetter wrote:
> On Tue, Sep 06, 2005 at 09:16:13PM +0000, nathan wagner wrote:
> > On Tue, Sep 06, 2005 at 03:57:55PM -0400, jonah.harris@gmail.com wrote:
> > > I agree with Josh on the UUID type, it gets abused far too often
> > Out of curiosity, how does it get abused?  It doesn't seem to me
> > that it would be any more prone to abuse than any other type.
> A typical abuse, as I have seen it, is for OO coders on the front-end
> to turn the entire databse into what amounts to an associative array.
> It then becomes somewhere between difficult and impossible to get good
> performance, as UUID is the only thing useful as a PK/FK.

Replace "UUID" with "SERIAL" in your above paragraph, and I think your
argument is still true. Perhaps UUID makes it easier, as it crosses
the table boundary - but somebody so inclined, can use *val() to
define their own "database wide SERIAL identifier".

I don't see anything intrinisic about SERIAL, UUID, or OO abstracting
techniques that requires people to write inefficient code. People
write inefficient code because they don't know any better. They don't
properly look at how their data will be used, and what path the data
will take to get from the tables on disk, to the user they are
presenting the information to. If they had, in the case you describe,
perhaps they'd be horrified. Or perhaps they would STILL not know
any better. :-)

Technically, the UUID is 4x the size of a SERIAL, or 2x the size of a
SERIAL8. For this cost in terms of efficiency (table size, index size),
you are purchasing the scaleability of being able to, at any point in
the future, more easily (than SERIAL) merge your tables with other
tables, and have a relatively opaque handle to give out to allow
people to uniquely identify a collection of rows. For me, both are
appealing. For others, one or the other may be more appealing.

In terms of portability, perhaps (somebody else made this
arguments). If other systems don't have a UUID type, you end screwed
if you ever wished to ditch PostgreSQL. The reason I don't buy this
argument, though, is that in my current application, I dumped being
able to ditch PostgreSQL a long time ago. There are just too many neat
things I can do with PostgreSQL, that I can't do with MySQL, and that
I either can't, or don't want to do with Oracle. You're stuck with me
now... :-)

If people truly felt this way, there are many PostgreSQL extensions
that should be phased out, rather than introduced.

I'd like to point out the obvious - a UUID is just an identifier. As
it isn't linked to the table, it doesn't require generation to be
server-side. Any application can generate a UUID as two BIGINTs, and
use these two BIGINTs as a primary key, and accomplish the same.
Before I stumbled upon pguuid, and became interested in the rather
impressive extension interfaces provided by PostgreSQL, I was
considering doing this myself. The costs would have been more
complicated queries, and limited practical use from the psql command
line.

I've become comfortable with the concept of a UUID over the last two
or three years, as they have been a requirement for us to use for a
product at my primary place of employment. We use a source management
system with replicas at each site, that is synchronized periodically.
We write software on top of this system to perform automated tasks at
different sites (sometimes executing at the other sites), or that
requires configuration information to be associated with the data
stored in this source management system. In a previous product, we
used names to access the data. This failed horribly when the names
changed. We now frequently, and liberally use the UUID for the
objects. This allows us to associate data with, access or manipulate
the objects at any site, without ANY problem. To go back to a name
based access method seems a step BACKWARDS.

UUID definately has a place. It's only a question of the imagination
of the user as to how abused, or how well used, it is.

I think it deserves a spot in PostgreSQL, and that people would
possibly use it more than they would SERIAL. Eventually, SERIAL wraps
around. So you switch to SERIAL8. At the point that you have SERIAL8,
you aren't worried terribly about disk space, and you realize there is
usually no benefit at all to the numbers being ordered so closely. Why
not pick a scheme that is based on time? Perhaps create a revision
code field to deal with objects created simultaneously from the same
source. And why not identify the source to prevent collisions from
multiple sources? If we go from 8 bytes, to 16 bytes, we can encode
all of this information neatly. Welcome, UUID. :-)

mark

-- 
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada
 One ring to rule them all, one ring to find them, one ring to bring them all                      and in the darkness
bindthem...
 
                          http://mark.mielke.cc/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Rod Taylor
Дата:
Сообщение: Re: PITR on different hardware
Следующее
От: Greg Stark
Дата:
Сообщение: Re: uuid type for postgres