Re: uuid type for postgres
От | mark@mark.mielke.cc |
---|---|
Тема | Re: uuid type for postgres |
Дата | |
Msg-id | 20050906211624.GB3069@mark.mielke.cc обсуждение исходный текст |
Ответ на | Re: uuid type for postgres (David Fetter <david@fetter.org>) |
Ответы |
Re: uuid type for postgres
(Greg Stark <gsstark@mit.edu>)
|
Список | pgsql-hackers |
On Tue, Sep 06, 2005 at 01:49:28PM -0700, David Fetter wrote: > On Tue, Sep 06, 2005 at 09:16:13PM +0000, nathan wagner wrote: > > On Tue, Sep 06, 2005 at 03:57:55PM -0400, jonah.harris@gmail.com wrote: > > > I agree with Josh on the UUID type, it gets abused far too often > > Out of curiosity, how does it get abused? It doesn't seem to me > > that it would be any more prone to abuse than any other type. > A typical abuse, as I have seen it, is for OO coders on the front-end > to turn the entire databse into what amounts to an associative array. > It then becomes somewhere between difficult and impossible to get good > performance, as UUID is the only thing useful as a PK/FK. Replace "UUID" with "SERIAL" in your above paragraph, and I think your argument is still true. Perhaps UUID makes it easier, as it crosses the table boundary - but somebody so inclined, can use *val() to define their own "database wide SERIAL identifier". I don't see anything intrinisic about SERIAL, UUID, or OO abstracting techniques that requires people to write inefficient code. People write inefficient code because they don't know any better. They don't properly look at how their data will be used, and what path the data will take to get from the tables on disk, to the user they are presenting the information to. If they had, in the case you describe, perhaps they'd be horrified. Or perhaps they would STILL not know any better. :-) Technically, the UUID is 4x the size of a SERIAL, or 2x the size of a SERIAL8. For this cost in terms of efficiency (table size, index size), you are purchasing the scaleability of being able to, at any point in the future, more easily (than SERIAL) merge your tables with other tables, and have a relatively opaque handle to give out to allow people to uniquely identify a collection of rows. For me, both are appealing. For others, one or the other may be more appealing. In terms of portability, perhaps (somebody else made this arguments). If other systems don't have a UUID type, you end screwed if you ever wished to ditch PostgreSQL. The reason I don't buy this argument, though, is that in my current application, I dumped being able to ditch PostgreSQL a long time ago. There are just too many neat things I can do with PostgreSQL, that I can't do with MySQL, and that I either can't, or don't want to do with Oracle. You're stuck with me now... :-) If people truly felt this way, there are many PostgreSQL extensions that should be phased out, rather than introduced. I'd like to point out the obvious - a UUID is just an identifier. As it isn't linked to the table, it doesn't require generation to be server-side. Any application can generate a UUID as two BIGINTs, and use these two BIGINTs as a primary key, and accomplish the same. Before I stumbled upon pguuid, and became interested in the rather impressive extension interfaces provided by PostgreSQL, I was considering doing this myself. The costs would have been more complicated queries, and limited practical use from the psql command line. I've become comfortable with the concept of a UUID over the last two or three years, as they have been a requirement for us to use for a product at my primary place of employment. We use a source management system with replicas at each site, that is synchronized periodically. We write software on top of this system to perform automated tasks at different sites (sometimes executing at the other sites), or that requires configuration information to be associated with the data stored in this source management system. In a previous product, we used names to access the data. This failed horribly when the names changed. We now frequently, and liberally use the UUID for the objects. This allows us to associate data with, access or manipulate the objects at any site, without ANY problem. To go back to a name based access method seems a step BACKWARDS. UUID definately has a place. It's only a question of the imagination of the user as to how abused, or how well used, it is. I think it deserves a spot in PostgreSQL, and that people would possibly use it more than they would SERIAL. Eventually, SERIAL wraps around. So you switch to SERIAL8. At the point that you have SERIAL8, you aren't worried terribly about disk space, and you realize there is usually no benefit at all to the numbers being ordered so closely. Why not pick a scheme that is based on time? Perhaps create a revision code field to deal with objects created simultaneously from the same source. And why not identify the source to prevent collisions from multiple sources? If we go from 8 bytes, to 16 bytes, we can encode all of this information neatly. Welcome, UUID. :-) mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bindthem... http://mark.mielke.cc/
В списке pgsql-hackers по дате отправления: