Обсуждение: oid's and primary keys on insert
Hello, I know it is posible to get the oid of the last row inserted to from a cursor. My question is what happens when/if PostgreSQL stops useing oids? Or, to put it another way, my application needs to get the primary key of some rows when they are inserted and not others, is there a future proof way of doing this? Thanks -- * * Rob Brown-Bayliss *
On Fri, 2002-08-09 at 21:34, Richard Huxton wrote: > The standard way of doing this is with a sequence or the SERIAL type (assuming > you don't have a natural primary key). You can use the currval() and > nextval() functions to get the last used/next to use values for a given > sequence. It works with multiple clients and there is now an int8 based > version for those needing a lot of inserts. Unfortunatley I am not useing a sequence directly, I am useing a text field that is like this: 46-X The X is a sequence, the 46 is site identification number, so that when I replicate teh data to teh main site their is a way of knowing where it came from, and also ensuring that the primary key for the table is unique across several sites. It is created and inserted by a trigger. So as you can see I don't actually know what the key will be before an insert. Currently I can then get the oid and then get the primary key for that table row, but if OIDs are banished then I am screwed... -- * * Rob Brown-Bayliss *
Rob Brown-Bayliss wrote: > On Fri, 2002-08-09 at 21:34, Richard Huxton wrote: > > >>The standard way of doing this is with a sequence or the SERIAL type (assuming >>you don't have a natural primary key). You can use the currval() and >>nextval() functions to get the last used/next to use values for a given >>sequence. It works with multiple clients and there is now an int8 based >>version for those needing a lot of inserts. > > > Unfortunatley I am not useing a sequence directly, I am useing a text > field that is like this: 46-X > > The X is a sequence, the 46 is site identification number, so that when > I replicate teh data to teh main site their is a way of knowing where it > came from, and also ensuring that the primary key for the table is > unique across several sites. It is created and inserted by a trigger. > > So as you can see I don't actually know what the key will be before an > insert. > > Currently I can then get the oid and then get the primary key for that > table row, but if OIDs are banished then I am screwed... > <Sorry if this has been suggested already, or will not work for some discussed reason -- I haven't followed this thread up to now.> We're doing something very similar. What we did was to write a plpgsql function that creates a primary key value as a combination of the nextval() from a sequence and a "local site identification number" (not what we called it, but same concept). We have the application call this function first to get the primary key for the main header record, then insert the header record, then reuse that value as the foreign key in the detail records. HTH, Joe
On Sat, 2002-08-10 at 08:13, Elaine Lindelef wrote: > Add another column of type SERIAL and use it to store a plain > incremented integer, and use it like you used to use OIDs. Except that this brings me back to the reason for not useing a sequence as a primary key: What happens when site2 has already placed a row with value 1234 in this column and site1's sequence is up to 1233, I insert the row, and now I have two rows with 1234 as the ID row... At least OID's will be unique as when inserting rows from another site they still use the OID from the local machine. -- * * Rob Brown-Bayliss *
On Sat, 2002-08-10 at 08:22, Joe Conway wrote: > what we called it, but same concept). We have the application call this > function first to get the primary key for the main header record, then > insert the header record, then reuse that value as the foreign key in > the detail records. Yeah, that would work, It would requirechanging the trigger, but as the app is not in use yet thats not a big problem, call the function and then insert with the key generated... Why are the developers (maybe) getting rid of OID's? Wont they then have to come up with another way of identifiying data inside postgresql? -- * * Rob Brown-Bayliss *
On Sat, Aug 10, 2002 at 08:32:39AM +1200, Rob Brown-Bayliss wrote: > Why are the developers (maybe) getting rid of OID's? Wont they then > have to come up with another way of identifiying data inside postgresql? They already have. OIDs won't go away for everything, just on a table, if you created the table that way. A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
On Sat, 2002-08-10 at 08:34, Andrew Sullivan wrote: > On Sat, Aug 10, 2002 at 08:32:39AM +1200, Rob Brown-Bayliss wrote: > > > Why are the developers (maybe) getting rid of OID's? Wont they then > > have to come up with another way of identifiying data inside postgresql? > > They already have. OIDs won't go away for everything, just on a > table, if you created the table that way. I thought there is a possibility that even that will disappear? Are OID's created by default at the moment? -- * * Rob Brown-Bayliss *
On 10 Aug 2002, Rob Brown-Bayliss wrote: > On Sat, 2002-08-10 at 08:13, Elaine Lindelef wrote: > > > Add another column of type SERIAL and use it to store a plain > > incremented integer, and use it like you used to use OIDs. > > Except that this brings me back to the reason for not useing a sequence > as a primary key: > > What happens when site2 has already placed a row with value 1234 in this > column and site1's sequence is up to 1233, I insert the row, and now I > have two rows with 1234 as the ID row... > > At least OID's will be unique as when inserting rows from another site > they still use the OID from the local machine. I believe you are misunderstanding the suggestion. what happens if for your table without OID you define a column called oid with a sequence attached? How is that different to using the OID column from a table with OID? I haven't paid attention to this thread until now but are you confusing primary key with OID? And in a separate message you ask how will PostgreSQL identify data if OIDs are removed from the system? Well doesn't the data identify the data? -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
On Fri, 9 Aug 2002, Nigel J. Andrews wrote: > > And in a separate message you ask how will PostgreSQL identify data if OIDs are > removed from the system? Well doesn't the data identify the data? It's sad to reply to one's own message but... Ah, I see, you don't know the data in order to find the data. I'm not entirely sure how you get in that situation but I'm sure you do. :) I'm equally not sure how you would get out of without knowing the data. The call to a function to combine two data values to make a unique value would seem to me to require that you know the data. However, I would wary of using the return information from an insert unless I knew with certainty that there was only one row of data being inserted and perhaps more especially unless I knew with certainty that there are no and are never going to be rules defined on inserts to a table. Obviously if you control both database and application then you can be more certain about rules but if not consider the situation where the DB design changes and the table you are currently inserting to becomes a view. In that situation you need to run maintenance on your application because of the DB change and more importantly you'd find that if you can't rewrite the insert procedure to write to the tables (for some reason) then you've got no access to such information as the OID of an inserted row. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
On Sat, 2002-08-10 at 08:55, Nigel J. Andrews wrote: > I believe you are misunderstanding the suggestion. > > what happens if for your table without OID you define a column called oid with > a sequence attached? How is that different to using the OID column from a > table with OID? My application (will) runs on several sites, each site updating the main site each night, and each morning getting updates from all other sites via the main site. So if all goes well, each morning each site will start with essentially identical databases. Many tables in the database refer to transactions specific to one site or another. The primary key on these tables are inserted via a trigger, and are in the form of "location-sequence" so that this 46-23456 represents the primary key for location 46, sequence 23456 on that machine. I cant just use a sequence at each site otherwise I could end up with 46 identical primary keys (23456) refering to different data. I cant just make a OID sequence on each machine as this is teh same problem, and I only need to know the oid to retrieve the primary key from the row just inserted, after that I don't care about it at all... And inserts are nice enough to return teh OID of a just inserted row which is helpful. > And in a separate message you ask how will PostgreSQL identify data if OIDs are > removed from the system? Well doesn't the data identify the data? Doesnt' PostgreSQL use the OID's for it's own internal use? So that every item has a unique identifier for postgresql to play with, not necessarily the user) Thats what I thought but I really have know idea why or what they are for. -- * * Rob Brown-Bayliss *
Rob Brown-Bayliss wrote: > On Sat, 2002-08-10 at 08:34, Andrew Sullivan wrote: > >>On Sat, Aug 10, 2002 at 08:32:39AM +1200, Rob Brown-Bayliss wrote: >> >> >>>Why are the developers (maybe) getting rid of OID's? Wont they then >>>have to come up with another way of identifiying data inside postgresql? >> >>They already have. OIDs won't go away for everything, just on a >>table, if you created the table that way. > > > I thought there is a possibility that even that will disappear? > > Are OID's created by default at the moment? > I don't think they will disappear, but in some future release they may not be created by default for user tables (they are created by default in current development sources, IIRC). They are still, and will most likely will continue to, be used for certain system tables. There is at least one good reason to not create oids on user tables: since they are generated from one source for the entire PostgreSQL cluster, in a large installation they can roll-over at 4 billion. By using them for system tables only, the concern of roll-over effectively goes away. There have been discussions of making per table oid generators but it doesn't look like that will happen for 7.3. I think going to 8 byte oids has been rejected to due to the performance hit and additional per row overhead. Joe
On Sat, 2002-08-10 at 09:09, Nigel J. Andrews wrote: > It's sad to reply to one's own message but... I do it daily :o( > I'm equally not sure how you would get out of without knowing the data. The > call to a function to combine two data values to make a unique value would seem > to me to require that you know the data. Heres what I am doing (sort of): -- -- TOC Entry ID 71 (OID 2429711) -- -- Name: "get_loc_key" () Type: FUNCTION Owner: postgres -- CREATE FUNCTION "get_loc_key" () RETURNS integer AS 'SELECT int_value FROM local_data WHERE descriptor=''sys_locations_key'' ' LANGUAGE 'sql'; -- -- TOC Entry ID 73 (OID 2429713) -- -- Name: "set_unique_key" () Type: FUNCTION Owner: rob -- CREATE FUNCTION "set_unique_key" () RETURNS text AS ' DECLARE sequence_number text; location_number text; retval text; BEGIN location_number := to_char(get_loc_key(),''999MI''); location_number := trim(both '' '' from location_number); sequence_number := to_char(nextval(''location_seq''),''999999999MI''); sequence_number := trim(both '' '' from sequence_number); retval := location_number || ''-'' || sequence_number; RETURN retval; END; ' LANGUAGE 'plpgsql'; and set_unique_key is the default value for the primary key field. I could I guess not have it set as default and first call teh function andthen insert it to the filed in the insert statement, but then it's open to forgetfullness. By default then it cant be left out. > However, I would wary of using the > return information from an insert unless I knew with certainty that there was Is it possible for teh oid returned to be the oid from another insert? I didn't know that. If thats the case then none of this matters :o) -- * * Rob Brown-Bayliss *
Joe Conway wrote: > There is at least one good reason to not create oids on user tables: > since they are generated from one source for the entire PostgreSQL > cluster, in a large installation they can roll-over at 4 billion. By > using them for system tables only, the concern of roll-over effectively > goes away. > > There have been discussions of making per table oid generators but it > doesn't look like that will happen for 7.3. I think going to 8 byte oids > has been rejected to due to the performance hit and additional per row > overhead. > We could have one sequence of OID (4 bytes) per table and a prefix (4 bytes) for a specific table in the system table. So we could have an effective OID of 8 bytes and still keep the benefit of system wide unique OID. JLL
On 10 Aug 2002, Rob Brown-Bayliss wrote: > On Sat, 2002-08-10 at 09:09, Nigel J. Andrews wrote: > > It's sad to reply to one's own message but... > > I do it daily :o( > > > I'm equally not sure how you would get out of without knowing the data. The > > call to a function to combine two data values to make a unique value would seem > > to me to require that you know the data. > > Heres what I am doing (sort of): > > -- > -- TOC Entry ID 71 (OID 2429711) > -- > -- Name: "get_loc_key" () Type: FUNCTION Owner: postgres > -- > > CREATE FUNCTION "get_loc_key" () RETURNS integer AS 'SELECT int_value > FROM local_data WHERE descriptor=''sys_locations_key'' > ' LANGUAGE 'sql'; > > -- > -- TOC Entry ID 73 (OID 2429713) > -- > -- Name: "set_unique_key" () Type: FUNCTION Owner: rob > -- > > CREATE FUNCTION "set_unique_key" () RETURNS text AS ' > DECLARE > sequence_number text; > location_number text; > retval text; > BEGIN > location_number := to_char(get_loc_key(),''999MI''); > location_number := trim(both '' '' from location_number); > sequence_number := to_char(nextval(''location_seq''),''999999999MI''); > sequence_number := trim(both '' '' from sequence_number); > retval := location_number || ''-'' || sequence_number; > RETURN retval; > END; > ' LANGUAGE 'plpgsql'; > > and set_unique_key is the default value for the primary key field. I > could I guess not have it set as default and first call teh function > andthen insert it to the filed in the insert statement, but then it's > open to forgetfullness. By default then it cant be left out. Okay, I get it. Now to my mind this is where you are going wrong. I think the generally accepted method of do this sort of thing with sequences is either: 1. SELECT nextval('seqname'); 2. Remember the result in variable say currid 3. INSERT INTO mytable (idcol, ...) VALUES (<value from currid>, ...) 4. Use value from variable currid whereever else it is that you require it or 1. INSERT INTO mytable (idcol, ...) VALUES (<value from currid>, ...) 2. SELECT curval('seqname'); 3. Remember the result in variable say currid 4. Use value from variable currid whereever else it is that you require it In your case you would use set_unique_key() instead of the nextval(...) in the first version. Yes, you have to remember to do the insert in two stages and you could set the set_unique_key() function as the default on your primary key column or as you say you could not do that and rely on a NOT NULL constraint to catch incorrectly performed inserts. Even with the default setting you have you still have to make sure the application is doing inserts correctly, i.e. not specifing the primary key data or using the correct function call, so not using a default setting should be no real pain. The advantage of trying it your way and using the returned OID from the INSERT statement is that you do have only the one statement to run in the backend. My view is that the extra little bit of work required doing this the two step way is worth it becuase there's no trap waiting should you get the situation where the OID information is no longer available. > > > However, I would wary of using the > > return information from an insert unless I knew with certainty that there was > > Is it possible for teh oid returned to be the oid from another insert? > I didn't know that. If thats the case then none of this matters :o) No, but with an ON INSERT rule you will find that there is no useful information returned from your insert other than the standard success/failure indicators. There has been discussion of this on the list and one of the arguments for this behaviour is that a rule can be doing alsort of stuff, like inserting to other tables so items such as number of rows inserted is meaningless in a rule situation. I myself use an ON INSERT rule to take insert data in a view's format and insert it into the appropiate base tables, so for a single 'row inserted into my view' there is really anything between 1 and 5 rows inserted into tables. To top it off there is not even an OID column available from a view, unless one is included from an underlying table. The point being that the OID is _not_ the primary key and it's just luck that some objects in a database have a column that may be unique and therefore may be used to look up a specific row. System tables are a slightly different beasts since they by definition are part of the system and therefore if they depend on OIDs for associations between tables then they will always use OIDs, until they don't, if you see what I mean. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
On Sat, Aug 10, 2002 at 09:13:46AM +1200, Rob Brown-Bayliss wrote: > So if all goes well, each morning each site will start with essentially > identical databases. Many tables in the database refer to transactions > specific to one site or another. The primary key on these tables are > inserted via a trigger, and are in the form of "location-sequence" so > that this 46-23456 represents the primary key for location 46, sequence > 23456 on that machine. Why don't you use a sequence at each site and a "site id" field? Then put a unique index on the site id + the sequence. You'd get something like |site_id|trans_id| ------------------ | 46 | 23456 | Also makes it easier to get "all transactions from site 46". You can even set each site's site_id field to the appropriate default, if you want, so that you don't have to handle it in your code. > Doesnt' PostgreSQL use the OID's for it's own internal use? So that > every item has a unique identifier for postgresql to play with, not > necessarily the user) No, every item doesn't need it. System things do. So tables are identified by OID. But a field in a table is field n of table OIDnn. A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
On Fri, Aug 09, 2002 at 11:01:30PM +0100, Nigel J. Andrews wrote: > accepted method of do this sort of thing with sequences is either: > > 1. SELECT nextval('seqname'); > 2. Remember the result in variable say currid > 3. INSERT INTO mytable (idcol, ...) VALUES (<value from currid>, ...) > 4. Use value from variable currid whereever else it is that you require it > > or > > 1. INSERT INTO mytable (idcol, ...) VALUES (<value from currid>, ...) > 2. SELECT curval('seqname'); > 3. Remember the result in variable say currid > 4. Use value from variable currid whereever else it is that you require it In the second example, you don't need to get "currid". You can just do INSERT and then SELECT currval('seqname'). Remember, currval is defined as sticking _with your connection_, so you don't have to worry about someone else messing with your currval. (That's also the reason you can't get a currval() until you have SELECTed nextval(), either implicitly or explicitly.) A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
On Fri, 9 Aug 2002, Andrew Sullivan wrote: > On Fri, Aug 09, 2002 at 11:01:30PM +0100, Nigel J. Andrews wrote: > > accepted method of do this sort of thing with sequences is either: > > > > 1. SELECT nextval('seqname'); > > 2. Remember the result in variable say currid > > 3. INSERT INTO mytable (idcol, ...) VALUES (<value from currid>, ...) > > 4. Use value from variable currid whereever else it is that you require it > > > > or > > > > 1. INSERT INTO mytable (idcol, ...) VALUES (<value from currid>, ...) > > 2. SELECT curval('seqname'); > > 3. Remember the result in variable say currid > > 4. Use value from variable currid whereever else it is that you require it > > In the second example, you don't need to get "currid". You can just > do INSERT and then SELECT currval('seqname'). Oops, that's wine, cut and paste for you. I just rearranged and renumbered the lines from the first version. > Remember, currval is > defined as sticking _with your connection_, so you don't have to > worry about someone else messing with your currval. (That's also the > reason you can't get a currval() until you have SELECTed nextval(), > either implicitly or explicitly.) Yes, as Andrew points out in step 1 of version 2 the <value from currid> should be: nextval('seqname') and that the application variable currid can be avoided by just selecting curval('seqname') whereever it's needed. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
> Why don't you use a sequence at each site and a "site id" field? > Then put a unique index on the site id + the sequence. You'd get > something like > > |site_id|trans_id| > ------------------ > | 46 | 23456 | > > Also makes it easier to get "all transactions from site 46". You can > even set each site's site_id field to the appropriate default, if you > want, so that you don't have to handle it in your code. I started out that way, but thought the whole select and update thing would get messy, like select blah from wah where site_id = 43 and trans_id = 5345 rather than a single where pkey = 43-5345 Thats realy the only reason I didn't do that. Not realy a goot reason at that... -- * * Rob Brown-Bayliss *
Rob Brown-Bayliss wrote: >>Why don't you use a sequence at each site and a "site id" field? >>Then put a unique index on the site id + the sequence. You'd get >>something like >> >> |site_id|trans_id| >> ------------------ >> | 46 | 23456 | >> >>Also makes it easier to get "all transactions from site 46". You can >>even set each site's site_id field to the appropriate default, if you >>want, so that you don't have to handle it in your code. > > > I started out that way, but thought the whole select and update thing > would get messy, like select blah from wah where site_id = 43 and > trans_id = 5345 rather than a single where pkey = 43-5345 > > Thats realy the only reason I didn't do that. Not realy a goot reason > at that... > I had the exact same experience. Having a 2 column primary key on every table (and hence for every foreign key as well) gets unmanageable pretty quickly. Joe
On Sat, 2002-08-10 at 15:23, Joe Conway wrote: > > Thats realy the only reason I didn't do that. Not realy a goot reason > > at that... > > > > I had the exact same experience. Having a 2 column primary key on every > table (and hence for every foreign key as well) gets unmanageable pretty > quickly. Yep, exactly. I have just been thinking that it was not really a good reason and I must have had another reason there it is, foreign keys... Thanks, I almost convinced myself to go back... -- * * Rob Brown-Bayliss *
On 10 Aug 2002, Rob Brown-Bayliss wrote: > The primary key on these tables are > inserted via a trigger, and are in the form of "location-sequence" so > that this 46-23456 represents the primary key for location 46, sequence > 23456 on that machine. > > I cant just use a sequence at each site otherwise I could end up with 46 > identical primary keys (23456) refering to different data. Your problem appears to be confusion about what your primary key is. If "46-23456" uniquely identifies a particular datum, and "23456" does not, you should never use "23456" as a primary key, but always "46-23456". You can do this either by combining the two into one column in the table, and creating a function to concatenate the site code and ID when doing an insert, or by having the codes in two separate columns and making your primary key the concatenation of the two columns. Note that object IDs being present or not is really irrelevant; object IDs don't do anything that you can't do explicitly with an integer column and a sequence. > Doesnt' PostgreSQL use the OID's for it's own internal use? So that > every item has a unique identifier for postgresql to play with, not > necessarily the user) It does for some things, yes. But not for every table. That's a good reason to remove them; it's an extra four bytes of data that ought to be made explicit if the data are being used, or just dropped if the data are not being used. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
On Fri, 9 Aug 2002, Jean-Luc Lachance wrote: > Joe Conway wrote: > > We could have one sequence of OID (4 bytes) per table and a prefix (4 > bytes) for a specific table in the system table. So we could have an > effective OID of 8 bytes and still keep the benefit of system wide > unique OID. Or we could just create an explicit "object ID" column in those system tables that need it, and drop the whole object IDs thing entirely. Then all this confusion over the purpose of object IDs would go away. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
On Fri, Aug 09, 2002 at 08:23:54PM -0700, Joe Conway wrote: > I had the exact same experience. Having a 2 column primary key on every > table (and hence for every foreign key as well) gets unmanageable pretty > quickly. You could write a trigger which populates the primary key as a combination of site_id and trans_id. Then you still get the advantage of a simple sequence plus a site id, without needing a two-column primary key. Of course, then you pay the price of a trigger every time you insert. A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
Curt Sampson <cjs@cynic.net> writes: > On Fri, 9 Aug 2002, Jean-Luc Lachance wrote: > > Joe Conway wrote: > > We could have one sequence of OID (4 bytes) per table and a prefix > > (4 bytes) for a specific table in the system table. So we could > > have an effective OID of 8 bytes and still keep the benefit of > > system wide unique OID. IMHO, it would be a better idea to eventually change to not creating OIDs by default on user tables (for backwards compatibility, we should probably wait a little while to make this change -- but perhaps toggle it with a GUC option, disabled by default, in the short-term?). Since WITHOUT OIDS is a space optimization in development sources, we're most of the way there already... > Or we could just create an explicit "object ID" column in those > system tables that need it, and drop the whole object IDs thing > entirely. As far as I can tell, OIDs on system tables are exactly that: an explicit "object ID" column that uniquely identifies entries in system catalogs. Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC