Re: coalesce in plpgsql, and other style questions
От | Ross Boylan |
---|---|
Тема | Re: coalesce in plpgsql, and other style questions |
Дата | |
Msg-id | 1339548151.5384.157.camel@corn.betterworld.us обсуждение исходный текст |
Ответ на | Re: coalesce in plpgsql, and other style questions (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: coalesce in plpgsql, and other style questions
(Jeff Davis <pgsql@j-davis.com>)
Re: coalesce in plpgsql, and other style questions (Daniel Staal <DStaal@usa.net>) |
Список | pgsql-novice |
Thanks for your responses. Some followup below. On Tue, 2012-06-12 at 16:30 -0700, Jeff Davis wrote: > On Tue, 2012-06-12 at 10:46 -0700, Ross Boylan wrote: > > I just wrote my first pl/pgsql function, and would appreciate any > > comments people have on it. I'll be writing a bunch of similar > > functions, with semantics "give me the id of the object if exists, > > otherwise create it and give me the id." > > > > My solution seems excessively procedural to me. > > I think what you're trying to do is procedural, and that's not > necessarily bad. You are trying to get an existing ID if available, or > assign a new ID if not; and I assume you are doing so to save the space > associated with storing the full names (essentially like dictionary > encoding). > > > I thought I could get > > the right semantics with something like > > select coalesce((select id from mytable where name='foo'), > > (insert into mytable ('name') values('foo') returning id)) > > but I could not get that to work in plgsql. > > In pl/pgsql, you can't just use SELECT, you have to use something like > SELECT INTO the_id ... or use PERFORM. > > Also, you have a potential race condition there, because someone might > insert "foo" into mytable after the select (first arg to coalesce) and > before the insert (second arg to coalesce). Practically, it's just me so there shouldn't be any risk. But I'd like to understand the general issue. I thought transaction would take care of this, so that within a transaction the state of the database does not change from actions in other sessions. Then if I commit and have conflict, the commit fails. I guess the sequence I'm using for did assures that did is unique across all transactions, and so the 2 transactions would not be in conflict, since they have different primary keys. As you said later, a unique constraint on one/some of the other fields will at least prevent bogus records that are "the same" from my point of view. I was a bit on the fence about making it unique, since I thought I might have different hosts that shared the same name. But those scenarios are unlikely, and I really don't want to deal with it if it does happen. But is my whole model of how transactions are operating off? I'm basically generalizing from Gemstone, an object database. > > > Also, I wonder if calling a column 'name' is asking for trouble. > > I recommend against it. I'll change it. > > > The actual function is a little more complicated because the table has > > the possibility of canonical entries with other entries pointing to > > them, I use 'did' for database id to avoid confusion with various other > > message ids the system uses; it's for tracking emails. > > > > Here's the code. As I said, I'd love comments on any aspect of it. > > > > /* gom = get or make functions retrieve did if available, otherwise create record > > and return its did. */ > > > > create or replace function gom_hostids(IN hostname text, IN canonicalname text = NULL, > > OUT hostid bigint, OUT canonicalid bigint) language plpgsql as $$ > > Sometimes it's good to declare your variables with names like > in_hostname or out_canonicalid, to prevent confusion reading > similarly-named variables coming from the tables. I'm slowly discovering the problems of name clashes; thanks for the suggestion. > > > DECLARE BTW, is DECLARE necessary if there are no declarations? > > BEGIN > > select did, canonical into hostid, canonicalid from host > > where name = hostname; > > if FOUND then > > return; > > end if; > > if canonicalname is not NULL then > > select did into canonicalid from host where name = canonicalname; > > if not FOUND then > > insert into host (name) values(canonicalname) returning did into canonicalid; > > end if; > > end if; > > if hostname != canonical then > > Is canonical a proper variable here? It's not in the argument list, and > it's not DECLAREd. Did you mean canonicalname? canonical is a column name in the table. Perhaps canonical_did would be more appropriate for it (and rename the output parameter out_canonical_did from canonicalid). > > > insert into host (name, canonical) values(hostname, canonicalid) > > returning did into hostid; > > else > > hostid := canonicalid; > > I would recommend keeping more explicit track of NULL values and what > should happen to them. It looks like a NULL canonicalname would fall > into this else branch (I could be mistaken), that's right. > but it's a little hard to > follow exactly what you want to happen. A few comments would help. I trimmed a comment before the start of the function; it doesn't exactly hit that issue, but it does discuss NULL's: /* Note that if this function creates a canonical host, the canonical field of that host will be NULL rather than pointing to itself. This function may return canonicalid = NULL; it does not currently set canonicalid to hostid in that case. Also if caller supplies an existing hostname with a bogus canonicalname the canonicalid on file of that hostname will be returned. */ > > > return; > > END > > $$; > > Again, you have a potential race between checking for the canonicalname > in host, and inserting it into host when it doesn't exist. If another > transaction runs in between those two, you will get duplicates in the > host table. > > > Table definition: > > create table host ( > > did bigint primary key default (nextval('rb_id_seq')), ---database id, like oid > > name text, --IP address legal, usually enclosed in []. More commonly an internet domain. > > -- domain is a sql keyword and so I used host. > > canonical bigint references host (did) --if not null, preferred name for host > > --may refer to self > > ) inherits (RBObject); > > > > I'm using Posgresql 8.4. > > I recommend a unique constraint on host.name. That will catch the race > conditions I mentioned above, and turn them into errors. You can catch > those errors in plpgsql, allowing you to do what you want in that case. > > Regards, > Jeff Davis >
В списке pgsql-novice по дате отправления: