Обсуждение: plpgsql; execute query inside exists
Hi there, I would like to use EXISTS in a small plpgsql function but I always get a "syntax error". How can I execute a query inside the EXISTS function? IF NOT EXISTS(EXECUTE 'SELECT * FROM '|| tmp_tbl) THEN CREATE TABLE tt(); ERROR: syntax error at or near "EXECUTE" LINE 1: SELECT NOT EXISTS(EXECUTE 'SELECT * FROM '|| $1 ) Sorry for my lame question but I'm new in postgres. Thanks!
On Mon, Oct 17, 2011 at 2:32 AM, <jozsef.kurucz@invitel.hu> wrote: > Hi there, > > I would like to use EXISTS in a small plpgsql function but I always > get a "syntax error". How can I execute a query inside the > EXISTS function? > > > > IF NOT EXISTS(EXECUTE 'SELECT * FROM '|| tmp_tbl) > THEN > CREATE TABLE tt(); > > > > > ERROR: syntax error at or near "EXECUTE" > LINE 1: SELECT NOT EXISTS(EXECUTE 'SELECT * FROM '|| $1 ) EXECUTE is a top level statement -- you can't run it inside a query like that. Also, EXISTS is not a way to check to see if a table does exist -- it is a clause for the presence of a row and returns true if it finds one -- but if the table does not exist you would get an SQL error. A better way to do this is to query information_schema: PERFORM 1 FROM information_schema.tables where schema_name = x and table_name = y; IF FOUND THEN CREATE TABLE ... END IF; (there is a race condition in the above code -- do you see it? if concurrent access to this function is an issue, you have to LOCK an object before running the PERFORM or perhaps use an advisory lock). merlin
On Mon, Oct 17, 2011 at 8:20 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Mon, Oct 17, 2011 at 2:32 AM, <jozsef.kurucz@invitel.hu> wrote: >> Hi there, >> >> I would like to use EXISTS in a small plpgsql function but I always >> get a "syntax error". How can I execute a query inside the >> EXISTS function? >> >> >> >> IF NOT EXISTS(EXECUTE 'SELECT * FROM '|| tmp_tbl) >> THEN >> CREATE TABLE tt(); >> >> >> >> >> ERROR: syntax error at or near "EXECUTE" >> LINE 1: SELECT NOT EXISTS(EXECUTE 'SELECT * FROM '|| $1 ) > > > EXECUTE is a top level statement -- you can't run it inside a query > like that. Also, EXISTS is not a way to check to see if a table does > exist -- it is a clause for the presence of a row and returns true if > it finds one -- but if the table does not exist you would get an SQL > error. > > A better way to do this is to query information_schema: > > PERFORM 1 FROM information_schema.tables where schema_name = x and > table_name = y; > > IF FOUND THEN > CREATE TABLE ... > END IF; oops.. meant to say IF NOT FOUND... :-). merlin
On 17 October 2011 15:20, Merlin Moncure <mmoncure@gmail.com> wrote: > A better way to do this is to query information_schema: > > PERFORM 1 FROM information_schema.tables where schema_name = x and > table_name = y; > > IF FOUND THEN > CREATE TABLE ... > END IF; > > (there is a race condition in the above code -- do you see it? if > concurrent access to this function is an issue, you have to LOCK an > object before running the PERFORM or perhaps use an advisory lock). Is there? You'd think that with transactional DDL and the code running in a single transaction (namely inside a stored function) it would be concurrency-safe. -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.
On Mon, Oct 17, 2011 at 8:44 AM, Alban Hertroys <haramrae@gmail.com> wrote: > On 17 October 2011 15:20, Merlin Moncure <mmoncure@gmail.com> wrote: >> A better way to do this is to query information_schema: >> >> PERFORM 1 FROM information_schema.tables where schema_name = x and >> table_name = y; >> >> IF FOUND THEN >> CREATE TABLE ... >> END IF; >> >> (there is a race condition in the above code -- do you see it? if >> concurrent access to this function is an issue, you have to LOCK an >> object before running the PERFORM or perhaps use an advisory lock). > > Is there? You'd think that with transactional DDL and the code running > in a single transaction (namely inside a stored function) it would be > concurrency-safe. Transactional DDL does not protect you from race conditions any more than MVCC protects you from race conditions in regular DML. What transactional DDL does is roll back the changes in the event of an error so you don't have half written schema changes in your database. MVCC gives a rigorous definition of visibility rules and transactions guarantee only a complete unit of work getting committed to the database. You still have to code defensively against multi-user access however. The good news is that multi user coding is about an order of magnitude easier in sql (especially postgres variant) than in any other development platform that I'm aware of. The race condition here is basically the same problem that affects 'upsert' patterns: test record(s) if found update if not found insert; The problem comes that in between the test and the insert case someone else can also test and get the insert in before you do. You have two general strategies to get around this: locking and retry. I greatly advise going the locking route unless your concurrency requirements are very high. It's much simpler, and since you're not invoking a subtransaction, faster in the uncontested case. merlin
On 17 October 2011 16:24, Merlin Moncure <mmoncure@gmail.com> wrote: > On Mon, Oct 17, 2011 at 8:44 AM, Alban Hertroys <haramrae@gmail.com> wrote: >> On 17 October 2011 15:20, Merlin Moncure <mmoncure@gmail.com> wrote: >>> A better way to do this is to query information_schema: >>> >>> PERFORM 1 FROM information_schema.tables where schema_name = x and >>> table_name = y; >>> >>> IF FOUND THEN >>> CREATE TABLE ... >>> END IF; >>> >>> (there is a race condition in the above code -- do you see it? if >>> concurrent access to this function is an issue, you have to LOCK an >>> object before running the PERFORM or perhaps use an advisory lock). >> >> Is there? You'd think that with transactional DDL and the code running >> in a single transaction (namely inside a stored function) it would be >> concurrency-safe. > > Transactional DDL does not protect you from race conditions any more > than MVCC protects you from race conditions in regular DML. What > transactional DDL does is roll back the changes in the event of an > error so you don't have half written schema changes in your database. > MVCC gives a rigorous definition of visibility rules and transactions > guarantee only a complete unit of work getting committed to the > database. You still have to code defensively against multi-user > access however. The good news is that multi user coding is about an > order of magnitude easier in sql (especially postgres variant) than in > any other development platform that I'm aware of. > > The race condition here is basically the same problem that affects > 'upsert' patterns: > > test record(s) if found update if not found insert; > > The problem comes that in between the test and the insert case someone > else can also test and get the insert in before you do. You have two > general strategies to get around this: locking and retry. I greatly > advise going the locking route unless your concurrency requirements > are very high. It's much simpler, and since you're not invoking a > subtransaction, faster in the uncontested case. So what would happen if you don't lock? I think it's this: Session A | Session B ----------------+----------------- SELECT x | SELECT x NOT FOUND | NOT FOUND ... | CREATE TABLE CREATE TABLE | <commit> <error> | <rollback> | If I understand correctly, if you don't mind the error and the subsequent rollback in Session A, than there's not much need to lock, or is there? It is important to be aware of the possible rollback of such a transaction, of course. And what would you lock? A record in information_schema.tables? That's a read-only view. A table that doesn't exist yet? Can't do. A record in the pg_ schema? Rather not... I suppose you could work around that problem by keeping track of your own tables that were generated using aforementioned plpgsql function. Then you have a table (that you own) with records to lock. -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.
On Mon, Oct 17, 2011 at 10:28 AM, Alban Hertroys <haramrae@gmail.com> wrote: > On 17 October 2011 16:24, Merlin Moncure <mmoncure@gmail.com> wrote: >> On Mon, Oct 17, 2011 at 8:44 AM, Alban Hertroys <haramrae@gmail.com> wrote: >>> On 17 October 2011 15:20, Merlin Moncure <mmoncure@gmail.com> wrote: >>>> A better way to do this is to query information_schema: >>>> >>>> PERFORM 1 FROM information_schema.tables where schema_name = x and >>>> table_name = y; >>>> >>>> IF FOUND THEN >>>> CREATE TABLE ... >>>> END IF; >>>> >>>> (there is a race condition in the above code -- do you see it? if >>>> concurrent access to this function is an issue, you have to LOCK an >>>> object before running the PERFORM or perhaps use an advisory lock). >>> >>> Is there? You'd think that with transactional DDL and the code running >>> in a single transaction (namely inside a stored function) it would be >>> concurrency-safe. >> >> Transactional DDL does not protect you from race conditions any more >> than MVCC protects you from race conditions in regular DML. What >> transactional DDL does is roll back the changes in the event of an >> error so you don't have half written schema changes in your database. >> MVCC gives a rigorous definition of visibility rules and transactions >> guarantee only a complete unit of work getting committed to the >> database. You still have to code defensively against multi-user >> access however. The good news is that multi user coding is about an >> order of magnitude easier in sql (especially postgres variant) than in >> any other development platform that I'm aware of. >> >> The race condition here is basically the same problem that affects >> 'upsert' patterns: >> >> test record(s) if found update if not found insert; >> >> The problem comes that in between the test and the insert case someone >> else can also test and get the insert in before you do. You have two >> general strategies to get around this: locking and retry. I greatly >> advise going the locking route unless your concurrency requirements >> are very high. It's much simpler, and since you're not invoking a >> subtransaction, faster in the uncontested case. > > So what would happen if you don't lock? I think it's this: > > Session A | Session B > ----------------+----------------- > SELECT x | SELECT x > NOT FOUND | NOT FOUND > ... | CREATE TABLE > CREATE TABLE | <commit> > <error> | > <rollback> | > > If I understand correctly, if you don't mind the error and the > subsequent rollback in Session A, than there's not much need to lock, > or is there? It is important to be aware of the possible rollback of > such a transaction, of course. right -- allowing for rollback and retrying (either from the client or in the procedure via sub-transaction) is always an option. I prefer to lock -- it's faster (usually) and gives more regular behavior. > And what would you lock? A record in information_schema.tables? That's > a read-only view. A table that doesn't exist yet? Can't do. A record > in the pg_ schema? Rather not... > I suppose you could work around that problem by keeping track of your > own tables that were generated using aforementioned plpgsql function. > Then you have a table (that you own) with records to lock. yeah -- you could use an advisory lock or a special table created for that purpose, or a row of a table that does your tracking. Agree that locking system catalogs is *not* advisable. merlin
Hi, Thanks for the reply! But I don't want to check if the table exists, I want to see the result of the SELECT query, if a row presence or not. The tmp_tbl is a dynamic generated table name, but when I write the code without EXECUTE, I get syntax error too. In this case how can I check if a SELECT has result or not? SELECT INTO rndmd5 md5(random()::text); tmp_tbl := 'tbl_tmp_' || rndmd5; IF NOT EXISTS(SELECT * FROM tmp_tbl) THEN END IF; ERROR: syntax error at or near "$1" LINE 1: SELECT NOT EXISTS(SELECT * FROM $1 ) Thanks! 2011/10/17 Merlin Moncure <mmoncure@gmail.com>: > On Mon, Oct 17, 2011 at 8:20 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> On Mon, Oct 17, 2011 at 2:32 AM, <jozsef.kurucz@invitel.hu> wrote: >>> Hi there, >>> >>> I would like to use EXISTS in a small plpgsql function but I always >>> get a "syntax error". How can I execute a query inside the >>> EXISTS function? >>> >>> >>> >>> IF NOT EXISTS(EXECUTE 'SELECT * FROM '|| tmp_tbl) >>> THEN >>> CREATE TABLE tt(); >>> >>> >>> >>> >>> ERROR: syntax error at or near "EXECUTE" >>> LINE 1: SELECT NOT EXISTS(EXECUTE 'SELECT * FROM '|| $1 ) >> >> >> EXECUTE is a top level statement -- you can't run it inside a query >> like that. Also, EXISTS is not a way to check to see if a table does >> exist -- it is a clause for the presence of a row and returns true if >> it finds one -- but if the table does not exist you would get an SQL >> error. >> >> A better way to do this is to query information_schema: >> >> PERFORM 1 FROM information_schema.tables where schema_name = x and >> table_name = y; >> >> IF FOUND THEN >> CREATE TABLE ... >> END IF; > > oops.. meant to say IF NOT FOUND... :-). > > merlin > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On 18 October 2011 09:57, <jozsef.kurucz@invitel.hu> wrote: > Hi, > > Thanks for the reply! > But I don't want to check if the table exists, I want to see the > result of the SELECT query, if a row presence or not. So you want to check that the table contains data? In that case it makes no sense to create the table if it doesn't contain data. It may very well exist already. > The tmp_tbl is a dynamic generated table name, but when I write the > code without EXECUTE, I get syntax error too. They were explaining why you got the error, they were not telling you to leave out EXECUTE for dynamic SQL. > In this case how can I check if a SELECT has result or not? > > SELECT INTO rndmd5 md5(random()::text); > > tmp_tbl := 'tbl_tmp_' || rndmd5; > > > IF NOT EXISTS(SELECT * FROM tmp_tbl) > THEN > END IF; You really should at least limit the possible amount of results from that SELECT statement. You're not interested in the results. Anyway, the way to do this in plpgsql is: EXECUTE SELECT * FROM tmp_tbl LIMIT 1; IF NOT FOUND THEN ... END IF; -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.