Обсуждение: "stored procedures"
So the topic of "real" "stored procedures" came up again. Meaning a function-like object that executes outside of a regular transaction, with the ability to start and stop SQL transactions itself. I would like to collect some specs on this feature. So does anyone have links to documentation of existing implementations, or their own spec writeup? A lot of people appear to have a very clear idea of this concept in their own head, so let's start collecting those.
Hi Peter 2011/4/21 Peter Eisentraut <peter_e@gmx.net>: > So the topic of "real" "stored procedures" came up again. Meaning a > function-like object that executes outside of a regular transaction, > with the ability to start and stop SQL transactions itself. > > I would like to collect some specs on this feature. So does anyone have > links to documentation of existing implementations, or their own spec > writeup? A lot of people appear to have a very clear idea of this > concept in their own head, so let's start collecting those. > I had a patch for "transactional" procedures, but this is lost :( http://okbob.blogspot.com/2007/11/stacked-recordset-multirecordset.html What I (We) expect: Very important points: 1. possible explicit transaction controlling - not only subtransactions 2. correct or usual behave of OUT parameters (important for JDBC people) *** attention: overloading is related to OUT parameters too *** Not necessary but nice: 3. Support for multirecordset and RETURN_STATUS variable (RETURN_STATUS is defined by ANSI) Regards Pavel > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
On Thu, Apr 21, 2011 at 11:24 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > So the topic of "real" "stored procedures" came up again. Meaning a > function-like object that executes outside of a regular transaction, > with the ability to start and stop SQL transactions itself. > > I would like to collect some specs on this feature. So does anyone have > links to documentation of existing implementations, or their own spec > writeup? A lot of people appear to have a very clear idea of this > concept in their own head, so let's start collecting those. EDB has an implementation of this in Advanced Server. A stored procedure can issue a COMMIT, which commits the current transaction and begins a new one. This might or might not be what people are imagining for this feature. If we end up doing something else, one thing to consider is the impact on third-party tools like PGPOOL, which currently keep track of whether or not a transaction is in progress by snooping on the stream of SQL commands. If a procedure can be started with no transaction in progress and return with one open, or the other way around, that method will break horribly. That's not necessarily a reason not to do it, but I suspect we would want to add some kind of protocol-level information about the transaction state instead so that such tools could continue to work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > EDB has an implementation of this in Advanced Server. A stored > procedure can issue a COMMIT, which commits the current transaction > and begins a new one. This might or might not be what people are > imagining for this feature. If we end up doing something else, one > thing to consider is the impact on third-party tools like PGPOOL, > which currently keep track of whether or not a transaction is in > progress by snooping on the stream of SQL commands. If a procedure > can be started with no transaction in progress and return with one > open, or the other way around, that method will break horribly. > That's not necessarily a reason not to do it, but I suspect we would > want to add some kind of protocol-level information about the > transaction state instead so that such tools could continue to work. Huh? There's been a transaction state indicator in the protocol since 7.4 (see ReadyForQuery). It's not our problem if PGPOOL is still using methods that were appropriate ten years ago. regards, tom lane
On Thu, Apr 21, 2011 at 12:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> EDB has an implementation of this in Advanced Server. A stored >> procedure can issue a COMMIT, which commits the current transaction >> and begins a new one. This might or might not be what people are >> imagining for this feature. If we end up doing something else, one >> thing to consider is the impact on third-party tools like PGPOOL, >> which currently keep track of whether or not a transaction is in >> progress by snooping on the stream of SQL commands. If a procedure >> can be started with no transaction in progress and return with one >> open, or the other way around, that method will break horribly. >> That's not necessarily a reason not to do it, but I suspect we would >> want to add some kind of protocol-level information about the >> transaction state instead so that such tools could continue to work. > > Huh? There's been a transaction state indicator in the protocol since > 7.4 (see ReadyForQuery). It's not our problem if PGPOOL is still using > methods that were appropriate ten years ago. Hmm. Well, maybe we need some PGPOOL folks to weigh in. Possibly it's just a case of "it ain't broke, so we haven't fixed it". -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Peter, > I would like to collect some specs on this feature. So does anyone have > links to documentation of existing implementations, or their own spec > writeup? A lot of people appear to have a very clear idea of this > concept in their own head, so let's start collecting those. Delta between SPs and Functions for PostgreSQL: * SPs are executed using CALL or EXECUTE, and not SELECT. * SPs do not return a value ** optional: SPs *may* have OUT parameters. * SPs have internal transactions including begin/commit ** optional: SPs can run non-transaction statements, like CREATE INDEX CONCURRENTLY and VACUUM ** corollary: SPs may not be called as part of a larger query ** question: if an SP is called by another SP, what is its transaction context? * optional: SPs can return multisets (ala SQL Server). ** question: how would multisets be handled on the client end? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Hello 2011/4/21 Josh Berkus <josh@agliodbs.com>: > Peter, > >> I would like to collect some specs on this feature. So does anyone have >> links to documentation of existing implementations, or their own spec >> writeup? A lot of people appear to have a very clear idea of this >> concept in their own head, so let's start collecting those. > > Delta between SPs and Functions for PostgreSQL: > > * SPs are executed using CALL or EXECUTE, and not SELECT. > > * SPs do not return a value > ** optional: SPs *may* have OUT parameters. SP can returns value - result status or RETURNED_SQLSTATE. Result status is hidden OUT parameter > > * SPs have internal transactions including begin/commit > ** optional: SPs can run non-transaction statements, > like CREATE INDEX CONCURRENTLY and VACUUM > ** corollary: SPs may not be called as part of a larger query > ** question: if an SP is called by another SP, what is its > transaction context? > > * optional: SPs can return multisets (ala SQL Server). > ** question: how would multisets be handled on the client end? > you should to use some "next" function for iteration between resultsets http://dev.mysql.com/doc/refman/5.0/en/mysql-next-result.html similar function exists in MSSQL API too Regards Pavel Stehule > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
I'm pretty close to agreement with Josh, I think. Josh Berkus <josh@agliodbs.com> wrote: > Delta between SPs and Functions for PostgreSQL: > > * SPs are executed using CALL or EXECUTE, and not SELECT. Agreed, although some products will search for a matching procedure name if the start of a statement doesn't match any reserved word. That can be handy -- you run them more or less like commands. > * SPs do not return a value I've used some products where these were available, although in some cases only setting what in PostgreSQL would be the equivalent of an integer session GUC. > ** optional: SPs *may* have OUT parameters. Support for those would be important to handle some common uses of SPs. > * SPs have internal transactions including begin/commit Yeah. Entering or leaving an SP should not start or end a transaction. BEGIN, COMMIT, ROLLBACK, and SAVEPOINT should all be available and should not disrupt statement flow. > ** optional: SPs can run non-transaction statements, > like CREATE INDEX CONCURRENTLY and VACUUM That seems important. > ** corollary: SPs may not be called as part of a larger query OK. > ** question: if an SP is called by another SP, what is its > transaction context? Entering or leaving an SP should not start or end a transaction. > * optional: SPs can return multisets (ala SQL Server). I think that's important. > ** question: how would multisets be handled on the client end? In previous discussions there seemed to be a feeling that unless we were going to go to a new major version of the protocol, the return from an SP would be an array of result sets. We would probably want to reserve the first one for OUT parameters (and if we decide to support it, the return value). Tools like psql would need to display each in its turn, similar to what we do for some backslash commands. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Josh Berkus <josh@agliodbs.com> wrote: >> ** question: if an SP is called by another SP, what is its >> transaction context? > Entering or leaving an SP should not start or end a transaction. That all sounds mighty hand-wavy and at serious risk of tripping over implementation details. Some things to think about: 1. Are you expecting the procedure definition to be fetched from a system catalog? You're going to need to be inside a transaction to do that. 2. Are you expecting the procedure to take any input parameters? You're going to need to be inside a transaction to evaluate the inputs, unless perhaps you restrict the feature to an extremely lobotomized subset of possible arguments (no user-defined types, no expressions, just for starters). 3. What sort of primitive operations do you expect the SP to be able to execute "outside a transaction"? The plpgsql model where all the primitive operations are really SQL ain't gonna work. I think that we could finesse #1 and #2, along these lines: The CALL command is ordinary SQL but not allowed inside a transaction block, much like some existing commands like VACUUM. So we start a transaction to parse and execute it. The CALL looks up the procedure definition and evaluates any input arguments. It then copies this info to some outside-the-transaction memory context, terminates its transaction, and calls the procedure. On return it starts a new transaction, in which it can call the output functions that are going to have to be executed in order to pass anything back to the client. (This implies that OUT argument values are collected up during SP execution and not actually passed back to the client till later. People who were hoping to stream vast amounts of data to the client will not be happy. But I see no way around that unless you want to try to execute output functions outside a transaction, which strikes me as a quagmire.) I'm less sure what to do about #3. The most attractive approach would probably be to make people use a non-SQL script interpreter --- perl, python, or whatever floats your boat --- which would likely mean that we have not just one SP implementation language but N of them. But we've solved that problem before. Calling another SP ... particularly one with a different implementation language ... could be a bit tricky too. The above proposal assumes that SPs are always entered outside a transaction, but do we want to make that same restriction for the call-another-SP case? And if not, how's it going to work? Again, you'll have to be inside a transaction at least long enough to get the SP's definition out of the catalogs. regards, tom lane
On Thu, Apr 21, 2011 at 1:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > 3. What sort of primitive operations do you expect the SP to be > able to execute "outside a transaction"? The plpgsql model where > all the primitive operations are really SQL ain't gonna work. > I'm less sure what to do about #3. The most attractive approach would > probably be to make people use a non-SQL script interpreter --- perl, > python, or whatever floats your boat --- which would likely mean that > we have not just one SP implementation language but N of them. But > we've solved that problem before. Does this mean you do or don't expect plpgsql to be able to run as procedure? Should SPI based routines generally be able to run as a procedure (I hope so)? If so, what API enhancements would be needed? (I was thinking, SPI_is_proc, or something like that). I'd like to see plpgsql work as much as possible as it does now, except obviously you can't have exception handlers. What about cancelling? Cancel the current running query, or the whole procedure (I'm assuming the latter? How would that work? > Calling another SP ... particularly one with a different implementation > language ... could be a bit tricky too. The above proposal assumes that > SPs are always entered outside a transaction, but do we want to make > that same restriction for the call-another-SP case? And if not, how's > it going to work? Again, you'll have to be inside a transaction at > least long enough to get the SP's definition out of the catalogs. This restriction (no transaction only CALL) is ok I think. You can always code up a function otherwise. merlin
2011/4/21 Tom Lane <tgl@sss.pgh.pa.us>: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> Josh Berkus <josh@agliodbs.com> wrote: >>> ** question: if an SP is called by another SP, what is its >>> transaction context? > >> Entering or leaving an SP should not start or end a transaction. > > That all sounds mighty hand-wavy and at serious risk of tripping over > implementation details. Some things to think about: It doesn't mean so SQL are inside SP non transactional. Stored Procedure is just client module moved on server. You can call SQL statements from psql without outer implicit or explicit transaction too. It mean - a CALL statement should not start a outer transaction when it isn't requested, but all inner SQL statements runs in own transactions. The questions about mutable or immutable parameters are important - but it doesn't mean so SP without outer transactions are impossible. Regards Pavel > > 1. Are you expecting the procedure definition to be fetched from a > system catalog? You're going to need to be inside a transaction > to do that. > > 2. Are you expecting the procedure to take any input parameters? > You're going to need to be inside a transaction to evaluate the > inputs, unless perhaps you restrict the feature to an extremely > lobotomized subset of possible arguments (no user-defined types, > no expressions, just for starters). > > 3. What sort of primitive operations do you expect the SP to be > able to execute "outside a transaction"? The plpgsql model where > all the primitive operations are really SQL ain't gonna work. > > I think that we could finesse #1 and #2, along these lines: > The CALL command is ordinary SQL but not allowed inside a transaction > block, much like some existing commands like VACUUM. So we start a > transaction to parse and execute it. The CALL looks up the procedure > definition and evaluates any input arguments. It then copies this info to > some outside-the-transaction memory context, terminates its transaction, > and calls the procedure. On return it starts a new transaction, in > which it can call the output functions that are going to have to be > executed in order to pass anything back to the client. (This implies > that OUT argument values are collected up during SP execution and not > actually passed back to the client till later. People who were hoping > to stream vast amounts of data to the client will not be happy. But > I see no way around that unless you want to try to execute output > functions outside a transaction, which strikes me as a quagmire.) > > I'm less sure what to do about #3. The most attractive approach would > probably be to make people use a non-SQL script interpreter --- perl, > python, or whatever floats your boat --- which would likely mean that > we have not just one SP implementation language but N of them. But > we've solved that problem before. > > Calling another SP ... particularly one with a different implementation > language ... could be a bit tricky too. The above proposal assumes that > SPs are always entered outside a transaction, but do we want to make > that same restriction for the call-another-SP case? And if not, how's > it going to work? Again, you'll have to be inside a transaction at > least long enough to get the SP's definition out of the catalogs. > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
On Thu, Apr 21, 2011 at 2:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> Josh Berkus <josh@agliodbs.com> wrote: >>> ** question: if an SP is called by another SP, what is its >>> transaction context? > >> Entering or leaving an SP should not start or end a transaction. > > That all sounds mighty hand-wavy and at serious risk of tripping over > implementation details. Some things to think about: > > 1. Are you expecting the procedure definition to be fetched from a > system catalog? You're going to need to be inside a transaction > to do that. > > 2. Are you expecting the procedure to take any input parameters? > You're going to need to be inside a transaction to evaluate the > inputs, unless perhaps you restrict the feature to an extremely > lobotomized subset of possible arguments (no user-defined types, > no expressions, just for starters). > > 3. What sort of primitive operations do you expect the SP to be > able to execute "outside a transaction"? The plpgsql model where > all the primitive operations are really SQL ain't gonna work. I think we could handle a lot of these details cleanly if we had autonomous transactions as a system primitive. When you enter a stored procedure at the outermost level, you begin a transaction, which will remain open until the outermost stored procedure exits. Any transactions that the stored procedure begins, commits, or rolls back are in fact autonomous subtransactions under the hood. Possibly conditions like IF (1/0) THEN ... END IF that throw run time errors get evaluated in the outer transaction context, so any errors stops execution at that point - and we also avoid beginning and ending a gabazillion transactions. Possibly I am still waving my hands. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Apr 21, 2011 at 2:37 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Apr 21, 2011 at 2:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >>> Josh Berkus <josh@agliodbs.com> wrote: >>>> ** question: if an SP is called by another SP, what is its >>>> transaction context? >> >>> Entering or leaving an SP should not start or end a transaction. >> >> That all sounds mighty hand-wavy and at serious risk of tripping over >> implementation details. Some things to think about: >> >> 1. Are you expecting the procedure definition to be fetched from a >> system catalog? You're going to need to be inside a transaction >> to do that. >> >> 2. Are you expecting the procedure to take any input parameters? >> You're going to need to be inside a transaction to evaluate the >> inputs, unless perhaps you restrict the feature to an extremely >> lobotomized subset of possible arguments (no user-defined types, >> no expressions, just for starters). >> >> 3. What sort of primitive operations do you expect the SP to be >> able to execute "outside a transaction"? The plpgsql model where >> all the primitive operations are really SQL ain't gonna work. > > I think we could handle a lot of these details cleanly if we had > autonomous transactions as a system primitive. When you enter a > stored procedure at the outermost level, you begin a transaction, > which will remain open until the outermost stored procedure exits. If you do it that (base it on AT) way, then you can't: 1) call any utility command (vacuum, etc) 2) run for an arbitrary amount of time 3) discard any locks (except advisory) 4) deal with serialization isolation/mvcc snapshot issues that plague functions. Points 2 & (especially) 4 for me are painful. #4 explained: If you are trying to tuck all the gory mvcc details into server side functions, there is no real effective way to prevent serialization errors because the snapshot is already made when you enter the function. Even if you LOCK something on function line#1, it's already too late. No transaction procedures don't have this problem and allow encapsulating all that nastiness in the server. merlin
On Thu, Apr 21, 2011 at 3:51 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Thu, Apr 21, 2011 at 2:37 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Thu, Apr 21, 2011 at 2:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >>>> Josh Berkus <josh@agliodbs.com> wrote: >>>>> ** question: if an SP is called by another SP, what is its >>>>> transaction context? >>> >>>> Entering or leaving an SP should not start or end a transaction. >>> >>> That all sounds mighty hand-wavy and at serious risk of tripping over >>> implementation details. Some things to think about: >>> >>> 1. Are you expecting the procedure definition to be fetched from a >>> system catalog? You're going to need to be inside a transaction >>> to do that. >>> >>> 2. Are you expecting the procedure to take any input parameters? >>> You're going to need to be inside a transaction to evaluate the >>> inputs, unless perhaps you restrict the feature to an extremely >>> lobotomized subset of possible arguments (no user-defined types, >>> no expressions, just for starters). >>> >>> 3. What sort of primitive operations do you expect the SP to be >>> able to execute "outside a transaction"? The plpgsql model where >>> all the primitive operations are really SQL ain't gonna work. >> >> I think we could handle a lot of these details cleanly if we had >> autonomous transactions as a system primitive. When you enter a >> stored procedure at the outermost level, you begin a transaction, >> which will remain open until the outermost stored procedure exits. > > If you do it that (base it on AT) way, then you can't: > 1) call any utility command (vacuum, etc) > 2) run for an arbitrary amount of time > 3) discard any locks (except advisory) > 4) deal with serialization isolation/mvcc snapshot issues that plague functions. > > Points 2 & (especially) 4 for me are painful. > > #4 explained: > If you are trying to tuck all the gory mvcc details into server side > functions, there is no real effective way to prevent serialization > errors because the snapshot is already made when you enter the > function. Even if you LOCK something on function line#1, it's already > too late. No transaction procedures don't have this problem and allow > encapsulating all that nastiness in the server. Yes, those sound like a potent set of restrictions that "gut" what the facility ought to be able to be useful for. If what you want is something that runs inside a pre-existing transaction, that rules out doing VACUUM or, really, *anything* that generates transactions, without jumping through hoops to try to change their behaviour. My preference would be to expect that stored procedures are sure to generate at least one transaction, and potentially as many more as they choose to generate. One of the most recent things I implemented was a process that does bulk updates to customer balances. We don't want the balance tuples locked, so the process needs to COMMIT after each update. At present, that means I'm doing a round trip from client to server each time. If I had these "autonomous transaction procedures," I could perhaps do the whole thing in a stored procedure, which would: a) Pull the list of transactions it's supposed to process; b) Loop on them: - BEGIN; Do the processing for a transaction, COMMIT. That's not terribly different from a vacuum utility that: a) Pulls a list of tables it's supposed to vacuum; b) Loop on them: VACUUM the table Autovac ought to make that sort of thing limitedly useful; you'd usually rather just use autovac. Mind you, we might discover that implementing autovac mostly in the stored procedure language is easier and better than having it mostly in C. And this might further make it easy to add "hooks" to allow site-specific logic to affect autovacuum policy. (Note that Slony-I version 1.0, 1.1, and possibly 1.2 had the 'cleanup thread' which notably vacuums tables mostly written in C. 2.0 shifted the bulk of the logic into pl/pgsql, which made it much simpler to read and verify, and made some of the components usable by administrators.) I'd expect SP to NOT be nestable, or at least, not in a sense that allows rolling back activity of a "child" that thought it COMMITed work. It seems to me that we've already got perfectly good stored functions that are strictly inside an existing transactional context - if you want logic that's doing that, then use a SF, that's already perfectly good for that, and you should use that. If you want a stored procedure that runs its own transaction(s), do so; don't expect every kind of transactional logic out of SPs. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
Peter Eisentraut wrote: > So the topic of "real" "stored procedures" came up again. Meaning a > function-like object that executes outside of a regular transaction, > with the ability to start and stop SQL transactions itself. > > I would like to collect some specs on this feature. So does anyone have > links to documentation of existing implementations, or their own spec > writeup? A lot of people appear to have a very clear idea of this > concept in their own head, so let's start collecting those. I've thought a lot about this too. The general case of a stored procedure should be all powerful, and be able to directly invoke any code written in SQL or other languages that a DBMS client can directly invoke on the DBMS, as if it were a client, but that the procedure is stored and executed entirely in the DBMS. But the stored procedure also has its own lexical variables and supports conditionals and iteration and recursion. A stored procedure is invoked as a statement and doesn't have a "return" value; in contrast, a function has a return value and is invoked within a value expression of a statement. A stored procedure can see and update the database, and can have IN/INOUT/OUT parameters. A stored procedure can have side-effects out of band, such as user I/O, if Pg supports that. The general stored procedure should be orthogonal to other concerns, in particular to transactions and savepoints; executing one should not should not implicitly start or commit or rollback a transaction or savepoint. However, it should be possible to explicitly declare that procedure is a transaction, so that starts and ends are neatly paired regardless of how the procedure exits, that is a transaction lifetime is attached to its lexical scope, but this would be optional. A stored procedure should be able to do data manipulation, data definition, explicit transaction control (except perhaps when defined to be a transaction), privilege control, message passing, and so on. As for semantics, lets say that when a stored procedure is invoked, its definition will be pulled from the system catalog in a snapshot and be compiled, then run normally no matter what it does, even if the definition of the procedure itself is changed during its execution; in the latter case, it just means that once the execution finishes, subsequent calls to it would then call the updated version or fail. So just compiling the procedure may need a catalog lock or whatever, but when it starts executing a transaction isn't required. Any stored procedure in general should be able to invoke stored procedures, to any level of nesting, just like in any normal programming language. There might be restrictions on what individual procedures can do depending on how they're declared; for example, if one is declared to have a scope-bound transaction, then it or ones it invokes can't have explicit transaction control statements. But such restrictions are an orthogonal or case-dependent matter. (When we have a distinct stored procedure, I also believe that a stored function should be more restricted, such as only having IN parameters and not being able to see the database but by way of parameters, and that it should be deterministic. But that ship has sailed and I'm not going to argue for any changes to functions.) -- Darren Duncan
Merlin Moncure <mmoncure@gmail.com> writes: > On Thu, Apr 21, 2011 at 1:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> 3. What sort of primitive operations do you expect the SP to be >> able to execute "outside a transaction"? �The plpgsql model where >> all the primitive operations are really SQL ain't gonna work. > Does this mean you do or don't expect plpgsql to be able to run as > procedure? Should SPI based routines generally be able to run as a > procedure (I hope so)? If so, what API enhancements would be needed? > (I was thinking, SPI_is_proc, or something like that). I'd like to > see plpgsql work as much as possible as it does now, except obviously > you can't have exception handlers. You can't have arithmetic, comparisons, or much of anything outside a transaction with plpgsql. That model just plain doesn't work for this purpose, I think. You really want a control language that's independent of the SQL engine, and for better or worse plpgsql is built inside that engine. > What about cancelling? Cancel the current running query, or the whole > procedure (I'm assuming the latter? How would that work? Good question. If you're imagining that the SP could decide to cancel a database request partway through, it seems even further afield from what could reasonably be done in a single-threaded backend. Maybe we should think about the SP controlling a second backend (or even multiple backends?) that's executing the "transactional" operations. dblink on steroids, as it were. regards, tom lane
> Robert Haas <robertmhaas@gmail.com> writes: >> EDB has an implementation of this in Advanced Server. A stored >> procedure can issue a COMMIT, which commits the current transaction >> and begins a new one. This might or might not be what people are >> imagining for this feature. If we end up doing something else, one >> thing to consider is the impact on third-party tools like PGPOOL, >> which currently keep track of whether or not a transaction is in >> progress by snooping on the stream of SQL commands. If a procedure >> can be started with no transaction in progress and return with one >> open, or the other way around, that method will break horribly. >> That's not necessarily a reason not to do it, but I suspect we would >> want to add some kind of protocol-level information about the >> transaction state instead so that such tools could continue to work. > > Huh? There's been a transaction state indicator in the protocol since > 7.4 (see ReadyForQuery). It's not our problem if PGPOOL is still using > methods that were appropriate ten years ago. Pgpool has been using the info since 2004 (7.4 was born in 2003). -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
On 4/21/11 3:07 PM, Tom Lane wrote: > Maybe we should think about the SP controlling a second backend (or even > multiple backends?) that's executing the "transactional" operations. > dblink on steroids, as it were. This is how people are doing this now (using dblink I mean). -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > On 4/21/11 3:07 PM, Tom Lane wrote: >> Maybe we should think about the SP controlling a second backend (or even >> multiple backends?) that's executing the "transactional" operations. >> dblink on steroids, as it were. > This is how people are doing this now (using dblink I mean). Right, and it works. But it's notationally painful, management of the connection information poses security issues, etc etc. Perhaps those sorts of things could be addressed, though. regards, tom lane
On Apr 21, 2011, at 3:51 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > If you do it that (base it on AT) way, then you can't: > 1) call any utility command (vacuum, etc) > 2) run for an arbitrary amount of time > 3) discard any locks (except advisory) > 4) deal with serialization isolation/mvcc snapshot issues that plague functions. It is not obvious to me that you cannot do these things. > ...Robert
>> What about cancelling? Cancel the current running query, or the whole >> procedure (I'm assuming the latter? How would that work? > > Good question. If you're imagining that the SP could decide to cancel a > database request partway through, it seems even further afield from what > could reasonably be done in a single-threaded backend. > > Maybe we should think about the SP controlling a second backend (or even > multiple backends?) that's executing the "transactional" operations. > dblink on steroids, as it were. SP are executed in separate process in DB2 or in Oracle - but sometimes there are significant overhead from interprocess communication - it is reason, why collections are popular in PLSQL. A spacial backend for SP is probably most simple solution - but there can be performance problems :( Regards Pavel
On Thu, Apr 21, 2011 at 8:34 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Apr 21, 2011, at 3:51 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> If you do it that (base it on AT) way, then you can't: >> 1) call any utility command (vacuum, etc) >> 2) run for an arbitrary amount of time >> 3) discard any locks (except advisory) >> 4) deal with serialization isolation/mvcc snapshot issues that plague functions. > > It is not obvious to me that you cannot do these things. yeah...I think I misunderstood what you were saying ("When you enter a stored procedure at the outermost level, you begin a transaction...").Those restrictions only apply when there is a opentransaction controlling the context of what is running. If you are handing the command textually off to another backend which then runs it, then you are mostly good...although #2 still might be a problem, and #3 if you happen to grab any. merlin
On Thu, Apr 21, 2011 at 5:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Merlin Moncure <mmoncure@gmail.com> writes: >> On Thu, Apr 21, 2011 at 1:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> 3. What sort of primitive operations do you expect the SP to be >>> able to execute "outside a transaction"? The plpgsql model where >>> all the primitive operations are really SQL ain't gonna work. > >> Does this mean you do or don't expect plpgsql to be able to run as >> procedure? Should SPI based routines generally be able to run as a >> procedure (I hope so)? If so, what API enhancements would be needed? >> (I was thinking, SPI_is_proc, or something like that). I'd like to >> see plpgsql work as much as possible as it does now, except obviously >> you can't have exception handlers. > > You can't have arithmetic, comparisons, or much of anything outside a > transaction with plpgsql. That model just plain doesn't work for this > purpose, I think. You really want a control language that's independent > of the SQL engine, and for better or worse plpgsql is built inside that > engine. I'm arguing against a separate language, or at least questioning if plpgsql truly can't be run without an outer transaction context. Just because a transaction isn't set up on procedure invocation, doesn't mean you can't set them up to do things in the procedure? It wouldn't bother me in the lest that if in plpgsql procedures if you had to set up and tear down a transaction on every line. You can always dip into a function if/when you need the turbo boost. plpgsql is kind of a special case anyways in that it uses sql engine for a lot of core operations. The other pls use their own engines to manage non query code. Setting up a new control language implies that postgres needs to know the procedure language textually so it can read off a line and do something with it. I don't like this restriction -- wouldn't it be better if the current crop of language handlers could run procedures without major changes? C functions with SPI? However it's internally implemented, the more userland mindspace recovered for use of writing procedures the better off we are. merlin
Merlin Moncure <mmoncure@gmail.com> writes: > It wouldn't bother me in the lest that if in plpgsql procedures if you > had to set up and tear down a transaction on every line. It would once you noticed the performance impact ... regards, tom lane
On Fri, Apr 22, 2011 at 9:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Merlin Moncure <mmoncure@gmail.com> writes: >> It wouldn't bother me in the lest that if in plpgsql procedures if you >> had to set up and tear down a transaction on every line. > > It would once you noticed the performance impact ... I'm aware of the impact. It would suck, but you perhaps it's not *quite* as bad as you think, considering: *) faster performance is only an explicit transaction/function away *) perhaps some optimizations are possible...x := x +1; can be directly evaluated? *) simple logic (IF <variable>) can be directly evaluated? *) how bad is it really? from my measurements in queries/sec: 6.7k selects single client, 12k selects piped through single user backend, 13.5k piped through single user backend, one transaction 23k in plpgsql 'execute' in loop (which is really two queries, one to build the query and one to execute), 100k in non dynamic query plpgsql in loop. even if our plpgsql lines/sec dropped from 100k to 10k, maybe that's acceptable? Point being, procedures aren't trying to meet the same set of use cases that functions meet. I see them doing things you currently can't do with functions (point's 1-4 above, plus any syntax sugar/salt CALL brings to the table). You have tons of tools to deal with performance problems should they occur. merlin
Merlin Moncure <mmoncure@gmail.com> wrote: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> You can't have arithmetic, comparisons, or much of anything >> outside a transaction with plpgsql. That model just plain >> doesn't work for this purpose, I think. You really want a >> control language that's independent of the SQL engine, and for >> better or worse plpgsql is built inside that engine. > > I'm arguing against a separate language, or at least questioning > if plpgsql truly can't be run without an outer transaction > context. Just because a transaction isn't set up on procedure > invocation, doesn't mean you can't set them up to do things in the > procedure? Right -- I don't think anyone has suggested that transactions can't be started and ended "within" a SP. And I have argued that if a SP is called while a transaction is active, it runs within the context of that transaction. > It wouldn't bother me in the lest that if in plpgsql procedures if > you had to set up and tear down a transaction on every line. +1 > You can always dip into a function if/when you need the turbo > boost. Or BEGIN a transaction. > Setting up a new control language implies that postgres needs to > know the procedure language textually so it can read off a line > and do something with it. I don't like this restriction -- > wouldn't it be better if the current crop of language handlers > could run procedures without major changes? C functions with SPI? > However it's internally implemented, the more userland mindspace > recovered for use of writing procedures the better off we are. +1 -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Merlin Moncure <mmoncure@gmail.com> wrote: >> wouldn't it be better if the current crop of language handlers >> could run procedures without major changes? C functions with SPI? >> However it's internally implemented, the more userland mindspace >> recovered for use of writing procedures the better off we are. > +1 I'd like a pony, too. Let's be perfectly clear about this: there is no part of plpgsql that can run outside a transaction today, and probably no part of the other PLs either, and changing that "without major changes" is wishful thinking of the first order. regards, tom lane
On Fri, Apr 22, 2011 at 10:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> Merlin Moncure <mmoncure@gmail.com> wrote: >>> wouldn't it be better if the current crop of language handlers >>> could run procedures without major changes? C functions with SPI? >>> However it's internally implemented, the more userland mindspace >>> recovered for use of writing procedures the better off we are. > >> +1 > > I'd like a pony, too. Let's be perfectly clear about this: there is no > part of plpgsql that can run outside a transaction today, and probably > no part of the other PLs either, and changing that "without major > changes" is wishful thinking of the first order. Well, ok, but scope of the change and performance issues aside, is this a technically feasible route, that is, does anything jump out that makes it unworkable? merlin
On Apr 22, 2011, at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> Merlin Moncure <mmoncure@gmail.com> wrote: >>> wouldn't it be better if the current crop of language handlers >>> could run procedures without major changes? C functions with SPI? >>> However it's internally implemented, the more userland mindspace >>> recovered for use of writing procedures the better off we are. > >> +1 > > I'd like a pony, too. Let's be perfectly clear about this: there is no > part of plpgsql that can run outside a transaction today, and probably > no part of the other PLs either, and changing that "without major > changes" is wishful thinking of the first order. Correct me if I am wrong here, but the basic issue is, I think, that an error might occur. And transactions are how we makesure that when control returns to the top level, we've released any heavyweight locks, lightweight locks, buffer pins,backend-local memory allocations, etc. that we were holding when the error occurred. ...Robert
Robert Haas <robertmhaas@gmail.com> wrote: > On Apr 22, 2011, at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'd like a pony, too. No ponies for me; make mine an Arabian stallion. >> Let's be perfectly clear about this: there is no part of plpgsql >> that can run outside a transaction today, and probably no part of >> the other PLs either It's a good thing that nobody is suggesting that transactions can't be started and terminated at need within a SP. And I'm not suggesting that a SP couldn't be run within a transaction to avoid the overhead of that, when desired. > Correct me if I am wrong here, but the basic issue is, I think, > that an error might occur. And transactions are how we make sure > that when control returns to the top level, we've released any > heavyweight locks, lightweight locks, buffer pins, backend-local > memory allocations, etc. that we were holding when the error > occurred. If that's the issue, then the biggest problem would seem to be in preparing an SP which isn't within an existing transaction at startup. As someone previously mentioned, there would need to be a transaction to prepare it for execution which would then be completed before processing the body of the SP. Yes, that's hand-wavy, but I thought we were at the phase of brainstorming about what would make for desirable features, not mapping out the implementation details. Of course it's valuable to identify possible implementation issues for such desirable features, even this early; but let's not get bogged down in such details before we reach some kind of consensus on what we might all like. -Kevin
Robert Haas <robertmhaas@gmail.com> writes: > On Apr 22, 2011, at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'd like a pony, too. Let's be perfectly clear about this: there is no >> part of plpgsql that can run outside a transaction today, and probably >> no part of the other PLs either, and changing that "without major >> changes" is wishful thinking of the first order. > Correct me if I am wrong here, but the basic issue is, I think, that an error might occur. And transactions are how wemake sure that when control returns to the top level, we've released any heavyweight locks, lightweight locks, buffer pins,backend-local memory allocations, etc. that we were holding when the error occurred. Well, yes, all that infrastructure is tied to transactions. Now if you don't use any of it, then you don't have a problem. The real difficulty is that plpgsql uses SQL expressions freely and there's no guarantees about what parts of the infrastructure a random function, operator, or datatype I/O function might use. (Examples: domain_in can invoke pretty much arbitrary code as a consequence of domain CHECK constraints, and will certainly do catalog accesses even without those. Almost any array operator will do catalog accesses to get the properties of the array element type. And so on.) You could possibly lobotomize plpgsql down to a small number of datatypes and operators that are known not to ever do anything more interesting than palloc() and elog(), but IMO the usefulness would be low and the fragility high. It'd be better to give the task to an interpreter that was never built to depend on a SQL environment in the first place. Thus my thought about perl etc. regards, tom lane
Tom, > >> I'd like a pony, too. Let's be perfectly clear about this: there is > >> no > >> part of plpgsql that can run outside a transaction today, and > >> probably > >> no part of the other PLs either, and changing that "without major > >> changes" is wishful thinking of the first order. I always thought that it was pretty clear that autonomous transactions were a major feature, and very difficult to implement. Otherwise we'd have done SPs back in 7.4 when we first had this discussion. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com San Francisco
On Fri, Apr 22, 2011 at 11:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Apr 22, 2011, at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I'd like a pony, too. Let's be perfectly clear about this: there is no >>> part of plpgsql that can run outside a transaction today, and probably >>> no part of the other PLs either, and changing that "without major >>> changes" is wishful thinking of the first order. > >> Correct me if I am wrong here, but the basic issue is, I think, that an error might occur. And transactions are how wemake sure that when control returns to the top level, we've released any heavyweight locks, lightweight locks, buffer pins,backend-local memory allocations, etc. that we were holding when the error occurred. > > Well, yes, all that infrastructure is tied to transactions. Now if you > don't use any of it, then you don't have a problem. The real difficulty > is that plpgsql uses SQL expressions freely and there's no guarantees > about what parts of the infrastructure a random function, operator, or > datatype I/O function might use. (Examples: domain_in can invoke pretty > much arbitrary code as a consequence of domain CHECK constraints, and > will certainly do catalog accesses even without those. Almost any array > operator will do catalog accesses to get the properties of the array > element type. And so on.) Just to be clear (I'm really trying not to be obtuse here), does that mean you can't touch that infrastructure at all in a procedure in this vein, or can you set up a transaction temporarily in cases you need it (maybe at the statement level)? If you are well and truly locked out of the sql engine with no doorway in, then i'd have to agree, plpgsql is out. merlin
On 04/22/2011 12:06 PM, Tom Lane wrote: > You could possibly lobotomize plpgsql down to a small number of > datatypes and operators that are known not to ever do anything more > interesting than palloc() and elog(), but IMO the usefulness would be > low and the fragility high. It'd be better to give the task to an > interpreter that was never built to depend on a SQL environment in the > first place. Thus my thought about perl etc. > It's not clear to me what the other interpreter would actually be doing. cheers andrew
On fre, 2011-04-22 at 08:37 -0500, Merlin Moncure wrote: > It wouldn't bother me in the lest that if in plpgsql procedures if you > had to set up and tear down a transaction on every line. It would probably be more reasonable and feasible to have a setup where you can end a transaction in plpgsql but a new one would start right away.
On Fri, Apr 22, 2011 at 1:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On fre, 2011-04-22 at 08:37 -0500, Merlin Moncure wrote: >> It wouldn't bother me in the lest that if in plpgsql procedures if you >> had to set up and tear down a transaction on every line. > > It would probably be more reasonable and feasible to have a setup where > you can end a transaction in plpgsql but a new one would start right > away. ya, that's an idea. if nothing else, it would certainly be faster, and you still be able to control things properly. Just thinking out loud here, but maybe you could make a cut down version of StartTransaction() that does non-transactional set up like memory, guc, etc but doesn't set the state (or set's it to something else, like TRANS_PROCEDURE). We get here maybe by a new protocol firstchar. One thing that's not clear is how you'd get there via a simple query (sent via PQexec vs hypothetical PQcall). The protocol and syntax portions are a whole separate issue anyways... I poked around a bit in pl_exec.c and and pl_handler.c. My thinking is that in strategic points, in particular in exec_stmt(), you check if in procedure state and not already in a transaction, set one up there, run the statement, and take it down afterwords. Maybe you do this on every statement, or maybe as Peter suggest it's user controlled, but i'm curious how this would turn out. You'd also have to be in a transaction during the function call setup/compilation, and the portions that handle the input arguments. However the main execution loop which passes over the exec state istm is fairly self contained and won't be problematic if run outside of transaction. This is the key point -- the SPI routines when run would always be in *a* transaction, just not always the same transaction. :-) What exactly SPI_connect does, and what the other SPI functions would do if invoked from a different transaction is a mystery to me and presumably a big problem. I'm quite aware this is all awfully light on detail, and the million + 1 assumptions I'm making, but since your getting basically injected directly into a function from the tiny lizard brain of postgres in tcop, I wonder if it could be worked out... merlin
Merlin Moncure <mmoncure@gmail.com> writes: > On Fri, Apr 22, 2011 at 1:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> It would probably be more reasonable and feasible to have a setup where >> you can end a transaction in plpgsql but a new one would start right >> away. > ya, that's an idea. Yeah, that's a good thought. Then we'd have a very well-defined collection of state that had to be preserved through such an operation, ie, the variable values and control state of the SP. It also gets rid of the feeling that you ought not be in a transaction when you enter the SP. There's still the problem of whether you can invoke operations such as VACUUM from such an SP. I think we'd want to insist that they terminate the current xact, which is perhaps not too cool. regards, tom lane
On Fri, Apr 22, 2011 at 3:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Merlin Moncure <mmoncure@gmail.com> writes: >> On Fri, Apr 22, 2011 at 1:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >>> It would probably be more reasonable and feasible to have a setup where >>> you can end a transaction in plpgsql but a new one would start right >>> away. > >> ya, that's an idea. > > Yeah, that's a good thought. Then we'd have a very well-defined > collection of state that had to be preserved through such an operation, > ie, the variable values and control state of the SP. It also gets rid > of the feeling that you ought not be in a transaction when you enter > the SP. hm, another neat thing about this is that it skirts the unfortunate confusion between sql 'begin' and plpgsql 'begin'... merlin
Merlin Moncure <mmoncure@gmail.com> wrote: > hm, another neat thing about this is that it skirts the > unfortunate confusion between sql 'begin' and plpgsql 'begin'... I hadn't thought about that. There is the SQL-standard START TRANSACTION synonym, so there is a way to deal with it -- but since BEGIN seems to be used more heavily there would clearly be confusion. -Kevin
On Apr 22, 2011, at 3:50 PM, Tom Lane wrote: > Merlin Moncure <mmoncure@gmail.com> writes: >> On Fri, Apr 22, 2011 at 1:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >>> It would probably be more reasonable and feasible to have a setup where >>> you can end a transaction in plpgsql but a new one would start right >>> away. > >> ya, that's an idea. > > Yeah, that's a good thought. Then we'd have a very well-defined > collection of state that had to be preserved through such an operation, > ie, the variable values and control state of the SP. It also gets rid > of the feeling that you ought not be in a transaction when you enter > the SP. > > There's still the problem of whether you can invoke operations such as > VACUUM from such an SP. I think we'd want to insist that they terminate > the current xact, which is perhaps not too cool. Dumb question, but wouldn't this kind of approach open up a window where (say) datatypes, operators, catalogs, etc, coulddisappear/change out from under you, being that you're now in a different transaction/snapshot; presuming there is aconcurrent transaction from a different backend modifying the objects in question? In the non-explicit transaction case,locking wouldn't work to keep these objects around due to the transaction scope of locks (unless locks are part of thetransaction state carried forward across the implicit transactions). If so, could that be done in such a way that itwould take precedence over a parallel backend attempting to acquire the same locks without blocking the procedure? Regards, David -- David Christensen End Point Corporation david@endpoint.com
On 21.04.2011 17:24, Peter Eisentraut wrote: > I would like to collect some specs on this feature. So does anyone have > links to documentation of existing implementations, or their own spec > writeup? A lot of people appear to have a very clear idea of this > concept in their own head, so let's start collecting those. Peter, what I like from "the other" is that store procedures are able to return result sets. Susanne -- Susanne Ebrecht - 2ndQuadrant PostgreSQL Development, 24x7 Support, Training and Services www.2ndQuadrant.com
On Fri, Apr 22, 2011 at 11:46 PM, David Christensen <david@endpoint.com> wrote: > > On Apr 22, 2011, at 3:50 PM, Tom Lane wrote: > >> Merlin Moncure <mmoncure@gmail.com> writes: >>> On Fri, Apr 22, 2011 at 1:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >>>> It would probably be more reasonable and feasible to have a setup where >>>> you can end a transaction in plpgsql but a new one would start right >>>> away. >> >>> ya, that's an idea. >> >> Yeah, that's a good thought. Then we'd have a very well-defined >> collection of state that had to be preserved through such an operation, >> ie, the variable values and control state of the SP. It also gets rid >> of the feeling that you ought not be in a transaction when you enter >> the SP. >> >> There's still the problem of whether you can invoke operations such as >> VACUUM from such an SP. I think we'd want to insist that they terminate >> the current xact, which is perhaps not too cool. > > > Dumb question, but wouldn't this kind of approach open up a window where (say) datatypes, operators, catalogs, etc, coulddisappear/change out from under you, being that you're now in a different transaction/snapshot; presuming there is aconcurrent transaction from a different backend modifying the objects in question? That's a good question. This is already a problem for functions -- an object you are dependent upon in the function body can disappear at any time. If you grabbed the lock first you're ok, but otherwise you're not and the caller will receive an error. Starting with 8.3 there is plan cache machinery that invalidates plans used inside plpgsql which should prevent the worst problems. If you're cavalier about deleting objects that are used in a lot of functions you can get really burned from a performance standpoint, but that's no different than dealing with functions today. Procedures unlike functions however can no longer rely that catalogs remain static visibility wise through execution for functions. pl_comp.c is full of catalog lookups and that means that some assumptions that are made during compilation that are no longer valid for procedures. A missing table isn't such a big deal, but maybe it's possible to make intermediate changes while a procedure is execution that can cause an expression to parse differently, or not at all (for example, replacing a scalar function with setof)? This could be a minefield of problems or possibly not -- I really just don't know all the details and perhaps some experimentation is in order. One thing that's tempting is to force recompilation upon certain things happening so you can catch this stuff proactively, but plpgsql function compilation is very slow and this approach is probably very complex. Ideally we can just bail from the procedure if external events cause things to go awry. merlin
Merlin Moncure <mmoncure@gmail.com> wrote: > Procedures unlike functions however can no longer rely that > catalogs remain static visibility wise through execution for > functions. If you start from the perspective that stored procedures are in many respects more like psql scripts than functions, this shouldn't be too surprising. If you have a psql script with multiple database transactions, you know that other processes can change things between transactions. Same deal with SPs. The whole raison d'être for SPs is that there are cases where people need something *different* from functions. While it would be *nice* to leverage plpgsql syntax for a stored procedure language, if it means we have to behave like a function, it's not worth it. -Kevin
On Mon, Apr 25, 2011 at 9:18 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Merlin Moncure <mmoncure@gmail.com> wrote: > >> Procedures unlike functions however can no longer rely that >> catalogs remain static visibility wise through execution for >> functions. > > If you start from the perspective that stored procedures are in many > respects more like psql scripts than functions, this shouldn't be > too surprising. If you have a psql script with multiple database > transactions, you know that other processes can change things > between transactions. Same deal with SPs. > > The whole raison d'être for SPs is that there are cases where people > need something *different* from functions. While it would be *nice* > to leverage plpgsql syntax for a stored procedure language, if it > means we have to behave like a function, it's not worth it. As noted above it would be really nice if the SPI interface could be recovered for use in writing procedures. plpgsql the language is less of a sure thing, but it would be truly unfortunate if it couldn't be saved on grounds of user-retraining alone. If a sneaky injection of transaction manipulation gets the job done without rewriting the entire then great, but it's an open question if that's possible, and I'm about 2 orders of magnitude unfamiliar with the code to say either way. I'm inclined to just poke around and see what breaks. OTOH, if you go the fully textual route you can get away with doing things that are not at all sensible in the plpgsql world (or at least not without a serious rethink of how it works), like connecting to databases mid-procedure, a cleaner attack at things like running 'CLUSTER', than the flush transaction state methodology above. So I see we have three choices: 1. recover SPI, recover plpgsql (and other pls), transaction flush command (SPI_flush()?) 2. recover SPI, replace plpgsql (with what?) 3. no spi, custom built language, most flexibility, database reconnects, aka, 'tabula rasa' #1 is probably the easiest and most appealing on a lot of levels, but fraught with technical danger, and the most limiting? merlin
On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote: > So the topic of "real" "stored procedures" came up again. Meaning a > function-like object that executes outside of a regular transaction, > with the ability to start and stop SQL transactions itself. I would like to add a note about the SQL standard here. Some people have been using terminology that a "function" does this and a "procedure" does something else. Others have also mentioned the use of a CALL statement to invoke procedures. Both procedures (as in CREATE PROCEDURE etc.) and the CALL statement are specified by the SQL standard, and they make no mention of any supertransactional behavior or autonomous transactions for procedures. As far as I can tell, it's just a Pascal-like difference that functions return values and procedures don't. So procedure-like objects with a special transaction behavior will need a different syntax or a syntax addition.
On 04/25/2011 02:18 PM, Peter Eisentraut wrote: > On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote: >> So the topic of "real" "stored procedures" came up again. Meaning a >> function-like object that executes outside of a regular transaction, >> with the ability to start and stop SQL transactions itself. > I would like to add a note about the SQL standard here. > > Some people have been using terminology that a "function" does this and > a "procedure" does something else. Others have also mentioned the use > of a CALL statement to invoke procedures. > > Both procedures (as in CREATE PROCEDURE etc.) and the CALL statement are > specified by the SQL standard, and they make no mention of any > supertransactional behavior or autonomous transactions for procedures. > As far as I can tell, it's just a Pascal-like difference that functions > return values and procedures don't. > > So procedure-like objects with a special transaction behavior will need > a different syntax or a syntax addition. > The trouble is that people using at least some other databases call supertransactional program units "stored procedures". Maybe we need a keyword to designate supertransactional behaviour, but if we call them anything but procedures there is likely to be endless confusion, ISTM, especially if we have something called a procedure which is never supertransactional. cheers andrew
On Mon, Apr 25, 2011 at 1:18 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote: >> So the topic of "real" "stored procedures" came up again. Meaning a >> function-like object that executes outside of a regular transaction, >> with the ability to start and stop SQL transactions itself. > > I would like to add a note about the SQL standard here. > > Some people have been using terminology that a "function" does this and > a "procedure" does something else. Others have also mentioned the use > of a CALL statement to invoke procedures. > > Both procedures (as in CREATE PROCEDURE etc.) and the CALL statement are > specified by the SQL standard, and they make no mention of any > supertransactional behavior or autonomous transactions for procedures. > As far as I can tell, it's just a Pascal-like difference that functions > return values and procedures don't. > > So procedure-like objects with a special transaction behavior will need > a different syntax or a syntax addition. hm. does the sql standard prohibit the use of extra transactional features? are you sure it's not implied that any sql (including START TRANSACTION etc) is valid? meaning, unless otherwise specified, you should be able to do those things, and that our functions because they force one transaction operation are non-standard, not the other way around. merlin
On mån, 2011-04-25 at 13:34 -0500, Merlin Moncure wrote: > hm. does the sql standard prohibit the use of extra transactional > features? It doesn't prohibit anything. It just kindly requests that standard syntax has standard behavior. > are you sure it's not implied that any sql (including > START TRANSACTION etc) is valid? meaning, unless otherwise specified, > you should be able to do those things, and that our functions because > they force one transaction operation are non-standard, not the other > way around. Syntactically, it appears to be allowed, and there's something about savepoint levels. So that might be something related. In any case, if we use standard syntax, that should be researched.
On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote: > So the topic of "real" "stored procedures" came up again. Meaning a > function-like object that executes outside of a regular transaction, > with the ability to start and stop SQL transactions itself. > > I would like to collect some specs on this feature. So does anyone have > links to documentation of existing implementations, or their own spec > writeup? A lot of people appear to have a very clear idea of this > concept in their own head, so let's start collecting those. Another point, as there appear to be diverging camps about supertransactional stored procedures vs. autonomous transactions, what would be the actual use cases of any of these features? Let's collect some, so we can think of ways to make them work.
Peter Eisentraut <peter_e@gmx.net> wrote: > what would be the actual use cases of any of these features? > Let's collect some, so we can think of ways to make them work. The two things which leap to mind for me are: (1) All the \d commands in psql should be implemented in SPs so that they are available from any client, through calling one SP equivalent to one \d command. The \d commands would be changed to call the SPs for releases recent enough to support this. Eventually psql would be free of worrying about which release contained which columns in which system tables, because it would just be passing the parameters in and displaying whatever results came back. I have used products which implemented something like this, and found it quite useful. (2) In certain types of loads -- in particular converting data from old systems into the database for a new system -- you need to load several tables in parallel, with queries among the tables which are being loaded. The ability to batch many DML statements into one transaction is important, to avoid excessive COMMIT overhead and related disk output; however, the ability to ANALYZE tables periodically is equally important, to prevent each access to an initially-empty table from being done as a table scan after it has millions of rows. VACUUM might become equally important if there are counts or totals being accumulated in some tables, or status columns are being updated, as rows are added to other tables. I've often had to do something like this during conversions. This could be handled in an external program (I've often done it in Java), but performance might be better if a stored procedure in PostgreSQL was able to keep SQL/MED streams of data open while committing and performing this maintenance every so many rows. -Kevin
Peter Eisentraut wrote: > > Another point, as there appear to be diverging camps about > supertransactional stored procedures vs. autonomous transactions, what > would be the actual use cases of any of these features? Looping over hundreds of identical schema executing DDL statements on each. We can't do this in a single transaction because it consumes all of shared memory with locks. -- todd
On Mon, Apr 25, 2011 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote: >> So the topic of "real" "stored procedures" came up again. Meaning a >> function-like object that executes outside of a regular transaction, >> with the ability to start and stop SQL transactions itself. >> >> I would like to collect some specs on this feature. So does anyone have >> links to documentation of existing implementations, or their own spec >> writeup? A lot of people appear to have a very clear idea of this >> concept in their own head, so let's start collecting those. > > Another point, as there appear to be diverging camps about > supertransactional stored procedures vs. autonomous transactions, what > would be the actual use cases of any of these features? Let's collect > some, so we can think of ways to make them work. My answer is this: plpgsql with its first class SQL expressions, direct access to the postgres type system, and other nifty features has proven for me to be superior to all other languages in terms of defect rate, output progress for input work, and other metrics one might apply by a significant margin. By adding super-transactional (I prefer the phrasing, 'explicit control of transaction state') features you can eliminate all kinds of cases where you might otherwise be forced to coding on the client side. Lots of people prefer not to do this (or recoil in horror at the mere suggestion of doing so), and that's fine, but I don't like being prohibited from being able to do so by technical constraint. Explicit transaction controls remove those constraints. Anyone who really 'gets' plpgsql programming knows exactly what I'm talking about and has bumped into those constraints. Autonomous transactions, basically a formalization of the dblink style techniques of running SQL in a parallel transaction state, are also useful, but for different reasons. You can extend them pretty far to do most of things explicit transactions give you (like creating lots of tables or running 'CLUSTER') although I find having to force users to maintain separate transaction states just to do so to be a bit of a kludge, and the outermost function still has to terminate within a limited timeframe. merlin
Peter Eisentraut wrote: > Another point, as there appear to be diverging camps about > supertransactional stored procedures vs. autonomous transactions, what > would be the actual use cases of any of these features? Let's collect > some, so we can think of ways to make them work. An analogy I like to use for a very capable DBMS is that of an operating system, and each autonomous transaction is like a distinct process/thread in this system. The DBMS is like a virtual machine in which processes/autonomous transactions run. Like with an operating system, a process/auto-transaction can be started by another one, or by the OS/DBMS (or a root process/auto), and once running all processes are mutually independent to a large extent, in that each has its own separatable privileges or state or view of the database, the database being an analogy to the file system. A process/auto-transaction can be started by a DBMS client, analogous to a user, but it doesn't have to be. The message passing feature that Pg has, listen/notify, is like inter-process communication between these processes/autos. A stored procedure always runs within the context of one process/auto, and a regular transaction or savepoint or whatever is specific to a process/auto. Has anyone else thought of the DBMS as operating system analogy? I don't recall specifically reading this anywhere, but expect the thought may be common. -- Darren Duncan
On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: > (1) All the \d commands in psql should be implemented in SPs so > that they are available from any client, through calling one SP > equivalent to one \d command. You don't need stored procedures with special transaction behavior for this. In fact, you probably shouldn't use them even if you had them, because you surely want a consistent view of, say, a table. > (2) In certain types of loads -- in particular converting data from > old systems into the database for a new system -- you need to load > several tables in parallel, with queries among the tables which are > being loaded. The ability to batch many DML statements into one > transaction is important, to avoid excessive COMMIT overhead and > related disk output; however, the ability to ANALYZE tables > periodically is equally important, to prevent each access to an > initially-empty table from being done as a table scan after it has > millions of rows. VACUUM might become equally important if there > are counts or totals being accumulated in some tables, or status > columns are being updated, as rows are added to other tables. I'm not sure I really follow this. If your aim is to batch DML statements and avoid COMMIT overhead, why would you want to use stored procedures that possibly span multiple transactions?
Peter Eisentraut <peter_e@gmx.net> writes: > On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: >> (1) All the \d commands in psql should be implemented in SPs so >> that they are available from any client, through calling one SP >> equivalent to one \d command. > You don't need stored procedures with special transaction behavior for > this. No, but what you *would* need is the ability to return multiple result sets from one call. Even then, you could not exactly duplicate the current output of \d; but you could duplicate the functionality. regards, tom lane
Peter Eisentraut <peter_e@gmx.net> wrote: > On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: >> (1) All the \d commands in psql should be implemented in SPs so >> that they are available from any client, through calling one SP >> equivalent to one \d command. > > You don't need stored procedures with special transaction behavior > for this. In fact, you probably shouldn't use them even if you > had them, because you surely want a consistent view of, say, a > table. Agreed. I was just outlining use cases here, not trying to make a case for something in particular with each one. >> (2) In certain types of loads -- in particular converting data >> from old systems into the database for a new system -- you need >> to load several tables in parallel, with queries among the tables >> which are being loaded. The ability to batch many DML statements >> into one transaction is important, to avoid excessive COMMIT >> overhead and related disk output; however, the ability to ANALYZE >> tables periodically is equally important, to prevent each access >> to an initially-empty table from being done as a table scan after >> it has millions of rows. VACUUM might become equally important >> if there are counts or totals being accumulated in some tables, >> or status columns are being updated, as rows are added to other >> tables. > > I'm not sure I really follow this. If your aim is to batch DML > statements and avoid COMMIT overhead, why would you want to use > stored procedures that possibly span multiple transactions? The point is that if such a conversion is run in a situation where table access is always done on a plan based on empty tables, it starts to get pretty slow after a while. You need to commit, analyze, and start a new transaction for the queries to make new plans which run well. This obviously isn't an issue when you're blasting entire tables in through COPY commands without needing to reference other data being concurrently loaded. So, rough pseudo-code where this is done in a client app with autovacuum disabled would look something like: open input stream of non-normalized data open database connection while notEOF on input start transaction for 50000 top level inputs (break on EOF) parse apart messy data, load into multipletables (logic involves queries against tables being loaded) (some updates besides straight inserts) (print exceptions for questionable or undigestable data) end for commit transaction vacuum analyze end while In database products with stored procedures it has usually been faster to use an SP in the target database than to use a client program. -Kevin
> Another point, as there appear to be diverging camps about > supertransactional stored procedures vs. autonomous transactions, what > would be the actual use cases of any of these features? Let's collect > some, so we can think of ways to make them work. Here's where I wanted autonomous transactions just last week, and didn't have them so I had to use a python script outside the database: -- doing a CREATE INDEX CONCURRENTLY for 150 partitions on a partitioned table. -- doing a backfill operation for 10GB of computed data, taking 8 hours, where I don't want to hold a transaction open for 8 hours since this is a high-volume OLTP database. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Tom Lane <tgl@sss.pgh.pa.us> wrote: > No, but what you *would* need is the ability to return multiple > result sets from one call. At least. > Even then, you could not exactly duplicate the current output of > \d; but you could duplicate the functionality. I would think that psql could duplicate the output pretty closely, especially if the output of the stored procedure was a stream of intermingled result sets and messages (as from ereport). This is what many products provide. They usually show messages with a class '00' SQLSTATE just as plain text lines, and decorate the more severe levels with appropriate additional information. A while back I included a link to show what Sybase returns from their sp_help SP for various object types: http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc36273.1550/html/sprocs/X85190.htm Note the lines like:Object does not have any indexes. This came from the server as a SQLSTATE '00000' message. -Kevin
Josh Berkus <josh@agliodbs.com> wrote: > -- doing a backfill operation for 10GB of computed data, taking 8 > hours, where I don't want to hold a transaction open for 8 hours > since this is a high-volume OLTP database. Been there, done that. Definitely not a rare use case. -Kevin
On Tue, Apr 26, 2011 at 11:55 PM, Josh Berkus <josh@agliodbs.com> wrote: > Here's where I wanted autonomous transactions just last week, and didn't > have them so I had to use a python script outside the database: > > -- doing a CREATE INDEX CONCURRENTLY for 150 partitions on a partitioned > table. > > -- doing a backfill operation for 10GB of computed data, taking 8 hours, > where I don't want to hold a transaction open for 8 hours since this is > a high-volume OLTP database. These don't seem like compelling use cases at all to me. You said you had to fall back to using a python script outside the database, but what disadvantage does that have? Why is moving your application logic into the database an improvement? Honestly in every case where I've had to move code that had been in a function to the application I've found there were tons of benefits. Everything from being able to better control the behaviour, to being able to parallelize the processing over multiple connections, being able to run parts of it at different times, being able to see the progress and control it from another session, being able to manage the code in version control, the list just goes on. Trying to move all the code into the database just makes life harder. Autonomous transactions have value on their own. But it's not so that you can run create index ocncurrently or vacuum or whatever. They're useful so that a single session can do things like log errors even when a transaction rolls back. Actually that's the only example I can think of but it's a pretty good use case on its own and I'm sure it's not entirely unique. -- greg
On Mon, Apr 25, 2011 at 12:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > Another point, as there appear to be diverging camps about > supertransactional stored procedures vs. autonomous transactions, what > would be the actual use cases of any of these features? Let's collect > some, so we can think of ways to make them work. Some number of moons ago it would have been highly desirable to be able to create daemon worker processes out of UDFs. In practice, many such daemons want to do their own snapshot management (that is to say, acquire new ones...) and there's no nice way to do that by extending postgres. Instead, you cargo cult onto what autovacuum does and release your own postgres binary use SPI from outside a snapshot. Although it would be better still to have a worker pool type mechanic (see the "async" discussion happening recently), being able to have contribs or modules where one could run: SELECT do_the_thing(); And block indefinitely doing cross-snapshot work would be pretty useful, I feel. As a thought exercise, could one create: SELECT autovacuum(tuning, parameters, one, through, n); as a C UDF without bizarro snapshot mangling? (I believe we did play some tricks to escape the snapshot even in this case, but they weren't very lucid in the code, if memory serves). In any case, I've encountered at least a few situations where I'd like to be able to opt-out of getting one and exactly one snapshot in the daemon/worker case. -- fdr
On Tue, Apr 26, 2011 at 10:12 PM, Greg Stark <gsstark@mit.edu> wrote: > On Tue, Apr 26, 2011 at 11:55 PM, Josh Berkus <josh@agliodbs.com> wrote: >> Here's where I wanted autonomous transactions just last week, and didn't >> have them so I had to use a python script outside the database: >> >> -- doing a CREATE INDEX CONCURRENTLY for 150 partitions on a partitioned >> table. >> >> -- doing a backfill operation for 10GB of computed data, taking 8 hours, >> where I don't want to hold a transaction open for 8 hours since this is >> a high-volume OLTP database. > > These don't seem like compelling use cases at all to me. You said you > had to fall back to using a python script outside the database, but > what disadvantage does that have? Why is moving your application logic > into the database an improvement? > > Honestly in every case where I've had to move code that had been in a > function to the application I've found there were tons of benefits. > Everything from being able to better control the behaviour, to being > able to parallelize the processing over multiple connections, being > able to run parts of it at different times, being able to see the > progress and control it from another session, being able to manage the > code in version control, the list just goes on. Trying to move all the > code into the database just makes life harder. my experience has been the opposite. merlin
> These don't seem like compelling use cases at all to me. You said you > had to fall back to using a python script outside the database, but > what disadvantage does that have? Why is moving your application logic > into the database an improvement? Since both were part of a code rollout, it complicated our deployment process considerably and took a deployment which could have been push-button automatic and forced us to do it by manually logging into the shell on the database server. > Trying to move all the > code into the database just makes life harder. I might make *your* life harder. It makes *mine* easier. If you pursue your argument a little further, Greg, why do we have functions at all? We could do it all in the application. > Autonomous transactions have value on their own. But it's not so that > you can run create index ocncurrently or vacuum or whatever. Why not? Why are you so intent on making my life harder? > They're > useful so that a single session can do things like log errors even > when a transaction rolls back. That's *also* an excellent use case. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Wed, Apr 27, 2011 at 6:48 PM, Josh Berkus <josh@agliodbs.com> wrote: > If you pursue your argument a little further, Greg, why do we have > functions at all? We could do it all in the application. > >> Autonomous transactions have value on their own. But it's not so that >> you can run create index ocncurrently or vacuum or whatever. > > Why not? Why are you so intent on making my life harder? Because we want to be able to manipulate data in queries in data-type-specific ways. For example we want to do aggregations on the result of a function or index scans across a user data type, etc. If all the functions do is implement application logic then you end up having half your application logic in the application and half in the database and it's hard to keep them in sync. To take the argument in the opposite extreme would you suggest we should have html formatting functions in the database so that people can have their entire web server just be print $dbh->('select web_page(url)') ? >> They're >> useful so that a single session can do things like log errors even >> when a transaction rolls back. > > That's *also* an excellent use case. What makes it an excellent use case is that it's basically impossible to do without autonomous transactions. You can hack it with dblink but it's much less clean and much higher overhead. -- greg
Greg, > Because we want to be able to manipulate data in queries in > data-type-specific ways. For example we want to do aggregations on the > result of a function or index scans across a user data type, etc. I don't see how this is different from wanting to capture error output, which would face the same issues. You seem to be wanting to make a hard feature easier by telling me that I don't actually want the things I want. Wanna make it even easier? Then Stored Procedures are just functions without a return value. That's a 40-line patch. Done! > If > all the functions do is implement application logic then you end up > having half your application logic in the application and half in the > database and it's hard to keep them in sync. You build your applications your way, and I'll build mine my way. I'll just ask you not to try to dictate to me how I should build applications. Especially, since, based on the responses on this thread, a LOT of people would like to have multitransaction control inside a stored procedure script. I suspect that your experience of application development has been rather narrow. > To take the argument in the opposite extreme would you suggest we > should have html formatting functions in the database so that people > can have their entire web server just be print $dbh->('select > web_page(url)') ? Actually, you can already sort of do that using XSLT. So I don't necessary think that's a prohibitive idea, depending on implementation.After all, many of the new non-relational databasesimplement exactly this. >>> They're >>> useful so that a single session can do things like log errors even >>> when a transaction rolls back. >> >> That's *also* an excellent use case. > > What makes it an excellent use case is that it's basically impossible > to do without autonomous transactions. You can hack it with dblink but > it's much less clean and much higher overhead. You could do it by using application code. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Apr 27, 2011, at 3:28 PM, Josh Berkus wrote: > Actually, you can already sort of do that using XSLT. So I don't > necessary think that's a prohibitive idea, depending on implementation. > After all, many of the new non-relational databases implement exactly this. The proposed JSON data type and construction functions (once there's agreement on an implementation) will allow this, too.Just serve JSON. Boom, instant REST server. David
On Apr 26, 2011, at 6:08 PM, Kevin Grittner wrote: > Josh Berkus <josh@agliodbs.com> wrote: >> -- doing a backfill operation for 10GB of computed data, taking 8 >> hours, where I don't want to hold a transaction open for 8 hours >> since this is a high-volume OLTP database. > > Been there, done that. Definitely not a rare use case. We do that so often we've actually written a framework around it and are working on a daemon that will deal with any backfillsthat have been registered in the system. If we could control transactions that daemon could be entirely in the database...but since we can't, we have to write it in another language outside the database and switch back and forth betweenthe two worlds. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Fri, Apr 22, 2011 at 1:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On fre, 2011-04-22 at 08:37 -0500, Merlin Moncure wrote: >> It wouldn't bother me in the lest that if in plpgsql procedures if you >> had to set up and tear down a transaction on every line. > > It would probably be more reasonable and feasible to have a setup where > you can end a transaction in plpgsql but a new one would start right > away. I've been poking around to see how this might be done, and yes there are a lot of issue. The good news is that at least, from what I can tell so far, that there are relatively few problems inside plpgsql itself in terms of making it span transactions (there is a small assumption with the simple eval code but that can probably easily fixed). The problems are further up in that plpgsql relies on various structures that are tucked into the transaction memory context. The very first thing that I think has to be figured out to implement supertransactional behaviors is under which memory context the various structures plpgsql depends on will live, especially the execution state. I'm thinking it should rely in the message context, with some participation at the portal level, possibly via a new routine (PortalRunProcedure) that is special in that it has to communicate to plpgsql that it is a procedure and what to do when doing transactional management. For example, it is currently managing the ExecutorQueryDesc and should probably continue doing so. One way to do this is to inject a callback somewhere (in the queryDesc?) which could be accessible at the lower levels (ideally even in SPI if we want to be able to get to this from other PLs). The callback implementation would kill the snapshot, reset the transaction etc. Most other transaction management is not happening here, but in postgres.c, so I'm not sure if this is the right place. I'd also like to defer the snapshot creation as long as possible after flushing the current transaction so that it's possible to sneak a lock into the procedure body to deal with serialization problems. If that can't be worked out, maybe a textual implementation or something based on autonomous approach is better. merlin
Josh Berkus wrote: > Peter, > > > I would like to collect some specs on this feature. So does anyone have > > links to documentation of existing implementations, or their own spec > > writeup? A lot of people appear to have a very clear idea of this > > concept in their own head, so let's start collecting those. > > Delta between SPs and Functions for PostgreSQL: > > * SPs are executed using CALL or EXECUTE, and not SELECT. > > * SPs do not return a value > ** optional: SPs *may* have OUT parameters. [ Late reply.] What is it about stored procedures that would require it not to return a value or use CALL? I am trying to understand what part of this is "procedures" (doesn't return a values, we decided there isn't much value for that syntax vs. functions), and anonymous transactions. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Mon, May 9, 2011 at 1:41 PM, Bruce Momjian <bruce@momjian.us> wrote: > Josh Berkus wrote: >> Peter, >> >> > I would like to collect some specs on this feature. So does anyone have >> > links to documentation of existing implementations, or their own spec >> > writeup? A lot of people appear to have a very clear idea of this >> > concept in their own head, so let's start collecting those. >> >> Delta between SPs and Functions for PostgreSQL: >> >> * SPs are executed using CALL or EXECUTE, and not SELECT. >> >> * SPs do not return a value >> ** optional: SPs *may* have OUT parameters. > > [ Late reply.] > > What is it about stored procedures that would require it not to return a > value or use CALL? I am trying to understand what part of this is > "procedures" (doesn't return a values, we decided there isn't much value > for that syntax vs. functions), and anonymous transactions. FWICT the sql standard. The only summary of standard behaviors I can find outside of the standard itself is here: http://farrago.sourceforge.net/design/UserDefinedTypesAndRoutines.html.Peter's synopsis of how the standard works is murkyat best and competing implementations are all over the place...SQL server's 'CALL' feature is basically what I personally would like to see. It would complement our functions nicely. Procedures return values and are invoked with CALL. Functions return values and are in-query callable. The fact that 'CALL' is not allowed inside a query seems to make it pretty darn convenient to make the additional distinction of allowing transactional control statements there and not in functions. You don't *have* to allow transactional control statements and could offer this feature as an essentially syntax sugar enhancement, but then run the risk of boxing yourself out of a useful properties of this feature later on because of backwards compatibility issues (in particular, the assumption that your are in a running transaction in the procedure body). merlin
Kevin Grittner wrote: > Peter Eisentraut <peter_e@gmx.net> wrote: > > > what would be the actual use cases of any of these features? > > Let's collect some, so we can think of ways to make them work. > > The two things which leap to mind for me are: > > (1) All the \d commands in psql should be implemented in SPs so > that they are available from any client, through calling one SP > equivalent to one \d command. The \d commands would be changed to > call the SPs for releases recent enough to support this. Eventually > psql would be free of worrying about which release contained which > columns in which system tables, because it would just be passing the > parameters in and displaying whatever results came back. > > I have used products which implemented something like this, and > found it quite useful. Uh, why does this require stored procedures? Seems our existing function capabilities are even better suited to this. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: > >> (1) All the \d commands in psql should be implemented in SPs so > >> that they are available from any client, through calling one SP > >> equivalent to one \d command. > > > You don't need stored procedures with special transaction behavior for > > this. > > No, but what you *would* need is the ability to return multiple result > sets from one call. Even then, you could not exactly duplicate the > current output of \d; but you could duplicate the functionality. Oh, good point. Thanks. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On 05/09/2011 08:20 PM, Bruce Momjian wrote: > Tom Lane wrote: >> Peter Eisentraut<peter_e@gmx.net> writes: >>> On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: >>>> (1) All the \d commands in psql should be implemented in SPs so >>>> that they are available from any client, through calling one SP >>>> equivalent to one \d command. >>> You don't need stored procedures with special transaction behavior for >>> this. >> No, but what you *would* need is the ability to return multiple result >> sets from one call. Even then, you could not exactly duplicate the >> current output of \d; but you could duplicate the functionality. > Oh, good point. Thanks. Multiple resultsets in one call would be a good thing, though, no? cheers andrew
On Mon, May 9, 2011 at 9:21 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > > > On 05/09/2011 08:20 PM, Bruce Momjian wrote: >> >> Tom Lane wrote: >>> >>> Peter Eisentraut<peter_e@gmx.net> writes: >>>> >>>> On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote: >>>>> >>>>> (1) All the \d commands in psql should be implemented in SPs so >>>>> that they are available from any client, through calling one SP >>>>> equivalent to one \d command. >>>> >>>> You don't need stored procedures with special transaction behavior for >>>> this. >>> >>> No, but what you *would* need is the ability to return multiple result >>> sets from one call. Even then, you could not exactly duplicate the >>> current output of \d; but you could duplicate the functionality. >> >> Oh, good point. Thanks. > > Multiple resultsets in one call would be a good thing, though, no? > > cheers I *thought* the purpose of having stored procedures was to allow a substrate supporting running multiple transactions, so it could do things like: - Managing vacuums - Managing transactions - Replacing some of the need for dblink. - Being an in-DB piece that could manage LISTENs It seems to be getting "bikeshedded" into something with more "functional argument functionality" than stored functions. I think we could have a perfectly successful implementation of "stored procedures" that supports ZERO ability to pass arguments in or out. That's quite likely to represent a good start. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
Christopher Browne wrote: > > Multiple resultsets in one call would be a good thing, though, no? > > > > cheers > > I *thought* the purpose of having stored procedures was to allow a > substrate supporting running multiple transactions, so it could do > things like: > - Managing vacuums > - Managing transactions > - Replacing some of the need for dblink. > - Being an in-DB piece that could manage LISTENs > > It seems to be getting "bikeshedded" into something with more > "functional argument functionality" than stored functions. > > I think we could have a perfectly successful implementation of "stored > procedures" that supports ZERO ability to pass arguments in or out. > That's quite likely to represent a good start. I am kind of confused too, particularly with the CALL syntax. I thought our function call usage was superior in every way to CALL, so why implement CALL? I assume for SQL-standards compliance, right? Does multiple result sets require CALL? I assume autonomous transactions don't require CALL. Are we assuming no one is going to want a function that allows multiple result sets or autonomous transactions? That seems unlikely. I would think CALL is independent of those features. Maybe we need those features to support SQL-standard CALL, and we will just add those features to functions too. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
2011/5/10 Bruce Momjian <bruce@momjian.us>: > Christopher Browne wrote: >> > Multiple resultsets in one call would be a good thing, though, no? >> > >> > cheers >> >> I *thought* the purpose of having stored procedures was to allow a >> substrate supporting running multiple transactions, so it could do >> things like: >> - Managing vacuums >> - Managing transactions >> - Replacing some of the need for dblink. >> - Being an in-DB piece that could manage LISTENs >> >> It seems to be getting "bikeshedded" into something with more >> "functional argument functionality" than stored functions. >> >> I think we could have a perfectly successful implementation of "stored >> procedures" that supports ZERO ability to pass arguments in or out. >> That's quite likely to represent a good start. > > I am kind of confused too, particularly with the CALL syntax. I thought > our function call usage was superior in every way to CALL, so why > implement CALL? I assume for SQL-standards compliance, right? Does > multiple result sets require CALL? I assume autonomous transactions > don't require CALL. > no - you are little bit confused :). CALL and function execution shares nothing. There is significant differences between function and procedure. Function is called only from executor - from some plan, and you have to know a structure of result before run. The execution of CALL is much simple - you just execute code - without plan and waiting for any result - if there is. > Are we assuming no one is going to want a function that allows multiple > result sets or autonomous transactions? That seems unlikely. I would > think CALL is independent of those features. Maybe we need those > features to support SQL-standard CALL, and we will just add those > features to functions too. > We can use a SETOF cursors for returning a multiple result sets now. But there are a few complications: a) The client should to wait for finish of all sets from multiple result sets - minimally in PL/pgSQL b) client 'psql' doesn't support a unpacking result when result is multiple result set c) The using cursors isn't too comfortable - in comparation to MS SQL or MySQL Regards Pavel > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + It's impossible for everything to be true. + > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
On Mon, May 9, 2011 at 11:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > no - you are little bit confused :). CALL and function execution > shares nothing. There is significant differences between function and > procedure. Function is called only from executor - from some plan, and > you have to know a structure of result before run. The execution of > CALL is much simple - you just execute code - without plan and waiting > for any result - if there is. Now I'm a little confused, or you are. Surely any SQL has to be planned and executed, regardless of whether it appears in a function, a stored procedure, or anywhere else. Non-SQL statements within a stored procedure don't need to go through the planner and executor, but that's true in PL/python or PL/pgsql or whatever today. I think people are using the term "stored procedures" to refer to approximately whatever it is that they're unhappy that functions don't allow, and that's leading to a lot of people talking across each other. The main features seem to be (1) explicit transaction control and/or execution of commands like VACUUM that can't be invoked from within a transaction, (2) autonomous transactions, and (3) returning multiple result sets. But I don't think anybody would be desperately unhappy if it magically became possible to do those things from regular functions, unlikely as that may seem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2011/5/10 Robert Haas <robertmhaas@gmail.com>: > On Mon, May 9, 2011 at 11:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> no - you are little bit confused :). CALL and function execution >> shares nothing. There is significant differences between function and >> procedure. Function is called only from executor - from some plan, and >> you have to know a structure of result before run. The execution of >> CALL is much simple - you just execute code - without plan and waiting >> for any result - if there is. > > Now I'm a little confused, or you are. Surely any SQL has to be > planned and executed, regardless of whether it appears in a function, > a stored procedure, or anywhere else. Non-SQL statements within a > stored procedure don't need to go through the planner and executor, > but that's true in PL/python or PL/pgsql or whatever today. > CALL statement is "util command" than SQL. It has to execute some NON SQL code. You can thinking about CALL statement like synonymum for SELECT, but it isn't correct (it is my opinion) The "stored procedures" was prior stored functions (more corectly UDF - user defined functions). These "old time" stored procedures was simply - it was client code moved on server. Usually these procedures was executed in different process or different thread. Inside procedures was full client's side functionality and there wasn't a network overhead. CALL statement is +/- remote call. It isn't SQL statement. > I think people are using the term "stored procedures" to refer to > approximately whatever it is that they're unhappy that functions don't > allow, and that's leading to a lot of people talking across each > other. The main features seem to be (1) explicit transaction control > and/or execution of commands like VACUUM that can't be invoked from > within a transaction, (2) autonomous transactions, and (3) returning > multiple result sets. But I don't think anybody would be desperately > unhappy if it magically became possible to do those things from > regular functions, unlikely as that may seem. > yes. @2 Autonomous transaction doesn't need stored procedures. Autonomous transaction can be isolated by function's flag, by some special PL/pgSQL statement: like BEGIN EXECUTE AUTONOMOUS '....' END; @3 is possible now too - but not too much user friendly. Point 3 is strange. Oracle doesn't support it. Support in DB2 is little bit strange. And it is well supported by MySQL, MSSQL, maybe Informix, Sybase. Pavel Regards Pavel > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
Hi, On 05/10/2011 02:55 PM, Robert Haas wrote: > On Mon, May 9, 2011 at 11:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> no - you are little bit confused :). CALL and function execution >> shares nothing. There is significant differences between function and >> procedure. Function is called only from executor - from some plan, and >> you have to know a structure of result before run. The execution of >> CALL is much simple - you just execute code - without plan and waiting >> for any result - if there is. I think the distinction between function and procedure is misleading here. Some envision stored *procedures* to be able to return values, result sets and possibly even *multiple* result sets. > The main features seem to be (1) explicit transaction control > and/or execution of commands like VACUUM that can't be invoked from > within a transaction, I think that's the main point of stored procedures. > (2) autonomous transactions To me autonomous transactions seem orthogonal. Those can be used to implement (1) above, but might have other uses for regular transactions as well. (The point I'm taking home here is that you might want to control not only one concurrent transaction, but several from a "stored procedure".So far, I assumed only one.) > and (3) returning > multiple result sets. But I don't think anybody would be desperately > unhappy if it magically became possible to do those things from > regular functions, unlikely as that may seem. That point definitely is on my wish-list for UDFs already. I didn't think of this as having to do with stored procedures, either. Regards Markus
On 9 May 2011 20:52, Merlin Moncure <mmoncure@gmail.com> wrote: > On Mon, May 9, 2011 at 1:41 PM, Bruce Momjian <bruce@momjian.us> wrote: >> Josh Berkus wrote: >>> Peter, >>> >>> > I would like to collect some specs on this feature. So does anyone have >>> > links to documentation of existing implementations, or their own spec >>> > writeup? A lot of people appear to have a very clear idea of this >>> > concept in their own head, so let's start collecting those. >>> >>> Delta between SPs and Functions for PostgreSQL: >>> >>> * SPs are executed using CALL or EXECUTE, and not SELECT. >>> >>> * SPs do not return a value >>> ** optional: SPs *may* have OUT parameters. >> >> [ Late reply.] >> >> What is it about stored procedures that would require it not to return a >> value or use CALL? I am trying to understand what part of this is >> "procedures" (doesn't return a values, we decided there isn't much value >> for that syntax vs. functions), and anonymous transactions. > > FWICT the sql standard. The only summary of standard behaviors I can > find outside of the standard itself is here: > http://farrago.sourceforge.net/design/UserDefinedTypesAndRoutines.html. > Peter's synopsis of how the standard works is murky at best and > competing implementations are all over the place...SQL server's > 'CALL' feature is basically what I personally would like to see. It > would complement our functions nicely. > > Procedures return values and are invoked with CALL. Functions return > values and are in-query callable. > > The fact that 'CALL' is not allowed inside a query seems to make it > pretty darn convenient to make the additional distinction of allowing > transactional control statements there and not in functions. You > don't *have* to allow transactional control statements and could offer > this feature as an essentially syntax sugar enhancement, but then run > the risk of boxing yourself out of a useful properties of this feature > later on because of backwards compatibility issues (in particular, the > assumption that your are in a running transaction in the procedure > body). I've seen no mention of SQL/PSM. Isn't all of this covered by that? -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Aug 31, 2011 at 9:00 AM, Thom Brown <thom@linux.com> wrote: > On 9 May 2011 20:52, Merlin Moncure <mmoncure@gmail.com> wrote: >> On Mon, May 9, 2011 at 1:41 PM, Bruce Momjian <bruce@momjian.us> wrote: >>> Josh Berkus wrote: >>>> Peter, >>>> >>>> > I would like to collect some specs on this feature. So does anyone have >>>> > links to documentation of existing implementations, or their own spec >>>> > writeup? A lot of people appear to have a very clear idea of this >>>> > concept in their own head, so let's start collecting those. >>>> >>>> Delta between SPs and Functions for PostgreSQL: >>>> >>>> * SPs are executed using CALL or EXECUTE, and not SELECT. >>>> >>>> * SPs do not return a value >>>> ** optional: SPs *may* have OUT parameters. >>> >>> [ Late reply.] >>> >>> What is it about stored procedures that would require it not to return a >>> value or use CALL? I am trying to understand what part of this is >>> "procedures" (doesn't return a values, we decided there isn't much value >>> for that syntax vs. functions), and anonymous transactions. >> >> FWICT the sql standard. The only summary of standard behaviors I can >> find outside of the standard itself is here: >> http://farrago.sourceforge.net/design/UserDefinedTypesAndRoutines.html. >> Peter's synopsis of how the standard works is murky at best and >> competing implementations are all over the place...SQL server's >> 'CALL' feature is basically what I personally would like to see. It >> would complement our functions nicely. >> >> Procedures return values and are invoked with CALL. Functions return >> values and are in-query callable. >> >> The fact that 'CALL' is not allowed inside a query seems to make it >> pretty darn convenient to make the additional distinction of allowing >> transactional control statements there and not in functions. You >> don't *have* to allow transactional control statements and could offer >> this feature as an essentially syntax sugar enhancement, but then run >> the risk of boxing yourself out of a useful properties of this feature >> later on because of backwards compatibility issues (in particular, the >> assumption that your are in a running transaction in the procedure >> body). > > I've seen no mention of SQL/PSM. Isn't all of this covered by that? That's the 64k$ question. My take is that 'CALL' doesn't implicitly set up a transaction state, and a proper PSM implementation would allow transaction control mid-procedure. Functions will always be called in-transaction, since there is no way I can see to execute a function except from an outer query (or the special case of DO). I think there's zero point in making CALL work without dealing with the transaction issue -- in fact it could end up being a huge mistake to do so. Pavel's PSM implementation (see: http://www.pgsql.cz/index.php/SQL/PSM_Manual) works under the constraints of pg's understanding of what functions should and should not be allowed to do. It allows creation of PSM *functions* -- that's all. IMNSHO, stored procedures should run in-process, and the execution engine needs to be modified to not automatically spin up a transaction and a snapshot when running them, but most allow a pl to do that at appropriate times. plpgsql and the other pls fwict make no assumptions that strictly invalidate their use in that fashion outside of some unfortunate ambiguity issues around 'begin', 'end', etc. If there is no current transaction, each statement should create one if it's determined that the statement is interfacing with the sql engine in such a way a transaction would be required, and immediately tear it down, exactly as if an sql script was run inside the backend. The SPI interface can probably work 'as-is', and should probably return an error if you arrive into certain functions while not in transaction. An out of process, autonomous transaction type implementation should probably not sit under stored procedures for a number of reasons -- mainly that it's going to expose too many implementation details to the user. For example, does a SP heavy app have 2*N running processes? Or do we slot them into a defined number of backends for that purpose? Yuck & yuck. I like the AT feature, and kludge it frequently via dblink, but it's a solution for a different set of problems. merlin
On 8/31/11 12:15 PM, Merlin Moncure wrote: > An out of process, autonomous transaction type implementation should > probably not sit under stored procedures for a number of reasons -- > mainly that it's going to expose too many implementation details to > the user. For example, does a SP heavy app have 2*N running > processes? Or do we slot them into a defined number of backends for > that purpose? Yuck & yuck. I like the AT feature, and kludge it > frequently via dblink, but it's a solution for a different set of > problems. I think that transaction control without parallelism would be the 80% solution. That is, an SP has transaction control, but those transactions are strictly serial, and cannot be run in parallel. For example, if you were writing an SP in PL/pgSQL, each "BEGIN ... END" block would be an explicit transaction, and standalone-only statements be allowed between BEGIN ... END blocks, or possibly in their own special block type (I prefer the latter). One issue we'd need to deal with is exception control around single-statement transactions and non-transactional statements (VACUUM, CREATE INDEX CONCURRENTLY, CHECKPOINT, etc.). In some cases, the user is going to want to catch exceptions and abort the SP, and in other cases ignore them, so both need to be possible. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Thu, Sep 1, 2011 at 1:18 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 8/31/11 12:15 PM, Merlin Moncure wrote: >> An out of process, autonomous transaction type implementation should >> probably not sit under stored procedures for a number of reasons -- >> mainly that it's going to expose too many implementation details to >> the user. For example, does a SP heavy app have 2*N running >> processes? Or do we slot them into a defined number of backends for >> that purpose? Yuck & yuck. I like the AT feature, and kludge it >> frequently via dblink, but it's a solution for a different set of >> problems. > > I think that transaction control without parallelism would be the 80% > solution. That is, an SP has transaction control, but those > transactions are strictly serial, and cannot be run in parallel. For > example, if you were writing an SP in PL/pgSQL, each "BEGIN ... END" > block would be an explicit transaction, and standalone-only statements > be allowed between BEGIN ... END blocks, or possibly in their own > special block type (I prefer the latter). > > One issue we'd need to deal with is exception control around > single-statement transactions and non-transactional statements (VACUUM, > CREATE INDEX CONCURRENTLY, CHECKPOINT, etc.). In some cases, the user > is going to want to catch exceptions and abort the SP, and in other > cases ignore them, so both need to be possible. Yep, "+1" on that. Leaving out parallelism, and having the mechanism operate under the auspices of a single connection, makes a fine start, and perhaps is enough even in the longer run. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
On Thu, Sep 1, 2011 at 12:18 PM, Josh Berkus <josh@agliodbs.com> wrote: > On 8/31/11 12:15 PM, Merlin Moncure wrote: >> An out of process, autonomous transaction type implementation should >> probably not sit under stored procedures for a number of reasons -- >> mainly that it's going to expose too many implementation details to >> the user. For example, does a SP heavy app have 2*N running >> processes? Or do we slot them into a defined number of backends for >> that purpose? Yuck & yuck. I like the AT feature, and kludge it >> frequently via dblink, but it's a solution for a different set of >> problems. > > I think that transaction control without parallelism would be the 80% > solution. That is, an SP has transaction control, but those > transactions are strictly serial, and cannot be run in parallel. For > example, if you were writing an SP in PL/pgSQL, each "BEGIN ... END" > block would be an explicit transaction, and standalone-only statements > be allowed between BEGIN ... END blocks, or possibly in their own > special block type (I prefer the latter). > > One issue we'd need to deal with is exception control around > single-statement transactions and non-transactional statements (VACUUM, > CREATE INDEX CONCURRENTLY, CHECKPOINT, etc.). In some cases, the user > is going to want to catch exceptions and abort the SP, and in other > cases ignore them, so both need to be possible. Totally agree -- was thinking about this very issue. One of the things I'd really like to see SP be able to do is to abstract some of the nasty details of MVCC away from the client -- setting isolation mode, replaying errors on serialization, etc. This requires error handling. Unfortunately, this (exception handling in non transaction context) is probably going to add some complexity to the implementation. Are we on the right track here (that is, maybe we really *should* be looking at out of process execution)? How do procedures fit in terms of execution from the tcop down? merlin
Is "stored procedures" planned in future? I think is a "most missing" future today in Postgres. Using a dblink to emulate commit in transaction is very complicated -- View this message in context: http://postgresql.1045698.n5.nabble.com/stored-procedures-tp4331060p5752274.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
Hello
2013/4/16 aasat <satriani@veranet.pl>
Is "stored procedures" planned in future? I think is a "most missing" future
today in Postgres.
It is in ToDo, but nobody working on this feature in this moment, probably.
Using a dblink to emulate commit in transaction is very complicated
probably autonomous transaction will be implemented first - and should be really nice feature.
Regards
Pavel
--
View this message in context: http://postgresql.1045698.n5.nabble.com/stored-procedures-tp4331060p5752274.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers