Обсуждение: "stored procedures"

Поиск

Список

Период

Сортировка

"stored procedures"

От

Peter Eisentraut

Дата:

21 апреля 2011 г., 12:24:18

So the topic of "real" "stored procedures" came up again.  Meaning a
function-like object that executes outside of a regular transaction,
with the ability to start and stop SQL transactions itself.

I would like to collect some specs on this feature.  So does anyone have
links to documentation of existing implementations, or their own spec
writeup?  A lot of people appear to have a very clear idea of this
concept in their own head, so let's start collecting those.

Re: "stored procedures"

От

Pavel Stehule

Дата:

21 апреля 2011 г., 12:41:15

Hi Peter

2011/4/21 Peter Eisentraut <peter_e@gmx.net>:
> So the topic of "real" "stored procedures" came up again.  Meaning a
> function-like object that executes outside of a regular transaction,
> with the ability to start and stop SQL transactions itself.
>
> I would like to collect some specs on this feature.  So does anyone have
> links to documentation of existing implementations, or their own spec
> writeup?  A lot of people appear to have a very clear idea of this
> concept in their own head, so let's start collecting those.
>

I had a patch for "transactional" procedures, but this is lost :(

http://okbob.blogspot.com/2007/11/stacked-recordset-multirecordset.html

What I (We) expect:

Very important points:
1. possible explicit transaction controlling - not only subtransactions
2. correct or usual behave of OUT parameters (important for JDBC people)
*** attention: overloading is related to OUT parameters too ***

Not necessary but nice:
3. Support for multirecordset and RETURN_STATUS variable
(RETURN_STATUS is defined by ANSI)

Regards

Pavel


>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: "stored procedures"

От

Robert Haas

Дата:

21 апреля 2011 г., 12:52:10

On Thu, Apr 21, 2011 at 11:24 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> So the topic of "real" "stored procedures" came up again.  Meaning a
> function-like object that executes outside of a regular transaction,
> with the ability to start and stop SQL transactions itself.
>
> I would like to collect some specs on this feature.  So does anyone have
> links to documentation of existing implementations, or their own spec
> writeup?  A lot of people appear to have a very clear idea of this
> concept in their own head, so let's start collecting those.

EDB has an implementation of this in Advanced Server.  A stored
procedure can issue a COMMIT, which commits the current transaction
and begins a new one.  This might or might not be what people are
imagining for this feature.  If we end up doing something else, one
thing to consider is the impact on third-party tools like PGPOOL,
which currently keep track of whether or not a transaction is in
progress by snooping on the stream of SQL commands.  If a procedure
can be started with no transaction in progress and return with one
open, or the other way around, that method will break horribly.
That's not necessarily a reason not to do it, but I suspect we would
want to add some kind of protocol-level information about the
transaction state instead so that such tools could continue to work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: "stored procedures"

От

Tom Lane

Дата:

21 апреля 2011 г., 13:38:49

Robert Haas <robertmhaas@gmail.com> writes:
> EDB has an implementation of this in Advanced Server.  A stored
> procedure can issue a COMMIT, which commits the current transaction
> and begins a new one.  This might or might not be what people are
> imagining for this feature.  If we end up doing something else, one
> thing to consider is the impact on third-party tools like PGPOOL,
> which currently keep track of whether or not a transaction is in
> progress by snooping on the stream of SQL commands.  If a procedure
> can be started with no transaction in progress and return with one
> open, or the other way around, that method will break horribly.
> That's not necessarily a reason not to do it, but I suspect we would
> want to add some kind of protocol-level information about the
> transaction state instead so that such tools could continue to work.

Huh?  There's been a transaction state indicator in the protocol since
7.4 (see ReadyForQuery).  It's not our problem if PGPOOL is still using
methods that were appropriate ten years ago.
        regards, tom lane

Re: "stored procedures"

От

Robert Haas

Дата:

21 апреля 2011 г., 13:42:39

On Thu, Apr 21, 2011 at 12:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> EDB has an implementation of this in Advanced Server.  A stored
>> procedure can issue a COMMIT, which commits the current transaction
>> and begins a new one.  This might or might not be what people are
>> imagining for this feature.  If we end up doing something else, one
>> thing to consider is the impact on third-party tools like PGPOOL,
>> which currently keep track of whether or not a transaction is in
>> progress by snooping on the stream of SQL commands.  If a procedure
>> can be started with no transaction in progress and return with one
>> open, or the other way around, that method will break horribly.
>> That's not necessarily a reason not to do it, but I suspect we would
>> want to add some kind of protocol-level information about the
>> transaction state instead so that such tools could continue to work.
>
> Huh?  There's been a transaction state indicator in the protocol since
> 7.4 (see ReadyForQuery).  It's not our problem if PGPOOL is still using
> methods that were appropriate ten years ago.

Hmm.  Well, maybe we need some PGPOOL folks to weigh in.  Possibly
it's just a case of "it ain't broke, so we haven't fixed it".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: "stored procedures"

От

Josh Berkus

Дата:

21 апреля 2011 г., 14:22:38

Peter,

> I would like to collect some specs on this feature.  So does anyone have
> links to documentation of existing implementations, or their own spec
> writeup?  A lot of people appear to have a very clear idea of this
> concept in their own head, so let's start collecting those.

Delta between SPs and Functions for PostgreSQL:

* SPs are executed using CALL or EXECUTE, and not SELECT.

* SPs do not return a value
** optional: SPs *may* have OUT parameters.

* SPs have internal transactions including begin/commit
** optional: SPs can run non-transaction statements,  like CREATE INDEX CONCURRENTLY and VACUUM
** corollary: SPs may not be called as part of a larger query
** question: if an SP is called by another SP, what is its  transaction context?

* optional: SPs can return multisets (ala SQL Server).
** question: how would multisets be handled on the client end?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: "stored procedures"

От

Pavel Stehule

Дата:

21 апреля 2011 г., 14:34:48

Hello

2011/4/21 Josh Berkus <josh@agliodbs.com>:
> Peter,
>
>> I would like to collect some specs on this feature.  So does anyone have
>> links to documentation of existing implementations, or their own spec
>> writeup?  A lot of people appear to have a very clear idea of this
>> concept in their own head, so let's start collecting those.
>
> Delta between SPs and Functions for PostgreSQL:
>
> * SPs are executed using CALL or EXECUTE, and not SELECT.
>
> * SPs do not return a value
> ** optional: SPs *may* have OUT parameters.

SP can returns value - result status or RETURNED_SQLSTATE. Result
status is hidden OUT parameter

>
> * SPs have internal transactions including begin/commit
> ** optional: SPs can run non-transaction statements,
>   like CREATE INDEX CONCURRENTLY and VACUUM
> ** corollary: SPs may not be called as part of a larger query
> ** question: if an SP is called by another SP, what is its
>   transaction context?
>
> * optional: SPs can return multisets (ala SQL Server).
> ** question: how would multisets be handled on the client end?
>

you should to use some "next" function for iteration between resultsets

http://dev.mysql.com/doc/refman/5.0/en/mysql-next-result.html

similar function exists in MSSQL API too

Regards

Pavel Stehule

> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: "stored procedures"

От

"Kevin Grittner"

Дата:

21 апреля 2011 г., 14:40:23

I'm pretty close to agreement with Josh, I think.

Josh Berkus <josh@agliodbs.com> wrote:
> Delta between SPs and Functions for PostgreSQL:
> 
> * SPs are executed using CALL or EXECUTE, and not SELECT.
Agreed, although some products will search for a matching procedure
name if the start of a statement doesn't match any reserved word. 
That can be handy -- you run them more or less like commands.
> * SPs do not return a value
I've used some products where these were available, although in some
cases only setting what in PostgreSQL would be the equivalent of an
integer session GUC.
> ** optional: SPs *may* have OUT parameters.
Support for those would be important to handle some common uses of
SPs.
> * SPs have internal transactions including begin/commit
Yeah.  Entering or leaving an SP should not start or end a
transaction.  BEGIN, COMMIT, ROLLBACK, and SAVEPOINT should all be
available and should not disrupt statement flow.
> ** optional: SPs can run non-transaction statements,
>    like CREATE INDEX CONCURRENTLY and VACUUM
That seems important.
> ** corollary: SPs may not be called as part of a larger query
OK.
> ** question: if an SP is called by another SP, what is its
>    transaction context?
Entering or leaving an SP should not start or end a transaction.
> * optional: SPs can return multisets (ala SQL Server).
I think that's important.
> ** question: how would multisets be handled on the client end?
In previous discussions there seemed to be a feeling that unless we
were going to go to a new major version of the protocol, the return
from an SP would be an array of result sets.  We would probably want
to reserve the first one for OUT parameters (and if we decide to
support it, the return value).  Tools like psql would need to
display each in its turn, similar to what we do for some backslash
commands.
-Kevin

Re: "stored procedures"

От

Tom Lane

Дата:

21 апреля 2011 г., 15:14:12

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Josh Berkus <josh@agliodbs.com> wrote:
>> ** question: if an SP is called by another SP, what is its
>> transaction context?
> Entering or leaving an SP should not start or end a transaction.

That all sounds mighty hand-wavy and at serious risk of tripping over
implementation details.  Some things to think about:

1. Are you expecting the procedure definition to be fetched from a
system catalog?  You're going to need to be inside a transaction
to do that.

2. Are you expecting the procedure to take any input parameters?
You're going to need to be inside a transaction to evaluate the
inputs, unless perhaps you restrict the feature to an extremely
lobotomized subset of possible arguments (no user-defined types,
no expressions, just for starters).

3. What sort of primitive operations do you expect the SP to be
able to execute "outside a transaction"?  The plpgsql model where
all the primitive operations are really SQL ain't gonna work.

I think that we could finesse #1 and #2, along these lines:
The CALL command is ordinary SQL but not allowed inside a transaction
block, much like some existing commands like VACUUM.  So we start a
transaction to parse and execute it.  The CALL looks up the procedure
definition and evaluates any input arguments.  It then copies this info to
some outside-the-transaction memory context, terminates its transaction,
and calls the procedure.  On return it starts a new transaction, in
which it can call the output functions that are going to have to be
executed in order to pass anything back to the client.  (This implies
that OUT argument values are collected up during SP execution and not
actually passed back to the client till later.  People who were hoping
to stream vast amounts of data to the client will not be happy.  But
I see no way around that unless you want to try to execute output
functions outside a transaction, which strikes me as a quagmire.)

I'm less sure what to do about #3.  The most attractive approach would
probably be to make people use a non-SQL script interpreter --- perl,
python, or whatever floats your boat --- which would likely mean that
we have not just one SP implementation language but N of them.  But
we've solved that problem before.

Calling another SP ... particularly one with a different implementation
language ... could be a bit tricky too.  The above proposal assumes that
SPs are always entered outside a transaction, but do we want to make
that same restriction for the call-another-SP case?  And if not, how's
it going to work?  Again, you'll have to be inside a transaction at
least long enough to get the SP's definition out of the catalogs.
        regards, tom lane

Re: "stored procedures"

От

Merlin Moncure

Дата:

21 апреля 2011 г., 15:36:00

On Thu, Apr 21, 2011 at 1:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 3. What sort of primitive operations do you expect the SP to be
> able to execute "outside a transaction"?  The plpgsql model where
> all the primitive operations are really SQL ain't gonna work.
> I'm less sure what to do about #3.  The most attractive approach would
> probably be to make people use a non-SQL script interpreter --- perl,
> python, or whatever floats your boat --- which would likely mean that
> we have not just one SP implementation language but N of them.  But
> we've solved that problem before.

Does this mean you do or don't expect plpgsql to be able to run as
procedure?  Should SPI based routines generally be able to run as a
procedure (I hope so)?  If so, what API enhancements would be needed?
(I was thinking, SPI_is_proc, or something like that).  I'd like to
see plpgsql work as much as possible as it does now, except obviously
you can't have exception handlers.

What about cancelling? Cancel the current running query, or the whole
procedure (I'm assuming the latter?  How would that work?

> Calling another SP ... particularly one with a different implementation
> language ... could be a bit tricky too.  The above proposal assumes that
> SPs are always entered outside a transaction, but do we want to make
> that same restriction for the call-another-SP case?  And if not, how's
> it going to work?  Again, you'll have to be inside a transaction at
> least long enough to get the SP's definition out of the catalogs.

This restriction (no transaction only CALL) is ok I think.  You can
always code up a function otherwise.

merlin

Re: "stored procedures"

От

Pavel Stehule

Дата:

21 апреля 2011 г., 15:38:43

2011/4/21 Tom Lane <tgl@sss.pgh.pa.us>:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Josh Berkus <josh@agliodbs.com> wrote:
>>> ** question: if an SP is called by another SP, what is its
>>> transaction context?
>
>> Entering or leaving an SP should not start or end a transaction.
>

> That all sounds mighty hand-wavy and at serious risk of tripping over
> implementation details.  Some things to think about:

It doesn't mean so SQL are inside SP non transactional. Stored
Procedure is just client module moved on server. You can call SQL
statements from psql without outer implicit or explicit transaction
too.

It mean - a CALL statement should not start a outer transaction when
it isn't requested, but all inner SQL statements runs in own
transactions.

The questions about mutable or immutable parameters are important -
but it doesn't mean so SP without outer transactions are impossible.

Regards

Pavel

>
> 1. Are you expecting the procedure definition to be fetched from a
> system catalog?  You're going to need to be inside a transaction
> to do that.
>
> 2. Are you expecting the procedure to take any input parameters?
> You're going to need to be inside a transaction to evaluate the
> inputs, unless perhaps you restrict the feature to an extremely
> lobotomized subset of possible arguments (no user-defined types,
> no expressions, just for starters).
>
> 3. What sort of primitive operations do you expect the SP to be
> able to execute "outside a transaction"?  The plpgsql model where
> all the primitive operations are really SQL ain't gonna work.
>
> I think that we could finesse #1 and #2, along these lines:
> The CALL command is ordinary SQL but not allowed inside a transaction
> block, much like some existing commands like VACUUM.  So we start a
> transaction to parse and execute it.  The CALL looks up the procedure
> definition and evaluates any input arguments.  It then copies this info to
> some outside-the-transaction memory context, terminates its transaction,
> and calls the procedure.  On return it starts a new transaction, in
> which it can call the output functions that are going to have to be
> executed in order to pass anything back to the client.  (This implies
> that OUT argument values are collected up during SP execution and not
> actually passed back to the client till later.  People who were hoping
> to stream vast amounts of data to the client will not be happy.  But
> I see no way around that unless you want to try to execute output
> functions outside a transaction, which strikes me as a quagmire.)
>
> I'm less sure what to do about #3.  The most attractive approach would
> probably be to make people use a non-SQL script interpreter --- perl,
> python, or whatever floats your boat --- which would likely mean that
> we have not just one SP implementation language but N of them.  But
> we've solved that problem before.
>
> Calling another SP ... particularly one with a different implementation
> language ... could be a bit tricky too.  The above proposal assumes that
> SPs are always entered outside a transaction, but do we want to make
> that same restriction for the call-another-SP case?  And if not, how's
> it going to work?  Again, you'll have to be inside a transaction at
> least long enough to get the SP's definition out of the catalogs.
>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: "stored procedures"

От

Robert Haas

Дата:

21 апреля 2011 г., 16:37:43

On Thu, Apr 21, 2011 at 2:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Josh Berkus <josh@agliodbs.com> wrote:
>>> ** question: if an SP is called by another SP, what is its
>>> transaction context?
>
>> Entering or leaving an SP should not start or end a transaction.
>
> That all sounds mighty hand-wavy and at serious risk of tripping over
> implementation details.  Some things to think about:
>
> 1. Are you expecting the procedure definition to be fetched from a
> system catalog?  You're going to need to be inside a transaction
> to do that.
>
> 2. Are you expecting the procedure to take any input parameters?
> You're going to need to be inside a transaction to evaluate the
> inputs, unless perhaps you restrict the feature to an extremely
> lobotomized subset of possible arguments (no user-defined types,
> no expressions, just for starters).
>
> 3. What sort of primitive operations do you expect the SP to be
> able to execute "outside a transaction"?  The plpgsql model where
> all the primitive operations are really SQL ain't gonna work.

I think we could handle a lot of these details cleanly if we had
autonomous transactions as a system primitive.  When you enter a
stored procedure at the outermost level, you begin a transaction,
which will remain open until the outermost stored procedure exits.
Any transactions that the stored procedure begins, commits, or rolls
back are in fact autonomous subtransactions under the hood.  Possibly
conditions like IF (1/0) THEN ... END IF that throw run time errors
get evaluated in the outer transaction context, so any errors stops
execution at that point - and we also avoid beginning and ending a
gabazillion transactions.

Possibly I am still waving my hands.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: "stored procedures"

От

Merlin Moncure

Дата:

21 апреля 2011 г., 16:51:28

On Thu, Apr 21, 2011 at 2:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Apr 21, 2011 at 2:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>>> Josh Berkus <josh@agliodbs.com> wrote:
>>>> ** question: if an SP is called by another SP, what is its
>>>> transaction context?
>>
>>> Entering or leaving an SP should not start or end a transaction.
>>
>> That all sounds mighty hand-wavy and at serious risk of tripping over
>> implementation details.  Some things to think about:
>>
>> 1. Are you expecting the procedure definition to be fetched from a
>> system catalog?  You're going to need to be inside a transaction
>> to do that.
>>
>> 2. Are you expecting the procedure to take any input parameters?
>> You're going to need to be inside a transaction to evaluate the
>> inputs, unless perhaps you restrict the feature to an extremely
>> lobotomized subset of possible arguments (no user-defined types,
>> no expressions, just for starters).
>>
>> 3. What sort of primitive operations do you expect the SP to be
>> able to execute "outside a transaction"?  The plpgsql model where
>> all the primitive operations are really SQL ain't gonna work.
>
> I think we could handle a lot of these details cleanly if we had
> autonomous transactions as a system primitive.  When you enter a
> stored procedure at the outermost level, you begin a transaction,
> which will remain open until the outermost stored procedure exits.

If you do it that (base it on AT) way, then you can't:
1) call any utility command (vacuum, etc)
2) run for an arbitrary amount of time
3) discard any locks (except advisory)
4) deal with serialization isolation/mvcc snapshot issues that plague functions.

Points 2 & (especially) 4 for me are painful.

#4 explained:
If you are trying to tuck all the gory mvcc details into server side
functions, there is no real effective way to prevent serialization
errors because the snapshot is already made when you enter the
function.  Even if you LOCK something on function line#1, it's already
too late.  No transaction procedures don't have this problem and allow
encapsulating all that nastiness in the server.

merlin

Re: "stored procedures"

От

Christopher Browne

Дата:

21 апреля 2011 г., 17:48:44

On Thu, Apr 21, 2011 at 3:51 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Thu, Apr 21, 2011 at 2:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Thu, Apr 21, 2011 at 2:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>>>> Josh Berkus <josh@agliodbs.com> wrote:
>>>>> ** question: if an SP is called by another SP, what is its
>>>>> transaction context?
>>>
>>>> Entering or leaving an SP should not start or end a transaction.
>>>
>>> That all sounds mighty hand-wavy and at serious risk of tripping over
>>> implementation details.  Some things to think about:
>>>
>>> 1. Are you expecting the procedure definition to be fetched from a
>>> system catalog?  You're going to need to be inside a transaction
>>> to do that.
>>>
>>> 2. Are you expecting the procedure to take any input parameters?
>>> You're going to need to be inside a transaction to evaluate the
>>> inputs, unless perhaps you restrict the feature to an extremely
>>> lobotomized subset of possible arguments (no user-defined types,
>>> no expressions, just for starters).
>>>
>>> 3. What sort of primitive operations do you expect the SP to be
>>> able to execute "outside a transaction"?  The plpgsql model where
>>> all the primitive operations are really SQL ain't gonna work.
>>
>> I think we could handle a lot of these details cleanly if we had
>> autonomous transactions as a system primitive.  When you enter a
>> stored procedure at the outermost level, you begin a transaction,
>> which will remain open until the outermost stored procedure exits.
>
> If you do it that (base it on AT) way, then you can't:
> 1) call any utility command (vacuum, etc)
> 2) run for an arbitrary amount of time
> 3) discard any locks (except advisory)
> 4) deal with serialization isolation/mvcc snapshot issues that plague functions.
>
> Points 2 & (especially) 4 for me are painful.
>
> #4 explained:
> If you are trying to tuck all the gory mvcc details into server side
> functions, there is no real effective way to prevent serialization
> errors because the snapshot is already made when you enter the
> function.  Even if you LOCK something on function line#1, it's already
> too late.  No transaction procedures don't have this problem and allow
> encapsulating all that nastiness in the server.

Yes, those sound like a potent set of restrictions that "gut" what the
facility ought to be able to be useful for.

If what you want is something that runs inside a pre-existing
transaction, that rules out doing VACUUM or, really, *anything* that
generates transactions, without jumping through hoops to try to change
their behaviour.

My preference would be to expect that stored procedures are sure to
generate at least one transaction, and potentially as many more as
they choose to generate.

One of the most recent things I implemented was a process that does
bulk updates to customer balances.  We don't want the balance tuples
locked, so the process needs to COMMIT after each update.

At present, that means I'm doing a round trip from client to server each time.

If I had these "autonomous transaction procedures," I could perhaps do
the whole thing in a stored procedure, which would:
a) Pull the list of transactions it's supposed to process;
b) Loop on them:  - BEGIN; Do the processing for a transaction, COMMIT.

That's not terribly different from a vacuum utility that:
a) Pulls a list of tables it's supposed to vacuum;
b) Loop on them:   VACUUM the table

Autovac ought to make that sort of thing limitedly useful; you'd
usually rather just use autovac.

Mind you, we might discover that implementing autovac mostly in the
stored procedure language is easier and better than having it mostly
in C.  And this might further make it easy to add "hooks" to allow
site-specific logic to affect autovacuum policy.

(Note that Slony-I version 1.0, 1.1, and possibly 1.2 had the 'cleanup
thread' which notably vacuums tables mostly written in C.  2.0 shifted
the bulk of the logic into pl/pgsql, which made it much simpler to
read and verify, and made some of the components usable by
administrators.)

I'd expect SP to NOT be nestable, or at least, not in a sense that
allows rolling back activity of a "child" that thought it COMMITed
work.

It seems to me that we've already got perfectly good stored functions
that are strictly inside an existing transactional context - if you
want logic that's doing that, then use a SF, that's already perfectly
good for that, and you should use that.  If you want a stored
procedure that runs its own transaction(s), do so; don't expect every
kind of transactional logic out of SPs.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

Re: "stored procedures"

От

Darren Duncan

Дата:

21 апреля 2011 г., 18:48:26

Peter Eisentraut wrote:
> So the topic of "real" "stored procedures" came up again.  Meaning a
> function-like object that executes outside of a regular transaction,
> with the ability to start and stop SQL transactions itself.
> 
> I would like to collect some specs on this feature.  So does anyone have
> links to documentation of existing implementations, or their own spec
> writeup?  A lot of people appear to have a very clear idea of this
> concept in their own head, so let's start collecting those.

I've thought a lot about this too.

The general case of a stored procedure should be all powerful, and be able to 
directly invoke any code written in SQL or other languages that a DBMS client 
can directly invoke on the DBMS, as if it were a client, but that the procedure 
is stored and executed entirely in the DBMS.  But the stored procedure also has 
its own lexical variables and supports conditionals and iteration and recursion.

A stored procedure is invoked as a statement and doesn't have a "return" value; 
in contrast, a function has a return value and is invoked within a value 
expression of a statement.  A stored procedure can see and update the database, 
and can have IN/INOUT/OUT parameters.  A stored procedure can have side-effects 
out of band, such as user I/O, if Pg supports that.

The general stored procedure should be orthogonal to other concerns, in 
particular to transactions and savepoints; executing one should not should not 
implicitly start or commit or rollback a transaction or savepoint.  However, it 
should be possible to explicitly declare that procedure is a transaction, so 
that starts and ends are neatly paired regardless of how the procedure exits, 
that is a transaction lifetime is attached to its lexical scope, but this would 
be optional.

A stored procedure should be able to do data manipulation, data definition, 
explicit transaction control (except perhaps when defined to be a transaction), 
privilege control, message passing, and so on.

As for semantics, lets say that when a stored procedure is invoked, its 
definition will be pulled from the system catalog in a snapshot and be compiled, 
then run normally no matter what it does, even if the definition of the 
procedure itself is changed during its execution; in the latter case, it just 
means that once the execution finishes, subsequent calls to it would then call 
the updated version or fail.  So just compiling the procedure may need a catalog 
lock or whatever, but when it starts executing a transaction isn't required.

Any stored procedure in general should be able to invoke stored procedures, to 
any level of nesting, just like in any normal programming language.  There might 
be restrictions on what individual procedures can do depending on how they're 
declared; for example, if one is declared to have a scope-bound transaction, 
then it or ones it invokes can't have explicit transaction control statements. 
But such restrictions are an orthogonal or case-dependent matter.

(When we have a distinct stored procedure, I also believe that a stored function 
should be more restricted, such as only having IN parameters and not being able 
to see the database but by way of parameters, and that it should be 
deterministic.  But that ship has sailed and I'm not going to argue for any 
changes to functions.)

-- Darren Duncan

Re: "stored procedures"

От

Tom Lane

Дата:

21 апреля 2011 г., 19:07:53

Merlin Moncure <mmoncure@gmail.com> writes:
> On Thu, Apr 21, 2011 at 1:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> 3. What sort of primitive operations do you expect the SP to be
>> able to execute "outside a transaction"? �The plpgsql model where
>> all the primitive operations are really SQL ain't gonna work.

> Does this mean you do or don't expect plpgsql to be able to run as
> procedure?  Should SPI based routines generally be able to run as a
> procedure (I hope so)?  If so, what API enhancements would be needed?
> (I was thinking, SPI_is_proc, or something like that).  I'd like to
> see plpgsql work as much as possible as it does now, except obviously
> you can't have exception handlers.

You can't have arithmetic, comparisons, or much of anything outside a
transaction with plpgsql.  That model just plain doesn't work for this
purpose, I think.  You really want a control language that's independent
of the SQL engine, and for better or worse plpgsql is built inside that
engine.

> What about cancelling? Cancel the current running query, or the whole
> procedure (I'm assuming the latter?  How would that work?

Good question.  If you're imagining that the SP could decide to cancel a
database request partway through, it seems even further afield from what
could reasonably be done in a single-threaded backend.

Maybe we should think about the SP controlling a second backend (or even
multiple backends?) that's executing the "transactional" operations.
dblink on steroids, as it were.
        regards, tom lane

Re: "stored procedures"

От

Tatsuo Ishii

Дата:

21 апреля 2011 г., 19:56:58

> Robert Haas <robertmhaas@gmail.com> writes:
>> EDB has an implementation of this in Advanced Server.  A stored
>> procedure can issue a COMMIT, which commits the current transaction
>> and begins a new one.  This might or might not be what people are
>> imagining for this feature.  If we end up doing something else, one
>> thing to consider is the impact on third-party tools like PGPOOL,
>> which currently keep track of whether or not a transaction is in
>> progress by snooping on the stream of SQL commands.  If a procedure
>> can be started with no transaction in progress and return with one
>> open, or the other way around, that method will break horribly.
>> That's not necessarily a reason not to do it, but I suspect we would
>> want to add some kind of protocol-level information about the
>> transaction state instead so that such tools could continue to work.
> 
> Huh?  There's been a transaction state indicator in the protocol since
> 7.4 (see ReadyForQuery).  It's not our problem if PGPOOL is still using
> methods that were appropriate ten years ago.

Pgpool has been using the info since 2004 (7.4 was born in 2003).
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

Re: "stored procedures"

От

Josh Berkus

Дата:

21 апреля 2011 г., 20:21:37

On 4/21/11 3:07 PM, Tom Lane wrote:
> Maybe we should think about the SP controlling a second backend (or even
> multiple backends?) that's executing the "transactional" operations.
> dblink on steroids, as it were.

This is how people are doing this now (using dblink I mean).

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: "stored procedures"

От

Tom Lane

Дата:

21 апреля 2011 г., 21:18:03

Josh Berkus <josh@agliodbs.com> writes:
> On 4/21/11 3:07 PM, Tom Lane wrote:
>> Maybe we should think about the SP controlling a second backend (or even
>> multiple backends?) that's executing the "transactional" operations.
>> dblink on steroids, as it were.

> This is how people are doing this now (using dblink I mean).

Right, and it works.  But it's notationally painful, management of the
connection information poses security issues, etc etc.  Perhaps those
sorts of things could be addressed, though.
        regards, tom lane

Re: "stored procedures"

От

Robert Haas

Дата:

21 апреля 2011 г., 22:34:33

On Apr 21, 2011, at 3:51 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> If you do it that (base it on AT) way, then you can't:
> 1) call any utility command (vacuum, etc)
> 2) run for an arbitrary amount of time
> 3) discard any locks (except advisory)
> 4) deal with serialization isolation/mvcc snapshot issues that plague functions.

It is not obvious to me that you cannot do these things.
>

...Robert

Re: "stored procedures"

От

Pavel Stehule

Дата:

22 апреля 2011 г., 00:43:00

>> What about cancelling? Cancel the current running query, or the whole
>> procedure (I'm assuming the latter?  How would that work?
>
> Good question.  If you're imagining that the SP could decide to cancel a
> database request partway through, it seems even further afield from what
> could reasonably be done in a single-threaded backend.
>
> Maybe we should think about the SP controlling a second backend (or even
> multiple backends?) that's executing the "transactional" operations.
> dblink on steroids, as it were.

SP are executed in separate process in DB2 or in Oracle - but
sometimes there are significant overhead from interprocess
communication - it is reason, why collections are popular in PLSQL.

A spacial backend for SP is probably most simple solution - but there
can be performance problems :(

Regards

Pavel

Re: "stored procedures"

От

Merlin Moncure

Дата:

22 апреля 2011 г., 10:09:58

On Thu, Apr 21, 2011 at 8:34 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Apr 21, 2011, at 3:51 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> If you do it that (base it on AT) way, then you can't:
>> 1) call any utility command (vacuum, etc)
>> 2) run for an arbitrary amount of time
>> 3) discard any locks (except advisory)
>> 4) deal with serialization isolation/mvcc snapshot issues that plague functions.
>
> It is not obvious to me that you cannot do these things.

yeah...I think I misunderstood what you were saying ("When you enter a
stored procedure at the outermost level, you begin a transaction...").Those restrictions only apply when there is a
opentransaction
 
controlling the context of what is running.  If you are handing the
command textually off to another backend which then runs it, then you
are mostly good...although #2 still might be a problem, and #3 if you
happen to grab any.

merlin

Re: "stored procedures"

От

Merlin Moncure

Дата:

22 апреля 2011 г., 10:37:58

On Thu, Apr 21, 2011 at 5:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> On Thu, Apr 21, 2011 at 1:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> 3. What sort of primitive operations do you expect the SP to be
>>> able to execute "outside a transaction"?  The plpgsql model where
>>> all the primitive operations are really SQL ain't gonna work.
>
>> Does this mean you do or don't expect plpgsql to be able to run as
>> procedure?  Should SPI based routines generally be able to run as a
>> procedure (I hope so)?  If so, what API enhancements would be needed?
>> (I was thinking, SPI_is_proc, or something like that).  I'd like to
>> see plpgsql work as much as possible as it does now, except obviously
>> you can't have exception handlers.
>
> You can't have arithmetic, comparisons, or much of anything outside a
> transaction with plpgsql.  That model just plain doesn't work for this
> purpose, I think.  You really want a control language that's independent
> of the SQL engine, and for better or worse plpgsql is built inside that
> engine.

I'm arguing against a separate language, or at least questioning if
plpgsql truly can't be run without an outer transaction context.  Just
because a transaction isn't set up on procedure invocation, doesn't
mean you can't set them up to do things in the procedure?

It wouldn't bother me in the lest that if in plpgsql procedures if you
had to set up and tear down a transaction on every line.  You can
always dip into a function if/when you need the turbo boost.  plpgsql
is kind of a special case anyways in that it uses sql engine for a lot
of core operations. The other pls use their own engines to manage non
query code.

Setting up a new control language implies that postgres needs to know
the procedure language textually so it can read off a line and do
something with it.   I don't like this restriction -- wouldn't it be
better if the current crop of language handlers could run procedures
without major changes?  C functions with SPI?  However it's internally
implemented, the more userland mindspace recovered for use of writing
procedures the better off we are.

merlin

Re: "stored procedures"

От

Tom Lane

Дата:

22 апреля 2011 г., 11:29:34

Merlin Moncure <mmoncure@gmail.com> writes:
> It wouldn't bother me in the lest that if in plpgsql procedures if you
> had to set up and tear down a transaction on every line.

It would once you noticed the performance impact ...
        regards, tom lane

Re: "stored procedures"

От

Merlin Moncure

Дата:

22 апреля 2011 г., 11:56:20

On Fri, Apr 22, 2011 at 9:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> It wouldn't bother me in the lest that if in plpgsql procedures if you
>> had to set up and tear down a transaction on every line.
>
> It would once you noticed the performance impact ...

I'm aware of the impact.  It would suck, but you perhaps it's not
*quite* as bad as you think, considering:
*) faster performance is only an explicit transaction/function away
*) perhaps some optimizations are possible...x := x +1; can be
directly evaluated?
*) simple logic (IF <variable>) can be directly evaluated?
*) how bad is it really? from my measurements in queries/sec:

6.7k selects single client,
12k selects piped through single user backend,
13.5k piped through single user backend, one transaction
23k in plpgsql 'execute' in loop (which is really two queries, one to
build the query and one to execute),
100k in non dynamic query plpgsql in loop.

even if our plpgsql lines/sec dropped from 100k to 10k, maybe that's acceptable?

Point being, procedures aren't trying to meet the same set of use
cases that functions meet. I see them doing things you currently can't
do with functions (point's 1-4 above, plus any syntax sugar/salt CALL
brings to the table).  You have tons of tools to deal with performance
problems should they occur.

merlin

Re: "stored procedures"

От

"Kevin Grittner"

Дата:

22 апреля 2011 г., 12:03:26

Merlin Moncure <mmoncure@gmail.com> wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> You can't have arithmetic, comparisons, or much of anything
>> outside a transaction with plpgsql.  That model just plain
>> doesn't work for this purpose, I think.  You really want a
>> control language that's independent of the SQL engine, and for
>> better or worse plpgsql is built inside that engine.
> 
> I'm arguing against a separate language, or at least questioning
> if plpgsql truly can't be run without an outer transaction
> context. Just because a transaction isn't set up on procedure
> invocation, doesn't mean you can't set them up to do things in the
> procedure?
Right -- I don't think anyone has suggested that transactions can't
be started and ended "within" a SP.  And I have argued that if a SP
is called while a transaction is active, it runs within the context
of that transaction.
> It wouldn't bother me in the lest that if in plpgsql procedures if
> you had to set up and tear down a transaction on every line.
+1
> You can always dip into a function if/when you need the turbo
> boost.
Or BEGIN a transaction.
> Setting up a new control language implies that postgres needs to
> know the procedure language textually so it can read off a line
> and do something with it.   I don't like this restriction --
> wouldn't it be better if the current crop of language handlers
> could run procedures without major changes?  C functions with SPI?
> However it's internally implemented, the more userland mindspace
> recovered for use of writing procedures the better off we are.
+1
-Kevin

Re: "stored procedures"

От

Tom Lane

Дата:

22 апреля 2011 г., 12:11:00

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Merlin Moncure <mmoncure@gmail.com> wrote:
>> wouldn't it be better if the current crop of language handlers
>> could run procedures without major changes?  C functions with SPI?
>> However it's internally implemented, the more userland mindspace
>> recovered for use of writing procedures the better off we are.
> +1

I'd like a pony, too.  Let's be perfectly clear about this: there is no
part of plpgsql that can run outside a transaction today, and probably
no part of the other PLs either, and changing that "without major
changes" is wishful thinking of the first order.
        regards, tom lane

Re: "stored procedures"

От

Merlin Moncure

Дата:

22 апреля 2011 г., 12:24:32

On Fri, Apr 22, 2011 at 10:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Merlin Moncure <mmoncure@gmail.com> wrote:
>>> wouldn't it be better if the current crop of language handlers
>>> could run procedures without major changes?  C functions with SPI?
>>> However it's internally implemented, the more userland mindspace
>>> recovered for use of writing procedures the better off we are.
>
>> +1
>
> I'd like a pony, too.  Let's be perfectly clear about this: there is no
> part of plpgsql that can run outside a transaction today, and probably
> no part of the other PLs either, and changing that "without major
> changes" is wishful thinking of the first order.

Well, ok, but scope of the change and performance issues aside, is
this a technically feasible route, that is, does anything jump out
that makes it unworkable?

merlin

Re: "stored procedures"

От

Robert Haas

Дата:

22 апреля 2011 г., 12:27:25

On Apr 22, 2011, at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Merlin Moncure <mmoncure@gmail.com> wrote:
>>> wouldn't it be better if the current crop of language handlers
>>> could run procedures without major changes?  C functions with SPI?
>>> However it's internally implemented, the more userland mindspace
>>> recovered for use of writing procedures the better off we are.
>
>> +1
>
> I'd like a pony, too.  Let's be perfectly clear about this: there is no
> part of plpgsql that can run outside a transaction today, and probably
> no part of the other PLs either, and changing that "without major
> changes" is wishful thinking of the first order.

Correct me if I am wrong here, but the basic issue is, I think, that an error might occur. And transactions are how we
makesure that when control returns to the top level, we've released any heavyweight locks, lightweight locks, buffer
pins,backend-local memory allocations, etc. that we were holding when the error occurred. 

...Robert

Re: "stored procedures"

От

"Kevin Grittner"

Дата:

22 апреля 2011 г., 12:41:53

Robert Haas <robertmhaas@gmail.com> wrote:
> On Apr 22, 2011, at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'd like a pony, too.
No ponies for me; make mine an Arabian stallion.
>> Let's be perfectly clear about this: there is no part of plpgsql
>> that can run outside a transaction today, and probably no part of
>> the other PLs either
It's a good thing that nobody is suggesting that transactions can't
be started and terminated at need within a SP.  And I'm not
suggesting that a SP couldn't be run within a transaction to avoid
the overhead of that, when desired.
> Correct me if I am wrong here, but the basic issue is, I think,
> that an error might occur. And transactions are how we make sure
> that when control returns to the top level, we've released any
> heavyweight locks, lightweight locks, buffer pins, backend-local
> memory allocations, etc. that we were holding when the error
> occurred.
If that's the issue, then the biggest problem would seem to be in
preparing an SP which isn't within an existing transaction at
startup.  As someone previously mentioned, there would need to be
a transaction to prepare it for execution which would then be
completed before processing the body of the SP.
Yes, that's hand-wavy, but I thought we were at the phase of
brainstorming about what would make for desirable features, not
mapping out the implementation details.  Of course it's valuable to
identify possible implementation issues for such desirable features,
even this early; but let's not get bogged down in such details
before we reach some kind of consensus on what we might all like.
-Kevin

Re: "stored procedures"

От

Tom Lane

Дата:

22 апреля 2011 г., 13:07:10

Robert Haas <robertmhaas@gmail.com> writes:
> On Apr 22, 2011, at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'd like a pony, too.  Let's be perfectly clear about this: there is no
>> part of plpgsql that can run outside a transaction today, and probably
>> no part of the other PLs either, and changing that "without major
>> changes" is wishful thinking of the first order.

> Correct me if I am wrong here, but the basic issue is, I think, that an error might occur. And transactions are how
wemake sure that when control returns to the top level, we've released any heavyweight locks, lightweight locks, buffer
pins,backend-local memory allocations, etc. that we were holding when the error occurred.

Well, yes, all that infrastructure is tied to transactions.  Now if you
don't use any of it, then you don't have a problem.  The real difficulty
is that plpgsql uses SQL expressions freely and there's no guarantees
about what parts of the infrastructure a random function, operator, or
datatype I/O function might use.  (Examples: domain_in can invoke pretty
much arbitrary code as a consequence of domain CHECK constraints, and
will certainly do catalog accesses even without those.  Almost any array
operator will do catalog accesses to get the properties of the array
element type.  And so on.)

You could possibly lobotomize plpgsql down to a small number of
datatypes and operators that are known not to ever do anything more
interesting than palloc() and elog(), but IMO the usefulness would be
low and the fragility high.  It'd be better to give the task to an
interpreter that was never built to depend on a SQL environment in the
first place.  Thus my thought about perl etc.
        regards, tom lane

Re: "stored procedures"

От

Joshua Berkus

Дата:

22 апреля 2011 г., 13:45:21

Tom,

> >> I'd like a pony, too. Let's be perfectly clear about this: there is
> >> no
> >> part of plpgsql that can run outside a transaction today, and
> >> probably
> >> no part of the other PLs either, and changing that "without major
> >> changes" is wishful thinking of the first order.

I always thought that it was pretty clear that autonomous transactions were a major feature, and very difficult to
implement. Otherwise we'd have done SPs back in 7.4 when we first had this discussion.
 

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

Re: "stored procedures"

От

Merlin Moncure

Дата:

22 апреля 2011 г., 13:49:34

On Fri, Apr 22, 2011 at 11:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Apr 22, 2011, at 11:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I'd like a pony, too.  Let's be perfectly clear about this: there is no
>>> part of plpgsql that can run outside a transaction today, and probably
>>> no part of the other PLs either, and changing that "without major
>>> changes" is wishful thinking of the first order.
>
>> Correct me if I am wrong here, but the basic issue is, I think, that an error might occur. And transactions are how
wemake sure that when control returns to the top level, we've released any heavyweight locks, lightweight locks, buffer
pins,backend-local memory allocations, etc. that we were holding when the error occurred. 
>
> Well, yes, all that infrastructure is tied to transactions.  Now if you
> don't use any of it, then you don't have a problem.  The real difficulty
> is that plpgsql uses SQL expressions freely and there's no guarantees
> about what parts of the infrastructure a random function, operator, or
> datatype I/O function might use.  (Examples: domain_in can invoke pretty
> much arbitrary code as a consequence of domain CHECK constraints, and
> will certainly do catalog accesses even without those.  Almost any array
> operator will do catalog accesses to get the properties of the array
> element type.  And so on.)

Just to be clear (I'm really trying not to be obtuse here), does that
mean you can't touch that infrastructure at all in a procedure in this
vein, or can you set up a transaction temporarily in cases you need it
(maybe at the statement level)?  If you are well and truly locked out
of the sql engine with no doorway in, then i'd have to agree, plpgsql
is out.

merlin

Re: "stored procedures"

От

Andrew Dunstan

Дата:

22 апреля 2011 г., 13:57:41


On 04/22/2011 12:06 PM, Tom Lane wrote:
> You could possibly lobotomize plpgsql down to a small number of
> datatypes and operators that are known not to ever do anything more
> interesting than palloc() and elog(), but IMO the usefulness would be
> low and the fragility high.  It'd be better to give the task to an
> interpreter that was never built to depend on a SQL environment in the
> first place.  Thus my thought about perl etc.
>

It's not clear to me what the other interpreter would actually be doing.

cheers

andrew

Re: "stored procedures"

От

Peter Eisentraut

Дата:

22 апреля 2011 г., 15:28:38

On fre, 2011-04-22 at 08:37 -0500, Merlin Moncure wrote:
> It wouldn't bother me in the lest that if in plpgsql procedures if you
> had to set up and tear down a transaction on every line.

It would probably be more reasonable and feasible to have a setup where
you can end a transaction in plpgsql but a new one would start right
away.

Re: "stored procedures"

От

Merlin Moncure

Дата:

22 апреля 2011 г., 17:22:08

On Fri, Apr 22, 2011 at 1:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On fre, 2011-04-22 at 08:37 -0500, Merlin Moncure wrote:
>> It wouldn't bother me in the lest that if in plpgsql procedures if you
>> had to set up and tear down a transaction on every line.
>
> It would probably be more reasonable and feasible to have a setup where
> you can end a transaction in plpgsql but a new one would start right
> away.

ya, that's an idea.  if nothing else, it would certainly be faster,
and you still be able to control things properly.

Just thinking out loud here, but maybe you could make a cut down
version of StartTransaction() that does non-transactional set up like
memory, guc, etc but doesn't set the state (or set's it to something
else, like TRANS_PROCEDURE).  We get here maybe by a new protocol
firstchar.  One thing that's not clear is how you'd get there via a
simple query (sent via PQexec vs hypothetical PQcall).  The protocol
and syntax portions are a whole separate issue anyways...

I poked around a bit in pl_exec.c and and pl_handler.c.  My thinking
is that in strategic points, in particular in exec_stmt(), you check
if in procedure state and not already in a transaction, set one up
there, run the statement, and take it down afterwords.  Maybe you do
this on every statement, or maybe as Peter suggest it's user
controlled, but i'm curious how this would turn out.

You'd also have to be in a transaction during the function call
setup/compilation, and the portions that handle the input arguments.
However the main execution loop which passes over the exec state istm
is fairly self contained and won't be problematic if run outside of
transaction.  This is the key point -- the SPI routines when run would
always be in *a* transaction, just not always the same transaction.
:-)

What exactly SPI_connect does, and what the other SPI functions would
do if invoked from a different transaction is a mystery to me and
presumably a big problem.   I'm quite aware this is all awfully light
on detail, and the million + 1 assumptions I'm making, but since your
getting basically injected directly into a function from the tiny
lizard brain of postgres in tcop, I wonder if it could be worked
out...

merlin

Re: "stored procedures"

От

Tom Lane

Дата:

22 апреля 2011 г., 17:51:13

Merlin Moncure <mmoncure@gmail.com> writes:
> On Fri, Apr 22, 2011 at 1:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
>> It would probably be more reasonable and feasible to have a setup where
>> you can end a transaction in plpgsql but a new one would start right
>> away.

> ya, that's an idea.

Yeah, that's a good thought.  Then we'd have a very well-defined
collection of state that had to be preserved through such an operation,
ie, the variable values and control state of the SP.  It also gets rid
of the feeling that you ought not be in a transaction when you enter
the SP.

There's still the problem of whether you can invoke operations such as
VACUUM from such an SP.  I think we'd want to insist that they terminate
the current xact, which is perhaps not too cool.
        regards, tom lane

Re: "stored procedures"

От

Merlin Moncure

Дата:

22 апреля 2011 г., 18:52:29

On Fri, Apr 22, 2011 at 3:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> On Fri, Apr 22, 2011 at 1:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
>>> It would probably be more reasonable and feasible to have a setup where
>>> you can end a transaction in plpgsql but a new one would start right
>>> away.
>
>> ya, that's an idea.
>
> Yeah, that's a good thought.  Then we'd have a very well-defined
> collection of state that had to be preserved through such an operation,
> ie, the variable values and control state of the SP.  It also gets rid
> of the feeling that you ought not be in a transaction when you enter
> the SP.

hm, another neat thing about this is that it skirts the unfortunate
confusion between sql 'begin' and plpgsql 'begin'...

merlin

Re: "stored procedures"

От

"Kevin Grittner"

Дата:

22 апреля 2011 г., 19:00:30

Merlin Moncure <mmoncure@gmail.com> wrote:
> hm, another neat thing about this is that it skirts the
> unfortunate confusion between sql 'begin' and plpgsql 'begin'...
I hadn't thought about that.  There is the SQL-standard START
TRANSACTION synonym, so there is a way to deal with it -- but since
BEGIN seems to be used more heavily there would clearly be
confusion.
-Kevin

Re: "stored procedures"

От

David Christensen

Дата:

23 апреля 2011 г., 02:09:39

On Apr 22, 2011, at 3:50 PM, Tom Lane wrote:

> Merlin Moncure <mmoncure@gmail.com> writes:
>> On Fri, Apr 22, 2011 at 1:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
>>> It would probably be more reasonable and feasible to have a setup where
>>> you can end a transaction in plpgsql but a new one would start right
>>> away.
>
>> ya, that's an idea.
>
> Yeah, that's a good thought.  Then we'd have a very well-defined
> collection of state that had to be preserved through such an operation,
> ie, the variable values and control state of the SP.  It also gets rid
> of the feeling that you ought not be in a transaction when you enter
> the SP.
>
> There's still the problem of whether you can invoke operations such as
> VACUUM from such an SP.  I think we'd want to insist that they terminate
> the current xact, which is perhaps not too cool.

Dumb question, but wouldn't this kind of approach open up a window where (say) datatypes, operators, catalogs, etc,
coulddisappear/change out from under you, being that you're now in a different transaction/snapshot; presuming there is
aconcurrent transaction from a different backend modifying the objects in question?  In the non-explicit transaction
case,locking wouldn't work to keep these objects around due to the transaction scope of locks (unless locks are part of
thetransaction state carried forward across the implicit transactions).  If so, could that be done in such a way that
itwould take precedence over a parallel backend attempting to acquire the same locks without blocking the procedure? 

Regards,

David
--
David Christensen
End Point Corporation
david@endpoint.com

Re: "stored procedures"

От

Susanne Ebrecht

Дата:

24 апреля 2011 г., 11:05:17

On 21.04.2011 17:24, Peter Eisentraut wrote:
> I would like to collect some specs on this feature.  So does anyone have
> links to documentation of existing implementations, or their own spec
> writeup?  A lot of people appear to have a very clear idea of this
> concept in their own head, so let's start collecting those.

Peter,

what  I like from "the other" is that store procedures are able to 
return result sets.

Susanne



-- 
Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com

Re: "stored procedures"

От

Merlin Moncure

Дата:

25 апреля 2011 г., 11:00:32

On Fri, Apr 22, 2011 at 11:46 PM, David Christensen <david@endpoint.com> wrote:
>
> On Apr 22, 2011, at 3:50 PM, Tom Lane wrote:
>
>> Merlin Moncure <mmoncure@gmail.com> writes:
>>> On Fri, Apr 22, 2011 at 1:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
>>>> It would probably be more reasonable and feasible to have a setup where
>>>> you can end a transaction in plpgsql but a new one would start right
>>>> away.
>>
>>> ya, that's an idea.
>>
>> Yeah, that's a good thought.  Then we'd have a very well-defined
>> collection of state that had to be preserved through such an operation,
>> ie, the variable values and control state of the SP.  It also gets rid
>> of the feeling that you ought not be in a transaction when you enter
>> the SP.
>>
>> There's still the problem of whether you can invoke operations such as
>> VACUUM from such an SP.  I think we'd want to insist that they terminate
>> the current xact, which is perhaps not too cool.
>
>
> Dumb question, but wouldn't this kind of approach open up a window where (say) datatypes, operators, catalogs, etc,
coulddisappear/change out from under you, being that you're now in a different transaction/snapshot; presuming there is
aconcurrent transaction from a different backend modifying the objects in question? 

That's a good question.  This is already a problem for functions -- an
object you are dependent upon in the function body can disappear at
any time.  If you grabbed the lock first you're ok, but otherwise
you're not and the caller will receive an error.  Starting with 8.3
there is plan cache machinery that invalidates plans used inside
plpgsql which should prevent the worst problems.  If you're cavalier
about deleting objects that are used in a lot of functions you can get
really burned from a performance standpoint, but that's no different
than dealing with functions today.

Procedures unlike functions however can no longer rely that catalogs
remain static visibility wise through execution for functions.
pl_comp.c is full of catalog lookups and that means that some
assumptions that are made during compilation that are no longer valid
for procedures.  A missing table isn't such a big deal, but maybe it's
possible to make intermediate changes while a procedure is execution
that can cause an expression to parse differently, or not at all (for
example, replacing a scalar function with setof)?  This could be a
minefield of problems or possibly not -- I really just don't know all
the details and perhaps some experimentation is in order.

One thing that's tempting is to force recompilation upon certain
things happening so you can catch this stuff proactively, but  plpgsql
function compilation is very slow and this approach is probably very
complex.  Ideally we can just bail from the procedure if external
events cause things to go awry.

merlin

Re: "stored procedures"

От

"Kevin Grittner"

Дата:

25 апреля 2011 г., 11:19:14

Merlin Moncure <mmoncure@gmail.com> wrote:
> Procedures unlike functions however can no longer rely that
> catalogs remain static visibility wise through execution for
> functions.
If you start from the perspective that stored procedures are in many
respects more like psql scripts than functions, this shouldn't be
too surprising.  If you have a psql script with multiple database
transactions, you know that other processes can change things
between transactions.  Same deal with SPs.
The whole raison d'être for SPs is that there are cases where people
need something *different* from functions.  While it would be *nice*
to leverage plpgsql syntax for a stored procedure language, if it
means we have to behave like a function, it's not worth it.
-Kevin

Re: "stored procedures"

От

Merlin Moncure

Дата:

25 апреля 2011 г., 11:51:02

On Mon, Apr 25, 2011 at 9:18 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Merlin Moncure <mmoncure@gmail.com> wrote:
>
>> Procedures unlike functions however can no longer rely that
>> catalogs remain static visibility wise through execution for
>> functions.
>
> If you start from the perspective that stored procedures are in many
> respects more like psql scripts than functions, this shouldn't be
> too surprising.  If you have a psql script with multiple database
> transactions, you know that other processes can change things
> between transactions.  Same deal with SPs.
>
> The whole raison d'être for SPs is that there are cases where people
> need something *different* from functions.  While it would be *nice*
> to leverage plpgsql syntax for a stored procedure language, if it
> means we have to behave like a function, it's not worth it.

As noted above it would be really nice if the SPI interface could be
recovered for use in writing procedures.  plpgsql the language is less
of a sure thing, but it would be truly unfortunate if it couldn't be
saved on grounds of user-retraining alone.  If a sneaky injection of
transaction manipulation gets the job done without rewriting the
entire then great, but it's an open question if that's possible, and
I'm about 2 orders of magnitude unfamiliar with the code to say either
way.  I'm inclined to just poke around and see what breaks.

OTOH, if you go the fully textual route you can get away with doing
things that are not at all sensible in the plpgsql world (or at least
not without a serious rethink of how it works), like connecting to
databases mid-procedure, a cleaner attack at things like running
'CLUSTER', than the flush transaction state methodology above.

So I see we have three choices:
1. recover SPI, recover plpgsql (and other pls), transaction flush
command (SPI_flush()?)
2. recover SPI, replace plpgsql (with what?)
3. no spi, custom built language, most flexibility, database
reconnects, aka, 'tabula rasa'

#1 is probably the easiest and most appealing on a lot of levels, but
fraught with technical danger, and the most limiting?

merlin

Re: "stored procedures"

От

Peter Eisentraut

Дата:

25 апреля 2011 г., 15:20:48

On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote:
> So the topic of "real" "stored procedures" came up again.  Meaning a
> function-like object that executes outside of a regular transaction,
> with the ability to start and stop SQL transactions itself.

I would like to add a note about the SQL standard here.

Some people have been using terminology that a "function" does this and
a "procedure" does something else.  Others have also mentioned the use
of a CALL statement to invoke procedures.

Both procedures (as in CREATE PROCEDURE etc.) and the CALL statement are
specified by the SQL standard, and they make no mention of any
supertransactional behavior or autonomous transactions for procedures.
As far as I can tell, it's just a Pascal-like difference that functions
return values and procedures don't.

So procedure-like objects with a special transaction behavior will need
a different syntax or a syntax addition.

Re: "stored procedures"

От

Andrew Dunstan

Дата:

25 апреля 2011 г., 15:28:20


On 04/25/2011 02:18 PM, Peter Eisentraut wrote:
> On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote:
>> So the topic of "real" "stored procedures" came up again.  Meaning a
>> function-like object that executes outside of a regular transaction,
>> with the ability to start and stop SQL transactions itself.
> I would like to add a note about the SQL standard here.
>
> Some people have been using terminology that a "function" does this and
> a "procedure" does something else.  Others have also mentioned the use
> of a CALL statement to invoke procedures.
>
> Both procedures (as in CREATE PROCEDURE etc.) and the CALL statement are
> specified by the SQL standard, and they make no mention of any
> supertransactional behavior or autonomous transactions for procedures.
> As far as I can tell, it's just a Pascal-like difference that functions
> return values and procedures don't.
>
> So procedure-like objects with a special transaction behavior will need
> a different syntax or a syntax addition.
>

The trouble is that people using at least some other databases call 
supertransactional program units "stored procedures". Maybe we need a 
keyword to designate supertransactional behaviour, but if we call them 
anything but procedures there is likely to be endless confusion, ISTM, 
especially if we have something called a procedure which is never 
supertransactional.

cheers

andrew

Re: "stored procedures"

От

Merlin Moncure

Дата:

25 апреля 2011 г., 15:34:31

On Mon, Apr 25, 2011 at 1:18 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote:
>> So the topic of "real" "stored procedures" came up again.  Meaning a
>> function-like object that executes outside of a regular transaction,
>> with the ability to start and stop SQL transactions itself.
>
> I would like to add a note about the SQL standard here.
>
> Some people have been using terminology that a "function" does this and
> a "procedure" does something else.  Others have also mentioned the use
> of a CALL statement to invoke procedures.
>
> Both procedures (as in CREATE PROCEDURE etc.) and the CALL statement are
> specified by the SQL standard, and they make no mention of any
> supertransactional behavior or autonomous transactions for procedures.
> As far as I can tell, it's just a Pascal-like difference that functions
> return values and procedures don't.
>
> So procedure-like objects with a special transaction behavior will need
> a different syntax or a syntax addition.

hm. does the sql standard prohibit the use of extra transactional
features?   are you sure it's not implied that any sql (including
START TRANSACTION etc) is valid?  meaning, unless otherwise specified,
you should be able to do those things, and that our functions because
they force one transaction operation are non-standard, not the other
way around.

merlin

Re: "stored procedures"

От

Peter Eisentraut

Дата:

25 апреля 2011 г., 16:05:51

On mån, 2011-04-25 at 13:34 -0500, Merlin Moncure wrote:
> hm. does the sql standard prohibit the use of extra transactional
> features?

It doesn't prohibit anything.  It just kindly requests that standard
syntax has standard behavior.

> are you sure it's not implied that any sql (including
> START TRANSACTION etc) is valid?  meaning, unless otherwise specified,
> you should be able to do those things, and that our functions because
> they force one transaction operation are non-standard, not the other
> way around. 

Syntactically, it appears to be allowed, and there's something about
savepoint levels.  So that might be something related.  In any case, if
we use standard syntax, that should be researched.

Re: "stored procedures" - use cases?

От

Peter Eisentraut

Дата:

25 апреля 2011 г., 16:07:35

On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote:
> So the topic of "real" "stored procedures" came up again.  Meaning a
> function-like object that executes outside of a regular transaction,
> with the ability to start and stop SQL transactions itself.
> 
> I would like to collect some specs on this feature.  So does anyone have
> links to documentation of existing implementations, or their own spec
> writeup?  A lot of people appear to have a very clear idea of this
> concept in their own head, so let's start collecting those.

Another point, as there appear to be diverging camps about
supertransactional stored procedures vs. autonomous transactions, what
would be the actual use cases of any of these features?  Let's collect
some, so we can think of ways to make them work.

Re: "stored procedures" - use cases?

От

"Kevin Grittner"

Дата:

25 апреля 2011 г., 16:36:05

Peter Eisentraut <peter_e@gmx.net> wrote:
> what would be the actual use cases of any of these features? 
> Let's collect some, so we can think of ways to make them work.
The two things which leap to mind for me are:
(1)  All the \d commands in psql should be implemented in SPs so
that they are available from any client, through calling one SP
equivalent to one \d command.  The \d commands would be changed to
call the SPs for releases recent enough to support this.  Eventually
psql would be free of worrying about which release contained which
columns in which system tables, because it would just be passing the
parameters in and displaying whatever results came back.
I have used products which implemented something like this, and
found it quite useful.
(2)  In certain types of loads -- in particular converting data from
old systems into the database for a new system -- you need to load
several tables in parallel, with queries among the tables which are
being loaded.  The ability to batch many DML statements into one
transaction is important, to avoid excessive COMMIT overhead and
related disk output; however, the ability to ANALYZE tables
periodically is equally important, to prevent each access to an
initially-empty table from being done as a table scan after it has
millions of rows.  VACUUM might become equally important if there
are counts or totals being accumulated in some tables, or status
columns are being updated, as rows are added to other tables.
I've often had to do something like this during conversions.  This
could be handled in an external program (I've often done it in
Java), but performance might be better if a stored procedure in
PostgreSQL was able to keep SQL/MED streams of data open while
committing and performing this maintenance every so many rows.
-Kevin

Re: "stored procedures" - use cases?

От

"Todd A. Cook"

Дата:

25 апреля 2011 г., 17:15:50

Peter Eisentraut wrote:
>
> Another point, as there appear to be diverging camps about
> supertransactional stored procedures vs. autonomous transactions, what
> would be the actual use cases of any of these features?

Looping over hundreds of identical schema executing DDL statements on
each.  We can't do this in a single transaction because it consumes
all of shared memory with locks.

-- todd

Re: "stored procedures" - use cases?

От

Merlin Moncure

Дата:

25 апреля 2011 г., 17:58:04

On Mon, Apr 25, 2011 at 2:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On tor, 2011-04-21 at 18:24 +0300, Peter Eisentraut wrote:
>> So the topic of "real" "stored procedures" came up again.  Meaning a
>> function-like object that executes outside of a regular transaction,
>> with the ability to start and stop SQL transactions itself.
>>
>> I would like to collect some specs on this feature.  So does anyone have
>> links to documentation of existing implementations, or their own spec
>> writeup?  A lot of people appear to have a very clear idea of this
>> concept in their own head, so let's start collecting those.
>
> Another point, as there appear to be diverging camps about
> supertransactional stored procedures vs. autonomous transactions, what
> would be the actual use cases of any of these features?  Let's collect
> some, so we can think of ways to make them work.

My answer is this:
plpgsql with its first class SQL expressions, direct access to the
postgres type system, and other nifty features has proven for me to be
superior to all other languages in terms of defect rate, output
progress for input work, and other metrics one might apply by a
significant margin.  By adding super-transactional (I prefer the
phrasing, 'explicit control of transaction state') features you can
eliminate all kinds of cases where you might otherwise be forced to
coding on the client side.  Lots of people prefer not to do this (or
recoil in horror at the mere suggestion of doing so), and that's fine,
but I don't like being prohibited from being able to do so by
technical constraint.  Explicit transaction controls remove those
constraints.  Anyone who really 'gets' plpgsql programming knows
exactly what I'm talking about and has bumped into those constraints.

Autonomous transactions, basically a formalization of the dblink style
techniques of running SQL in a parallel transaction state, are also
useful, but for different reasons.  You can extend them pretty far to
do most of things explicit transactions give you (like creating lots
of tables or running 'CLUSTER') although I find having to force users
to maintain separate transaction states just to do so to be a bit of a
kludge, and the outermost function still has to terminate within a
limited timeframe.

merlin

Re: "stored procedures" - use cases?

От

Darren Duncan

Дата:

25 апреля 2011 г., 18:29:09

Peter Eisentraut wrote:
> Another point, as there appear to be diverging camps about
> supertransactional stored procedures vs. autonomous transactions, what
> would be the actual use cases of any of these features?  Let's collect
> some, so we can think of ways to make them work.

An analogy I like to use for a very capable DBMS is that of an operating system, 
and each autonomous transaction is like a distinct process/thread in this 
system.  The DBMS is like a virtual machine in which processes/autonomous 
transactions run.

Like with an operating system, a process/auto-transaction can be started by 
another one, or by the OS/DBMS (or a root process/auto), and once running all 
processes are mutually independent to a large extent, in that each has its own 
separatable privileges or state or view of the database, the database being an 
analogy to the file system.

A process/auto-transaction can be started by a DBMS client, analogous to a user, 
but it doesn't have to be.  The message passing feature that Pg has, 
listen/notify, is like inter-process communication between these processes/autos.

A stored procedure always runs within the context of one process/auto, and a 
regular transaction or savepoint or whatever is specific to a process/auto.

Has anyone else thought of the DBMS as operating system analogy?  I don't recall 
specifically reading this anywhere, but expect the thought may be common.

-- Darren Duncan

Re: "stored procedures" - use cases?

От

Peter Eisentraut

Дата:

26 апреля 2011 г., 18:57:02

On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:
> (1)  All the \d commands in psql should be implemented in SPs so
> that they are available from any client, through calling one SP
> equivalent to one \d command.

You don't need stored procedures with special transaction behavior for
this.  In fact, you probably shouldn't use them even if you had them,
because you surely want a consistent view of, say, a table.
> (2)  In certain types of loads -- in particular converting data from
> old systems into the database for a new system -- you need to load
> several tables in parallel, with queries among the tables which are
> being loaded.  The ability to batch many DML statements into one
> transaction is important, to avoid excessive COMMIT overhead and
> related disk output; however, the ability to ANALYZE tables
> periodically is equally important, to prevent each access to an
> initially-empty table from being done as a table scan after it has
> millions of rows.  VACUUM might become equally important if there
> are counts or totals being accumulated in some tables, or status
> columns are being updated, as rows are added to other tables.

I'm not sure I really follow this.  If your aim is to batch DML
statements and avoid COMMIT overhead, why would you want to use stored
procedures that possibly span multiple transactions?

Re: "stored procedures" - use cases?

От

Tom Lane

Дата:

26 апреля 2011 г., 19:29:05

Peter Eisentraut <peter_e@gmx.net> writes:
> On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:
>> (1)  All the \d commands in psql should be implemented in SPs so
>> that they are available from any client, through calling one SP
>> equivalent to one \d command.

> You don't need stored procedures with special transaction behavior for
> this.

No, but what you *would* need is the ability to return multiple result
sets from one call.  Even then, you could not exactly duplicate the
current output of \d; but you could duplicate the functionality.
        regards, tom lane

Re: "stored procedures" - use cases?

От

"Kevin Grittner"

Дата:

26 апреля 2011 г., 19:50:03

Peter Eisentraut <peter_e@gmx.net> wrote:
> On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:
>> (1)  All the \d commands in psql should be implemented in SPs so
>> that they are available from any client, through calling one SP
>> equivalent to one \d command.
> 
> You don't need stored procedures with special transaction behavior
> for this.  In fact, you probably shouldn't use them even if you
> had them, because you surely want a consistent view of, say, a
> table.
Agreed.  I was just outlining use cases here, not trying to make a
case for something in particular with each one.
>> (2)  In certain types of loads -- in particular converting data
>> from old systems into the database for a new system -- you need
>> to load several tables in parallel, with queries among the tables
>> which are being loaded.  The ability to batch many DML statements
>> into one transaction is important, to avoid excessive COMMIT
>> overhead and related disk output; however, the ability to ANALYZE
>> tables periodically is equally important, to prevent each access
>> to an initially-empty table from being done as a table scan after
>> it has millions of rows.  VACUUM might become equally important
>> if there are counts or totals being accumulated in some tables,
>> or status columns are being updated, as rows are added to other
>> tables.
> 
> I'm not sure I really follow this.  If your aim is to batch DML
> statements and avoid COMMIT overhead, why would you want to use
> stored procedures that possibly span multiple transactions?
The point is that if such a conversion is run in a situation where
table access is always done on a plan based on empty tables, it
starts to get pretty slow after a while.  You need to commit,
analyze, and start a new transaction for the queries to make new
plans which run well.  This obviously isn't an issue when you're
blasting entire tables in through COPY commands without needing to
reference other data being concurrently loaded.
So, rough pseudo-code where this is done in a client app with
autovacuum disabled would look something like: open input stream of non-normalized data open database connection while
notEOF on input   start transaction   for 50000 top level inputs (break on EOF)     parse apart messy data, load into
multipletables       (logic involves queries against tables being loaded)       (some updates besides straight inserts)
     (print exceptions for questionable or undigestable data)   end for   commit transaction   vacuum analyze end
while
In database products with stored procedures it has usually been
faster to use an SP in the target database than to use a client
program.
-Kevin

Re: "stored procedures" - use cases?

От

Josh Berkus

Дата:

26 апреля 2011 г., 19:55:59

> Another point, as there appear to be diverging camps about
> supertransactional stored procedures vs. autonomous transactions, what
> would be the actual use cases of any of these features?  Let's collect
> some, so we can think of ways to make them work.

Here's where I wanted autonomous transactions just last week, and didn't
have them so I had to use a python script outside the database:

-- doing a CREATE INDEX CONCURRENTLY for 150 partitions on a partitioned
table.

-- doing a backfill operation for 10GB of computed data, taking 8 hours,
where I don't want to hold a transaction open for 8 hours since this is
a high-volume OLTP database.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: "stored procedures" - use cases?

От

"Kevin Grittner"

Дата:

26 апреля 2011 г., 20:02:21

Tom Lane <tgl@sss.pgh.pa.us> wrote:
> No, but what you *would* need is the ability to return multiple
> result sets from one call.
At least.
> Even then, you could not exactly duplicate the current output of
> \d; but you could duplicate the functionality.
I would think that psql could duplicate the output pretty closely, 
especially if the output of the stored procedure was a stream of
intermingled result sets and messages (as from ereport).  This is
what many products provide.  They usually show messages with a class
'00' SQLSTATE just as plain text lines, and decorate the more severe
levels with appropriate additional information.
A while back I included a link to show what Sybase returns from
their sp_help SP for various object types:
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc36273.1550/html/sprocs/X85190.htm
Note the lines like:Object does not have any indexes.
This came from the server as a SQLSTATE '00000' message.
-Kevin

Re: "stored procedures" - use cases?

От

"Kevin Grittner"

Дата:

26 апреля 2011 г., 20:09:06

Josh Berkus <josh@agliodbs.com> wrote:
> -- doing a backfill operation for 10GB of computed data, taking 8
> hours, where I don't want to hold a transaction open for 8 hours
> since this is a high-volume OLTP database.
Been there, done that.  Definitely not a rare use case.
-Kevin

Re: "stored procedures" - use cases?

От

Greg Stark

Дата:

27 апреля 2011 г., 00:12:52

On Tue, Apr 26, 2011 at 11:55 PM, Josh Berkus <josh@agliodbs.com> wrote:
> Here's where I wanted autonomous transactions just last week, and didn't
> have them so I had to use a python script outside the database:
>
> -- doing a CREATE INDEX CONCURRENTLY for 150 partitions on a partitioned
> table.
>
> -- doing a backfill operation for 10GB of computed data, taking 8 hours,
> where I don't want to hold a transaction open for 8 hours since this is
> a high-volume OLTP database.

These don't seem like compelling use cases at all to me. You said you
had to fall back to using a python script outside the database, but
what disadvantage does that have? Why is moving your application logic
into the database an improvement?

Honestly in every case where I've had to move code that had been in a
function to the application I've found there were tons of benefits.
Everything from being able to better control the behaviour, to being
able to parallelize the processing over multiple connections, being
able to run parts of it at different times, being able to see the
progress and control it from another session, being able to manage the
code in version control, the list just goes on. Trying to move all the
code into the database just makes life harder.

Autonomous transactions have value on their own. But it's not so that
you can run create index ocncurrently or vacuum or whatever. They're
useful so that a single session can do things like log errors even
when a transaction rolls back. Actually that's the only example I can
think of but it's a pretty good use case on its own and I'm sure it's
not entirely unique.

-- 
greg

Re: "stored procedures" - use cases?

От

Daniel Farina

Дата:

27 апреля 2011 г., 00:57:42

On Mon, Apr 25, 2011 at 12:07 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> Another point, as there appear to be diverging camps about
> supertransactional stored procedures vs. autonomous transactions, what
> would be the actual use cases of any of these features?  Let's collect
> some, so we can think of ways to make them work.

Some number of moons ago it would have been highly desirable to be
able to create daemon worker processes out of UDFs.  In practice, many
such daemons want to do their own snapshot management (that is to say,
acquire new ones...) and there's no nice way to do that by extending
postgres.  Instead, you cargo cult onto what autovacuum does and
release your own postgres binary use SPI from outside a snapshot.
Although it would be better still to have a worker pool type mechanic
(see the "async" discussion happening recently), being able to have
contribs or modules where one could run:

SELECT do_the_thing();

And block indefinitely doing cross-snapshot work would be pretty
useful, I feel.  As a thought exercise, could one create:

SELECT autovacuum(tuning, parameters, one, through, n); as a C UDF
without bizarro snapshot mangling? (I believe we did play some tricks
to escape the snapshot even in this case, but they weren't very lucid
in the code, if memory serves).

In any case, I've encountered at least a few situations where I'd like
to be able to opt-out of getting one and exactly one snapshot in the
daemon/worker case.

--
fdr

Re: "stored procedures" - use cases?

От

Merlin Moncure

Дата:

27 апреля 2011 г., 11:01:31

On Tue, Apr 26, 2011 at 10:12 PM, Greg Stark <gsstark@mit.edu> wrote:
> On Tue, Apr 26, 2011 at 11:55 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> Here's where I wanted autonomous transactions just last week, and didn't
>> have them so I had to use a python script outside the database:
>>
>> -- doing a CREATE INDEX CONCURRENTLY for 150 partitions on a partitioned
>> table.
>>
>> -- doing a backfill operation for 10GB of computed data, taking 8 hours,
>> where I don't want to hold a transaction open for 8 hours since this is
>> a high-volume OLTP database.
>
> These don't seem like compelling use cases at all to me. You said you
> had to fall back to using a python script outside the database, but
> what disadvantage does that have? Why is moving your application logic
> into the database an improvement?
>
> Honestly in every case where I've had to move code that had been in a
> function to the application I've found there were tons of benefits.
> Everything from being able to better control the behaviour, to being
> able to parallelize the processing over multiple connections, being
> able to run parts of it at different times, being able to see the
> progress and control it from another session, being able to manage the
> code in version control, the list just goes on. Trying to move all the
> code into the database just makes life harder.

my experience has been the opposite.

merlin

Re: "stored procedures" - use cases?

От

Josh Berkus

Дата:

27 апреля 2011 г., 14:53:51

> These don't seem like compelling use cases at all to me. You said you
> had to fall back to using a python script outside the database, but
> what disadvantage does that have? Why is moving your application logic
> into the database an improvement?

Since both were part of a code rollout, it complicated our deployment
process considerably and took a deployment which could have been
push-button automatic and forced us to do it by manually logging into
the shell on the database server.

>  Trying to move all the
> code into the database just makes life harder.

I might make *your* life harder.  It makes *mine* easier.

If you pursue your argument a little further, Greg, why do we have
functions at all?  We could do it all in the application.

> Autonomous transactions have value on their own. But it's not so that
> you can run create index ocncurrently or vacuum or whatever. 

Why not?  Why are you so intent on making my life harder?

> They're
> useful so that a single session can do things like log errors even
> when a transaction rolls back.

That's *also* an excellent use case.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: "stored procedures" - use cases?

От

Greg Stark

Дата:

27 апреля 2011 г., 17:07:17

On Wed, Apr 27, 2011 at 6:48 PM, Josh Berkus <josh@agliodbs.com> wrote:
> If you pursue your argument a little further, Greg, why do we have
> functions at all?  We could do it all in the application.
>
>> Autonomous transactions have value on their own. But it's not so that
>> you can run create index ocncurrently or vacuum or whatever.
>
> Why not?  Why are you so intent on making my life harder?

Because we want to be able to manipulate data in queries in
data-type-specific ways. For example we want to do aggregations on the
result of a function or index scans across a user data type, etc. If
all the functions do is implement application logic then you end up
having half your application logic in the application and half in the
database and it's hard to keep them in sync.

To take the argument in the opposite extreme would you suggest we
should have html formatting functions in the database so that people
can have their entire web server just be print $dbh->('select
web_page(url)') ?

>> They're
>> useful so that a single session can do things like log errors even
>> when a transaction rolls back.
>
> That's *also* an excellent use case.

What makes it an excellent use case is that it's basically impossible
to do without autonomous transactions. You can hack it with dblink but
it's much less clean and much higher overhead.

--
greg

Re: "stored procedures" - use cases?

От

Josh Berkus

Дата:

27 апреля 2011 г., 19:29:13

Greg,

> Because we want to be able to manipulate data in queries in
> data-type-specific ways. For example we want to do aggregations on the
> result of a function or index scans across a user data type, etc. 

I don't see how this is different from wanting to capture error output,
which would face the same issues.  You seem to be wanting to make a hard
feature easier by telling me that I don't actually want the things I
want.  Wanna make it even easier?  Then Stored Procedures are just
functions without a return value.  That's a 40-line patch.  Done!

> If
> all the functions do is implement application logic then you end up
> having half your application logic in the application and half in the
> database and it's hard to keep them in sync.

You build your applications your way, and I'll build mine my way.  I'll
just ask you not to try to dictate to me how I should build
applications.  Especially, since, based on the responses on this thread,
a LOT of people would like to have multitransaction control inside a
stored procedure script.  I suspect that your experience of application
development has been rather narrow.

> To take the argument in the opposite extreme would you suggest we
> should have html formatting functions in the database so that people
> can have their entire web server just be print $dbh->('select
> web_page(url)') ?

Actually, you can already sort of do that using XSLT.   So I don't
necessary think that's a prohibitive idea, depending on implementation.After all, many of the new non-relational
databasesimplement exactly this.
 

>>> They're
>>> useful so that a single session can do things like log errors even
>>> when a transaction rolls back.
>>
>> That's *also* an excellent use case.
> 
> What makes it an excellent use case is that it's basically impossible
> to do without autonomous transactions. You can hack it with dblink but
> it's much less clean and much higher overhead.

You could do it by using application code.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: "stored procedures" - use cases?

От

"David E. Wheeler"

Дата:

27 апреля 2011 г., 20:25:06

On Apr 27, 2011, at 3:28 PM, Josh Berkus wrote:

> Actually, you can already sort of do that using XSLT.   So I don't
> necessary think that's a prohibitive idea, depending on implementation.
> After all, many of the new non-relational databases implement exactly this.

The proposed JSON data type and construction functions (once there's agreement on an implementation) will allow this,
too.Just serve JSON. Boom, instant REST server. 

David

Re: "stored procedures" - use cases?

От

Jim Nasby

Дата:

29 апреля 2011 г., 12:18:23

On Apr 26, 2011, at 6:08 PM, Kevin Grittner wrote:
> Josh Berkus <josh@agliodbs.com> wrote:
>> -- doing a backfill operation for 10GB of computed data, taking 8
>> hours, where I don't want to hold a transaction open for 8 hours
>> since this is a high-volume OLTP database.
>
> Been there, done that.  Definitely not a rare use case.

We do that so often we've actually written a framework around it and are working on a daemon that will deal with any
backfillsthat have been registered in the system. If we could control transactions that daemon could be entirely in the
database...but since we can't, we have to write it in another language outside the database and switch back and forth
betweenthe two worlds. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net

Re: "stored procedures"

От

Merlin Moncure

Дата:

29 апреля 2011 г., 16:36:39

On Fri, Apr 22, 2011 at 1:28 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On fre, 2011-04-22 at 08:37 -0500, Merlin Moncure wrote:
>> It wouldn't bother me in the lest that if in plpgsql procedures if you
>> had to set up and tear down a transaction on every line.
>
> It would probably be more reasonable and feasible to have a setup where
> you can end a transaction in plpgsql but a new one would start right
> away.

I've been poking around to see how this might be done, and yes there
are a lot of issue.   The good news is that at least, from what I can
tell so far, that there are relatively few problems inside plpgsql
itself in terms of making it span transactions (there is a small
assumption with the simple eval code but that can probably easily
fixed).  The problems are further up in that plpgsql relies on various
structures that are tucked into the transaction memory context.

The very first thing that I think has to be figured out to implement
supertransactional behaviors is under which memory context the various
structures plpgsql depends on will live, especially the execution
state.  I'm thinking it should rely in the message context, with some
participation at the portal level, possibly via a new routine
(PortalRunProcedure) that is special in that it has to communicate to
plpgsql that it is a procedure and what to do when doing transactional
management.   For example, it is currently managing the
ExecutorQueryDesc and should probably continue doing so.   One way to
do this is to inject a callback somewhere (in the queryDesc?) which
could be accessible at the lower levels (ideally even in SPI if we
want to be able to get to this from other PLs).

The callback implementation would kill the snapshot, reset the
transaction etc. Most other transaction management is not happening
here, but in postgres.c, so I'm not sure if this is the right place.
I'd also like to defer the snapshot creation as long as possible after
flushing the current transaction so that it's possible to sneak a lock
into the procedure body to deal with serialization problems.  If that
can't be worked out, maybe a textual implementation or something based
on autonomous approach is better.

merlin

Re: "stored procedures"

От

Bruce Momjian

Дата:

09 мая 2011 г., 16:08:04

Josh Berkus wrote:
> Peter,
> 
> > I would like to collect some specs on this feature.  So does anyone have
> > links to documentation of existing implementations, or their own spec
> > writeup?  A lot of people appear to have a very clear idea of this
> > concept in their own head, so let's start collecting those.
> 
> Delta between SPs and Functions for PostgreSQL:
> 
> * SPs are executed using CALL or EXECUTE, and not SELECT.
> 
> * SPs do not return a value
> ** optional: SPs *may* have OUT parameters.

[ Late reply.]

What is it about stored procedures that would require it not to return a
value or use CALL?  I am trying to understand what part of this is
"procedures" (doesn't return a values, we decided there isn't much value
for that syntax vs. functions), and anonymous transactions.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: "stored procedures"

От

Merlin Moncure

Дата:

09 мая 2011 г., 17:17:10

On Mon, May 9, 2011 at 1:41 PM, Bruce Momjian <bruce@momjian.us> wrote:
> Josh Berkus wrote:
>> Peter,
>>
>> > I would like to collect some specs on this feature.  So does anyone have
>> > links to documentation of existing implementations, or their own spec
>> > writeup?  A lot of people appear to have a very clear idea of this
>> > concept in their own head, so let's start collecting those.
>>
>> Delta between SPs and Functions for PostgreSQL:
>>
>> * SPs are executed using CALL or EXECUTE, and not SELECT.
>>
>> * SPs do not return a value
>> ** optional: SPs *may* have OUT parameters.
>
> [ Late reply.]
>
> What is it about stored procedures that would require it not to return a
> value or use CALL?  I am trying to understand what part of this is
> "procedures" (doesn't return a values, we decided there isn't much value
> for that syntax vs. functions), and anonymous transactions.

FWICT the sql standard.  The only summary of standard behaviors I can
find outside of the standard itself is here:
http://farrago.sourceforge.net/design/UserDefinedTypesAndRoutines.html.Peter's synopsis of how the standard works is
murkyat best and 
competing implementations are all over the place...SQL server's
'CALL'  feature is basically what I personally would like to see. It
would complement our functions nicely.

Procedures return values and are invoked with CALL.  Functions return
values and are in-query callable.

The fact that 'CALL' is not allowed inside a query seems to make it
pretty darn convenient to make the additional distinction of allowing
transactional control statements there and not in functions.  You
don't *have* to allow transactional control statements and could offer
this feature as an essentially syntax sugar enhancement, but then run
the risk of boxing yourself out of a useful properties of this feature
later on because of backwards compatibility issues (in particular, the
assumption that your are in a running transaction in the procedure
body).

merlin

Re: "stored procedures" - use cases?

От

Bruce Momjian

Дата:

09 мая 2011 г., 21:21:16

Kevin Grittner wrote:
> Peter Eisentraut <peter_e@gmx.net> wrote:
>  
> > what would be the actual use cases of any of these features? 
> > Let's collect some, so we can think of ways to make them work.
>  
> The two things which leap to mind for me are:
>  
> (1)  All the \d commands in psql should be implemented in SPs so
> that they are available from any client, through calling one SP
> equivalent to one \d command.  The \d commands would be changed to
> call the SPs for releases recent enough to support this.  Eventually
> psql would be free of worrying about which release contained which
> columns in which system tables, because it would just be passing the
> parameters in and displaying whatever results came back.
>  
> I have used products which implemented something like this, and
> found it quite useful.

Uh, why does this require stored procedures?  Seems our existing
function capabilities are even better suited to this.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: "stored procedures" - use cases?

От

Bruce Momjian

Дата:

09 мая 2011 г., 21:23:02

Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:
> >> (1)  All the \d commands in psql should be implemented in SPs so
> >> that they are available from any client, through calling one SP
> >> equivalent to one \d command.
> 
> > You don't need stored procedures with special transaction behavior for
> > this.
> 
> No, but what you *would* need is the ability to return multiple result
> sets from one call.  Even then, you could not exactly duplicate the
> current output of \d; but you could duplicate the functionality.

Oh, good point.  Thanks.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: "stored procedures" - use cases?

От

Andrew Dunstan

Дата:

09 мая 2011 г., 22:21:30


On 05/09/2011 08:20 PM, Bruce Momjian wrote:
> Tom Lane wrote:
>> Peter Eisentraut<peter_e@gmx.net>  writes:
>>> On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:
>>>> (1)  All the \d commands in psql should be implemented in SPs so
>>>> that they are available from any client, through calling one SP
>>>> equivalent to one \d command.
>>> You don't need stored procedures with special transaction behavior for
>>> this.
>> No, but what you *would* need is the ability to return multiple result
>> sets from one call.  Even then, you could not exactly duplicate the
>> current output of \d; but you could duplicate the functionality.
> Oh, good point.  Thanks.

Multiple resultsets in one call would be a good thing, though, no?

cheers

andrew

Re: "stored procedures" - use cases?

От

Christopher Browne

Дата:

09 мая 2011 г., 23:19:14

On Mon, May 9, 2011 at 9:21 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 05/09/2011 08:20 PM, Bruce Momjian wrote:
>>
>> Tom Lane wrote:
>>>
>>> Peter Eisentraut<peter_e@gmx.net>  writes:
>>>>
>>>> On mån, 2011-04-25 at 14:35 -0500, Kevin Grittner wrote:
>>>>>
>>>>> (1)  All the \d commands in psql should be implemented in SPs so
>>>>> that they are available from any client, through calling one SP
>>>>> equivalent to one \d command.
>>>>
>>>> You don't need stored procedures with special transaction behavior for
>>>> this.
>>>
>>> No, but what you *would* need is the ability to return multiple result
>>> sets from one call.  Even then, you could not exactly duplicate the
>>> current output of \d; but you could duplicate the functionality.
>>
>> Oh, good point.  Thanks.
>
> Multiple resultsets in one call would be a good thing, though, no?
>
> cheers

I *thought* the purpose of having stored procedures was to allow a
substrate supporting running multiple transactions, so it could do
things like:
- Managing vacuums
- Managing transactions
- Replacing some of the need for dblink.
- Being an in-DB piece that could manage LISTENs

It seems to be getting "bikeshedded" into something with more
"functional argument functionality" than stored functions.

I think we could have a perfectly successful implementation of "stored
procedures" that supports ZERO ability to pass arguments in or out.
That's quite likely to represent a good start.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

Re: "stored procedures" - use cases?

От

Bruce Momjian

Дата:

10 мая 2011 г., 00:32:11

Christopher Browne wrote:
> > Multiple resultsets in one call would be a good thing, though, no?
> >
> > cheers
> 
> I *thought* the purpose of having stored procedures was to allow a
> substrate supporting running multiple transactions, so it could do
> things like:
> - Managing vacuums
> - Managing transactions
> - Replacing some of the need for dblink.
> - Being an in-DB piece that could manage LISTENs
> 
> It seems to be getting "bikeshedded" into something with more
> "functional argument functionality" than stored functions.
> 
> I think we could have a perfectly successful implementation of "stored
> procedures" that supports ZERO ability to pass arguments in or out.
> That's quite likely to represent a good start.

I am kind of confused too, particularly with the CALL syntax.  I thought
our function call usage was superior in every way to CALL, so why
implement CALL?  I assume for SQL-standards compliance, right?  Does
multiple result sets require CALL?  I assume autonomous transactions
don't require CALL.

Are we assuming no one is going to want a function that allows multiple
result sets or autonomous transactions?  That seems unlikely.  I would
think CALL is independent of those features.  Maybe we need those
features to support SQL-standard CALL, and we will just add those
features to functions too.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

Re: "stored procedures" - use cases?

От

Pavel Stehule

Дата:

10 мая 2011 г., 00:59:52

2011/5/10 Bruce Momjian <bruce@momjian.us>:
> Christopher Browne wrote:
>> > Multiple resultsets in one call would be a good thing, though, no?
>> >
>> > cheers
>>
>> I *thought* the purpose of having stored procedures was to allow a
>> substrate supporting running multiple transactions, so it could do
>> things like:
>> - Managing vacuums
>> - Managing transactions
>> - Replacing some of the need for dblink.
>> - Being an in-DB piece that could manage LISTENs
>>
>> It seems to be getting "bikeshedded" into something with more
>> "functional argument functionality" than stored functions.
>>
>> I think we could have a perfectly successful implementation of "stored
>> procedures" that supports ZERO ability to pass arguments in or out.
>> That's quite likely to represent a good start.
>
> I am kind of confused too, particularly with the CALL syntax.  I thought
> our function call usage was superior in every way to CALL, so why
> implement CALL?  I assume for SQL-standards compliance, right?  Does
> multiple result sets require CALL?  I assume autonomous transactions
> don't require CALL.
>

no - you are little bit confused :). CALL and function execution
shares nothing. There is significant differences between function and
procedure. Function is called only from executor - from some plan, and
you have to know a structure of result before run. The execution of
CALL is much simple - you just execute code - without plan and waiting
for any result - if there is.

> Are we assuming no one is going to want a function that allows multiple
> result sets or autonomous transactions?  That seems unlikely.  I would
> think CALL is independent of those features.  Maybe we need those
> features to support SQL-standard CALL, and we will just add those
> features to functions too.
>

We can use a SETOF cursors for returning a multiple result sets now.
But there are a few complications:

a) The client should to wait for finish of all sets from multiple
result sets - minimally in PL/pgSQL
b) client 'psql' doesn't support a unpacking result when result is
multiple result set
c) The using cursors isn't too comfortable - in comparation to MS SQL or MySQL

Regards

Pavel




> --
>  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>  EnterpriseDB                             http://enterprisedb.com
>
>  + It's impossible for everything to be true. +
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: "stored procedures" - use cases?

От

Robert Haas

Дата:

10 мая 2011 г., 09:56:13

On Mon, May 9, 2011 at 11:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> no - you are little bit confused :). CALL and function execution
> shares nothing. There is significant differences between function and
> procedure. Function is called only from executor - from some plan, and
> you have to know a structure of result before run. The execution of
> CALL is much simple - you just execute code - without plan and waiting
> for any result - if there is.

Now I'm a little confused, or you are.  Surely any SQL has to be
planned and executed, regardless of whether it appears in a function,
a stored procedure, or anywhere else.  Non-SQL statements within a
stored procedure don't need to go through the planner and executor,
but that's true in PL/python or PL/pgsql or whatever today.

I think people are using the term "stored procedures" to refer to
approximately whatever it is that they're unhappy that functions don't
allow, and that's leading to a lot of people talking across each
other.  The main features seem to be (1) explicit transaction control
and/or execution of commands like VACUUM that can't be invoked from
within a transaction, (2) autonomous transactions, and (3) returning
multiple result sets.  But I don't think anybody would be desperately
unhappy if it magically became possible to do those things from
regular functions, unlikely as that may seem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: "stored procedures" - use cases?

От

Pavel Stehule

Дата:

10 мая 2011 г., 10:19:07

2011/5/10 Robert Haas <robertmhaas@gmail.com>:
> On Mon, May 9, 2011 at 11:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> no - you are little bit confused :). CALL and function execution
>> shares nothing. There is significant differences between function and
>> procedure. Function is called only from executor - from some plan, and
>> you have to know a structure of result before run. The execution of
>> CALL is much simple - you just execute code - without plan and waiting
>> for any result - if there is.
>
> Now I'm a little confused, or you are.  Surely any SQL has to be
> planned and executed, regardless of whether it appears in a function,
> a stored procedure, or anywhere else.  Non-SQL statements within a
> stored procedure don't need to go through the planner and executor,
> but that's true in PL/python or PL/pgsql or whatever today.
>

CALL statement is "util command" than SQL. It has to execute some NON SQL code.

You can thinking about CALL statement like synonymum for SELECT, but
it isn't correct (it is my opinion)

The "stored procedures" was prior stored functions (more corectly UDF
- user defined functions). These "old time" stored procedures was
simply - it was client code moved on server. Usually these procedures
was executed in different process or different thread. Inside
procedures was full client's side functionality and there wasn't a
network overhead. CALL statement is +/- remote call. It isn't SQL
statement.

> I think people are using the term "stored procedures" to refer to
> approximately whatever it is that they're unhappy that functions don't
> allow, and that's leading to a lot of people talking across each
> other.  The main features seem to be (1) explicit transaction control
> and/or execution of commands like VACUUM that can't be invoked from
> within a transaction, (2) autonomous transactions, and (3) returning
> multiple result sets.  But I don't think anybody would be desperately
> unhappy if it magically became possible to do those things from
> regular functions, unlikely as that may seem.
>

yes.

@2 Autonomous transaction doesn't need stored procedures. Autonomous
transaction can be isolated by function's flag, by some special
PL/pgSQL statement:

like

BEGIN EXECUTE AUTONOMOUS '....'
END;

@3 is possible now too - but not too much user friendly. Point 3 is
strange. Oracle doesn't support it. Support in DB2 is little bit
strange. And it is well supported by MySQL, MSSQL, maybe Informix,
Sybase.

Pavel

Regards

Pavel

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: "stored procedures" - use cases?

От

Markus Wanner

Дата:

12 мая 2011 г., 08:09:57

Hi,

On 05/10/2011 02:55 PM, Robert Haas wrote:
> On Mon, May 9, 2011 at 11:58 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> no - you are little bit confused :). CALL and function execution
>> shares nothing. There is significant differences between function and
>> procedure. Function is called only from executor - from some plan, and
>> you have to know a structure of result before run. The execution of
>> CALL is much simple - you just execute code - without plan and waiting
>> for any result - if there is.

I think the distinction between function and procedure is misleading
here.  Some envision stored *procedures* to be able to return values,
result sets and possibly even *multiple* result sets.

> The main features seem to be (1) explicit transaction control
> and/or execution of commands like VACUUM that can't be invoked from
> within a transaction,

I think that's the main point of stored procedures.

> (2) autonomous transactions

To me autonomous transactions seem orthogonal.  Those can be used to
implement (1) above, but might have other uses for regular transactions
as well.

(The point I'm taking home here is that you might want to control not
only one concurrent transaction, but several from a "stored procedure".So far, I assumed only one.)

> and (3) returning
> multiple result sets.  But I don't think anybody would be desperately
> unhappy if it magically became possible to do those things from
> regular functions, unlikely as that may seem.

That point definitely is on my wish-list for UDFs already.  I didn't
think of this as having to do with stored procedures, either.

Regards

Markus

Re: "stored procedures"

От

Thom Brown

Дата:

31 августа 2011 г., 11:02:04

On 9 May 2011 20:52, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Mon, May 9, 2011 at 1:41 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> Josh Berkus wrote:
>>> Peter,
>>>
>>> > I would like to collect some specs on this feature.  So does anyone have
>>> > links to documentation of existing implementations, or their own spec
>>> > writeup?  A lot of people appear to have a very clear idea of this
>>> > concept in their own head, so let's start collecting those.
>>>
>>> Delta between SPs and Functions for PostgreSQL:
>>>
>>> * SPs are executed using CALL or EXECUTE, and not SELECT.
>>>
>>> * SPs do not return a value
>>> ** optional: SPs *may* have OUT parameters.
>>
>> [ Late reply.]
>>
>> What is it about stored procedures that would require it not to return a
>> value or use CALL?  I am trying to understand what part of this is
>> "procedures" (doesn't return a values, we decided there isn't much value
>> for that syntax vs. functions), and anonymous transactions.
>
> FWICT the sql standard.  The only summary of standard behaviors I can
> find outside of the standard itself is here:
> http://farrago.sourceforge.net/design/UserDefinedTypesAndRoutines.html.
>  Peter's synopsis of how the standard works is murky at best and
> competing implementations are all over the place...SQL server's
> 'CALL'  feature is basically what I personally would like to see. It
> would complement our functions nicely.
>
> Procedures return values and are invoked with CALL.  Functions return
> values and are in-query callable.
>
> The fact that 'CALL' is not allowed inside a query seems to make it
> pretty darn convenient to make the additional distinction of allowing
> transactional control statements there and not in functions.  You
> don't *have* to allow transactional control statements and could offer
> this feature as an essentially syntax sugar enhancement, but then run
> the risk of boxing yourself out of a useful properties of this feature
> later on because of backwards compatibility issues (in particular, the
> assumption that your are in a running transaction in the procedure
> body).

I've seen no mention of SQL/PSM.  Isn't all of this covered by that?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: "stored procedures"

От

Merlin Moncure

Дата:

31 августа 2011 г., 16:15:10

On Wed, Aug 31, 2011 at 9:00 AM, Thom Brown <thom@linux.com> wrote:
> On 9 May 2011 20:52, Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Mon, May 9, 2011 at 1:41 PM, Bruce Momjian <bruce@momjian.us> wrote:
>>> Josh Berkus wrote:
>>>> Peter,
>>>>
>>>> > I would like to collect some specs on this feature.  So does anyone have
>>>> > links to documentation of existing implementations, or their own spec
>>>> > writeup?  A lot of people appear to have a very clear idea of this
>>>> > concept in their own head, so let's start collecting those.
>>>>
>>>> Delta between SPs and Functions for PostgreSQL:
>>>>
>>>> * SPs are executed using CALL or EXECUTE, and not SELECT.
>>>>
>>>> * SPs do not return a value
>>>> ** optional: SPs *may* have OUT parameters.
>>>
>>> [ Late reply.]
>>>
>>> What is it about stored procedures that would require it not to return a
>>> value or use CALL?  I am trying to understand what part of this is
>>> "procedures" (doesn't return a values, we decided there isn't much value
>>> for that syntax vs. functions), and anonymous transactions.
>>
>> FWICT the sql standard.  The only summary of standard behaviors I can
>> find outside of the standard itself is here:
>> http://farrago.sourceforge.net/design/UserDefinedTypesAndRoutines.html.
>>  Peter's synopsis of how the standard works is murky at best and
>> competing implementations are all over the place...SQL server's
>> 'CALL'  feature is basically what I personally would like to see. It
>> would complement our functions nicely.
>>
>> Procedures return values and are invoked with CALL.  Functions return
>> values and are in-query callable.
>>
>> The fact that 'CALL' is not allowed inside a query seems to make it
>> pretty darn convenient to make the additional distinction of allowing
>> transactional control statements there and not in functions.  You
>> don't *have* to allow transactional control statements and could offer
>> this feature as an essentially syntax sugar enhancement, but then run
>> the risk of boxing yourself out of a useful properties of this feature
>> later on because of backwards compatibility issues (in particular, the
>> assumption that your are in a running transaction in the procedure
>> body).
>
> I've seen no mention of SQL/PSM.  Isn't all of this covered by that?

That's the 64k$ question. My take is that 'CALL' doesn't implicitly
set up a transaction state, and a proper PSM implementation would
allow transaction control mid-procedure.  Functions will always be
called in-transaction, since there is no way I can see to execute a
function except from an outer query (or the special case of DO).  I
think there's zero point in making CALL work without dealing with the
transaction issue -- in fact it could end up being a huge mistake to
do so.

Pavel's PSM implementation (see:
http://www.pgsql.cz/index.php/SQL/PSM_Manual) works under the
constraints of pg's understanding of what functions should and should
not be allowed to do.   It allows creation of PSM *functions* --
that's all.

IMNSHO, stored procedures should run in-process, and the execution
engine needs to be modified to not automatically spin up a transaction
and a snapshot when running them, but most allow a pl to do that at
appropriate times.  plpgsql and the other pls fwict make no
assumptions that strictly invalidate their use in that fashion outside
of some unfortunate ambiguity issues around 'begin', 'end', etc.  If
there is no current transaction, each statement should create one if
it's determined that the statement is interfacing with the sql engine
in such a way a transaction would be required, and immediately tear it
down, exactly as if an sql script was run inside the backend.  The SPI
interface can probably work 'as-is', and should probably return an
error if you arrive into certain functions while not in transaction.

An out of process, autonomous transaction type implementation should
probably not sit under stored procedures for a number of reasons --
mainly that it's going to expose too many implementation details to
the user.  For example, does a SP heavy app have 2*N running
processes?  Or do we slot them into a defined number of backends for
that purpose? Yuck & yuck.  I like the AT feature, and kludge it
frequently via dblink, but it's a solution for a different set of
problems.

merlin

Re: "stored procedures"

От

Josh Berkus

Дата:

01 сентября 2011 г., 14:18:40

On 8/31/11 12:15 PM, Merlin Moncure wrote:
> An out of process, autonomous transaction type implementation should
> probably not sit under stored procedures for a number of reasons --
> mainly that it's going to expose too many implementation details to
> the user.  For example, does a SP heavy app have 2*N running
> processes?  Or do we slot them into a defined number of backends for
> that purpose? Yuck & yuck.  I like the AT feature, and kludge it
> frequently via dblink, but it's a solution for a different set of
> problems.

I think that transaction control without parallelism would be the 80%
solution.  That is, an SP has transaction control, but those
transactions are strictly serial, and cannot be run in parallel.  For
example, if you were writing an SP in PL/pgSQL, each "BEGIN ... END"
block would be an explicit transaction, and standalone-only statements
be allowed between BEGIN ... END blocks, or possibly in their own
special block type (I prefer the latter).

One issue we'd need to deal with is exception control around
single-statement transactions and non-transactional statements (VACUUM,
CREATE INDEX CONCURRENTLY, CHECKPOINT, etc.).  In some cases, the user
is going to want to catch exceptions and abort the SP, and in other
cases ignore them, so both need to be possible.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: "stored procedures"

От

Christopher Browne

Дата:

01 сентября 2011 г., 14:49:21

On Thu, Sep 1, 2011 at 1:18 PM, Josh Berkus <josh@agliodbs.com> wrote:
> On 8/31/11 12:15 PM, Merlin Moncure wrote:
>> An out of process, autonomous transaction type implementation should
>> probably not sit under stored procedures for a number of reasons --
>> mainly that it's going to expose too many implementation details to
>> the user.  For example, does a SP heavy app have 2*N running
>> processes?  Or do we slot them into a defined number of backends for
>> that purpose? Yuck & yuck.  I like the AT feature, and kludge it
>> frequently via dblink, but it's a solution for a different set of
>> problems.
>
> I think that transaction control without parallelism would be the 80%
> solution.  That is, an SP has transaction control, but those
> transactions are strictly serial, and cannot be run in parallel.  For
> example, if you were writing an SP in PL/pgSQL, each "BEGIN ... END"
> block would be an explicit transaction, and standalone-only statements
> be allowed between BEGIN ... END blocks, or possibly in their own
> special block type (I prefer the latter).
>
> One issue we'd need to deal with is exception control around
> single-statement transactions and non-transactional statements (VACUUM,
> CREATE INDEX CONCURRENTLY, CHECKPOINT, etc.).  In some cases, the user
> is going to want to catch exceptions and abort the SP, and in other
> cases ignore them, so both need to be possible.

Yep, "+1" on that.

Leaving out parallelism, and having the mechanism operate under the
auspices of a single connection, makes a fine start, and perhaps is
enough even in the longer run.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

Re: "stored procedures"

От

Merlin Moncure

Дата:

23 сентября 2011 г., 15:18:05

On Thu, Sep 1, 2011 at 12:18 PM, Josh Berkus <josh@agliodbs.com> wrote:
> On 8/31/11 12:15 PM, Merlin Moncure wrote:
>> An out of process, autonomous transaction type implementation should
>> probably not sit under stored procedures for a number of reasons --
>> mainly that it's going to expose too many implementation details to
>> the user.  For example, does a SP heavy app have 2*N running
>> processes?  Or do we slot them into a defined number of backends for
>> that purpose? Yuck & yuck.  I like the AT feature, and kludge it
>> frequently via dblink, but it's a solution for a different set of
>> problems.
>
> I think that transaction control without parallelism would be the 80%
> solution.  That is, an SP has transaction control, but those
> transactions are strictly serial, and cannot be run in parallel.  For
> example, if you were writing an SP in PL/pgSQL, each "BEGIN ... END"
> block would be an explicit transaction, and standalone-only statements
> be allowed between BEGIN ... END blocks, or possibly in their own
> special block type (I prefer the latter).
>
> One issue we'd need to deal with is exception control around
> single-statement transactions and non-transactional statements (VACUUM,
> CREATE INDEX CONCURRENTLY, CHECKPOINT, etc.).  In some cases, the user
> is going to want to catch exceptions and abort the SP, and in other
> cases ignore them, so both need to be possible.

Totally agree -- was thinking about this very issue.  One of the
things I'd really like to see SP be able to do is to abstract some of
the nasty details of MVCC away from the client -- setting isolation
mode, replaying errors on serialization, etc.  This requires error
handling.  Unfortunately, this (exception handling in non transaction
context) is probably going to add some complexity to the
implementation.  Are we on the right track here (that is, maybe we
really *should* be looking at out of process execution)? How do
procedures fit in terms of execution from the tcop down?

merlin

Re: "stored procedures"

От

aasat

Дата:

16 апреля 2013 г., 09:07:42

Is "stored procedures" planned in future? I think is a "most missing" future
today in Postgres.

Using a dblink to emulate commit in transaction is very complicated




--
View this message in context: http://postgresql.1045698.n5.nabble.com/stored-procedures-tp4331060p5752274.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: "stored procedures"

От

Pavel Stehule

Дата:

16 апреля 2013 г., 09:26:04

Hello

2013/4/16 aasat <satriani@veranet.pl>

Is "stored procedures" planned in future? I think is a "most missing" future
today in Postgres.

It is in ToDo, but nobody working on this feature in this moment, probably.

Using a dblink to emulate commit in transaction is very complicated

probably autonomous transaction will be implemented first - and should be really nice feature.

Regards

Pavel

--
View this message in context: http://postgresql.1045698.n5.nabble.com/stored-procedures-tp4331060p5752274.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: "stored procedures"