Обсуждение: The transaction that "happens" with function invocation

Поиск
Список
Период
Сортировка

The transaction that "happens" with function invocation

От
jr@amanue.com (Jim Rosenberg)
Дата:
I need some clarification on the issue of functions and transactions.
I gather -- from bits and pieces of various mailing list postings --
that whenever a function created with CREATE FUNCTION is invoked,
there is some kind of "automatic" transaction wrapping this function
invocation. While I would rate the quality of PostgreSQL documentation
overall to be *EXCELLENT*, on this one issue there are some gaps, it
seems to me. We need complete documentation on just exactly how this
"automatic function transaction" works. There are several issues.

Is this an "ordinary" transaction, or is it "special"? If it is special,
in exactly what ways? If it's ordinary, then the following should work,
even absent nested transactions. Suppose I want to create a function in
a loadable language where I make my own decision about where the
transaction boundaries are going to be. E.g. I want a query loop, and
I want the transaction scoped to the contents of each loop iteration,
not the whole function. So: since I can't have nested transactions [yet!],
here's what I do: the function begins by executing COMMIT. Now there's
no transaction in force. I start my query loop, and inside the body of the
loop I put BEGIN and COMMIT / ROLLBACK just like I would if the statements
weren't inside a function. I make sure before exiting to have a dummy
BEGIN. Arguably, this is a demented way to program, but *will it work*?
[I gather it won't ...] If it won't work then that means there *is*
something "special" about the function transaction, and I would like to
see this documented.

What about error handling / ROLLBACK? How exactly does this happen?
Suppose my function is written in -- let's say -- pgTCL. I'm using
catch to trap errors. Normally I would expect a catch handler to do a
ROLLBACK if the code assumes a transaction is happening. Do I do
that inside a function in the usual way? Or does something "magic"
happen to roll back this transaction? If there is magic, how is a catch
handler supposed to "register" an error?

If these issues *are* discussed in the documentation, please accept
my apology and let me know where, but if not please please please
document these things.

* * *

I'm still trying to decide if PostgreSQL functions can "really" be
used to encapsulate business logic, or if it requires a 3-tier
architecture to do it properly. (I suspect the latter ...) Business
logic is tricky stuff. I *hope* the designers of PostgreSQL are not
making a decision for me that this encapsulation "has to" take place
at exactly the granularity of the transaction. That's a decision I
want to make for myself. There are many cases in business logic where
you want to have many transactions at the database level rolled into a
single encapsulated unit of work. Take a payroll system. Each person's
pay may be a transaction at the database level, but you don't want to
allow a person not to be paid "by mistake". It would be horrendous to
make the whole pay a single transaction. Do you really want to roll back
an entire payroll because of a problem with one person's check? I don't
think so.

PostgreSQL took a good step when it allowed a function to be invoked
under the privilege of the function creator. Encapsulating business
logic is *all about* privilege escalation. Obviously, nested transactions
will be a *HUGE* step forward. [I can't wait ...] But even with nested
transactions, we will still need clear documentation on how any transaction
wrapping a function works.

Re: The transaction that "happens" with function invocation

От
"Nigel J. Andrews"
Дата:
On Tue, 10 Jun 2003, (Jim Rosenberg) wrote:

> I need some clarification on the issue of functions and transactions.
> I gather -- from bits and pieces of various mailing list postings --
> that whenever a function created with CREATE FUNCTION is invoked,
> there is some kind of "automatic" transaction wrapping this function
> invocation. While I would rate the quality of PostgreSQL documentation
> overall to be *EXCELLENT*, on this one issue there are some gaps, it
> seems to me. We need complete documentation on just exactly how this
> "automatic function transaction" works. There are several issues.
>
> Is this an "ordinary" transaction, or is it "special"? If it is special,
> in exactly what ways? If it's ordinary, then the following should work,
> even absent nested transactions. Suppose I want to create a function in
> a loadable language where I make my own decision about where the
> transaction boundaries are going to be. E.g. I want a query loop, and
> I want the transaction scoped to the contents of each loop iteration,
> not the whole function. So: since I can't have nested transactions [yet!],
> here's what I do: the function begins by executing COMMIT. Now there's
> no transaction in force. I start my query loop, and inside the body of the
> loop I put BEGIN and COMMIT / ROLLBACK just like I would if the statements
> weren't inside a function. I make sure before exiting to have a dummy
> BEGIN. Arguably, this is a demented way to program, but *will it work*?
> [I gather it won't ...] If it won't work then that means there *is*
> something "special" about the function transaction, and I would like to
> see this documented.

Not exactly anything special.

What it is is that each statement gets an implicit transaction around it not
that each function starts a transaction.

Your SELECT * FROM your_table has an implicit transaction, if there isn't
already one started, as does UPDATE your_table SET your_field = your_value. So
when you invoke your function it's not because you are invoking a function that
means that a transaction is started. In your function you've been given the
opportunity to perform extra processing in a select query and so you can get
the opportunity to 'see' that you're in a transaction whereas normally you'd
just be sat twiddling your thumbs waiting for psql to spew the results out at
you.


> What about error handling / ROLLBACK? How exactly does this happen?
> Suppose my function is written in -- let's say -- pgTCL. I'm using
> catch to trap errors. Normally I would expect a catch handler to do a
> ROLLBACK if the code assumes a transaction is happening. Do I do
> that inside a function in the usual way? Or does something "magic"
> happen to roll back this transaction? If there is magic, how is a catch
> handler supposed to "register" an error?
>
> If these issues *are* discussed in the documentation, please accept
> my apology and let me know where, but if not please please please
> document these things.

Not sure, I'm sure it must be somewhere but I don't know off hand.

Nested transactions of course let you do what you want, probably still a good
idea not to try and commit the outermost statement's transaction though.


> I'm still trying to decide if PostgreSQL functions can "really" be
> used to encapsulate business logic, or if it requires a 3-tier
> architecture to do it properly. (I suspect the latter ...) Business
> logic is tricky stuff. I *hope* the designers of PostgreSQL are not
> making a decision for me that this encapsulation "has to" take place
> at exactly the granularity of the transaction. That's a decision I
> want to make for myself. There are many cases in business logic where
> you want to have many transactions at the database level rolled into a
> single encapsulated unit of work. Take a payroll system. Each person's
> pay may be a transaction at the database level, but you don't want to
> allow a person not to be paid "by mistake". It would be horrendous to
> make the whole pay a single transaction. Do you really want to roll back
> an entire payroll because of a problem with one person's check? I don't
> think so.

Why not? I can sort of see your logic but if you're doing it within a database
function then that function also has to take care of making the payments
doesn't it?


--
Nigel Andrews



Re: The transaction that "happens" with function invocation

От
Richard Huxton
Дата:
On Tuesday 10 Jun 2003 2:03 pm, Jim Rosenberg wrote:
> I need some clarification on the issue of functions and transactions.
> I gather -- from bits and pieces of various mailing list postings --
> that whenever a function created with CREATE FUNCTION is invoked,
> there is some kind of "automatic" transaction wrapping this function
> invocation. While I would rate the quality of PostgreSQL documentation
> overall to be *EXCELLENT*, on this one issue there are some gaps, it
> seems to me. We need complete documentation on just exactly how this
> "automatic function transaction" works. There are several issues.

It takes place inside the same transaction as anything else. There is no
additional transaction "wrapping" a function-call. Any SQL statement occurs
within a transaction, either explicitly (BEGIN...COMMIT) or implicitly
(autocommit).

> Is this an "ordinary" transaction, or is it "special"? If it is special,
> in exactly what ways? If it's ordinary, then the following should work,
> even absent nested transactions. Suppose I want to create a function in
> a loadable language where I make my own decision about where the
> transaction boundaries are going to be.

Sorry - can't do. This will have to wait until we have nested transactions.

> If these issues *are* discussed in the documentation, please accept
> my apology and let me know where, but if not please please please
> document these things.

Think you've got the wrong end of the stick somewhere. Is there a part of the
docs which implies this? (in which case they need some rewriting)

> I'm still trying to decide if PostgreSQL functions can "really" be
> used to encapsulate business logic, or if it requires a 3-tier
> architecture to do it properly. (I suspect the latter ...)

Yep - latter.

> Business
> logic is tricky stuff. I *hope* the designers of PostgreSQL are not
> making a decision for me that this encapsulation "has to" take place
> at exactly the granularity of the transaction.

The decision is which features to work on first. Nested transactions impact a
lot of other areas.

> That's a decision I
> want to make for myself. There are many cases in business logic where
> you want to have many transactions at the database level rolled into a
> single encapsulated unit of work. Take a payroll system. Each person's
> pay may be a transaction at the database level, but you don't want to
> allow a person not to be paid "by mistake". It would be horrendous to
> make the whole pay a single transaction. Do you really want to roll back
> an entire payroll because of a problem with one person's check? I don't
> think so.

Hmm - I'd disagree on this specific example. At some point you do want a
"commit all changes" transaction to mark the payroll finished. Having said
that, nested transactions allowing a function to rollback sub-transactions
will be a big gain in many cases.

At present, you'll have to do it at the application level I'm afraid.

--
  Richard Huxton

Re: The transaction that "happens" with function invocation

От
Tom Lane
Дата:
jr@amanue.com (Jim Rosenberg) writes:
> I need some clarification on the issue of functions and transactions.

http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=tutorial-transactions.html
See particularly the last couple of paragraphs.

            regards, tom lane

Re: The transaction that "happens" with function

От
Jim Rosenberg
Дата:
--On Tuesday, June 10, 2003 9:51 AM -0400 Tom Lane <tgl@sss.pgh.pa.us>
wrote:

> jr@amanue.com (Jim Rosenberg) writes:
>> I need some clarification on the issue of functions and transactions.
>
> http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=tutorial-
> transactions.html See particularly the last couple of paragraphs.

Thanks!

Oh oh oh oh [Sound of light bulbs going on ...] I see where i went astray.
Some databases have the concept -- missing from PostgreSQL -- of *no
transaction*. In PostgreSQL, it appears, whenever I'm "in" SQL I'm in
*some* transaction. In some databases it's not necessary to have "any"
transaction if one hasn't changed anything, but PostgreSQL doesn't work
this way.

I don't suppose "no transaction" is easier to do than nested transactions
...

This reminds me of my favorite joke [an original creation, I might add :-)]
about why it's so hard to get no-salt potato chips at the grocery store: it
seems there is a world-wide cartel that controls the price of no-salt.

--On Tuesday, June 10, 2003 2:33 PM +0100 Richard Huxton <dev@archonet.com>
wrote:

>> Take a payroll system. Each person's
>> pay may be a transaction at the database level, but you don't want to
>> allow a person not to be paid "by mistake". It would be horrendous to
>> make the whole pay a single transaction. Do you really want to roll back
>> an entire payroll because of a problem with one person's check? I don't
>> think so.
>
> Hmm - I'd disagree on this specific example. At some point you do want a
> "commit all changes" transaction to mark the payroll finished.

The thinking here is very simple. There are twenty very burly guys outside
your office waiting for their paychecks. There is one very mild-mannered
sweet director of graphics design who is not waiting outside your office.
She is in Hawaii, on vacation. She has already agreed to be the company
guinea pig for the new deduction scheme you just launched. You already
warned her that this might mess up her pay check, we'll have to make an
adjustment. That's OK, she says, I won't be here. And sure enough, her pay
blows up. And you're going to roll back the WHOLE PAYROLL because of this?
As I said, I don't think so.

A structure where there's a series of business transactions that are posted
together, but where at the database level each business transaction is a
*separate* database transaction is *very common*. One needs to ask: *WHY*
are transactions rolled back, anyway. The answer usually is that there's
some inconsistency in the data, and it's always easier to deal with such
things sooner rather than later. You roll back the transaction because it's
*dangerous* to let the data go forward. But it may not be dangerous for
there to be more data yet to be posted. In this situation, you post what is
postable, fail what is not. But you still may want the whole posting
routine encapsulated as a unit of business logic where users can't temper
with it.

I could imagine making the whole payroll be a single transaction if there
were some rule at the company that *says* explicitly that no one can be
paid unless everyone is paid.

* * *

At any rate, such issues are a function of the individual application, and
shouldn't be dictated by the database. I repeat that there are very strong
business reasons for insisting that business logic be encapsulated at a
higher-level unit than the database-level transaction.