Обсуждение: Writing most code in Stored Procedures

Поиск
Список
Период
Сортировка

Writing most code in Stored Procedures

От
Rohit
Дата:
I have few queries regarding the use of Stored Procedures, Functions
and Triggers in an RDBMS.

(1) When to use Stored Procedure? Writing an INSERT query in a Stored
Procedure is better or firing it from the application level?

(2) Can a Trigger call a Stored Procedure?

(3) What type of code must reside in application and what type in
RDBMS?

(4) Is it faster to work at application level or at the database level?


Re: Writing most code in Stored Procedures

От
Richard Huxton
Дата:
Rohit wrote:
> I have few queries regarding the use of Stored Procedures, Functions
> and Triggers in an RDBMS.

These are all easy questions to answer: "it depends".

OK, so you might want some reasons...

> (1) When to use Stored Procedure? Writing an INSERT query in a Stored
> Procedure is better or firing it from the application level?

Some people like to have all access to the database through stored
procedures. This can make it simpler to control access to the data and
seems popular with MSSQL developers.

Personally, I use SQL as my interface - that's its purpose. Actually,
that's not quite true, I use various wrappers that generate SQL for most
queries.

> (2) Can a Trigger call a Stored Procedure?

It has to, although they're just called functions in PostgreSQL, can be
written in a range of procedural languages and take place within a
transaction the same as any other function-call.

> (3) What type of code must reside in application and what type in
> RDBMS?

Anything that's to do with data integrity I put in the database.
Anything that's to do with process I put in the application (or
middle-ware layer).
The tricky bit is deciding what to do when you have something that's a
bit of both.

> (4) Is it faster to work at application level or at the database level?

Probably faster in the database, assuming you have only one machine. If
you have more than one machine then you can have each machine designed
for its purpose. Of course, faster to run might be at the cost of more
difficult to write. It might also mean your process code can crash the
database server.

--
   Richard Huxton
   Archonet Ltd

Re: Writing most code in Stored Procedures

От
"Josh Tolley"
Дата:
On 8/15/07, Rohit <rpk.general@gmail.com> wrote:
> I have few queries regarding the use of Stored Procedures, Functions
> and Triggers in an RDBMS.
>
> (1) When to use Stored Procedure? Writing an INSERT query in a Stored
> Procedure is better or firing it from the application level?
>
> (2) Can a Trigger call a Stored Procedure?
>
> (3) What type of code must reside in application and what type in
> RDBMS?
>
> (4) Is it faster to work at application level or at the database level?

One of the major advantages of stored procedures over application code
is that stored procedures can get to the data much more quickly than
an application can, in general. An application needs to talk to
PostgreSQL through some sort of driver (JDBC, libpq, etc.) and
typically data need to traverse a network to get from the database
machine to the application server. Stored procedures, on the other
hand, don't have the overhead of either the network or the driver.
However, stored procedures can be more difficult to debug, profile,
etc., so they might not be the best for really complex logic. They
tend to be really useful if you're doing something that requires lots
and lots of queries to the database, and don't need anything else
(data from other sources, user interaction, etc.), because that takes
greatest advantage of their quick connection to the data. Richard
Huxton's point that stored procedures are typically best for data
integrity types of functions, whereas business logic should often be
in application code is an excellent rule of thumb.

-Josh

Re: Writing most code in Stored Procedures

От
Steve Manes
Дата:
On 8/15/07, Rohit <rpk.general@gmail.com> wrote:
> I have few queries regarding the use of Stored Procedures, Functions
> and Triggers in an RDBMS.
>
> (1) When to use Stored Procedure? Writing an INSERT query in a Stored
> Procedure is better or firing it from the application level?
>
> (2) Can a Trigger call a Stored Procedure?
>
> (3) What type of code must reside in application and what type in
> RDBMS?
>
> (4) Is it faster to work at application level or at the database level?

I'm fairly hardcore about keeping as much business logic as I can in the
database.  In fact, I only do SELECTs from the application, and usually
via Views.  All inserts, updates and deletes are via procs.  I'm a
proponent of separating application code from presentation and
application code from database code.  And HTML from layout style, for
that matter.

In addition to the other reasons you've gotten:

It lets me blackbox commonly used functions, such as a function to
insert a normalized demographic record for a customer, an organization,
a user, a guarantor, a physician, etc.

It isolates database logic so it can be debugged separately from the
application.

It reduces the application's vulnerability to SQL injection, especially
if another developer (never me <g>) forgets to massage potentially
tainted user input.

Another is because I typically do my web application programming in PHP5
but the offline scripts in Perl.  Both can call the same stored
procedures so I don't have multiple copies of database code to maintain
and debug.

Another is because I want transactions to start and end in the database,
not in external application code which might crash before a COMMIT.

Another is because I'm a freelancer and generally hand off my
applications to the client's tech department, which is often a junior
level grasshopper who knows just enough SQL to be dangerous. Using
stored procedures raises the bar on what they need to know about RDMBSes
  before they start hacking working code.

And, yes, it's faster.  Particularly if business logic decisions have to
be made in context with a transaction.

Re: Writing most code in Stored Procedures

От
"Trevor Talbot"
Дата:
On 8/15/07, Steve Manes <smanes@magpie.com> wrote:

> I'm fairly hardcore about keeping as much business logic as I can in the
> database.  In fact, I only do SELECTs from the application, and usually
> via Views.  All inserts, updates and deletes are via procs.  I'm a
> proponent of separating application code from presentation and
> application code from database code.  And HTML from layout style, for
> that matter.
>
> In addition to the other reasons you've gotten:

> Another is because I want transactions to start and end in the database,
> not in external application code which might crash before a COMMIT.

Hmm, how do you handle this logically?  Do your applications never
need to submit chunks of work at once?  Or do you do something like
fill in a temporary table, and have a proc work from that?

> Another is because I'm a freelancer and generally hand off my
> applications to the client's tech department, which is often a junior
> level grasshopper who knows just enough SQL to be dangerous. Using
> stored procedures raises the bar on what they need to know about RDMBSes
>   before they start hacking working code.

...although that helps explain "why" a bit more: if you're stuck with
people who can't use transactions properly, it makes sense to keep
them out of reach...

Re: Writing most code in Stored Procedures

От
Steve Manes
Дата:
Trevor Talbot wrote:
>> Another is because I want transactions to start and end in the database,
>> not in external application code which might crash before a COMMIT.
>
> Hmm, how do you handle this logically?  Do your applications never
> need to submit chunks of work at once?  Or do you do something like
> fill in a temporary table, and have a proc work from that?

Of course not "never" but the goal is one logical database transaction
per page invocation.  It's a discipline I got into when I was working on
a site that clocked 3/4 billion page views month on a severely overtaxed
Sun Oracle box.  A lot of it involves caching on the web server.  For
instance, static data like pulldown data gets cached and refreshed
either by expiry or by the application blowing the cache after an
update.  If I have a thousand users pulling the same list of physicians
a hundred times a day, I cache it.  If users need to view a complex
report where the dependent data only changes once every 24 hours (like
an adserver pull), I cache it and have an offline job rebuild it when
fresh data becomes available.

Result sets get cached either in the user's session or in hidden DIVs to
reduce redundant database calls.

I build database-intensive applications starting at the database,
beginning with an ERD, then the proc primitives.  Then I wireframe the
application from the perspective of the database, building macros of
procedures.  After testing, these go into the application's API.  The
brass ring is being able to take a validated, prepared and filtered POST
array and send it to a stored procedure.  And rather than building a
200+ line SELECT statement in the application, building a view to
encapsulate it.

I'm not saying that's the best way to build an application.  It's just
what I'm most comfortable doing and it works for me.



Re: Writing most code in Stored Procedures

От
Lew
Дата:
Rohit wrote:
>> (4) Is it faster to work at application level or at the database level?

Richard Huxton wrote:
> Probably faster in the database, assuming you have only one machine. If
> you have more than one machine then you can have each machine designed
> for its purpose. Of course, faster to run might be at the cost of more
> difficult to write. It might also mean your process code can crash the
> database server.

Which would you rather have, a wrong answer really quickly, a correct answer
slowly, or no answer at all?

Be clear on the metric for "faster" before devising strategies.  Personally I
prefer "more correct" first.  As Richard hinted, reliability and stability are
usually important also, I'd say more important.

"Faster to run" also depends on the degree of client concurrency.  Sometimes
middleware can scale logic better than the database engine can.  Systems can
achieve higher aggregate throughput with perhaps a reduction in individual
response time.


If you put logic where it "belongs" that can help scalability.  The database
tends to be a central resource for all concurrent clients, so logic in that
layer is potentially a bottleneck.  Danger in that layer is danger to everyone.

Business logic that relies on data but performs session-specific processing
should live in the middleware in many cases.  One can throw extra hardware at
that layer to parallelize client services, and errors tend to not escape a
specific client session.

Business logic that is actually data-integrity logic could break the system if
it were in the middleware layer.  Deadlocks, race conditions and other
nastiness argue that that type of logic belongs in a more central location,
with the data.  One can then exploit the database engine capabilities for this
stuff.

As Richard metioned, there are plenty of boundary cases that require a
judgment call.  Try to analyze which approach will have more risk in such
cases; sometimes that helps discriminate.  It ain't always easy; that's why
they pay us the big bucks.

--
Lew

Re: Writing most code in Stored Procedures

От
"Mohd Kamal Bin Mustafa"
Дата:
On 8/16/07, Steve Manes <smanes@magpie.com> wrote:
> On 8/15/07, Rohit <rpk.general@gmail.com> wrote:
>
> Another is because I typically do my web application programming in PHP5
> but the offline scripts in Perl.  Both can call the same stored
> procedures so I don't have multiple copies of database code to maintain
> and debug.
I'm also thinking the same thing once before. Doing as much logic in
the database and at the application layer most of it would be just a
"SELECT .." call. It allows me to experiment with different kind of
langguage (PHP, Python, Ruby) without having to rewrite the business
logic of the application. But using this kind of approach make it hard
to debug and maintain. Changing simple thing would mean updating the
database schema and it would always frigthened me some silly mistake
can affect the integrity of the data. In the end, I slowly move the
logic out of the db and decided on one platform for the application
(PHP).

--
kamal, www.k4ml.com

Re: Writing most code in Stored Procedures

От
Guy Rouillier
Дата:
Steve Manes wrote:
> I'm fairly hardcore about keeping as much business logic as I can in the
> database.  In fact, I only do SELECTs from the application, and usually
> via Views.  All inserts, updates and deletes are via procs.
...
> And, yes, it's faster.  Particularly if business logic decisions have to
> be made in context with a transaction.

I have a thread I started ages ago over on the PERFORM list that I'm
sadly just now being able to provide some insight on.  I'll be replying
on that thread in more detail, but the short of it turns out to be that
at least in this one application, using stored procs for inserts is
slowing down the app considerably.  The app does high volume inserts and
updates, about 16 million rows a day.  By switching from stored procs to
inline inserts, elapsed time dropped from 2595 seconds to 991 seconds
for a test run.

So the moral of the story is that, as anyone who has worked
professionally for a number of years knows, no magic bullets exist.
General guidelines can be helpful, but each scenario must be
individually investigated.

--
Guy Rouillier

Re: Writing most code in Stored Procedures

От
Kenneth Downs
Дата:
Guy Rouillier wrote:
> Steve Manes wrote:
>> I'm fairly hardcore about keeping as much business logic as I can in
>> the database.  In fact, I only do SELECTs from the application, and
>> usually via Views.  All inserts, updates and deletes are via procs.
> ...
>> And, yes, it's faster.  Particularly if business logic decisions have
>> to be made in context with a transaction.
>
> I have a thread I started ages ago over on the PERFORM list that I'm
> sadly just now being able to provide some insight on.  I'll be
> replying on that thread in more detail, but the short of it turns out
> to be that at least in this one application, using stored procs for
> inserts is slowing down the app considerably.  The app does high
> volume inserts and updates, about 16 million rows a day.  By switching
> from stored procs to inline inserts, elapsed time dropped from 2595
> seconds to 991 seconds for a test run.
>
> So the moral of the story is that, as anyone who has worked
> professionally for a number of years knows, no magic bullets exist.
> General guidelines can be helpful, but each scenario must be
> individually investigated.
>

I would suggest separating the strategy of server-side biz rules from
the implementation method of stored procedures.  You can also implement
biz rules as triggers and stick with inline inserts, updates and
deletes, which is what we do in my shop.

That being said, the enforcement of the biz rules has to be done
somewhere, and no matter where it is done it will involve disk reads and
validations.  It always *seemed* to me that putting the code on a table
trigger involves the least possible overhead for doing these things. It
provides complete real-time enforcement within the transaction with no
network overhead, and has the simplest possible interface: plain old SQL.

--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com    www.andromeda-project.org
631-689-7200   Fax: 631-689-0527
cell: 631-379-0010


Re: Writing most code in Stored Procedures

От
Steve Manes
Дата:
Guy Rouillier wrote:
> I have a thread I started ages ago over on the PERFORM list that I'm
> sadly just now being able to provide some insight on.  I'll be replying
> on that thread in more detail, but the short of it turns out to be that
> at least in this one application, using stored procs for inserts is
> slowing down the app considerably.  The app does high volume inserts and
> updates, about 16 million rows a day.  By switching from stored procs to
> inline inserts, elapsed time dropped from 2595 seconds to 991 seconds
> for a test run.
>
> So the moral of the story is that, as anyone who has worked
> professionally for a number of years knows, no magic bullets exist.
> General guidelines can be helpful, but each scenario must be
> individually investigated.

Absolutely.  You can't assume that every application is going to fit
neatly into the same development jig.  Hope the MVC frameworks zealots
don't read that. <g>

I worked on a fairly high volume site using PG, an ad trafficking
workflow application, which imported ~2 million placements daily from
DoubleClick, OAS and Accipiter.  Everything had to be imported and the
reports run and cached by 8am so the clients stare blankly at 10,000 row
Excel charts over their morning coffee.

Moving all the application-bound inserts into stored procedures didn't
achieve nearly the performance enhancement I'd assumed I'd get, which I
figured was due to the overhead of the procs themselves.


Re: Writing most code in Stored Procedures

От
Ron Johnson
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/17/07 18:00, Steve Manes wrote:
> Guy Rouillier wrote:
>> I have a thread I started ages ago over on the PERFORM list that I'm
>> sadly just now being able to provide some insight on.  I'll be
>> replying on that thread in more detail, but the short of it turns out
>> to be that at least in this one application, using stored procs for
>> inserts is slowing down the app considerably.  The app does high
>> volume inserts and updates, about 16 million rows a day.  By switching
>> from stored procs to inline inserts, elapsed time dropped from 2595
>> seconds to 991 seconds for a test run.
>>
>> So the moral of the story is that, as anyone who has worked
>> professionally for a number of years knows, no magic bullets exist.
>> General guidelines can be helpful, but each scenario must be
>> individually investigated.
>
> Absolutely.  You can't assume that every application is going to fit
> neatly into the same development jig.  Hope the MVC frameworks zealots
> don't read that. <g>
>
> I worked on a fairly high volume site using PG, an ad trafficking
> workflow application, which imported ~2 million placements daily from
> DoubleClick, OAS and Accipiter.  Everything had to be imported and the
> reports run and cached by 8am so the clients stare blankly at 10,000 row
> Excel charts over their morning coffee.
>
> Moving all the application-bound inserts into stored procedures didn't
> achieve nearly the performance enhancement I'd assumed I'd get, which I
> figured was due to the overhead of the procs themselves.

Would that be because the original app was written in a compiled
language, but the SPs in an interpreted language?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxjK1S9HxQb37XmcRAnYWAKCRV+INrpvl83lhnt4iadIMrBNIRgCgr8J2
UK3F87ji/24mrISLl+WmLnY=
=5csM
-----END PGP SIGNATURE-----

Re: Writing most code in Stored Procedures

От
Steve Manes
Дата:
Ron Johnson wrote:
>> Moving all the application-bound inserts into stored procedures didn't
>> achieve nearly the performance enhancement I'd assumed I'd get, which I
>> figured was due to the overhead of the procs themselves.
>
> Would that be because the original app was written in a compiled
> language, but the SPs in an interpreted language?

No, because the application language was Perl5 for both.  I think it was
just the overhead of 2 million inserts via procs versus 2 million inline
inserts (without the proc overhead).


Re: Writing most code in Stored Procedures

От
Guy Rouillier
Дата:
Steve Manes wrote:
> Moving all the application-bound inserts into stored procedures didn't
> achieve nearly the performance enhancement I'd assumed I'd get, which I
> figured was due to the overhead of the procs themselves.

That's the conclusion I'm coming to as well for my app with very high
insert/update rates.  Is there a PG knowledgebase where this kind of
information might be useful?

--
Guy Rouillier

Re: Writing most code in Stored Procedures

От
Ron Johnson
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/17/07 21:45, Steve Manes wrote:
> Ron Johnson wrote:
>>> Moving all the application-bound inserts into stored procedures didn't
>>> achieve nearly the performance enhancement I'd assumed I'd get, which I
>>> figured was due to the overhead of the procs themselves.
>>
>> Would that be because the original app was written in a compiled
>> language, but the SPs in an interpreted language?
>
> No, because the application language was Perl5 for both.  I think it was
> just the overhead of 2 million inserts via procs versus 2 million inline
> inserts (without the proc overhead).

Interesting.  Does PG have to initiate the Perl interpreter every
time you call a Perl-written SP?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxwiAS9HxQb37XmcRAgi2AJ9Yq2drImecZVTbZR0Wo4VKlpaiXwCgpjHo
8KcDWqDpW6BWNWCj+ZUQFU4=
=Qlpg
-----END PGP SIGNATURE-----

Re: Writing most code in Stored Procedures

От
"Josh Tolley"
Дата:
On 8/18/07, Ron Johnson <ron.l.johnson@cox.net> wrote:
> Interesting.  Does PG have to initiate the Perl interpreter every
> time you call a Perl-written SP?

IIRC PostgreSQL should only load the perl interpreter once per session.

- Josh

Re: Writing most code in Stored Procedures

От
"Joshua D. Drake"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Josh Tolley wrote:
> On 8/18/07, Ron Johnson <ron.l.johnson@cox.net> wrote:
>> Interesting.  Does PG have to initiate the Perl interpreter every
>> time you call a Perl-written SP?
>
> IIRC PostgreSQL should only load the perl interpreter once per session.

Right.

Joshua D. Drake

>
> - Josh
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>


- --

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGxxmFATb/zqfZUUQRAoL+AJ0cyZ3FaB82PGbGvC/fPYWPo9UcjwCgjiDo
U5XCNKFZhhdiJ3Lf5x68lKQ=
=4HMG
-----END PGP SIGNATURE-----

Re: Writing most code in Stored Procedures

От
Ron Johnson
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/18/07 11:08, Joshua D. Drake wrote:
> Josh Tolley wrote:
>> On 8/18/07, Ron Johnson <ron.l.johnson@cox.net> wrote:
>>> Interesting.  Does PG have to initiate the Perl interpreter every
>>> time you call a Perl-written SP?
>> IIRC PostgreSQL should only load the perl interpreter once per session.
>
> Right.

I'd have been stunned if you did it any other way.

So why is Perl-SP-INSERT so much slower than Perl-SQL-INSERT?

(I can imagine that the SP code path would be longer, but since IO
is the slowest part of the system, I'm surprised that it's *that*
much slower.)

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGxx5jS9HxQb37XmcRAn1vAKCEWTovgcj/w/uFVK0nankGdbOFuACg6HSq
dUnMN0dPsdQ8NKSDW0EahcU=
=P8gN
-----END PGP SIGNATURE-----

Re: Writing most code in Stored Procedures

От
"Joshua D. Drake"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ron Johnson wrote:
> On 08/18/07 11:08, Joshua D. Drake wrote:
>> Josh Tolley wrote:
>>> On 8/18/07, Ron Johnson <ron.l.johnson@cox.net> wrote:
>>>> Interesting.  Does PG have to initiate the Perl interpreter every
>>>> time you call a Perl-written SP?
>>> IIRC PostgreSQL should only load the perl interpreter once per session.
>> Right.
>
> I'd have been stunned if you did it any other way.
>
> So why is Perl-SP-INSERT so much slower than Perl-SQL-INSERT?
>
> (I can imagine that the SP code path would be longer, but since IO
> is the slowest part of the system, I'm surprised that it's *that*
> much slower.)

We would have to see his script as there are several variables at hand here.

1. Is he using do or execute with his perl-sql? If he is using execute
then he is getting the advantage of prepared query inserts.

2. Did he use a single transaction? If so then he is not paying for 2
million commits, which he would be with the procedure.

Sincerely,

Joshua D. Drake




- ---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq



- --

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGxzFYATb/zqfZUUQRAvUWAKCHm4EOtmrblRcnoVLVSEA0/Hd1JQCfTbgu
4tRGmCvqZSuBsGvQyUOQBzU=
=edfC
-----END PGP SIGNATURE-----

Re: Writing most code in Stored Procedures

От
Steve Manes
Дата:
Ron Johnson wrote:
> Interesting.  Does PG have to initiate the Perl interpreter every
> time you call a Perl-written SP?

I mean the *application* language was Perl for both the inline insert
and the proc call.  The proc was written in plpgsql.

Re: Writing most code in Stored Procedures

От
Guy Rouillier
Дата:
Ron Johnson wrote:

> So why is Perl-SP-INSERT so much slower than Perl-SQL-INSERT?
>
> (I can imagine that the SP code path would be longer, but since IO
> is the slowest part of the system, I'm surprised that it's *that*
> much slower.)

I'm guessing that since PG allows overloaded SP names, the slowness is
coming from resolving which SP to run.  But that is just a guess.  In my
environment, I don't *have* overloaded SPs, only a single version of a
given name.  But when I was doing the conversion from Oracle, it took me
  a couple tries to get the SP signatures correct, so I'm pretty sure PG
is still going through the resolution logic, even if you only have a
single instance of a given name.

--
Guy Rouillier