Обсуждение: stored procedures and dynamic queries
Any general rule about dynamically generated queries in stored procedures vs. performances? I was going to write stuff as simple as create or replace function EditQty(int, int, int, varchar(10)) returns boolean as ' declare _uid alias for $1; _aid alias for $2; _qty alias for $3; _table alias $4; _modified timestamp; begin _table := 'shop_commerce_basket' || _table _modified := now(); update _table set qty=_qty, modified=_modified where uid=_uid and aid=_aid; if not found then insert into _table (uid, aid, qty) values(_uid,_aid,_qty); end if; end; ' language plpgsql; Is it going to perform worse than with a static table name? Where can I find some clue about the effects of similar decisions? thx -- Ivan Sergio Borgonovo http://www.webthatworks.it
Ivan Sergio Borgonovo wrote: > Any general rule about dynamically generated queries in stored > procedures vs. performances? It's the same decision as any with any prepared plan vs plan-each-time trade-off. A query built using EXECUTE will have to be planned each time. That costs you something but means the plan will have all the information it needs. A pre-planned query saves planning time on the second,third etc runs but the plan won't change with the query-parameters. So: A query that's going to be executed a million times in a loop with the same plan each time implies you want a pre-planned query. A query executed once, or where changing input parameters would benefit from changing plans implies you want to re-plan each time. If you don't have a good reason to think it matters one way or the other, then it probably doesn't. If it's not worth testing then it's not worth optimising either. Of course, in your example the query wouldn't work at all - you'd need to use the EXECUTE command in plpgsql. -- Richard Huxton Archonet Ltd
On Mon, 03 Dec 2007 19:06:29 +0000 Richard Huxton <dev@archonet.com> wrote: > Ivan Sergio Borgonovo wrote: > > Any general rule about dynamically generated queries in stored > > procedures vs. performances? > > It's the same decision as any with any prepared plan vs > plan-each-time trade-off. Should I guess this trade off on aruspices or is it possible to gain a little bit of culture or it's something based on heuristic and experience? If the second, any good reading? How should I take into account parameters like: - is it a insert/update vs select query - is it performed on a large table? - is it performed frequently? frequently with same parameters? frequently with different parameters? - does the table is growing? - are there frequent delete? etc... I'm just guessing things that may impact. > Of course, in your example the query wouldn't work at all - you'd > need to use the EXECUTE command in plpgsql. OK just adapted from a more complicated one and forgot to fix it. thx -- Ivan Sergio Borgonovo http://www.webthatworks.it
Ivan Sergio Borgonovo wrote: > On Mon, 03 Dec 2007 19:06:29 +0000 > Richard Huxton <dev@archonet.com> wrote: > >> Ivan Sergio Borgonovo wrote: >>> Any general rule about dynamically generated queries in stored >>> procedures vs. performances? >> It's the same decision as any with any prepared plan vs >> plan-each-time trade-off. > > Should I guess this trade off on aruspices or is it possible to gain > a little bit of culture or it's something based on heuristic and > experience? Unless it's an obvious decision (millions of small identical queries vs. occasional large complex ones) then you'll have to test. That's going to be true of any decision like this on any system. > If the second, any good reading? > How should I take into account parameters like: > - is it a insert/update vs select query > - is it performed on a large table? > - is it performed frequently? frequently with same parameters? > frequently with different parameters? > - does the table is growing? > - are there frequent delete? Read the manuals, particularly the sections on MVCC, also the planner and EXPLAIN. Read up on the WAL and checkpoints. You might want to skim through the archives on the performance list. You might find some of the community-related resources useful too. Some of the following might be a little out-of-date, so check: http://www.westnet.com/~gsmith/content/postgresql/ http://www.powerpostgresql.com/PerfList http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php HTH -- Richard Huxton Archonet Ltd
On Tue, 04 Dec 2007 08:14:56 +0000 Richard Huxton <dev@archonet.com> wrote: > Unless it's an obvious decision (millions of small identical > queries vs. occasional large complex ones) then you'll have to > test. That's going to be true of any decision like this on any > system. :( I'm trying to grasp a general idea from the view point of a developer rather than a sysadmin. At this moment I'm not interested in optimisation, I'm interested in understanding the trade off of certain decisions in the face of a cleaner interface. Most of the documents available are from a sysadmin point of view. That makes me think that unless I write terrible SQL it won't make a big difference and the first place I'll have to look at if the application need to run faster is pg config. This part (for posterity) looks as the most interesting for developers: http://www.gtsm.com/oscon2003/toc.html Starting from Functions Still I can't understand some things, I'll come back. thanks for the right pointers. -- Ivan Sergio Borgonovo http://www.webthatworks.it
Ivan Sergio Borgonovo wrote: > On Tue, 04 Dec 2007 08:14:56 +0000 > Richard Huxton <dev@archonet.com> wrote: > >> Unless it's an obvious decision (millions of small identical >> queries vs. occasional large complex ones) then you'll have to >> test. That's going to be true of any decision like this on any >> system. > > :( > > I'm trying to grasp a general idea from the view point of a developer > rather than a sysadmin. At this moment I'm not interested in > optimisation, I'm interested in understanding the trade off of > certain decisions in the face of a cleaner interface. Always go for the cleaner design. If it turns out that isn't fast enough, *then* start worrying about having a bad but faster design. > Most of the documents available are from a sysadmin point of view. > That makes me think that unless I write terrible SQL it won't make a > big difference and the first place I'll have to look at if the > application need to run faster is pg config. The whole point of a RDBMS is so that you don't have to worry about this. If you have to start tweaking the fine details of these things, then that's a point where the RDBMS has reached its limits. In a perfect world you wouldn't need to configure PG either, but it's not that clever I'm afraid. Keep your database design clean, likewise with your queries, consider whether you can cache certain results and get everything working first. Then, look for where bottle-necks are, do you have any unexpectedly long-running queries? (see www.pgfoundry.org for some tools to help with log analysis) > This part (for posterity) looks as the most interesting for > developers: > http://www.gtsm.com/oscon2003/toc.html > Starting from Functions Note that this is quite old now, so some performance-related assumptions will be wrong for current versions of PG. > Still I can't understand some things, I'll come back. -- Richard Huxton Archonet Ltd
--- Richard Huxton <dev@archonet.com> wrote: > Ivan Sergio Borgonovo wrote: > > On Tue, 04 Dec 2007 08:14:56 +0000 > > Richard Huxton <dev@archonet.com> wrote: > > > >> Unless it's an obvious decision (millions of > small identical > >> queries vs. occasional large complex ones) then > you'll have to > >> test. That's going to be true of any decision > like this on any > >> system. > > > > :( > > > > I'm trying to grasp a general idea from the view > point of a developer > > rather than a sysadmin. At this moment I'm not > interested in > > optimisation, I'm interested in understanding the > trade off of > > certain decisions in the face of a cleaner > interface. > > Always go for the cleaner design. If it turns out > that isn't fast > enough, *then* start worrying about having a bad but > faster design. > I don't know about "clean", but the mantra here is "First you make it provably correct, then you make it fast." I am a fan of making things as simple as practicable, BUT NO SIMPLER. I don't know how that meshes with "clean", but the general idea is right. If you look at my code, my C++ and my java code is infinitely better than my SQL, so with either of those, I end up making fewer variants, and my starting point is generally much closer to my end point, and that is just a function of experience. With my SQL code, I generally find myself producing a variety of scripts to support a given task. And these include exploration of just about everything SQL has to offer. I TRY everything, from joins to correlated subqueries to stored procedures to views, and beyond. And I am still trying to develop an intuition as to which options will serve best in a given situation, analogous to design patterns I routinely use in my C++ and Java code. Your reaction to Richard's advice to test seems to imply you want a more direct approach to "THE" answer. That is chasing something that doesn't exist! I tell you, as a developer (and NOT a sysadmin), there is no substitute for experience and THAT can ONLY be obtained through trial and error. I haven't seen any books about design patterns in SQL (while there are plenty in C++ and Java, and other languages), so there is no short cut. In short, the general idea most developers I know work with is "code and test, and then code again and test again, until you've tried the variety of options that exist." There is no substitute for testing yourself. I have seen repeated advice to replace correlated subqueries by left joins, claiming that ubiquitous experience is that the joins are faster, and yet, in my tests, in most cases there was little difference in speed while in others the correlated subqueries were faster. So it appears the advice was based on experience with dated software and the RDBMS in question had subsequently dramatically improved how it handled correlated subqueries. And my use of EXPLAIN confirmed what I was finding through testing. > > Most of the documents available are from a > sysadmin point of view. > > That makes me think that unless I write terrible > SQL it won't make a > > big difference and the first place I'll have to > look at if the > > application need to run faster is pg config. > > The whole point of a RDBMS is so that you don't have > to worry about > this. If you have to start tweaking the fine details > of these things, > then that's a point where the RDBMS has reached its > limits. In a perfect > world you wouldn't need to configure PG either, but > it's not that clever > I'm afraid. > I am not sure I buy this, if I properly understand it. Trust me, I have written some really bad but simple queries that took hours to complete a task that was completed in less than a minute with smarter code. And yet the "bad" code I'd written was similar in nature to examples used in some texts to explain ideas in SQL. The point is, until you get extensive experience in SQL programming and optimization, you won't know what is bad code until you test it. Personally, I rely on the sysadmin to administer the RDBMS properly, to ensure it is configured appropriately for our application, AND I ask his or her advice and input on how I design and implement my SQL code, as well as for input on distributed application architecture. You can't do it all. On my development machine, I just use whatever the default configuration is, so I have it up and running in a flash and can focus on my development. I'll change that configuration ONLY if the sysadmin tells me there is a problem with the default. My advice, therefore is forget about configuration issues and focus on getting your SQL right, and then fast, and let your sysadmin advise, and possibly help, with changes to your configuration should he or she feel it needs to be modified to better represent how your application will behave once in production. > Keep your database design clean, likewise with your > queries, consider > whether you can cache certain results and get > everything working first. > Richard, could you possibly clarify what you mean by a "clean design"? Is it different from what I normally do with regard to ensuring all the requisite data is available, properly normalized, with a suitable suite of indeces, keys, &c., and as simple as practicable? (That is, over-simplification is avoided.) I also tend to ensure that all user access to the data is through either a stored procedure or a read only view (perhaps with a little paranoia thrown in ;). Ted
On Tue, 04 Dec 2007 13:54:15 +0000 Richard Huxton <dev@archonet.com> wrote: > Always go for the cleaner design. If it turns out that isn't fast > enough, *then* start worrying about having a bad but faster design. mmm yeah right. I did express myself badly. What I mean I've first to know what are the boundaries to know what a good design is. I'm ready to refactor... I'd just like to avoid it for ignorance of common knowledge about good practice. BTW still a good reading for dev: http://www.gtsm.com/oscon2004/ > > Most of the documents available are from a sysadmin point of view. > > That makes me think that unless I write terrible SQL it won't > > make a big difference and the first place I'll have to look at if > > the application need to run faster is pg config. > The whole point of a RDBMS is so that you don't have to worry about > this. If you have to start tweaking the fine details of these This will definitively be the last resort. These times you can't wear so many hats as before. > Keep your database design clean, likewise with your queries, > consider whether you can cache certain results and get everything > working first. At the end... if you don't look to much to details everything will reach a defined deterministic state after all ;) > Note that this is quite old now, so some performance-related > assumptions will be wrong for current versions of PG. I noticed. Maybe this will be part of some other question later. -- Ivan Sergio Borgonovo http://www.webthatworks.it
As a side note, there is actually a book on design patterns in SQL, although I personally haven't read it. From the reviews I recall reading about it, I think its mostly based on Oracle Features. Still might be a good read as far as PostgreSQL is concerned except for the sections on Graphs and recursive trees since Oracle has special syntactical sugar for that kind of stuff that is unique to Oracle. http://www.rampant-books.com/book_2006_1_sql_coding_styles.htm Hope that helps, Regina -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Ted Byers Sent: Tuesday, December 04, 2007 9:59 AM To: Richard Huxton; Ivan Sergio Borgonovo Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] stored procedures and dynamic queries --- Richard Huxton <dev@archonet.com> wrote: > Ivan Sergio Borgonovo wrote: > > On Tue, 04 Dec 2007 08:14:56 +0000 > > Richard Huxton <dev@archonet.com> wrote: > > > >> Unless it's an obvious decision (millions of > small identical > >> queries vs. occasional large complex ones) then > you'll have to > >> test. That's going to be true of any decision > like this on any > >> system. > > > > :( > > > > I'm trying to grasp a general idea from the view > point of a developer > > rather than a sysadmin. At this moment I'm not > interested in > > optimisation, I'm interested in understanding the > trade off of > > certain decisions in the face of a cleaner > interface. > > Always go for the cleaner design. If it turns out > that isn't fast > enough, *then* start worrying about having a bad but > faster design. > I don't know about "clean", but the mantra here is "First you make it provably correct, then you make it fast." I am a fan of making things as simple as practicable, BUT NO SIMPLER. I don't know how that meshes with "clean", but the general idea is right. If you look at my code, my C++ and my java code is infinitely better than my SQL, so with either of those, I end up making fewer variants, and my starting point is generally much closer to my end point, and that is just a function of experience. With my SQL code, I generally find myself producing a variety of scripts to support a given task. And these include exploration of just about everything SQL has to offer. I TRY everything, from joins to correlated subqueries to stored procedures to views, and beyond. And I am still trying to develop an intuition as to which options will serve best in a given situation, analogous to design patterns I routinely use in my C++ and Java code. Your reaction to Richard's advice to test seems to imply you want a more direct approach to "THE" answer. That is chasing something that doesn't exist! I tell you, as a developer (and NOT a sysadmin), there is no substitute for experience and THAT can ONLY be obtained through trial and error. I haven't seen any books about design patterns in SQL (while there are plenty in C++ and Java, and other languages), so there is no short cut. In short, the general idea most developers I know work with is "code and test, and then code again and test again, until you've tried the variety of options that exist." There is no substitute for testing yourself. I have seen repeated advice to replace correlated subqueries by left joins, claiming that ubiquitous experience is that the joins are faster, and yet, in my tests, in most cases there was little difference in speed while in others the correlated subqueries were faster. So it appears the advice was based on experience with dated software and the RDBMS in question had subsequently dramatically improved how it handled correlated subqueries. And my use of EXPLAIN confirmed what I was finding through testing. > > Most of the documents available are from a > sysadmin point of view. > > That makes me think that unless I write terrible > SQL it won't make a > > big difference and the first place I'll have to > look at if the > > application need to run faster is pg config. > > The whole point of a RDBMS is so that you don't have > to worry about > this. If you have to start tweaking the fine details > of these things, > then that's a point where the RDBMS has reached its > limits. In a perfect > world you wouldn't need to configure PG either, but > it's not that clever > I'm afraid. > I am not sure I buy this, if I properly understand it. Trust me, I have written some really bad but simple queries that took hours to complete a task that was completed in less than a minute with smarter code. And yet the "bad" code I'd written was similar in nature to examples used in some texts to explain ideas in SQL. The point is, until you get extensive experience in SQL programming and optimization, you won't know what is bad code until you test it. Personally, I rely on the sysadmin to administer the RDBMS properly, to ensure it is configured appropriately for our application, AND I ask his or her advice and input on how I design and implement my SQL code, as well as for input on distributed application architecture. You can't do it all. On my development machine, I just use whatever the default configuration is, so I have it up and running in a flash and can focus on my development. I'll change that configuration ONLY if the sysadmin tells me there is a problem with the default. My advice, therefore is forget about configuration issues and focus on getting your SQL right, and then fast, and let your sysadmin advise, and possibly help, with changes to your configuration should he or she feel it needs to be modified to better represent how your application will behave once in production. > Keep your database design clean, likewise with your > queries, consider > whether you can cache certain results and get > everything working first. > Richard, could you possibly clarify what you mean by a "clean design"? Is it different from what I normally do with regard to ensuring all the requisite data is available, properly normalized, with a suitable suite of indeces, keys, &c., and as simple as practicable? (That is, over-simplification is avoided.) I also tend to ensure that all user access to the data is through either a stored procedure or a read only view (perhaps with a little paranoia thrown in ;). Ted ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org/ ----------------------------------------- The substance of this message, including any attachments, may be confidential, legally privileged and/or exempt from disclosure pursuant to Massachusetts law. It is intended solely for the addressee. If you received this in error, please contact the sender and delete the material from any computer.