Обсуждение: Re: [BUGS] BUG #3244: problem with PREPARE

Поиск
Список
Период
Сортировка

Re: [BUGS] BUG #3244: problem with PREPARE

От
Tom Lane
Дата:
"William Lawrance" <bill.lawrance@bull.com> writes:
> This program that does   "PQprepare" and then
> "PQexecPrepared" has worked previously, but doesn't
> work now.
> ...
>     strcpy(openStmt, "declare C1 cursor for select cola" 
>                        " from tprep"
>                        " where cola = $1");
>     res = PQprepare(conn, "stmtopen", openStmt, 0, 0);

I looked into this a bit and found that the issue comes from my recent
changes in support of plan caching.  To simplify matters, I instituted
a rule that utility statements don't have any interesting
transformations done at parse analysis time; see this new comment in
analyze.c:
* For optimizable statements, we are careful to obtain a suitable lock on* each referenced table, and other modules of
thebackend preserve or* re-obtain these locks before depending on the results.  It is therefore* okay to do significant
semanticanalysis of these statements.  For* utility commands, no locks are obtained here (and if they were, we could*
notbe sure we'd still have them at execution).  Hence the general rule* for utility commands is to just dump them into
aQuery node untransformed.* parse_analyze does do some purely syntactic transformations on CREATE TABLE* and ALTER
TABLE,but that's about it.  In cases where this module contains* mechanisms that are useful for utility statements, we
provideseparate* subroutines that should be called at the beginning of utility execution;* an example is
analyzeIndexStmt.

This means that "preparing" a DECLARE CURSOR is now effectively a no-op;
it doesn't do much more than detect basic syntax errors that the Bison
grammar can catch.  If you run this program without having created
the tprep table, the PQprepare doesn't fail!  But the bigger problem,
at least for Bill's complaint, is that we also don't notice, let alone
assign datatypes to, any parameter symbols appearing in the query.

I don't see any particular problem in this for the other command types
that had their analyze-time processing removed; there's no value in a
parameter in CREATE VIEW, for example.  But evidently there's some
interest in having parameters in prepared DECLARE CURSOR commands.

The easiest answer I can think of at the moment is to run parse analysis
for a DECLARE CURSOR and then throw away the result.  To avoid this
overhead in cases where it's useless, we could probably teach analyze.c
to do it only if p_variableparams is true (which essentially would mean
that the DECLARE CURSOR came in via PQprepare or equivalent, and not as
a simply executable statement).

Plan B would be to promote DECLARE CURSOR to an "optimizable statement"
that is treated under the same rules as SELECT/UPDATE/etc, in particular
that we assume locks obtained at analysis are held through to execution.
This might be a cleaner answer overall, but I have no idea right now
about the effort required or any possible downsides.

Comments, better ideas?
        regards, tom lane


Re: [BUGS] BUG #3244: problem with PREPARE

От
"William Lawrance"
Дата:
We were about to submit a patch to ECPG to improve the performance
of embedded SQL, when we discovered that PREPARE had quit working.

Background:

We have a benchmark for a very large customer that consists of several
hundred programs containing several thousand embedded SQL statements.
In a three hour execution, millions of calls are made to the DBMS server.
This benchmark has been successfully executed using Oracle, DB2, and
PostgreSQL. In the benchmark, Postgres is shown to be slower, by far, than
the other DBMS systems.

While searching for ways to improve the PostgreSQL performance,
we noticed that query plans are not saved and re-used in the server
if accessed from the ECPG interface. In the customer benchmark this is key.
The program processes a large input file of work. For each item in the file
there is a large sequence of the application and SQL that must be executed.
For each subsequent item of the input file, many of the same SQL statements
are processed.

What we found was that each SQL statement was constructed by ECPG as
an ASCII string and presented to the postmaster to be re-optimized each
time the SQL was received. We found that using the 'prepare' interface
to save the query plan after the first execution would be a
significant savings - approximately 30% of the elapsed time of the
application.  The reduction in elapsed time was over 1 hour for the
benchmark.

Details:

Our first attempt to use the ECPG prepare interface revealed that ECPG
doesn't use the PQlib prepare function. The ECPG prepare replaces any
parameters with their values and presents a new SQL statement to the
postmaster each time. We then tried to use the PQlib prepare interface.
There are several difficulties to be encountered when attempting to use
this within a program using the ECPG interface. For example, the
connection structure for PQlib isn't readily available, and the
transaction semantics must be synchronized with ECPG's state. This did
work, but it was fairly clumsy.

Since we wanted to do this in a cleaner manner, and also wished to avoid
changing the applications if possible, we used the following approach:
   Within the "execute.c" module, we added routines to manage a cache   of prepared statements. These routines are able
tosearch, insert,   and delete entries in the cache. The key for these cache entries is   the text of the SQL statement
aspassed by ECPG from the application   program.
 
   Within the same module, we replaced the "ECPGexecute" function.   This is the function that is called to execute a
statementafter   some preliminary housekeeping is done. The original "ECPGexecute"   function constructs an ASCII
stringby replacing each host variable   with its current value and then calling "PQexec". The new   "ECPGexecute"
functiondoes the following:
 
     - build an array of the current values of the host variables.
     - search the cache for an entry indicating that this statement       has already been prepare'd, via  "PQprepare"
     - If no entry was found in the previous step, call "PQprepare"       for the statement and then insert an entry
forit into the       cache. If this requires an entry to be re-used, execute a       "DEALLOCATE PREPARE.." for the
previouscontents.
 
     - At this point, the SQL statement has been prepare'd by PQlib,       either when the statement was executed in
thepast, or in       the previous step.
 
     - call "PQexecPrepared", using the array of parameters built       in the first step above.






-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Sunday, April 22, 2007 7:53 PM
To: William Lawrance
Cc: pgsql-bugs@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [BUGS] BUG #3244: problem with PREPARE


"William Lawrance" <bill.lawrance@bull.com> writes:
> This program that does   "PQprepare" and then
> "PQexecPrepared" has worked previously, but doesn't
> work now.
> ...
>     strcpy(openStmt, "declare C1 cursor for select cola"
>                        " from tprep"
>                        " where cola = $1");
>     res = PQprepare(conn, "stmtopen", openStmt, 0, 0);

I looked into this a bit and found that the issue comes from my recent
changes in support of plan caching.  To simplify matters, I instituted
a rule that utility statements don't have any interesting
transformations done at parse analysis time; see this new comment in
analyze.c:
* For optimizable statements, we are careful to obtain a suitable lock on* each referenced table, and other modules of
thebackend preserve or* re-obtain these locks before depending on the results.  It is therefore* okay to do significant
semanticanalysis of these statements.  For* utility commands, no locks are obtained here (and if they were, we could*
notbe sure we'd still have them at execution).  Hence the general rule* for utility commands is to just dump them into
aQuery node
 
untransformed.* parse_analyze does do some purely syntactic transformations on CREATE
TABLE* and ALTER TABLE, but that's about it.  In cases where this module
contains* mechanisms that are useful for utility statements, we provide separate* subroutines that should be called at
thebeginning of utility execution;* an example is analyzeIndexStmt.
 

This means that "preparing" a DECLARE CURSOR is now effectively a no-op;
it doesn't do much more than detect basic syntax errors that the Bison
grammar can catch.  If you run this program without having created
the tprep table, the PQprepare doesn't fail!  But the bigger problem,
at least for Bill's complaint, is that we also don't notice, let alone
assign datatypes to, any parameter symbols appearing in the query.

I don't see any particular problem in this for the other command types
that had their analyze-time processing removed; there's no value in a
parameter in CREATE VIEW, for example.  But evidently there's some
interest in having parameters in prepared DECLARE CURSOR commands.

The easiest answer I can think of at the moment is to run parse analysis
for a DECLARE CURSOR and then throw away the result.  To avoid this
overhead in cases where it's useless, we could probably teach analyze.c
to do it only if p_variableparams is true (which essentially would mean
that the DECLARE CURSOR came in via PQprepare or equivalent, and not as
a simply executable statement).

Plan B would be to promote DECLARE CURSOR to an "optimizable statement"
that is treated under the same rules as SELECT/UPDATE/etc, in particular
that we assume locks obtained at analysis are held through to execution.
This might be a cleaner answer overall, but I have no idea right now
about the effort required or any possible downsides.

Comments, better ideas?
        regards, tom lane



Re: [BUGS] BUG #3244: problem with PREPARE

От
Michael Meskes
Дата:
On Mon, Apr 23, 2007 at 02:02:04PM -0700, William Lawrance wrote:
> Our first attempt to use the ECPG prepare interface revealed that ECPG
> doesn't use the PQlib prepare function. The ECPG prepare replaces any
> parameters with their values and presents a new SQL statement to the

This is true and should also be documented. The reason for this
behaviour is simply that ECPG prepare feature was added before the
backend had its own prepare feature. And no one changed it so far.

> There are several difficulties to be encountered when attempting to use
> this within a program using the ECPG interface. For example, the
> connection structure for PQlib isn't readily available, and the
> transaction semantics must be synchronized with ECPG's state. This did
> work, but it was fairly clumsy.

Right, that's what makes it non trivial.

> Since we wanted to do this in a cleaner manner, and also wished to avoid
> changing the applications if possible, we used the following approach:
> 
>     Within the "execute.c" module, we added routines to manage a cache
>     of prepared statements. These routines are able to search, insert,
>     and delete entries in the cache. The key for these cache entries is
>     the text of the SQL statement as passed by ECPG from the application
>     program.
> 
>     Within the same module, we replaced the "ECPGexecute" function.
>     This is the function that is called to execute a statement after
>     some preliminary housekeeping is done. The original "ECPGexecute"
>     function constructs an ASCII string by replacing each host variable
>     with its current value and then calling "PQexec". The new
>     "ECPGexecute" function does the following:
> 
>       - build an array of the current values of the host variables.
> 
>       - search the cache for an entry indicating that this statement
>         has already been prepare'd, via  "PQprepare"
> 
>       - If no entry was found in the previous step, call "PQprepare"
>         for the statement and then insert an entry for it into the
>         cache. If this requires an entry to be re-used, execute a
>         "DEALLOCATE PREPARE.." for the previous contents.
> 
>       - At this point, the SQL statement has been prepare'd by PQlib,
>         either when the statement was executed in the past, or in
>         the previous step.
> 
>       - call "PQexecPrepared", using the array of parameters built
>         in the first step above.

Does this mean you prepare ALL statements? Or where you only talking
about statements that are prepared in the application?

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!


Re: [BUGS] BUG #3244: problem with PREPARE

От
Bruce Momjian
Дата:
Michael Meskes wrote:
> On Mon, Apr 23, 2007 at 02:02:04PM -0700, William Lawrance wrote:
> > Our first attempt to use the ECPG prepare interface revealed that ECPG
> > doesn't use the PQlib prepare function. The ECPG prepare replaces any
> > parameters with their values and presents a new SQL statement to the
> 
> This is true and should also be documented. The reason for this
> behaviour is simply that ECPG prepare feature was added before the
> backend had its own prepare feature. And no one changed it so far.

It is in the TODO:
       o Use backend PREPARE/EXECUTE facility for ecpg where possible

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: [BUGS] BUG #3244: problem with PREPARE

От
"William Lawrance"
Дата:
In the modified version of ECPG that we used for our benchmark, we
PREPARE'd all statements.


-----Original Message-----
From: Michael Meskes [mailto:meskes@postgresql.org]
Sent: Tuesday, April 24, 2007 12:45 AM
To: William Lawrance
Cc: Tom Lane; pgsql-bugs@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [BUGS] BUG #3244: problem with PREPARE


On Mon, Apr 23, 2007 at 02:02:04PM -0700, William Lawrance wrote:
> Our first attempt to use the ECPG prepare interface revealed that ECPG
> doesn't use the PQlib prepare function. The ECPG prepare replaces any
> parameters with their values and presents a new SQL statement to the

This is true and should also be documented. The reason for this
behaviour is simply that ECPG prepare feature was added before the
backend had its own prepare feature. And no one changed it so far.

> There are several difficulties to be encountered when attempting to use
> this within a program using the ECPG interface. For example, the
> connection structure for PQlib isn't readily available, and the
> transaction semantics must be synchronized with ECPG's state. This did
> work, but it was fairly clumsy.

Right, that's what makes it non trivial.

> Since we wanted to do this in a cleaner manner, and also wished to avoid
> changing the applications if possible, we used the following approach:
> 
>     Within the "execute.c" module, we added routines to manage a cache
>     of prepared statements. These routines are able to search, insert,
>     and delete entries in the cache. The key for these cache entries is
>     the text of the SQL statement as passed by ECPG from the application
>     program.
> 
>     Within the same module, we replaced the "ECPGexecute" function.
>     This is the function that is called to execute a statement after
>     some preliminary housekeeping is done. The original "ECPGexecute"
>     function constructs an ASCII string by replacing each host variable
>     with its current value and then calling "PQexec". The new
>     "ECPGexecute" function does the following:
> 
>       - build an array of the current values of the host variables.
> 
>       - search the cache for an entry indicating that this statement
>         has already been prepare'd, via  "PQprepare"
> 
>       - If no entry was found in the previous step, call "PQprepare"
>         for the statement and then insert an entry for it into the
>         cache. If this requires an entry to be re-used, execute a
>         "DEALLOCATE PREPARE.." for the previous contents.
> 
>       - At this point, the SQL statement has been prepare'd by PQlib,
>         either when the statement was executed in the past, or in
>         the previous step.
> 
>       - call "PQexecPrepared", using the array of parameters built
>         in the first step above.

Does this mean you prepare ALL statements? Or where you only talking
about statements that are prepared in the application?

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!


Re: [BUGS] BUG #3244: problem with PREPARE

От
Tom Lane
Дата:
I wrote:
> The easiest answer I can think of at the moment is to run parse analysis
> for a DECLARE CURSOR and then throw away the result.  To avoid this
> overhead in cases where it's useless, we could probably teach analyze.c
> to do it only if p_variableparams is true (which essentially would mean
> that the DECLARE CURSOR came in via PQprepare or equivalent, and not as
> a simply executable statement).

> Plan B would be to promote DECLARE CURSOR to an "optimizable statement"
> that is treated under the same rules as SELECT/UPDATE/etc, in particular
> that we assume locks obtained at analysis are held through to execution.
> This might be a cleaner answer overall, but I have no idea right now
> about the effort required or any possible downsides.

I looked into Plan B a bit more, and decided that the least ugly way to
deal with DECLARE CURSOR is to treat it a bit like SELECT INTO: it is a
SELECT for purposes of parsing and planning, and then we have to divert
just before calling the executor.  This would require special-casing in
pretty nearly just the same places that currently treat SELECT INTO as
a special case.  The query representation would be:

In raw grammar output: same as now, ie, a DeclareCursorStmt with a
raw SelectStmt tree below it.

After parse analysis: a CMD_SELECT Query with nonempty utilityStmt
(the original DeclareCursorStmt, but with its query field now NULL
since we don't need the raw grammar output anymore; or maybe we should
invent a separate node type for the purpose).

After planning: a PlannedStmt.  Just as PlannedStmt carries an "into"
field for SELECT INTO, it'd have to carry a field for DECLARE CURSOR.

There seem to be enough places that know about SELECT INTO that this'd
be a bit tedious to do --- most of a day's work probably.  Is it worth
the trouble, or should I just do the klugy fix?  Thoughts?
        regards, tom lane