Обсуждение: Re: :PgSQL: More Queestions

Поиск
Список
Период
Сортировка

Re: :PgSQL: More Queestions

От
David Wheeler
Дата:
On Tuesday, November 19, 2002, at 03:42  PM, Jeff Urlwin wrote:

> You probably only need dTHR to support older, pre-threading perls.  I 
> don't
> believe you need the #ifdef, but it can't hurt (except visually in your
> code).

Okay. What is it?

>>
>> * In dbd_st_prepare(), is there a way to determine NUM_OF_FIELDS, 
>> NAME,
>> etc. -- that is, before executing the statement?
>
> Only if you want to fully parse the SQL :)

Okay, that makes sense. Thanks.

> DBD::ODBC, specifically doesn't handle comments.  Reasoning: comments 
> are
> (typically) DB vendor specific and I do not believe that ODBC itself
> declares a comment capability.  Therefore, it's really not safe for
> something like DBD::ODBC to look for them.  It may be for you.  I do 
> not see
> *much* benefit in adding comments to the queries themselves, within 
> perl,
> but then again, if you had perl read a file of SQL Statements and have 
> it
> generically prepare() and execute() them, then, there's probably value
> there.

Yeah. We have a Bricolage script that does that, but, as it happens, we 
were parsing out comments in Perl before passing them to DBD:Pg's 
prepare(). I didn't notice that there were any tests in DBD::Pg's test 
suite for comments, either. And it's not documented. So the way I look 
at it, if I leave the current parser, I'll leave the comment parsing 
part. But if I change it (as I'm seriously considering, in light of 
PostgreSQL 7.3's support for prepared statements), I'll probably do no 
parsing for comments.

> In the preparse(), we're looking for placeholders to notify DBI that 
> we need
> specific parameters to execute the query and, in the case of DBD::ODBC,
> later notify the ODBC Driver that we are binding parameters (and what 
> type
> they are, VARCHAR, etc).  Then the Driver does the binding in whatever 
> DBMS
> specific way it needs to.  You may have to do more, as you *are* the 
> driver.
> Note that there is also a way in DBD::ODBC to ignore :foo style 
> parameters
> because some databases use that for syntax in stored procedures or 
> triggers.
> For example, with Oracle a trigger can access :old.column_name or
> :new.column_name and DBD::ODBC allows you to turn off treating 
> :anything as
> a bind variable to support that.  You may not need that...

I understand that the goal is to convert the placeholders from '?' or 
':nn' to the PostgreSQL internal version (':pn'). What I'm referring to 
specifically, however, is this snippet from DBD::Pg:
        if (in_literal) {            /* Check if literal ends but keep quotes in literal */            if (*src ==
in_literal){                int bs = 0;                char *str;                str = src-1; /* Back a character. */
            while (*(str - bs) == '\\')                    bs++;                if (!(bs & 1))                    /* bs
isan even number? */                    in_literal = 0;            }            *dest++ = *src++;            continue;
     }
 

in_literal is set when the last character wasn't a placeholder 
character ('?' or ':') and is either a single or a double quotation 
mark. So while I understand that one might want to ignore placeholder 
characters, I don't really understand what the above code is doing. 
Probably'll be easier for me after I've been looking a C for a while...

Maybe it's just too complex, because, looking at DBD::ODBC's 
dbd_preparse(), the handling of literals in the query seems a good deal 
more straight-forward (though it doesn't appear to handle '\'' or "\"" 
-- am I reading that right?

> It's going to depend upon what you need to handle.  For the most part, 
> it
> shouldn't change after the prepare, but in DBD::ODBC, for example, 
> it's more
> complex because some statements can return multiple result sets.

Ah, that makes sense. Not sure if it's an issue for PostgreSQL, but I 
doesn't appear to be much of an overhead to set it on a per-execute 
basis...

> svp is a temporary reference to obtain a pointer to a scalar value 
> (scalar
> value pointer).  You are then casting it to a pointer to a phs_t, which
> holds your parameter information.  You'll "create" the phs_t instances 
> when
> you preparse the query.  In DBD::ODBC, the ftype is queried from the 
> driver
> itself (may go back to the database for information) to determine if 
> it's
> numeric, varchar, etc.  The phs_t instance can hold whatever you need 
> to
> track the parameter (type, scale, etc).  Some drivers assume 
> everything is a
> varchar and the database itself performs the conversion.
>
> So: all_params_hv is in your statement handle and you actually put the
> information in the all_params_hv hash when you preparse the statement 
> (and
> update it during execute/bind_param).  Each "phs_t" represents a 
> parameter
> in the query.  The information contained in the phs_t instance comes 
> from
> your parsing the query and, possibly querying the database to 
> determine the
> type of the parameter.
>
> Also, the all_params_hv is handy in handling the relatively new DBI
> attribute ParamValues...

Yes, thank you. I'm spending a lot of time right now just studying the 
code in dbd_preparse() and dbd_st_execute() in both DBD::Pg and 
DBD::ODBC. I think I'll keep at it until I feel I understand it as well 
as I can, and then post my thoughts on what to do next.

Thanks,

David

-- 
David Wheeler                                     AIM: dwTheory
david@wheeler.net                                 ICQ: 15726394
http://david.wheeler.net/                      Yahoo!: dew7e                                               Jabber:
Theory@jabber.org



Re: :PgSQL: More Queestions

От
Adam Witney
Дата:
> I understand that the goal is to convert the placeholders from '?' or
> ':nn' to the PostgreSQL internal version (':pn'). What I'm referring to
> specifically, however, is this snippet from DBD::Pg:
> 
>        if (in_literal) {
>            /* Check if literal ends but keep quotes in literal */
>            if (*src == in_literal) {
>                int bs = 0;
>                char *str;
>                str = src-1; /* Back a character. */
>                while (*(str - bs) == '\\')
>                    bs++;
>                if (!(bs & 1))
>                    /* bs is an even number? */
>                    in_literal = 0;
>            }
>            *dest++ = *src++;
>            continue;
>        }
> 
> in_literal is set when the last character wasn't a placeholder
> character ('?' or ':') and is either a single or a double quotation
> mark. So while I understand that one might want to ignore placeholder
> characters, I don't really understand what the above code is doing.
> Probably'll be easier for me after I've been looking a C for a while...

One thing you may want to keep in mind is the situation concerning array
slices

If you send a statement like this

my $sth = $dbh->prepare("select cube[1:2][1][1] from test4");

Then the :2 gets treated as a placeholder and gets substituted. From the
trace

dbd_st_execute: statement = >select cube[1NULL][1][1] from test4<
ERROR:  parser: parse error at or near "NULL" at character 14 error 7
recorded: ERROR:  parser: parse error at or near "
NULL" at character 14

As no parameters have been bound to the placeholder then a NULL is
substituted and so it fails.

Or maybe I was doing it wrong and there is a way to get around this already?

Thanks

adam


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



Re: :PgSQL: More Queestions

От
Tim Bunce
Дата:
On Tue, Nov 19, 2002 at 11:03:23PM -0800, David Wheeler wrote:
> On Tuesday, November 19, 2002, at 03:42  PM, Jeff Urlwin wrote:
> 
> But if I change it (as I'm seriously considering, in light of 
> PostgreSQL 7.3's support for prepared statements), I'll probably do no 
> parsing for comments.

I think that would be a bad move.

> I understand that the goal is to convert the placeholders from '?' or 
> ':nn' to the PostgreSQL internal version (':pn'). What I'm referring to 
> specifically, however, is this snippet from DBD::Pg:
> 
>         if (in_literal) {
>             /* Check if literal ends but keep quotes in literal */
>             if (*src == in_literal) {
>                 int bs = 0;
>                 char *str;
>                 str = src-1; /* Back a character. */
>                 while (*(str - bs) == '\\')
>                     bs++;
>                 if (!(bs & 1))
>                     /* bs is an even number? */
>                     in_literal = 0;
>             }
>             *dest++ = *src++;
>             continue;
>         }
> 
> in_literal is set when the last character wasn't a placeholder 
> character ('?' or ':') and is either a single or a double quotation 
> mark. So while I understand that one might want to ignore placeholder 
> characters, I don't really understand what the above code is doing. 

If it's seen the start of a string ("..." or '...') then it just
keeps copying the string till it finds the same type of quote
character to mark the end of the string. The 'fiddly bit in the
middle' is handling backslashes used to escape quote chars in the
middle of the string:  "...\"..." and "...\\" (odd vs even number).

> Maybe it's just too complex, because, looking at DBD::ODBC's 
> dbd_preparse(), the handling of literals in the query seems a good deal 
> more straight-forward (though it doesn't appear to handle '\'' or "\"" 
> -- am I reading that right?

ANSI standard doesn't use backslashes, it uses doubling: "...""..."

Take a look at dbd_preparse in DBD::Oracle.

There's also a preparse() in DBI.xs which was destined to become a
standard service offered to drivers - but isn't quite ready yet.

Tim.


Re: :PgSQL: More Queestions

От
Rudy Lippan
Дата:
On Wed, 20 Nov 2002, Tim Bunce wrote:

> > Maybe it's just too complex, because, looking at DBD::ODBC's 
> > dbd_preparse(), the handling of literals in the query seems a good deal 
> > more straight-forward (though it doesn't appear to handle '\'' or "\"" 
> > -- am I reading that right?
> 
> ANSI standard doesn't use backslashes, it uses doubling: "...""..."
> 

Postgres does, however ugly it may be. So this needs to be suported lest 
a \" thow off  placeholder counting and thus breaking execute().  But at 
least pg supports geminate "s.

Later,

-r



Re: :PgSQL: More Queestions

От
"Jeff Urlwin"
Дата:
>
>
> On Tuesday, November 19, 2002, at 03:42  PM, Jeff Urlwin wrote:
>
> > You probably only need dTHR to support older, pre-threading perls.  I
> > don't
> > believe you need the #ifdef, but it can't hurt (except visually in your
> > code).
>
> Okay. What is it?

See the other posts.  They did a better job of describing it.

>
> > In the preparse(), we're looking for placeholders to notify DBI that
> > we need
> > specific parameters to execute the query and, in the case of DBD::ODBC,
> > later notify the ODBC Driver that we are binding parameters (and what
> > type
> > they are, VARCHAR, etc).  Then the Driver does the binding in whatever
> > DBMS
> > specific way it needs to.  You may have to do more, as you *are* the
> > driver.
> > Note that there is also a way in DBD::ODBC to ignore :foo style
> > parameters
> > because some databases use that for syntax in stored procedures or
> > triggers.
> > For example, with Oracle a trigger can access :old.column_name or
> > :new.column_name and DBD::ODBC allows you to turn off treating
> > :anything as
> > a bind variable to support that.  You may not need that...
>
> I understand that the goal is to convert the placeholders from '?' or
> ':nn' to the PostgreSQL internal version (':pn'). What I'm referring to
> specifically, however, is this snippet from DBD::Pg:
>
>          if (in_literal) {
>              /* Check if literal ends but keep quotes in literal */
>              if (*src == in_literal) {
>                  int bs = 0;
>                  char *str;
>                  str = src-1; /* Back a character. */
>                  while (*(str - bs) == '\\')
>                      bs++;
>                  if (!(bs & 1))
>                      /* bs is an even number? */
>                      in_literal = 0;
>              }
>              *dest++ = *src++;
>              continue;
>          }
>
> in_literal is set when the last character wasn't a placeholder
> character ('?' or ':') and is either a single or a double quotation
> mark. So while I understand that one might want to ignore placeholder
> characters, I don't really understand what the above code is doing.
> Probably'll be easier for me after I've been looking a C for a while...

I'm not sure what it's really trying to do, either, really...

>
> Maybe it's just too complex, because, looking at DBD::ODBC's
> dbd_preparse(), the handling of literals in the query seems a good deal
> more straight-forward (though it doesn't appear to handle '\'' or "\""
> -- am I reading that right?

Nope, it handles " or '.
if (*src == '"' || *src == '\'') {    etc...}
>
> > It's going to depend upon what you need to handle.  For the most part,
> > it
> > shouldn't change after the prepare, but in DBD::ODBC, for example,
> > it's more
> > complex because some statements can return multiple result sets.
>
> Ah, that makes sense. Not sure if it's an issue for PostgreSQL, but I
> doesn't appear to be much of an overhead to set it on a per-execute
> basis...

Actually, if you can get away with doing it only once, the first execute, go
with it.  DBD::ODBC tries to do that, but rechecks under two conditions:1) we "know" there are multiple result sets in
thisquery via already
 
experiencing it2) the user sets a DBD::ODBC private attributed to recheck the result set
types (this is to support nasty things like stored procedures returning only
one result set per call, but a different result set based upon the input
(yes, I've seen this!).

My advice: if you don't have to support multiple result sets, do it once per
execute.  If you setup that "flag" to avoid re-doing work and find that you
need to support multiple-result sets, you can always clear the flag...

>
> > svp is a temporary reference to obtain a pointer to a scalar value
> > (scalar
> > value pointer).  You are then casting it to a pointer to a phs_t, which
> > holds your parameter information.  You'll "create" the phs_t instances
> > when
> > you preparse the query.  In DBD::ODBC, the ftype is queried from the
> > driver
> > itself (may go back to the database for information) to determine if
> > it's
> > numeric, varchar, etc.  The phs_t instance can hold whatever you need
> > to
> > track the parameter (type, scale, etc).  Some drivers assume
> > everything is a
> > varchar and the database itself performs the conversion.
> >
> > So: all_params_hv is in your statement handle and you actually put the
> > information in the all_params_hv hash when you preparse the statement
> > (and
> > update it during execute/bind_param).  Each "phs_t" represents a
> > parameter
> > in the query.  The information contained in the phs_t instance comes
> > from
> > your parsing the query and, possibly querying the database to
> > determine the
> > type of the parameter.
> >
> > Also, the all_params_hv is handy in handling the relatively new DBI
> > attribute ParamValues...
>
> Yes, thank you. I'm spending a lot of time right now just studying the
> code in dbd_preparse() and dbd_st_execute() in both DBD::Pg and
> DBD::ODBC. I think I'll keep at it until I feel I understand it as well
> as I can, and then post my thoughts on what to do next.

I would make the statement that DBD::Oracle may provide a better reference
on the pre-parse stuff.  DBD::ODBC's is probably a bit watered down from
DBD::Oracle -- especially because I'm avoiding comments.

Regards,

Jeff




Re: :PgSQL: More Queestions

От
Tim Bunce
Дата:
Or ignore colons that have a digit as the previous character
as that would never be a placeholder.

Tim.

On Thu, Nov 21, 2002 at 02:20:52PM -0500, Rudy Lippan wrote:
> On Thu, 21 Nov 2002, Adam Witney wrote:
> 
> > One thing you may want to keep in mind is the situation concerning array
> > slices
> > 
> > If you send a statement like this
> > 
> > my $sth = $dbh->prepare("select cube[1:2][1][1] from test4");
> > 
> > Then the :2 gets treated as a placeholder and gets substituted. From the
> > trace
> 
> Ugg.
> 
> Maybe as a possible idea, what do you think?
> 
> --- dbdimp.orig Thu Nov 21 15:51:06 2002
> +++ dbdimp.c    Thu Nov 21 15:56:24 2002
> @@ -692,6 +692,8 @@
>          if (*src != ':' && *src != '?') {
>              if (*src == '\'' || *src == '"') {
>                  in_literal = *src;
> +            } else if ('[' == *src) {
> +               in_literal = ']';
>              }
>              *dest++ = *src++;
>              continue;
> @@ -1139,6 +1141,8 @@
>              if (*src != ':' && *src != '?') {
>                  if (*src == '\'' || *src == '"') {
>                      in_literal = *src;
> +                } else if ('[' == *src) {
> +                   in_literal = ']';
>                  }
>                  *dest++ = *src++;
>                  continue;
> 
> 
> Untested, but compiles. :) (well I did do a quick test on your above stmt 
> and it works for that one, but may break other things. YMMV)
> 
> 
> > 
> > dbd_st_execute: statement = >select cube[1NULL][1][1] from test4<
> > ERROR:  parser: parse error at or near "NULL" at character 14 error 7
> > recorded: ERROR:  parser: parse error at or near "
> > NULL" at character 14
> > 
> > As no parameters have been bound to the placeholder then a NULL is
> > substituted and so it fails.
> > 
> > Or maybe I was doing it wrong and there is a way to get around this already?
> > 
> 
> As a quick workaround $dbh->prepare("cube[1:?][1][1]"); make '2' your
> first param to execute()
> 
> 
> -r
> 


Re: :PgSQL: More Queestions

От
Rudy Lippan
Дата:
On Thu, 21 Nov 2002, Adam Witney wrote:

> One thing you may want to keep in mind is the situation concerning array
> slices
> 
> If you send a statement like this
> 
> my $sth = $dbh->prepare("select cube[1:2][1][1] from test4");
> 
> Then the :2 gets treated as a placeholder and gets substituted. From the
> trace

Ugg.

Maybe as a possible idea, what do you think?

--- dbdimp.orig Thu Nov 21 15:51:06 2002
+++ dbdimp.c    Thu Nov 21 15:56:24 2002
@@ -692,6 +692,8 @@        if (*src != ':' && *src != '?') {            if (*src == '\'' || *src == '"') {
 in_literal = *src;
 
+            } else if ('[' == *src) {
+               in_literal = ']';            }            *dest++ = *src++;            continue;
@@ -1139,6 +1141,8 @@            if (*src != ':' && *src != '?') {                if (*src == '\'' || *src == '"') {
               in_literal = *src;
 
+                } else if ('[' == *src) {
+                   in_literal = ']';                }                *dest++ = *src++;                continue;


Untested, but compiles. :) (well I did do a quick test on your above stmt 
and it works for that one, but may break other things. YMMV)


> 
> dbd_st_execute: statement = >select cube[1NULL][1][1] from test4<
> ERROR:  parser: parse error at or near "NULL" at character 14 error 7
> recorded: ERROR:  parser: parse error at or near "
> NULL" at character 14
> 
> As no parameters have been bound to the placeholder then a NULL is
> substituted and so it fails.
> 
> Or maybe I was doing it wrong and there is a way to get around this already?
> 

As a quick workaround $dbh->prepare("cube[1:?][1][1]"); make '2' your
first param to execute()


-r



Re: :PgSQL: More Queestions

От
Tim Bunce
Дата:
On Thu, Nov 21, 2002 at 03:54:56PM -0500, Rudy Lippan wrote:
> On Thu, 21 Nov 2002, Tim Bunce wrote:
> 
> > Or ignore colons that have a digit as the previous character
> > as that would never be a placeholder.
> 
> Since DBD::Pg emulates place holders, it is possible to have them after a
> digit.

Possible sure, but place holder values should always be viewed and used
as complete lexical tokens. Being over-clever with 'token concatenation'
by doing things like "SELECT * FROM log_:1" and binding a date like
20021122 as a number to select from table log_20021122 is a bad idea.

(FYI I'm not familiar with PostgresSQL's dialect of SQL.)

> I thought that using the [ as a literal marker might break less existing
> code, for example $d->prepare->("insert into foo (int_date)
> VALUES(200301:1"); Or any such other silly uses.

Such silly uses deserve to be broken and replaced with $d->prepare->("insert into foo (int_date) VALUES(:1)")
or $d->prepare->("insert into foo (int_date) VALUES(200301$foo)")

> And besides the array
> data type will probably be used less than other types.
> 
> The best solution might be to treat [] as a literal wrt placeholders, but
> allow [:1 for the [:1] case, which is the only case that would work with
> the existing code. Currently, For [:1:2] preparse would see 2 place holders

Viewing place holder values as complete lexical tokens make that
not a sensible thing to do. And, for the only valid uses of
concatenated placeholders, the same effect could be achieved with
one placeholder and concatenating the values bound to it.

> and for [:1::2] preparse would see a place holder and a cast.

Is "::2" a valid cast that people would actually use in real code?

So far I think I'd just extend my original proposal to say
"Ignore colons that have a digit or colon as the previous character".

Tim.


Re: :PgSQL: More Queestions

От
Rudy Lippan
Дата:
On Thu, 21 Nov 2002, Tim Bunce wrote:

> 
> Or ignore colons that have a digit as the previous character
> as that would never be a placeholder.
> 

Since DBD::Pg emulates place holders, it is possible to have them after a
digit.

I thought that using the [ as a literal marker might break less existing
code, for example $d->prepare->("insert into foo (int_date)
VALUES(200301:1"); Or any such other silly uses. And besides the array
data type will probably be used less than other types.

The best solution might be to treat [] as a literal wrt placeholders, but
allow [:1 for the [:1] case, which is the only case that would work with
the existing code. Currently, For [:1:2] preparse would see 2 place
holders and for [:1::2] preparse would see a place holder and a cast.

Thougths?

-r