Обсуждение: JBoss w/int8 primary keys in postgres ...
Greetings all,
    Having just read a thread on the lack of implicit type cooersion by
postgresql when planning a query to use an index scan as opposed to a
table scan (thread over on psql-performance list at
http://archives.postgresql.org/pgsql-performance/2003-09/msg00090.php),
it hit me that the DB backing our EJB application has int8 primary keys
throughout (cooresponding to java datatype Long), and that the vast
majority of queries that JBoss is issuing with respect to pk values are
getting worst-case treatment as sequential scans as opposed to index
scans just from the int4 / int8 type mismatch on the primary key index.
Sample query generated by JBossCMP code:
LOG:  query: SELECT id FROM address WHERE (contact=30410)
Here's the table:
social=# \d address;
        Table "public.address"
     Column    |  Type   | Modifiers
--------------+---------+-----------
  id           | bigint  | not null
  name         | text    |
  streetnumber | integer | not null
  street       | text    |
  street2      | text    |
  city         | text    |
  state        | text    |
  zip          | text    |
  contact      | bigint  |
Indexes: pk_address primary key btree (id),
          address_contact btree (contact)
Here's how it gets analyzed, since the 31410 gets treated by int4
naturally:
explain analyze SELECT id FROM address WHERE (contact=30410);
                                             QUERY PLAN
------------------------------------------------------------------------
---------------------------
  Seq Scan on address  (cost=0.00..166.51 rows=1 width=8) (actual
time=17.41..17.41 rows=0 loops=1)
    Filter: (contact = 30410)
  Total runtime: 17.50 msec
(3 rows)
Explicitly casting the literal number to int8, making it match the
index type lets us do an index scan:
# explain analyze SELECT id FROM address WHERE (contact=30410::int8);
                                                       QUERY PLAN
------------------------------------------------------------------------
-----------------------------------------------
  Index Scan using address_contact on address  (cost=0.00..4.20 rows=1
width=8) (actual time=0.04..0.04 rows=0 loops=1)
    Index Cond: (contact = 30410::bigint)
  Total runtime: 0.12 msec
(3 rows)
A great bit better, of course.
Is there any hope to get JBoss + PG JDBC to do this typecasting
automatically? Assuming JBoss is using PreparedStatements for its dirty
work, could the setLong() method on the JDBC driver's PreparedStatement
  implementation possibly always punch in the trailing '::int8' ?
Or does someone already know that JBossCMP does not use
PreparedStatements, and I'm really barking up the wrong tree here.
Thanks,
James
			
		James, Have you run vacuum analyze on the database ? Dave On Sat, 2003-09-06 at 14:38, James Robinson wrote: > Greetings all, > > Having just read a thread on the lack of implicit type cooersion by > postgresql when planning a query to use an index scan as opposed to a > table scan (thread over on psql-performance list at > http://archives.postgresql.org/pgsql-performance/2003-09/msg00090.php), > it hit me that the DB backing our EJB application has int8 primary keys > throughout (cooresponding to java datatype Long), and that the vast > majority of queries that JBoss is issuing with respect to pk values are > getting worst-case treatment as sequential scans as opposed to index > scans just from the int4 / int8 type mismatch on the primary key index. > Sample query generated by JBossCMP code: > > LOG: query: SELECT id FROM address WHERE (contact=30410) > > Here's the table: > social=# \d address; > Table "public.address" > Column | Type | Modifiers > --------------+---------+----------- > id | bigint | not null > name | text | > streetnumber | integer | not null > street | text | > street2 | text | > city | text | > state | text | > zip | text | > contact | bigint | > Indexes: pk_address primary key btree (id), > address_contact btree (contact) > > Here's how it gets analyzed, since the 31410 gets treated by int4 > naturally: > > explain analyze SELECT id FROM address WHERE (contact=30410); > QUERY PLAN > ------------------------------------------------------------------------ > --------------------------- > Seq Scan on address (cost=0.00..166.51 rows=1 width=8) (actual > time=17.41..17.41 rows=0 loops=1) > Filter: (contact = 30410) > Total runtime: 17.50 msec > (3 rows) > > Explicitly casting the literal number to int8, making it match the > index type lets us do an index scan: > > # explain analyze SELECT id FROM address WHERE (contact=30410::int8); > QUERY PLAN > ------------------------------------------------------------------------ > ----------------------------------------------- > Index Scan using address_contact on address (cost=0.00..4.20 rows=1 > width=8) (actual time=0.04..0.04 rows=0 loops=1) > Index Cond: (contact = 30410::bigint) > Total runtime: 0.12 msec > (3 rows) > > A great bit better, of course. > > Is there any hope to get JBoss + PG JDBC to do this typecasting > automatically? Assuming JBoss is using PreparedStatements for its dirty > work, could the setLong() method on the JDBC driver's PreparedStatement > implementation possibly always punch in the trailing '::int8' ? > > Or does someone already know that JBossCMP does not use > PreparedStatements, and I'm really barking up the wrong tree here. > > Thanks, > > James > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Dave Cramer <Dave@micro-automation.net>
On Sat, Sep 06, 2003 at 02:38:40PM -0400, James Robinson wrote: > Is there any hope to get JBoss + PG JDBC to do this typecasting > automatically? Assuming JBoss is using PreparedStatements for its dirty > work, could the setLong() method on the JDBC driver's PreparedStatement > implementation possibly always punch in the trailing '::int8' ? Something like this was tried without much success. See: http://archives.postgresql.org/pgsql-jdbc/2002-10/msg00163.php http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/interfaces/jdbc/org/postgresql/jdbc1/AbstractJdbc1Statement.java#rev1.6 http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/interfaces/jdbc/org/postgresql/jdbc1/AbstractJdbc1Statement.java#rev1.12 Reading that thread, the change that was tried was quoting the value to get an implicit cast from 'text' to the expected type -- whatever it was -- happening. Perhaps it's reasonable to have setLong() always append ::int8 since that does reflect the possible range of the parameter's type -- anyone more familiar with this area want to comment? I guess that you end up with the reverse problem of things that expect an int4 value but use setLong() will break, but maybe that is less common and is going to have other problems anyway if you try to use a java long that's larger than int4 can handle .. -O
On 06/09/2003 23:25 Oliver Jowett wrote: > On Sat, Sep 06, 2003 at 02:38:40PM -0400, James Robinson wrote: > > > Is there any hope to get JBoss + PG JDBC to do this typecasting > > automatically? Assuming JBoss is using PreparedStatements for its dirty > > > work, could the setLong() method on the JDBC driver's PreparedStatement > > > implementation possibly always punch in the trailing '::int8' ? > > Something like this was tried without much success. See: > > http://archives.postgresql.org/pgsql-jdbc/2002-10/msg00163.php > > http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/interfaces/jdbc/org/postgresql/jdbc1/AbstractJdbc1Statement.java#rev1.6 > > http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/interfaces/jdbc/org/postgresql/jdbc1/AbstractJdbc1Statement.java#rev1.12 > > Reading that thread, the change that was tried was quoting the value to > get > an implicit cast from 'text' to the expected type -- whatever it was -- > happening. > > Perhaps it's reasonable to have setLong() always append ::int8 since that > does reflect the possible range of the parameter's type -- anyone more > familiar with this area want to comment? I guess that you end up with the > reverse problem of things that expect an int4 value but use setLong() > will > break, but maybe that is less common and is going to have other problems > anyway if you try to use a java long that's larger than int4 can handle Looking at the sources for the 7.3.4 driver, the code is almost there. It looks like the type is carried across to org.postgresql.core.QueryExecutor but is never send to the BE. I'm not familiar with the FE-BE protocol but it just looks like we need to pass the m_bindTypes[] array across and insert 2 lines of code to send :: followed by send m_bindTypes[i]. To prevent breaking existing code, we would need make this the non-default behavior. It would need to settable in the url string to be used in an EJB container. Comments? -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
On Sunday, September 7, 2003, at 06:44 AM, Paul Thomas wrote: > It would need to settable in the url string to be used in an EJB > container. > > Comments? Well, reading the associated threads more closely, I see that Tom announced that there was a proposed fix in the backend for 7.4 (see http://archives.postgresql.org/pgsql-jdbc/2002-10/msg00164.php). I'm cooking 7.4B2 right now to see if anything is indeed different. But, given the most recent discussion in performance-land, dated this past Friday, I'm not expecting too much. A backend fix would be the clearest answer, since the principle of least surprise is being violated by the backend, not the JDBC driver. I'm just not sure if that is/will be done in a timely enough fashon, given I have this problem *now*, and our particular frontend-of-choice can indeed inject a lossless cast in the setLong() path (and, likewise, perhaps a '::int2' for the setShort() path as well). CMP will be the death of us all. James
On 07/09/2003 20:09 James Robinson wrote: > > On Sunday, September 7, 2003, at 06:44 AM, Paul Thomas wrote: > >> It would need to settable in the url string to be used in an EJB >> container. >> >> Comments? > > > Well, reading the associated threads more closely, I see that Tom > announced that there was a proposed fix in the backend for 7.4 (see > http://archives.postgresql.org/pgsql-jdbc/2002-10/msg00164.php). I'm > cooking 7.4B2 right now to see if anything is indeed different. But, > given the most recent discussion in performance-land, dated this past > Friday, I'm not expecting too much. Looking at the cvs HEAD sources, there's nothing there that would help you. > A backend fix would be the clearest answer, since the principle of least > surprise is being violated by the backend, not the JDBC driver. I'm just > not sure if that is/will be done in a timely enough fashon, given I have > this problem *now*, and our particular frontend-of-choice can indeed > inject a lossless cast in the setLong() path (and, likewise, perhaps a > '::int2' for the setShort() path as well). > Whether you consider PGs very tight typing a bug or not is a matter of opinion. It probably would help if the BE could do type promotion (int2->int4->int8). But that would only help those using that particular version of PG. As the JDBC driver is designed to work for current + many past releases of PG, modifications along the line I've proposed would also help users of older versions of PG and provide fairly consistent behavior across versions. > CMP will be the death of us all. No, just those who choose to go the EJB route. My preference is DAO. -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
On Monday, September 8, 2003, at 10:18  AM, Paul Thomas wrote:
> As the JDBC driver is designed to work for current + many past
> releases of PG, modifications along the line I've proposed would also
> help users of older versions of PG and provide fairly consistent
> behavior across versions.
I'm certainly not against such a proposal, making our
PreparedStatements smarter, but it would not help out in the JBossCMP
case, nor would it help out any other frontend, including the CLI psql
client, in that:
create table test (
    pk int8 not null primary key,
    name text;
);
copy into test values ( .... )
...
select name from test where id = 42;
just doesn't do what you would reasonably expect.
>
>> CMP will be the death of us all.
>
> No, just those who choose to go the EJB route. My preference is DAO.
I am enslaved to entity beans for now, and am ignorant for the most
part regarding the philosophies of both JDO and DAO. How would DAO get
around this issue?
Thanks,
James
			
		On 09/09/2003 01:57 James Robinson wrote:
>
> On Monday, September 8, 2003, at 10:18  AM, Paul Thomas wrote:
>
>> As the JDBC driver is designed to work for current + many past releases
>> of PG, modifications along the line I've proposed would also help users
>> of older versions of PG and provide fairly consistent behavior across
>> versions.
>
> I'm certainly not against such a proposal, making our PreparedStatements
> smarter, but it would not help out in the JBossCMP case, nor would it
> help out any other frontend, including the CLI psql client, in that:
>
> create table test (
>     pk int8 not null primary key,
>     name text;
> );
> copy into test values ( .... )
>
> ...
> select name from test where id = 42;
>
> just doesn't do what you would reasonably expect.
With the proposed modifications to the JDBC driver enabled
PreparedStatement pstmt = con.prepareStatement("select name from test
where id = ?");
pstmt.setLong(1, 42);
pstmt.executeQuery();
would actually produce:
select name from test where id = 42::int8
which _would_ cause an index to be used on your table.
>>> CMP will be the death of us all.
>>
>> No, just those who choose to go the EJB route. My preference is DAO.
>
> I am enslaved to entity beans for now, and am ignorant for the most part
> regarding the philosophies of both JDO and DAO. How would DAO get around
> this issue?
Take a look at the sample Adventure application on java.sun.com. It uses
DAOs.
HTH
--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for the Smaller
Business |
| Computer Consultants         |
http://www.thomas-micro-systems-ltd.co.uk   |
+------------------------------+---------------------------------------------+
			
		> With the proposed modifications to the JDBC driver enabled
>
> PreparedStatement pstmt = con.prepareStatement("select name from test
> where id = ?");
> pstmt.setLong(1, 42);
> pstmt.executeQuery();
>
> would actually produce:
>
> select name from test where id = 42::int8
Silly question, but why don't you simply ensure any ?'s are quoted?
select name from test where id = '42'
This allows PostgreSQL to choose the best type for the job -- in this
case it will try to match that of id.  It will work that way on all
older and newer versions of PostgreSQL.
			
		Вложения
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi!
This is an old discussion. :)
At the office we resolved this by making the driver to quote every
setLong or SetBigDecimal (the problem exist here too.)
Its really easy to do but it's not the correct way: this a backend's
problem and have to be fixed there and not at the driver.
If you already have an application in production state this "hack" can
resolve for moment but this need to be resolved correctly in the future.
I paste the old mails about this subject as it's look like Barry is busy
or away.
Regards.
João Paulo Ribeiro
- -------- Original Message --------
Subject: Re: [JDBC] One little tip for int8 and decimal :)
Date: Thu, 28 Mar 2002 09:28:56 -0800
From: Barry Lind <barry@xythos.com>
To: João Paulo Ribeiro <jp@mobicomp.com>
CC: dave@fastcrypt.com
References: <3CA200D9.10100@mobicomp.com> <3CA27CB5.1080002@xythos.com>
<3CA309BD.8050405@mobicomp.com>
João,
There are two reasons that I don't what to make this change:
1) With this change in place it is less likely that the real problem
will get fixed.  Putting workarounds in place often have the effect of
ensuring that the real problem never gets fixed.  If the amount of
resources that it has taken the various people to discuss and build
workarounds to this problem where dedicated to fixing the real problem,
the real problem whould have been fixed by now.
2) I am concerned that this change will break existing code.  I am
concerned that there exists in the postgresql parser instances where a
quoted number is not allowed.  This change would then break existing
code.  (Although I am less concerned about your approach than what has
been proposed in the past, which was to have the driver produce explicit
casts  so  1234  would become  1234::int8 .  I think just your approach
of changing 1234 to '1234' is less likely to break existing code).
Finally, most people who have run into this problem have been able to
work around the problem either by explicitly casting in their sql
statements (i.e. adding ::int8) or by calling
setString(Long.toString(var)) in their code.
Having said all of that I do understand the problem you are facing
because you are working with a tool set that doesn't allow you to use
the workarounds that others can take advantage of.  So I will agree to
add the workaround you suggest to the jdbc driver at the end of the 7.3
development cycle if the real problem hasn't been fixed in the backend.
I would suggest that you send an email note to the pgsql-general and/or
pgsql-hackers list to explain the importance of getting this problem
fixed in the backend becuase you can't work around the problem since you
don't have control over the sql that is being generated in the tool set
you are using.  I think many developers who are in a position to fix the
real problem don't think the problem is as bad as it is because they
assume you can work around the problem easily by changing your code.
thanks,
- --Barry
João Paulo Ribeiro wrote:
> Barry,
>
> I understand that the problem is in the backend and this is why i called
> it a little tip.
> Maybe you are suggesting  that the fix that not appears in versions  <
> 7.2 will apear someday, but what we do in the while?
> You are trying to tell me that its better to use another database?
> Because if you use preparedstatement to acess big tables with int8 or
> decimal in postgres its better to forget it.
> Postgres without this will not be usefull for business.
>
> I perfectly understand that the problem is not in the JDBC driver but
> can someone tell me why we cant fix it here?
>
> Advantages:
> - its very easy to fix
> - it dont have implication with backward compatibilies
> - it will fix the problem with some older versions of postgres  (7.0 and
> 7.1  and dont know about the others)
> - it wil not make problem compatibilities in the futures
>
> Disadvantages:
> - the setString(...) method is slower than the set(...)
>
> I talk for the point view of someone that use postgres for work with
> medium databases (>400MB) and that can wait for the fix that can will
> appear to late.
> If we didnt made the fix we swapped out to Oracle.
> This is creating a very bad image of java with postgres.
>
> But as i said it was a suggestion.
>
> Best regards.
> João Paulo Ribeiro
>
> Barry Lind wrote:
>
>> João,
>>
>> This has been discussed before on the mailing list and rejected as the
>> wrong way to fix the problem.  The correct way to fix this problem is
>> to fix the problem in the backend, not to work around the problem in
>> each of the front ends.
>>
>> --Barry
>>
>>
>> João Paulo Ribeiro wrote:
>>
>>> Hi!
>>>
>>> We are working with  java and postgresql for a while.
>>>
>>> In our experiences we have seen the problem with int8  and decimal:
>>> postgres dont convert this types easyli and because of this the
>>> result sometimes wil not be the expected.
>>>
>>> A simple example:
>>> We have this table:
>>>
>>> create table test(
>>>
>>> data int8 NOT NULL PRIMARY KEY
>>>
>>> );
>>>
>>>
>>> we  put n lines (n> 70000) :)
>>>
>>> if we try to make query withou explicit cast the postgres will not
>>> use the index.
>>> Example:
>>>
>>> pvs=# explain select * from test where data=12345;
>>>
>>> NOTICE:  QUERY PLAN:
>>>
>>> Seq Scan on test  (cost=0.00..22.50 rows=1 width=8)
>>>
>>> EXPLAIN
>>>
>>> pvs=#
>>>
>>>
>>> but with a explicit cast:
>>>
>>> pvs=# explain select * from test where data=12345::int8;
>>>
>>> NOTICE:  QUERY PLAN:
>>>
>>> Index Scan using test_pkey on test  (cost=0.00..4.82 rows=1 width=8)
>>>
>>> EXPLAIN
>>>
>>> pvs=#
>>>
>>> another aproach is to force the postgresql to evaluate and transform
>>> the value to the desired datatype using quotes '
>>>
>>> pvs=# explain select * from test where data='12345';
>>>
>>> NOTICE:  QUERY PLAN:
>>>
>>> Index Scan using test_pkey on test  (cost=0.00..4.82 rows=1 width=8)
>>>
>>> EXPLAIN
>>>
>>> pvs=#
>>>
>>>
>>> This problem is well known for the postgres user.
>>> But the problem go further when you use JDBC to access the postgresql.
>>> Using the same table.
>>> We have a little program that make a simple query:
>>> ...
>>>
>>>        DBConnection con = someKindOfDbPool.allocateConnection();
>>>
>>>        PreparedStatement  ps = con.prepareStatement("Select * from
>>> user2 where obid=?");
>>>
>>>        ps.setlong(1,123456);
>>>
>>>        ps.executeQuery();
>>>
>>> ...
>>>
>>> This query will never use the index because of the problem explained
>>> above.
>>> We can use setBigDecimal and problem will persist.
>>>
>>> I use DODs with Enhydra  and the data layer generated by the DODs
>>> have this problem.
>>>
>>> What we propose is to change the prepared statment to force postgres
>>> to correctly use the index and the result will be the expected. :)
>>> For example,  at the office we made a little change to the setLong
>>> and setBigDecimal from PreparedStatement class.
>>>
>>> The orginal look like:
>>>
>>> public void setBigDecimal(int parameterIndex, BigDecimal x) throws
>>> SQLException
>>>        {
>>>      if (x == null)
>>>      setNull(parameterIndex, Types.OTHER);
>>>      else
>>>      set(parameterIndex, x.toString());
>>>        }
>>>
>>>
>>> public void setLong(int parameterIndex, long x) throws SQLException {
>>>         set(parameterIndex, (new Long(x)).toString());
>>> }
>>>
>>>
>>> and we changed de set(...) to setString(..) and its look like:
>>>
>>>
>>> public void setBigDecimal(int parameterIndex, BigDecimal x) throws
>>> SQLException {
>>>      if (x == null)
>>>      setNull(parameterIndex, Types.OTHER);
>>>      else
>>>      setString(parameterIndex, x.toString());
>>> }
>>>
>>>
>>> public void setLong(int parameterIndex, long x) throws SQLException {
>>>        setString(parameterIndex, (new Long(x)).toString());
>>>
>>> }
>>>
>>> With this change when we use the setBigdecimal or the setLong in a
>>> query and we expect that a index will be used, it will really be
>>> used. :)
>>>
>>> This has been working in a production database for couple of month
>>> and is really working fine.
>>>
>>>
>>> Regards.
>>>
>>> João Paulo Ribeiro & Marco Leal
>>>
>>>
>>
>>
>>
>
>
- --
- ----------------------------------------------------------------------------
MobiComp - Mobile Computing & Wireless Solutions
phone: +351 253 305 250     fax: +351 253 305 251
web: http://www.mobicomp.com
- ----------------------------------------------------------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQE/XfXjbwM7R+C+9B0RAlepAKDF11Yldz95snv58Ac7vj6bu15xYQCgzLWB
ia3iLpA+jwiP3FOaIHuDSd8=
=XsMs
-----END PGP SIGNATURE-----
			
		On 09/09/2003 16:46 João Ribeiro wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi! > > This is an old discussion. :) > At the office we resolved this by making the driver to quote every > setLong or SetBigDecimal (the problem exist here too.) > Its really easy to do but it's not the correct way: this a backend's > problem and have to be fixed there and not at the driver. > If you already have an application in production state this "hack" can > resolve for moment but this need to be resolved correctly in the future. > > I paste the old mails about this subject as it's look like Barry is busy > or away. Tnx for that João. I agree that a back-end(BE) change to allow some form of type promotion int4->int8 when looking for suitable indexes would be the best solution. The problem is explaining to the BE developers that in EJB/CMP, no SQL is written by the programmer and get-arounds like type casting or quoting can't be used. This makes PostgreSQL a poor choice enterprise database ATM as EJBs are an extremely important enterprise technology. I can see Barry's point about not hacking the driver to overcome something which should be handled in the BE. But here we are 18 months after Barry's email and nothing has changed so I'm wondering when, if ever, a solution will be implemented. -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
Rod,
That was tried once and needed to be pulled out.  That change ended up
breaking some code since there are some circumstances where '5' is not
the same as 5 in the parser.  I don't remember the details but they
should be in the maillist archives (or the cvs commit message for the
revert on the change).
thanks,
--Barry
PS.  Because of having tried and failed to fix this in the driver
before, I am gun shy on trying a fix again at the driver level.  The
server is the proper place for this to be fixed.  And I think it is in
7.4, but haven't verified that myself.
Rod Taylor wrote:
>>With the proposed modifications to the JDBC driver enabled
>>
>>PreparedStatement pstmt = con.prepareStatement("select name from test
>>where id = ?");
>>pstmt.setLong(1, 42);
>>pstmt.executeQuery();
>>
>>would actually produce:
>>
>>select name from test where id = 42::int8
>
>
> Silly question, but why don't you simply ensure any ?'s are quoted?
>
> select name from test where id = '42'
>
> This allows PostgreSQL to choose the best type for the job -- in this
> case it will try to match that of id.  It will work that way on all
> older and newer versions of PostgreSQL.
			
		On 09/09/2003 18:39 Barry Lind wrote: > Rod, > > That was tried once and needed to be pulled out. That change ended up > breaking some code since there are some circumstances where '5' is not > the same as 5 in the parser. I don't remember the details but they > should be in the maillist archives (or the cvs commit message for the > revert on the change). > > thanks, > --Barry > > PS. Because of having tried and failed to fix this in the driver > before, I am gun shy on trying a fix again at the driver level. The > server is the proper place for this to be fixed. And I think it is in > 7.4, but haven't verified that myself. Is there any way we could find out for definite? Until this index/type cast issue is sorted out I don't see any way that PostgreSQL can even be considered ready for enterprise use. EJB/CMP is an important enterprise technology and int8 index columns are not unknown in big company databases. -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
There are perfectly valid reasons why postgres uses a sequence scan as opposed to an index scan. http://www.postgresql.org/docs/faqs/FAQ.html That being said, the reason I asked if you had vacuumed the database lately is that that is the only way to update the planner statistics. So please vacuum analyze the entire db, if it then doesn't use an index scan you can try disabling sequence scans in the configuration file. Dave On Wed, 2003-09-10 at 09:31, Paul Thomas wrote: > On 09/09/2003 18:39 Barry Lind wrote: > > Rod, > > > > That was tried once and needed to be pulled out. That change ended up > > breaking some code since there are some circumstances where '5' is not > > the same as 5 in the parser. I don't remember the details but they > > should be in the maillist archives (or the cvs commit message for the > > revert on the change). > > > > thanks, > > --Barry > > > > PS. Because of having tried and failed to fix this in the driver > > before, I am gun shy on trying a fix again at the driver level. The > > server is the proper place for this to be fixed. And I think it is in > > 7.4, but haven't verified that myself. > > Is there any way we could find out for definite? Until this index/type > cast issue is sorted out I don't see any way that PostgreSQL can even be > considered ready for enterprise use. EJB/CMP is an important enterprise > technology and int8 index columns are not unknown in big company databases. -- Dave Cramer <Dave@micro-automation.net>
On Wed, 10 Sep 2003 14:31:53, Paul Thomas wrote: > Is there any way we could find out for definite? Until this index/type > cast issue is sorted out I don't see any way that PostgreSQL can even > be > considered ready for enterprise use. EJB/CMP is an important enterprise > technology and int8 index columns are not unknown in big company > databases. I ran 7.4B2 a few days ago, and it did not act any different than 7.3 does -- that is, the problem still exists in the backend. Shall I petition hackers, stating that EJB/CMP is not a fly-by-night technology (rather, more like the COBOL of our generation), and that postgres would make a great backend for something like JBoss, or any DB-neutral relational persistence generation framework using any of the available interfaces, not just JDBC, if this issue was solved once and for all? Or will we be told 'show us the backend code that passes the regression tests', which is a valid response.
On 10/09/2003 17:56 James Robinson wrote:
>
> On Wed, 10 Sep 2003 14:31:53, Paul Thomas wrote:
>
>> Is there any way we could find out for definite? Until this index/type
>> cast issue is sorted out I don't see any way that PostgreSQL can even be
>> considered ready for enterprise use. EJB/CMP is an important enterprise
>> technology and int8 index columns are not unknown in big company
>> databases.
>
> I ran 7.4B2 a few days ago, and it did not act any different than 7.3
> does -- that is, the problem still exists in the backend.
I didn't have a lot of hope that 7.4 would be the answer. This int8 index
column is a well known gotcha and the advice has always been either a type
cast or quotation. I can't recall anyone saying anything about the problem
going away in 7.4...
> Shall I petition hackers, stating that EJB/CMP is not a fly-by-night
> technology (rather, more like the COBOL of our generation), and that
> postgres would make a great backend for something like JBoss, or any
> DB-neutral relational persistence generation framework using any of the
> available interfaces, not just JDBC, if this issue was solved once and
> for all?
>
> Or will we be told 'show us the backend code that passes the regression
> tests', which is a valid response.
I think the first big hurdle is going to be making them realize that with
CMP there _is no_ SQL source to modify in the first place. Yes, I think
you need to petition hackers and maybe x-post to advocacy too - there are
also people there who need to be made aware that PostgreSQL has a serious
Achilles heel as an enterprise database!
This morning I discovered an inconsistency of behavior in 7.3.4. I have
one table which has 2 fields
     material_id             integer
     quantity_available      double precision
on which I need to do a search of the form:
     select * from mytable where material_id = 123 and
quantity_available > 0
I created the index on (material_id, quantity_available) and, in psql, did
an
     explain analyze select * from mytable where material_id = 123 and
quantity_available > 0;
to check that the index was being used - which it was. But it printed up a
most interesting line:
     Index Cond: ((material_id = 123) AND (quantity_available > 0::double
precision))
~~~~~~~~~~~~~~~~~~~
So it seems to know when a number should be treated as double precision
but not int8! Maybe the inconsistent behavior of the current production
release will strengthen your argument.
HTH
--
Paul Thomas
+------------------------------+---------------------------------------------+
| Thomas Micro Systems Limited | Software Solutions for the Smaller
Business |
| Computer Consultants         |
http://www.thomas-micro-systems-ltd.co.uk   |
+------------------------------+---------------------------------------------+
			
		On Wed, Sep 10, 2003 at 07:23:02PM +0100, Paul Thomas wrote: > >Shall I petition hackers, stating that EJB/CMP is not a fly-by-night > >technology (rather, more like the COBOL of our generation), and that > >postgres would make a great backend for something like JBoss, or any > >DB-neutral relational persistence generation framework using any of the > >available interfaces, not just JDBC, if this issue was solved once and > >for all? > > > >Or will we be told 'show us the backend code that passes the regression > >tests', which is a valid response. > > I think the first big hurdle is going to be making them realize that with > CMP there _is no_ SQL source to modify in the first place. Yes, I think > you need to petition hackers and maybe x-post to advocacy too - there are > also people there who need to be made aware that PostgreSQL has a serious > Achilles heel as an enterprise database! I have to ask .. If your CMP implementation knows enough about Postgresql types to use an int8 column, why can't it also know about the index behaviour? Or is it ending up with an int8 column via other means? (alias from a standard type name? manual configuration?) Our inhouse CMP implementation has to know about postgresql specifics for other reasons anyway (e.g. "use bytea for storing complex serializable objects"), so while we don't currently use int8 columns with indexes it wouldn't be hard to add. We don't support mapping to arbitary schemas, though, so we really do need to know more about the DB's guts to get table creation etc. right. For CMPs that do map to user-provided schemas, from memory the couple I've looked at (Weblogics, the J2EE RI) both let you edit the SQL to be used directly (via an extension DD), presumably since machine-generated code is hard to do a correct mapping with in all cases. Does JBoss not let you do this? -O
> > I have to ask .. If your CMP implementation knows enough about > Postgresql > types to use an int8 column, why can't it also know about the index > behaviour? > > Or is it ending up with an int8 column via other means? (alias from a > standard type name? manual configuration?) > > Our inhouse CMP implementation has to know about postgresql specifics > for > other reasons anyway (e.g. "use bytea for storing complex serializable > objects"), so while we don't currently use int8 columns with indexes it > wouldn't be hard to add. We don't support mapping to arbitary schemas, > though, so we really do need to know more about the DB's guts to get > table > creation etc. right. > > For CMPs that do map to user-provided schemas, from memory the couple > I've > looked at (Weblogics, the J2EE RI) both let you edit the SQL to be used > directly (via an extension DD), presumably since machine-generated > code is > hard to do a correct mapping with in all cases. Does JBoss not let you > do > this? > > -O From the perspective of someone running the EJB server who has to deploy the beans (possibly written by 3rd parties who have no clue what particular DB (or, for that matter, which EJB server it is being deployed on -- in theory, anyway)) on a given DB backend, say, PG, the CMP component of JBoss lets you tell it two major things off the top: which JDBC driver URL to bind into the datasource, plus the name of a set of typemappings that map Java datatype (i.e. java.lang.Long) to JDBC datatype to SQL datatype that JBoss would use when issuing the create table command for persisting a component that uses that datatype. It is this suite of XML mapping tags that lets you describe to JBoss that, for example, byte arrays should correspond to 'bytea' columns. This type info, AFAIK, is only used at table creation time. Other things it lets you vary per SQL backend are things like how to lock a row -- how exactly to do a 'select for update', as well as how to issue foreign key constraints, etc. Attached, for you personal enjoyment, is our current PG settings block. JBoss does do full table creation, so it is pretty spiffy in that respect. Enough voluminous code in there to make you both respect it, yet fear it if you would like to just focus on business-level app logic yet have to go pokin' around in there to really try to figure out what it is thinking. JBoss does let you provide hand-written SQL implementations of most any finder (not quite sure offhand about the all-important findByPrimaryKey case) using a custom DD, but I don't remember seeing anything that would let you override the SQL that it issues when it needs to emit an update, create, or delete statement, as well as the entire class of selects that it will generate when navigating container-managed relationships. So, through converting all of our SQL-container-neutral EQL finder strings into JBoss+Postgres+hack to fix our int8 pk-fields tags in a custom DD, we could work around ~50% or so of the selects to get 'em to use index scans, but that has a serious wrong way feel for it. I'm much more inclined to whack all of our EJBs to use Integer as the PK type (a little sed/awk magic on all of the source), then a second sed/awk pass on the 'CREATE TABLE' section of a pg_dump of the production schema to massage the columns into int4, then dropdb + restore the massaged dump. A fun 2AM site update. That is to say, we're nowhere near consumed the lower 32 bits of our pk space yet. But that would be giving up completely, wouldn't it? On looking at this problem sideways, could I somehow build functional indexes on the result of cooercing the int8 pk field to int4 and have the 'select * from foo where id = 44' queries somehow find the int4-oriented index?
Вложения
Paul Thomas <paul@tmsl.demon.co.uk> writes: > I think the first big hurdle is going to be making them realize that with > CMP there _is no_ SQL source to modify in the first place. Yes, I think > you need to petition hackers and maybe x-post to advocacy too - there are > also people there who need to be made aware that PostgreSQL has a serious > Achilles heel as an enterprise database! <rolls eyes> Do you think we have not heard about the datatype-mismatch issue ten thousand times before? Don't waste our time with "petitions". Give us a practical way to fix it (ie, one that doesn't create more problems than it solves). See for example http://archives.postgresql.org/pgsql-hackers/2002-11/msg00468.php as a recent discussion of the pitfalls involved. It occurred to me this afternoon that we might be able to improve matters for int8 without necessarily fixing the general problem. The problems cited in the above message mostly stem from trying to type small constants as int2 so that "int2col = 42" is indexable. Once upon a time it seemed that was where the hot buttons were, but if your hot button is mostly int8, maybe we could fix that by removing the int8-vs-int4 cross-type operators, and not touch the initial typing of integer literals just yet. Does someone want to explore the consequences of trying that? regards, tom lane
On Wed, Sep 10, 2003 at 08:54:33PM -0400, James Robinson wrote:
> From the perspective of someone running the EJB server who has to
> deploy the beans (possibly written by 3rd parties who have no clue what
> particular DB (or, for that matter, which EJB server it is being
> deployed on -- in theory, anyway)) on a given DB backend, say, PG, the
> CMP component of JBoss lets you tell it two major things off the top:
> which JDBC driver URL to bind into the datasource, plus the name of a
> set of typemappings that map Java datatype (i.e. java.lang.Long) to
> JDBC datatype to SQL datatype that JBoss would use when issuing the
> create table command for persisting a component that uses that
> datatype. It is this suite of XML mapping tags that lets you describe
> to JBoss that, for example, byte arrays should correspond to 'bytea'
> columns. This type info, AFAIK, is only used at table creation time.
This sounds like a reasonable way to operate.
In the absence of a backend fix, it sounds like appending explicit types to
parameter values is the way to go. That is, have setLong(12345) produce a
parameter of "12345::int8", setShort(12345) produce "12345::int2", etc.
This moves the breakage from "setLong() vs. int8 column" to "setLong() vs.
int4 column". Is this a reasonable thing to do? ISTM that the case where the
Java and DB types match is the case that should work generally, and the case
where type conversion is needed is dependant on Postgresql's exact behaviour
anyway.
Something related to think about: with PREPARE, we already explicitly
specify the types of parameters based on the method (or sql typecode) used
to set them. Given:
  CREATE TABLE foo(bar int4 not null primary key);
and this code fragment:
  PreparedStatement ps = c.prepareStatement("SELECT * FROM foo WHERE bar = ?");
  ps.setLong(1, 12345)
The current driver generates something like this normally:
  SELECT * FROM foo WHERE bar = 12345
which should use the index. However with server-side prepared queries
enabled the query is transformed to something like this:
  PREPARE jdbc_statement_N(int8) AS SELECT * FROM foo WHERE bar = $1;
  EXECUTE jdbc_statement_N(12345)
Which I suspect won't use the index as $1 is typed to int8. (anyone know for
sure? I can't find an easy way to get at the query plan produced by
PREPARE).
If it doesn't use the index this breaks the transparency of converting to
PREPARE i.e. the "setInt() vs. int8 column" is already broken in some cases
with the current driver. So making setLong() vs. int8 work at the expense of
this case doesn't really break things further. Honest. :)
-O
			
		On Thu, Sep 11, 2003 at 05:28:34PM +1200, Oliver Jowett wrote: > If it doesn't use the index this breaks the transparency of converting to > PREPARE i.e. the "setInt() vs. int8 column" is already broken in some cases Sorry, this should read 'the "setLong() vs int4 column" case'. -O
Oliver Jowett <oliver@opencloud.com> writes:
> which should use the index. However with server-side prepared queries
> enabled the query is transformed to something like this:
>   PREPARE jdbc_statement_N(int8) AS SELECT * FROM foo WHERE bar = $1;
>   EXECUTE jdbc_statement_N(12345)
> Which I suspect won't use the index as $1 is typed to int8. (anyone know for
> sure? I can't find an easy way to get at the query plan produced by
> PREPARE).
In 7.4 you can do EXPLAIN EXECUTE.  But no, this wouldn't use the index
:-(.  What exactly is causing the driver to decide to type the parameter
as int8?
            regards, tom lane
			
		On Thu, Sep 11, 2003 at 01:40:21AM -0400, Tom Lane wrote: > Oliver Jowett <oliver@opencloud.com> writes: > > which should use the index. However with server-side prepared queries > > enabled the query is transformed to something like this: > > > PREPARE jdbc_statement_N(int8) AS SELECT * FROM foo WHERE bar = $1; > > EXECUTE jdbc_statement_N(12345) > > > Which I suspect won't use the index as $1 is typed to int8. (anyone know for > > sure? I can't find an easy way to get at the query plan produced by > > PREPARE). > > In 7.4 you can do EXPLAIN EXECUTE. But no, this wouldn't use the index > :-(. What exactly is causing the driver to decide to type the parameter > as int8? JDBC doesn't provide a separate parameter-typing step for parameterized queries, so the driver has to infer the type from the method used to set the actual value of a parameter. In this case, setLong() takes a Java long which is a signed 64-bit integer, so the driver picks int8. There's a more general interface (setObject()) that allows both a value and a target SQL type to be specified, but that's not commonly used for primitive types thanks to Java's primitive-vs-Object distinction. See http://java.sun.com/j2se/1.4.2/docs/api/java/sql/PreparedStatement.html -O
On 11/09/2003 04:42 Tom Lane wrote: > Paul Thomas <paul@tmsl.demon.co.uk> writes: > > I think the first big hurdle is going to be making them realize that > with > > CMP there _is no_ SQL source to modify in the first place. Yes, I think > > > you need to petition hackers and maybe x-post to advocacy too - there > are > > also people there who need to be made aware that PostgreSQL has a > serious > > Achilles heel as an enterprise database! > > <rolls eyes> Do you think we have not heard about the datatype-mismatch > issue ten thousand times before? Don't waste our time with "petitions". > Give us a practical way to fix it (ie, one that doesn't create more > problems than it solves). See for example > http://archives.postgresql.org/pgsql-hackers/2002-11/msg00468.php > as a recent discussion of the pitfalls involved. Without meaningful knowledge of the back-end internals its difficult for me to see where that discussion fits in... > > It occurred to me this afternoon that we might be able to improve > matters for int8 without necessarily fixing the general problem. > The problems cited in the above message mostly stem from trying to > type small constants as int2 so that "int2col = 42" is indexable. > Once upon a time it seemed that was where the hot buttons were, but > if your hot button is mostly int8, maybe we could fix that by removing > the int8-vs-int4 cross-type operators, and not touch the initial typing > of integer literals just yet. Does someone want to explore the > consequences of trying that? Maybe the answer lies in finding out why I can define define an index on a double precision column 'quantity' and the the planner will use the index :- test_db=#\d type_test Table "public.type_test" Column | Type | Modifiers ----------+------------------+----------- id | integer | age | smallint | quantity | double precision | Indexes: type_test_age btree (age), type_test_qty btree (quantity) test_db=# explain select * from type_test where quantity = 0; QUERY PLAN -------------------------------------------------------------------------------- Index Scan using type_test_qty on type_test (cost=0.00..3.35 rows=1 width=14) Index Cond: (quantity = 0::double precision) (2 rows) -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
On Wednesday, September 10, 2003, at 11:42 PM, Tom Lane wrote: > Once upon a time it seemed that was where the hot buttons were, but > if your hot button is mostly int8, maybe we could fix that by removing > the int8-vs-int4 cross-type operators, and not touch the initial typing > of integer literals just yet. Does someone want to explore the > consequences of trying that? > I would be delighted, sitting on a 59 table production OLTP system, all with int8 primary keys and foreign key references out the yin yang. Things are running within within our needs currently, but just having realized that things could potentially be much better, the desire for 'wow, we're doing all of these O(n) searches that we thought were log2(n), so this could be so much better than we realized without us touching our app logic at all.' My personal need-to-be-scratched is int8 exclusively. What sort of 'fool' would ever have a int2 index :-) ? Probably the same number of 'fools' with int8. James
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi!
As i told before if you have an app already in production state i
recomend you to do as we did:
patch the setLong  and setBigDecimal methods and compile your own
version of postgres JDBC driver.
It's very easy to be done. Here is an example:
The orginal look like:
public void setBigDecimal(int parameterIndex, BigDecimal x) throws
SQLException
       {
     if (x == null)
     setNull(parameterIndex, Types.OTHER);
     else
     set(parameterIndex, x.toString());
       }
public void setLong(int parameterIndex, long x) throws SQLException {
        set(parameterIndex, (new Long(x)).toString());
}
and we changed de set(...) to setString(..) and its look like:
public void setBigDecimal(int parameterIndex, BigDecimal x) throws
SQLException {
     if (x == null)
     setNull(parameterIndex, Types.OTHER);
     else
     setString(parameterIndex, x.toString());
}
public void setLong(int parameterIndex, long x) throws SQLException {
       setString(parameterIndex, (new Long(x)).toString());
}
With this change when we use the setBigdecimal or the setLong in a
query the value passed will be evaluated and the will be correctly
used.
This has been working in a production database for more than a year.
Regards.
João Paulo Ribeiro
James Robinson wrote:
>
> On Wednesday, September 10, 2003, at 11:42  PM, Tom Lane wrote:
>
>> Once upon a time it seemed that was where the hot buttons were, but
>> if your hot button is mostly int8, maybe we could fix that by removing
>> the int8-vs-int4 cross-type operators, and not touch the initial typing
>> of integer literals just yet.  Does someone want to explore the
>> consequences of trying that?
>>
>
> I would be delighted, sitting on a 59 table production OLTP system,
> all with int8 primary keys and foreign key references out the yin yang.
> Things are running within within our needs currently, but just having
> realized that things could potentially be much better, the desire for
> 'wow, we're doing all of these O(n) searches that we thought were
> log2(n), so this could be so much better than we realized without
> us touching our app logic at all.'
>
> My personal need-to-be-scratched is int8 exclusively. What sort of 'fool'
> would ever have a int2 index :-) ? Probably the same number of 'fools'
> with int8.
>
> James
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQE/YZzwbwM7R+C+9B0RAjZfAKCwHXfajcKndw8mDf31CPzpm0BinwCeLu2S
PvX0B9KY5zfwiVJFi29MRD8=
=3Jay
-----END PGP SIGNATURE-----