Обсуждение: Performance improvement proposal. Removal of toLowerCase calls.
Hi, I would like to propose an optimization to improve performance in the jdbc driver. The performance improvement has been tested on commodity hardware using an industry standard Java benchmark. The overall benchmark metric reports an improvement in performance. Profiling using sampling showed calls reduced from 1100 to 0 when the benchmark workload is running. The optimization will eliminate calls to the method toLowerCase(java.util.Locale). This in the pg-jdbc org.postgresql.jdbc2.AbstractJdbc2ResultSet.findColumnIndex(java.lang.String) method when setting up the column index registry. For the optimization to be enabled I suggest relying on a new system property. Making the existing functionality the default behaviour to ensure existing applications do not break when the driver is upgraded. The change removes the call toLowerCase when putting items in the registry [1]. Essentially what's being proposed is removing sanitizing the key names. For best performance application code should pass SQL to the driver with column names already folded to lower case. But upper case names will still be matched in the second lookup [2] in the method. For this optimization to work this feature introduces a requirement on applications. To use all lower or upper case column names. Are there other factors that might affect this optimization that need to be taken into consideration ? Regards, Jeremy Whiting Senior Software Engineer Red Hat [1] https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2742 [2] https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2752
Jeremy.
I am even wondering why we would be doing that and even worse using a locale to do it.
AFAICT, the db would return everything in lower case anyway, and if the column really had mixed case it should be returned with it's original case intact.
Dave
On Thu, Jan 16, 2014 at 10:44 AM, Jeremy Whiting <jwhiting@redhat.com> wrote:
Hi,
I would like to propose an optimization to improve performance in the
jdbc driver. The performance improvement has been tested on commodity
hardware using an industry standard Java benchmark. The overall
benchmark metric reports an improvement in performance. Profiling using
sampling showed calls reduced from 1100 to 0 when the benchmark workload
is running.
The optimization will eliminate calls to the method
toLowerCase(java.util.Locale). This in the pg-jdbc
org.postgresql.jdbc2.AbstractJdbc2ResultSet.findColumnIndex(java.lang.String)
method when setting up the column index registry.
For the optimization to be enabled I suggest relying on a new system
property. Making the existing functionality the default behaviour to
ensure existing applications do not break when the driver is upgraded.
The change removes the call toLowerCase when putting items in the
registry [1]. Essentially what's being proposed is removing sanitizing
the key names. For best performance application code should pass SQL to
the driver with column names already folded to lower case. But upper
case names will still be matched in the second lookup [2] in the method.
For this optimization to work this feature introduces a requirement on
applications. To use all lower or upper case column names.
Are there other factors that might affect this optimization that need
to be taken into consideration ?
Regards,
Jeremy Whiting
Senior Software Engineer
Red Hat
[1]
https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2742
[2]
https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2752
--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc
Hi Dave,
On 16/01/14 16:12, Dave Cramer wrote:
On 16/01/14 16:12, Dave Cramer wrote:
I suspect it's caters for when the request SQL is mixed case. To get a match from the db server the response SQL (as you say is lower case) you'd need to make sure the .put(...) on the registry is either all upper or lower.Jeremy.I am even wondering why we would be doing that and even worse using a locale to do it.
Shall I prepare a PR with a test case ?AFAICT, the db would return everything in lower case anyway, and if the column really had mixed case it should be returned with it's original case intact.
Jeremy
DaveOn Thu, Jan 16, 2014 at 10:44 AM, Jeremy Whiting <jwhiting@redhat.com> wrote:Hi,
I would like to propose an optimization to improve performance in the
jdbc driver. The performance improvement has been tested on commodity
hardware using an industry standard Java benchmark. The overall
benchmark metric reports an improvement in performance. Profiling using
sampling showed calls reduced from 1100 to 0 when the benchmark workload
is running.
The optimization will eliminate calls to the method
toLowerCase(java.util.Locale). This in the pg-jdbc
org.postgresql.jdbc2.AbstractJdbc2ResultSet.findColumnIndex(java.lang.String)
method when setting up the column index registry.
For the optimization to be enabled I suggest relying on a new system
property. Making the existing functionality the default behaviour to
ensure existing applications do not break when the driver is upgraded.
The change removes the call toLowerCase when putting items in the
registry [1]. Essentially what's being proposed is removing sanitizing
the key names. For best performance application code should pass SQL to
the driver with column names already folded to lower case. But upper
case names will still be matched in the second lookup [2] in the method.
For this optimization to work this feature introduces a requirement on
applications. To use all lower or upper case column names.
Are there other factors that might affect this optimization that need
to be taken into consideration ?
Regards,
Jeremy Whiting
Senior Software Engineer
Red Hat
[1]
https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2742
[2]
https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2752
--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc
Well postgresql folds to lower case by default so the case of the request SQL is irrelevant unless the column is double quoted or the column really does have mixed case.
Please do prepare PR and test case
Thanks
On Thu, Jan 16, 2014 at 12:50 PM, Jeremy Whiting <jwhiting@redhat.com> wrote:
Hi Dave,
On 16/01/14 16:12, Dave Cramer wrote:I suspect it's caters for when the request SQL is mixed case. To get a match from the db server the response SQL (as you say is lower case) you'd need to make sure the .put(...) on the registry is either all upper or lower.Jeremy.I am even wondering why we would be doing that and even worse using a locale to do it.Shall I prepare a PR with a test case ?AFAICT, the db would return everything in lower case anyway, and if the column really had mixed case it should be returned with it's original case intact.
JeremyDaveOn Thu, Jan 16, 2014 at 10:44 AM, Jeremy Whiting <jwhiting@redhat.com> wrote:Hi,
I would like to propose an optimization to improve performance in the
jdbc driver. The performance improvement has been tested on commodity
hardware using an industry standard Java benchmark. The overall
benchmark metric reports an improvement in performance. Profiling using
sampling showed calls reduced from 1100 to 0 when the benchmark workload
is running.
The optimization will eliminate calls to the method
toLowerCase(java.util.Locale). This in the pg-jdbc
org.postgresql.jdbc2.AbstractJdbc2ResultSet.findColumnIndex(java.lang.String)
method when setting up the column index registry.
For the optimization to be enabled I suggest relying on a new system
property. Making the existing functionality the default behaviour to
ensure existing applications do not break when the driver is upgraded.
The change removes the call toLowerCase when putting items in the
registry [1]. Essentially what's being proposed is removing sanitizing
the key names. For best performance application code should pass SQL to
the driver with column names already folded to lower case. But upper
case names will still be matched in the second lookup [2] in the method.
For this optimization to work this feature introduces a requirement on
applications. To use all lower or upper case column names.
Are there other factors that might affect this optimization that need
to be taken into consideration ?
Regards,
Jeremy Whiting
Senior Software Engineer
Red Hat
[1]
https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2742
[2]
https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2752
--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc
Jeremy Whiting wrote: > Hi, > I would like to propose an optimization to improve performance in the > jdbc driver. The performance improvement has been tested on commodity > hardware using an industry standard Java benchmark. The overall > benchmark metric reports an improvement in performance. Profiling using > sampling showed calls reduced from 1100 to 0 when the benchmark workload > is running. > > The optimization will eliminate calls to the method > toLowerCase(java.util.Locale). This in the pg-jdbc > org.postgresql.jdbc2.AbstractJdbc2ResultSet.findColumnIndex(java.lang.String) > method when setting up the column index registry. > > For the optimization to be enabled I suggest relying on a new system > property. Making the existing functionality the default behaviour to > ensure existing applications do not break when the driver is upgraded. > > The change removes the call toLowerCase when putting items in the > registry [1]. Essentially what's being proposed is removing sanitizing > the key names. For best performance application code should pass SQL to > the driver with column names already folded to lower case. But upper > case names will still be matched in the second lookup [2] in the method. **************** > For this optimization to work this feature introduces a requirement on > applications. To use all lower or upper case column names. **************** This is going to break existing applications, if the requirement is to have either upper or lowercase. I test explicitly with the MyJSQLView application the use of mixed case column, key index names because there are applications that used mixed case. The proper way to handle is quote to handle the mixed case so it is store as such. Perhaps the optimization could check for quoting then do no addition process and store directly intact. Other wise the optimization could initiate as proposed. danap. > > Are there other factors that might affect this optimization that need > to be taken into consideration ? > > Regards, > Jeremy Whiting > > Senior Software Engineer > Red Hat > > [1] > https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2742 > [2] > https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2752 >
On Thu, Jan 16, 2014 at 2:21 PM, dmp <danap@ttc-cmc.net> wrote:
Are you sure ? This is in the resultset, so any column names should have come back from the db.
Jeremy Whiting wrote:****************Hi,
I would like to propose an optimization to improve performance in the
jdbc driver. The performance improvement has been tested on commodity
hardware using an industry standard Java benchmark. The overall
benchmark metric reports an improvement in performance. Profiling using
sampling showed calls reduced from 1100 to 0 when the benchmark workload
is running.
The optimization will eliminate calls to the method
toLowerCase(java.util.Locale). This in the pg-jdbc
org.postgresql.jdbc2.AbstractJdbc2ResultSet.findColumnIndex(java.lang.String)
method when setting up the column index registry.
For the optimization to be enabled I suggest relying on a new system
property. Making the existing functionality the default behaviour to
ensure existing applications do not break when the driver is upgraded.
The change removes the call toLowerCase when putting items in the
registry [1]. Essentially what's being proposed is removing sanitizing
the key names. For best performance application code should pass SQL to
the driver with column names already folded to lower case. But upper
case names will still be matched in the second lookup [2] in the method.****************For this optimization to work this feature introduces a requirement on
applications. To use all lower or upper case column names.
This is going to break existing applications, if the requirement is to have
either upper or lowercase. I test explicitly with the MyJSQLView application
the use of mixed case column, key index names because there are applications
that used mixed case. The proper way to handle is quote to handle the mixed
case so it is store as such.
Perhaps the optimization could check for quoting then do no addition process
and store directly intact. Other wise the optimization could initiate as
proposed.
danap.
Which means that they should come back in lower case anyway.
I would think the current code would break apps that having real mixed case columns in the database ?
Jeremy's proposal would leave the case alone and store it in the map with mixed case.
[2]
https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2752
--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc
> Are you sure ? This is in the resultset, so any column names should have come back from the db. > Which means that they should come back in lower case anyway. create table foo ( "ID" integer, "Bar" varchar ); select * from foo; Should return ID and Bar as the column names. Danap probably also referred to the column names that are returned through getTableColumns() or getIndexColumns() or anyother API call that returns information about tables, columns or other identifiers. I do the same thing in SQL Workbench/J:check the returned column names whether they are upper, lower or mixed case in order to know whether generatedSQL statements (e.g. an update resulting from editing the data) need to quote the column names or not. Thomas
On Thu, Jan 16, 2014 at 2:47 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:
create table fooAre you sure ? This is in the resultset, so any column names should have come back from the db.
Which means that they should come back in lower case anyway.
(
"ID" integer,
"Bar" varchar
);
select *
from foo;
Should return ID and Bar as the column names.
Exactly! While I haven't actually tested the code. The lines that Jeremy is referring to would seem to suggest that the return columns would be id and bar.
Does anyone have time to test this ?
Dave Cramer wrote on 16.01.2014 20:51: > > Are you sure ? This is in the resultset, so any column names should have come back from the db. > Which means that they should come back in lower case anyway. > > create table foo > ( > "ID" integer, > "Bar" varchar > ); > > select * > from foo; > > Should return ID and Bar as the column names. > > > Exactly! While I haven't actually tested the code. The lines that Jeremy is referring to would seem to suggest that >the return columns would be id and bar. > > Does anyone have time to test this ? The current driver returns this correctly for a ResultSet. Here is a "transcript" from running my application on the console (easier to copy & paste than a screenshot) SQL Workbench/J console interface started. Enter 'exit' to quit. Enter 'WbHelp' for a list of SQL Workbench/J specific commands Config directory: C:\Projects\sqlworkbench\conf SQL> wbconnect {Postgres}/WbTest; Connection to "User=thomas, Schema=public, URL=jdbc:postgresql://localhost/wbtest" successful thomas@wbtest/public> create table foo ("ID" integer, "Bar" varchar, "FoBaR" varchar); Table 'foo' created thomas@wbtest/public> insert into foo values (1, 'foo', 'bar'); INSERT executed successfully 1 row affected thomas@wbtest/public> select * from foo; ID | Bar | FoBaR ---+-----+------ 1 | foo | bar (1 Row) SELECT executed successfully thomas@wbtest/public> All the various API calls in DatabaseMetaData do return identifiers in the correct case as well.
Thomas Kellerer wrote: > Dave Cramer wrote on 16.01.2014 20:51: >> >> Are you sure ? This is in the resultset, so any column names should >> have come back from the db. >> Which means that they should come back in lower case anyway. >> >> create table foo >> ( >> "ID" integer, >> "Bar" varchar >> ); >> >> select * >> from foo; >> >> Should return ID and Bar as the column names. >> >> >> Exactly! While I haven't actually tested the code. The lines that >> Jeremy is referring to would seem to suggest that >> the return columns would be id and bar. >> >> Does anyone have time to test this ? > > The current driver returns this correctly for a ResultSet. > > Here is a "transcript" from running my application on the console > (easier to copy & paste than a screenshot) > > SQL Workbench/J console interface started. > Enter 'exit' to quit. > Enter 'WbHelp' for a list of SQL Workbench/J specific commands > Config directory: C:\Projects\sqlworkbench\conf > > SQL> wbconnect {Postgres}/WbTest; > Connection to "User=thomas, Schema=public, > URL=jdbc:postgresql://localhost/wbtest" successful > > thomas@wbtest/public> create table foo ("ID" integer, "Bar" varchar, > "FoBaR" varchar); > Table 'foo' created > thomas@wbtest/public> insert into foo values (1, 'foo', 'bar'); > INSERT executed successfully > 1 row affected > thomas@wbtest/public> select * from foo; > ID | Bar | FoBaR > ---+-----+------ > 1 | foo | bar > > (1 Row) > SELECT executed successfully > thomas@wbtest/public> > > All the various API calls in DatabaseMetaData do return identifiers in > the correct case as well. > This is also the type of results that I have observed in the past in testing. The tableMetaData.getColumnName(i) will also return back the correct mixed case result. Ex. from table below. 1 Host 2 Db 3 Username 4 select_priv Not only does this apply to columns but also to table names that are quoted Normally I use the following table to test such occurrences in my code. DROP TABLE IF EXISTS "keY_tAble2"; CREATE TABLE "keY_tAble2" ( "Host" char(60) NOT NULL default '', "Db" char(64) NOT NULL default '', "Username" char(16) NOT NULL default '', Select_priv boolean NOT NULL default TRUE, PRIMARY KEY ("Host","Db","Username") ); danap.
Hi Dave and danap,
On 16/01/14 19:33, Dave Cramer wrote:
On 16/01/14 19:33, Dave Cramer wrote:
I have found this to be incorrect. The column registry is built using fields/columns that have come back from the db.On Thu, Jan 16, 2014 at 2:21 PM, dmp <danap@ttc-cmc.net> wrote:Jeremy Whiting wrote:Hi,
I would like to propose an optimization to improve performance in the
jdbc driver. The performance improvement has been tested on commodity
hardware using an industry standard Java benchmark. The overall
benchmark metric reports an improvement in performance. Profiling using
sampling showed calls reduced from 1100 to 0 when the benchmark workload
is running.
The optimization will eliminate calls to the method
toLowerCase(java.util.Locale). This in the pg-jdbc
org.postgresql.jdbc2.AbstractJdbc2ResultSet.findColumnIndex(java.lang.String)
method when setting up the column index registry.
For the optimization to be enabled I suggest relying on a new system
property. Making the existing functionality the default behaviour to
ensure existing applications do not break when the driver is upgraded.
The change removes the call toLowerCase when putting items in the
registry [1]. Essentially what's being proposed is removing sanitizing
the key names. For best performance application code should pass SQL to
the driver with column names already folded to lower case.
Yes it's incompatible. Mixed case situations should use the existing behaviour in 9.3-1100 which is proposed to be the default.****************But upper
case names will still be matched in the second lookup [2] in the method.****************For this optimization to work this feature introduces a requirement on
applications. To use all lower or upper case column names.
This is going to break existing applications, if the requirement is to have
either upper or lowercase.
Column names coming back from the database are sanitized. Before putting into the column registry.I test explicitly with the MyJSQLView application
the use of mixed case column, key index names because there are applications
that used mixed case. The proper way to handle is quote to handle the mixed
case so it is store as such.
Perhaps the optimization could check for quoting then do no addition process
and store directly intact. Other wise the optimization could initiate as
proposed.
danap.Are you sure ? This is in the resultset, so any column names should have come back from the db.Which means that they should come back in lower case anyway.I would think the current code would break apps that having real mixed case columns in the database ?
The proposal has a hard requirement of either all upper or lower case columns defined in the database.Jeremy's proposal would leave the case alone and store it in the map with mixed case.
If a user attempts to enable the optimization with an application that uses mixed case the driver does throw an exception of type org.postgresql.util.PSQLException. With this message "The column name <column name> was not found in this ResultSet.".
Jeremy
[2]
https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2752
--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc
-- Jeremy Whiting Senior Software Engineer, Performance Team Red Hat ------------------------------------------------------------ Registered Address: Red Hat UK Ltd, 64 Baker Street, 4th Floor, London. W1U 7DF. United Kingdom. Registered in England and Wales under Company Registration No. 03798903. Directors: Michael Cunningham (USA), Charlie Peters (USA), Matt Parson (USA), Paul Hickey (Ireland)
Jeremy,
At this point send a PR and we can test it. If you are proposing having an optimization switch which defaults to off I can't see why I would not merge it.
On Fri, Jan 17, 2014 at 11:48 AM, Jeremy Whiting <jwhiting@redhat.com> wrote:
Hi Dave and danap,
On 16/01/14 19:33, Dave Cramer wrote:I have found this to be incorrect. The column registry is built using fields/columns that have come back from the db.On Thu, Jan 16, 2014 at 2:21 PM, dmp <danap@ttc-cmc.net> wrote:Jeremy Whiting wrote:Hi,
I would like to propose an optimization to improve performance in the
jdbc driver. The performance improvement has been tested on commodity
hardware using an industry standard Java benchmark. The overall
benchmark metric reports an improvement in performance. Profiling using
sampling showed calls reduced from 1100 to 0 when the benchmark workload
is running.
The optimization will eliminate calls to the method
toLowerCase(java.util.Locale). This in the pg-jdbc
org.postgresql.jdbc2.AbstractJdbc2ResultSet.findColumnIndex(java.lang.String)
method when setting up the column index registry.
For the optimization to be enabled I suggest relying on a new system
property. Making the existing functionality the default behaviour to
ensure existing applications do not break when the driver is upgraded.
The change removes the call toLowerCase when putting items in the
registry [1]. Essentially what's being proposed is removing sanitizing
the key names. For best performance application code should pass SQL to
the driver with column names already folded to lower case.Yes it's incompatible. Mixed case situations should use the existing behaviour in 9.3-1100 which is proposed to be the default.****************But upper
case names will still be matched in the second lookup [2] in the method.****************For this optimization to work this feature introduces a requirement on
applications. To use all lower or upper case column names.
This is going to break existing applications, if the requirement is to have
either upper or lowercase.Column names coming back from the database are sanitized. Before putting into the column registry.I test explicitly with the MyJSQLView application
the use of mixed case column, key index names because there are applications
that used mixed case. The proper way to handle is quote to handle the mixed
case so it is store as such.
Perhaps the optimization could check for quoting then do no addition process
and store directly intact. Other wise the optimization could initiate as
proposed.
danap.Are you sure ? This is in the resultset, so any column names should have come back from the db.Which means that they should come back in lower case anyway.I would think the current code would break apps that having real mixed case columns in the database ?The proposal has a hard requirement of either all upper or lower case columns defined in the database.Jeremy's proposal would leave the case alone and store it in the map with mixed case.
If a user attempts to enable the optimization with an application that uses mixed case the driver does throw an exception of type org.postgresql.util.PSQLException. With this message "The column name <column name> was not found in this ResultSet.".
Jeremy[2]
https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2752
--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc-- Jeremy Whiting Senior Software Engineer, Performance Team Red Hat ------------------------------------------------------------ Registered Address: Red Hat UK Ltd, 64 Baker Street, 4th Floor, London. W1U 7DF. United Kingdom. Registered in England and Wales under Company Registration No. 03798903. Directors: Michael Cunningham (USA), Charlie Peters (USA), Matt Parson (USA), Paul Hickey (Ireland)
Hi Dave,
A pull request has been created. See [1].
In the commit are:
Changes to the Driver and DataSource configuration. To recognise the new property called "disableColumnSanitiser". The default value for this property is false. Comments to rename the property are welcome.
3 additional test case classes added to the jdbc2 test suite. Each class groups tests for:
a) ColumnSanitiserEnabledTest. Tests to check existing and default behaviour still works as expected.
b) ColumnSanitiserDisabledTest. Tests to check new behaviour works as expected.
c) CaseOptimiserDataSourceTest. Checking a datasource configured with the sanitiser disabled will recognise the property configuration.
I've attempted to conform to the coding guidelines where possible. The test-suite passes for 1.7 and 1.6 spec. 1.5 fails due to code in master [1] branch.
Regards,
Jeremy
[1] https://github.com/pgjdbc/pgjdbc/pull/113
[2] http://www.postgresql.org/message-id/52DFA5D0.7000901@redhat.com
On 17/01/14 16:55, Dave Cramer wrote:
A pull request has been created. See [1].
In the commit are:
Changes to the Driver and DataSource configuration. To recognise the new property called "disableColumnSanitiser". The default value for this property is false. Comments to rename the property are welcome.
3 additional test case classes added to the jdbc2 test suite. Each class groups tests for:
a) ColumnSanitiserEnabledTest. Tests to check existing and default behaviour still works as expected.
b) ColumnSanitiserDisabledTest. Tests to check new behaviour works as expected.
c) CaseOptimiserDataSourceTest. Checking a datasource configured with the sanitiser disabled will recognise the property configuration.
I've attempted to conform to the coding guidelines where possible. The test-suite passes for 1.7 and 1.6 spec. 1.5 fails due to code in master [1] branch.
Regards,
Jeremy
[1] https://github.com/pgjdbc/pgjdbc/pull/113
[2] http://www.postgresql.org/message-id/52DFA5D0.7000901@redhat.com
On 17/01/14 16:55, Dave Cramer wrote:
Jeremy,At this point send a PR and we can test it. If you are proposing having an optimization switch which defaults to off I can't see why I would not merge it.On Fri, Jan 17, 2014 at 11:48 AM, Jeremy Whiting <jwhiting@redhat.com> wrote:Hi Dave and danap,
On 16/01/14 19:33, Dave Cramer wrote:I have found this to be incorrect. The column registry is built using fields/columns that have come back from the db.On Thu, Jan 16, 2014 at 2:21 PM, dmp <danap@ttc-cmc.net> wrote:Jeremy Whiting wrote:Hi,
I would like to propose an optimization to improve performance in the
jdbc driver. The performance improvement has been tested on commodity
hardware using an industry standard Java benchmark. The overall
benchmark metric reports an improvement in performance. Profiling using
sampling showed calls reduced from 1100 to 0 when the benchmark workload
is running.
The optimization will eliminate calls to the method
toLowerCase(java.util.Locale). This in the pg-jdbc
org.postgresql.jdbc2.AbstractJdbc2ResultSet.findColumnIndex(java.lang.String)
method when setting up the column index registry.
For the optimization to be enabled I suggest relying on a new system
property. Making the existing functionality the default behaviour to
ensure existing applications do not break when the driver is upgraded.
The change removes the call toLowerCase when putting items in the
registry [1]. Essentially what's being proposed is removing sanitizing
the key names. For best performance application code should pass SQL to
the driver with column names already folded to lower case.Yes it's incompatible. Mixed case situations should use the existing behaviour in 9.3-1100 which is proposed to be the default.****************But upper
case names will still be matched in the second lookup [2] in the method.****************For this optimization to work this feature introduces a requirement on
applications. To use all lower or upper case column names.
This is going to break existing applications, if the requirement is to have
either upper or lowercase.Column names coming back from the database are sanitized. Before putting into the column registry.I test explicitly with the MyJSQLView application
the use of mixed case column, key index names because there are applications
that used mixed case. The proper way to handle is quote to handle the mixed
case so it is store as such.
Perhaps the optimization could check for quoting then do no addition process
and store directly intact. Other wise the optimization could initiate as
proposed.
danap.Are you sure ? This is in the resultset, so any column names should have come back from the db.Which means that they should come back in lower case anyway.I would think the current code would break apps that having real mixed case columns in the database ?The proposal has a hard requirement of either all upper or lower case columns defined in the database.Jeremy's proposal would leave the case alone and store it in the map with mixed case.
If a user attempts to enable the optimization with an application that uses mixed case the driver does throw an exception of type org.postgresql.util.PSQLException. With this message "The column name <column name> was not found in this ResultSet.".
Jeremy[2]
https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2752
--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc-- Jeremy Whiting Senior Software Engineer, Performance Team Red Hat ------------------------------------------------------------ Registered Address: Red Hat UK Ltd, 64 Baker Street, 4th Floor, London. W1U 7DF. United Kingdom. Registered in England and Wales under Company Registration No. 03798903. Directors: Michael Cunningham (USA), Charlie Peters (USA), Matt Parson (USA), Paul Hickey (Ireland)
Hi Jeremy,
I saw the PR. Thanks
What tests fail in 1.5 ?
On Wed, Jan 22, 2014 at 2:11 PM, Jeremy Whiting <jwhiting@redhat.com> wrote:
Hi Dave,
A pull request has been created. See [1].
In the commit are:
Changes to the Driver and DataSource configuration. To recognise the new property called "disableColumnSanitiser". The default value for this property is false. Comments to rename the property are welcome.
3 additional test case classes added to the jdbc2 test suite. Each class groups tests for:
a) ColumnSanitiserEnabledTest. Tests to check existing and default behaviour still works as expected.
b) ColumnSanitiserDisabledTest. Tests to check new behaviour works as expected.
c) CaseOptimiserDataSourceTest. Checking a datasource configured with the sanitiser disabled will recognise the property configuration.
I've attempted to conform to the coding guidelines where possible. The test-suite passes for 1.7 and 1.6 spec. 1.5 fails due to code in master [1] branch.
Regards,
Jeremy
[1] https://github.com/pgjdbc/pgjdbc/pull/113
[2] http://www.postgresql.org/message-id/52DFA5D0.7000901@redhat.com
On 17/01/14 16:55, Dave Cramer wrote:Jeremy,At this point send a PR and we can test it. If you are proposing having an optimization switch which defaults to off I can't see why I would not merge it.On Fri, Jan 17, 2014 at 11:48 AM, Jeremy Whiting <jwhiting@redhat.com> wrote:Hi Dave and danap,
On 16/01/14 19:33, Dave Cramer wrote:I have found this to be incorrect. The column registry is built using fields/columns that have come back from the db.On Thu, Jan 16, 2014 at 2:21 PM, dmp <danap@ttc-cmc.net> wrote:Jeremy Whiting wrote:Hi,
I would like to propose an optimization to improve performance in the
jdbc driver. The performance improvement has been tested on commodity
hardware using an industry standard Java benchmark. The overall
benchmark metric reports an improvement in performance. Profiling using
sampling showed calls reduced from 1100 to 0 when the benchmark workload
is running.
The optimization will eliminate calls to the method
toLowerCase(java.util.Locale). This in the pg-jdbc
org.postgresql.jdbc2.AbstractJdbc2ResultSet.findColumnIndex(java.lang.String)
method when setting up the column index registry.
For the optimization to be enabled I suggest relying on a new system
property. Making the existing functionality the default behaviour to
ensure existing applications do not break when the driver is upgraded.
The change removes the call toLowerCase when putting items in the
registry [1]. Essentially what's being proposed is removing sanitizing
the key names. For best performance application code should pass SQL to
the driver with column names already folded to lower case.Yes it's incompatible. Mixed case situations should use the existing behaviour in 9.3-1100 which is proposed to be the default.****************But upper
case names will still be matched in the second lookup [2] in the method.****************For this optimization to work this feature introduces a requirement on
applications. To use all lower or upper case column names.
This is going to break existing applications, if the requirement is to have
either upper or lowercase.Column names coming back from the database are sanitized. Before putting into the column registry.I test explicitly with the MyJSQLView application
the use of mixed case column, key index names because there are applications
that used mixed case. The proper way to handle is quote to handle the mixed
case so it is store as such.
Perhaps the optimization could check for quoting then do no addition process
and store directly intact. Other wise the optimization could initiate as
proposed.
danap.Are you sure ? This is in the resultset, so any column names should have come back from the db.Which means that they should come back in lower case anyway.I would think the current code would break apps that having real mixed case columns in the database ?The proposal has a hard requirement of either all upper or lower case columns defined in the database.Jeremy's proposal would leave the case alone and store it in the map with mixed case.
If a user attempts to enable the optimization with an application that uses mixed case the driver does throw an exception of type org.postgresql.util.PSQLException. With this message "The column name <column name> was not found in this ResultSet.".
Jeremy[2]
https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2752
--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc-- Jeremy Whiting Senior Software Engineer, Performance Team Red Hat ------------------------------------------------------------ Registered Address: Red Hat UK Ltd, 64 Baker Street, 4th Floor, London. W1U 7DF. United Kingdom. Registered in England and Wales under Company Registration No. 03798903. Directors: Michael Cunningham (USA), Charlie Peters (USA), Matt Parson (USA), Paul Hickey (Ireland)
Hi Dave,
It's not tests that fail but compilation errors.
The compile phase output is attached to an email in a new thread I started earlier. Subject is "Master branch compile error using 1.5 spec.".
Jeremy
On 22/01/14 19:38, Dave Cramer wrote:
It's not tests that fail but compilation errors.
The compile phase output is attached to an email in a new thread I started earlier. Subject is "Master branch compile error using 1.5 spec.".
Jeremy
On 22/01/14 19:38, Dave Cramer wrote:
Hi Jeremy,I saw the PR. ThanksWhat tests fail in 1.5 ?On Wed, Jan 22, 2014 at 2:11 PM, Jeremy Whiting <jwhiting@redhat.com> wrote:Hi Dave,
A pull request has been created. See [1].
In the commit are:
Changes to the Driver and DataSource configuration. To recognise the new property called "disableColumnSanitiser". The default value for this property is false. Comments to rename the property are welcome.
3 additional test case classes added to the jdbc2 test suite. Each class groups tests for:
a) ColumnSanitiserEnabledTest. Tests to check existing and default behaviour still works as expected.
b) ColumnSanitiserDisabledTest. Tests to check new behaviour works as expected.
c) CaseOptimiserDataSourceTest. Checking a datasource configured with the sanitiser disabled will recognise the property configuration.
I've attempted to conform to the coding guidelines where possible. The test-suite passes for 1.7 and 1.6 spec. 1.5 fails due to code in master [1] branch.
Regards,
Jeremy
[1] https://github.com/pgjdbc/pgjdbc/pull/113
[2] http://www.postgresql.org/message-id/52DFA5D0.7000901@redhat.com
On 17/01/14 16:55, Dave Cramer wrote:Jeremy,At this point send a PR and we can test it. If you are proposing having an optimization switch which defaults to off I can't see why I would not merge it.On Fri, Jan 17, 2014 at 11:48 AM, Jeremy Whiting <jwhiting@redhat.com> wrote:Hi Dave and danap,
On 16/01/14 19:33, Dave Cramer wrote:I have found this to be incorrect. The column registry is built using fields/columns that have come back from the db.On Thu, Jan 16, 2014 at 2:21 PM, dmp <danap@ttc-cmc.net> wrote:Jeremy Whiting wrote:Hi,
I would like to propose an optimization to improve performance in the
jdbc driver. The performance improvement has been tested on commodity
hardware using an industry standard Java benchmark. The overall
benchmark metric reports an improvement in performance. Profiling using
sampling showed calls reduced from 1100 to 0 when the benchmark workload
is running.
The optimization will eliminate calls to the method
toLowerCase(java.util.Locale). This in the pg-jdbc
org.postgresql.jdbc2.AbstractJdbc2ResultSet.findColumnIndex(java.lang.String)
method when setting up the column index registry.
For the optimization to be enabled I suggest relying on a new system
property. Making the existing functionality the default behaviour to
ensure existing applications do not break when the driver is upgraded.
The change removes the call toLowerCase when putting items in the
registry [1]. Essentially what's being proposed is removing sanitizing
the key names. For best performance application code should pass SQL to
the driver with column names already folded to lower case.Yes it's incompatible. Mixed case situations should use the existing behaviour in 9.3-1100 which is proposed to be the default.****************But upper
case names will still be matched in the second lookup [2] in the method.****************For this optimization to work this feature introduces a requirement on
applications. To use all lower or upper case column names.
This is going to break existing applications, if the requirement is to have
either upper or lowercase.Column names coming back from the database are sanitized. Before putting into the column registry.I test explicitly with the MyJSQLView application
the use of mixed case column, key index names because there are applications
that used mixed case. The proper way to handle is quote to handle the mixed
case so it is store as such.
Perhaps the optimization could check for quoting then do no addition process
and store directly intact. Other wise the optimization could initiate as
proposed.
danap.Are you sure ? This is in the resultset, so any column names should have come back from the db.Which means that they should come back in lower case anyway.I would think the current code would break apps that having real mixed case columns in the database ?The proposal has a hard requirement of either all upper or lower case columns defined in the database.Jeremy's proposal would leave the case alone and store it in the map with mixed case.
If a user attempts to enable the optimization with an application that uses mixed case the driver does throw an exception of type org.postgresql.util.PSQLException. With this message "The column name <column name> was not found in this ResultSet.".
Jeremy[2]
https://github.com/pgjdbc/pgjdbc/blob/REL9_3_STABLE/org/postgresql/jdbc2/AbstractJdbc2ResultSet.java?#L2752
--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc-- Jeremy Whiting Senior Software Engineer, Performance Team Red Hat ------------------------------------------------------------ Registered Address: Red Hat UK Ltd, 64 Baker Street, 4th Floor, London. W1U 7DF. United Kingdom. Registered in England and Wales under Company Registration No. 03798903. Directors: Michael Cunningham (USA), Charlie Peters (USA), Matt Parson (USA), Paul Hickey (Ireland)
Dave,
Having merged in your change on master the compilation error is fixed. The testsuite is passing for the three different jdk specs.
For those interested to know the matching with combinations of case possible take a look at the test cases covering the proposed behaviour.
https://github.com/whitingjr/pgjdbc/blob/remove-toLowerCase-calls/org/postgresql/test/jdbc2/ColumnSanitiserDisabledTest.java?#L66
Jeremy
On 22/01/14 21:08, Jeremy Whiting wrote:
Having merged in your change on master the compilation error is fixed. The testsuite is passing for the three different jdk specs.
For those interested to know the matching with combinations of case possible take a look at the test cases covering the proposed behaviour.
https://github.com/whitingjr/pgjdbc/blob/remove-toLowerCase-calls/org/postgresql/test/jdbc2/ColumnSanitiserDisabledTest.java?#L66
Jeremy
On 22/01/14 21:08, Jeremy Whiting wrote:
Hi Dave,
It's not tests that fail but compilation errors.
The compile phase output is attached to an email in a new thread I started earlier. Subject is "Master branch compile error using 1.5 spec.".
Jeremy
On 22/01/14 19:38, Dave Cramer wrote:Hi Jeremy,I saw the PR. ThanksWhat tests fail in 1.5 ?
On 16/01/14 20:33, dmp wrote: > Thomas Kellerer wrote: >> Dave Cramer wrote on 16.01.2014 20:51: >>> >>> Are you sure ? This is in the resultset, so any column names should >>> have come back from the db. >>> Which means that they should come back in lower case anyway. >>> >>> create table foo >>> ( >>> "ID" integer, >>> "Bar" varchar >>> ); >>> >>> select * >>> from foo; >>> >>> Should return ID and Bar as the column names. >>> >>> >>> Exactly! While I haven't actually tested the code. The lines that >>> Jeremy is referring to would seem to suggest that >>> the return columns would be id and bar. >>> >>> Does anyone have time to test this ? >> >> The current driver returns this correctly for a ResultSet. >> >> Here is a "transcript" from running my application on the console >> (easier to copy & paste than a screenshot) >> >> SQL Workbench/J console interface started. >> Enter 'exit' to quit. >> Enter 'WbHelp' for a list of SQL Workbench/J specific commands >> Config directory: C:\Projects\sqlworkbench\conf >> >> SQL> wbconnect {Postgres}/WbTest; >> Connection to "User=thomas, Schema=public, >> URL=jdbc:postgresql://localhost/wbtest" successful >> >> thomas@wbtest/public> create table foo ("ID" integer, "Bar" varchar, >> "FoBaR" varchar); >> Table 'foo' created >> thomas@wbtest/public> insert into foo values (1, 'foo', 'bar'); >> INSERT executed successfully >> 1 row affected >> thomas@wbtest/public> select * from foo; >> ID | Bar | FoBaR >> ---+-----+------ >> 1 | foo | bar >> >> (1 Row) >> SELECT executed successfully >> thomas@wbtest/public> >> >> All the various API calls in DatabaseMetaData do return identifiers in >> the correct case as well. >> > > This is also the type of results that I have observed in the past in > testing. > The tableMetaData.getColumnName(i) will also return back the correct > mixed > case result. Ex. from table below. > > 1 Host > 2 Db > 3 Username > 4 select_priv > > Not only does this apply to columns but also to table names that are > quoted > Normally I use the following table to test such occurrences in my code. > > > DROP TABLE IF EXISTS "keY_tAble2"; > CREATE TABLE "keY_tAble2" ( > "Host" char(60) NOT NULL default '', > "Db" char(64) NOT NULL default '', > "Username" char(16) NOT NULL default '', > Select_priv boolean NOT NULL default TRUE, > PRIMARY KEY ("Host","Db","Username") > ); > > danap. > My testing does confirm the findings by danap and Thomas too. Names are correctly returned mixed. My proposal is for situations where the known case is already all upper or lower. For situations other than that the feature can be ignored. The current driver behaviour will be the default to prevent applications breaking. I don't have as much oversight on all possible use cases as others in the community. I may have been overlooked something. What other concerns are there if any ? Jeremy