Обсуждение: jdbc driver performance TODO
I'm interested in updating the performance TODO list for the jdbc driver. I want to refresh the TODO list with any new items people are aware of, plus make sure each item has a link to an agreed design if one exists. Forgive me for the observation, but the current list does seem to be a little out of date: - Add statement pooling to take advantage of server prepared statements. - Allow scrollable ResultSets to not fetch all results in one batch. - Allow refcursor ResultSets to not fetch all results in one batch. - Allow binary data transfers for all datatypes not just bytea. Could I shake the tree for any new performance suggestions? Or maybe not new exactly, but just not listed. Lots of detail please.... -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
My first vote would go to adding support for COPY. (Where would I find a design doc for this if one exists? I'm sort of curious how this will work..) My other is more of a question relating to parameterized statements and prepared statements, so I'm sending it in another post since I'm not sure if it qualifies as a TO-DO request. Thanks, Bucky > -----Original Message----- > From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc- > owner@postgresql.org] On Behalf Of Simon Riggs > Sent: Friday, October 27, 2006 6:58 PM > To: pgsql-jdbc@postgresql.org > Subject: [JDBC] jdbc driver performance TODO > > I'm interested in updating the performance TODO list for the jdbc > driver. I want to refresh the TODO list with any new items people are > aware of, plus make sure each item has a link to an agreed design if one > exists. > > Forgive me for the observation, but the current list does seem to be a > little out of date: > - Add statement pooling to take advantage of server prepared statements. > - Allow scrollable ResultSets to not fetch all results in one batch. > - Allow refcursor ResultSets to not fetch all results in one batch. > - Allow binary data transfers for all datatypes not just bytea. > > Could I shake the tree for any new performance suggestions? Or maybe not > new exactly, but just not listed. Lots of detail please.... > > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq
In my opinion, improvements could be made in the encoding/decoding of the protocole messages (Ie the PGStream class).
From what I have seen thanks to a profiler some methods are over invoked, and the numeric encoding is not as efficient as it could be.
That numeric encoding/decoding is used very often, even for the protocole purposes, not only for data encoding.
Furthermore, it would be interesting not to use the java BufferedStream implementation, and use a specific one so as to directly manage and access the buffer.
From the driver point of view, this is the core part to optimize since everything starts or goes from/through this part (the protocole encoding).
I use a driver I have optimized, and submitted these obervations a couple of time to the list, but I'm really sorry I can't contribute more for a long time I'm afraid
Nevertheless, anyone that would look into these parts of the code could make that optimisations again as they are quite simple to achieve.
Regards.
Bucky Jordan a écrit :
From what I have seen thanks to a profiler some methods are over invoked, and the numeric encoding is not as efficient as it could be.
That numeric encoding/decoding is used very often, even for the protocole purposes, not only for data encoding.
Furthermore, it would be interesting not to use the java BufferedStream implementation, and use a specific one so as to directly manage and access the buffer.
From the driver point of view, this is the core part to optimize since everything starts or goes from/through this part (the protocole encoding).
I use a driver I have optimized, and submitted these obervations a couple of time to the list, but I'm really sorry I can't contribute more for a long time I'm afraid
Nevertheless, anyone that would look into these parts of the code could make that optimisations again as they are quite simple to achieve.
Regards.
Bucky Jordan a écrit :
My first vote would go to adding support for COPY. (Where would I find a design doc for this if one exists? I'm sort of curious how this will work..) My other is more of a question relating to parameterized statements and prepared statements, so I'm sending it in another post since I'm not sure if it qualifies as a TO-DO request. Thanks, Bucky-----Original Message----- From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc- owner@postgresql.org] On Behalf Of Simon Riggs Sent: Friday, October 27, 2006 6:58 PM To: pgsql-jdbc@postgresql.org Subject: [JDBC] jdbc driver performance TODO I'm interested in updating the performance TODO list for the jdbc driver. I want to refresh the TODO list with any new items people are aware of, plus make sure each item has a link to an agreed design ifoneexists. Forgive me for the observation, but the current list does seem to be a little out of date: - Add statement pooling to take advantage of server preparedstatements.- Allow scrollable ResultSets to not fetch all results in one batch. - Allow refcursor ResultSets to not fetch all results in one batch. - Allow binary data transfers for all datatypes not just bytea. Could I shake the tree for any new performance suggestions? Or maybenotnew exactly, but just not listed. Lots of detail please.... -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end ofbroadcast)---------------------------TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings
On Sat, 28 Oct 2006, Julien Patrouix wrote: > In my opinion, improvements could be made in the encoding/decoding of the > protocole messages (Ie the PGStream class). > From what I have seen thanks to a profiler some methods are over invoked, and > the numeric encoding is not as efficient as it could be. > That numeric encoding/decoding is used very often, even for the protocole > purposes, not only for data encoding. > > Furthermore, it would be interesting not to use the java BufferedStream > implementation, and use a specific one so as to directly manage and access > the buffer. This is what Mikko Tiihonen has done here and I intend to apply it next week. http://archives.postgresql.org/pgsql-jdbc/2006-09/msg00163.php > I use a driver I have optimized, and submitted these obervations a couple of > time to the list, but I'm really sorry I can't contribute more for a long > time I'm afraid I've never seen a patch come across this list. Could you retry or point me to an archives entry? Kris Jurka
Simon, Here's an observation about the JDBC driver, but I'm not sure if it's practical to implement. After preparing a statement, the driver still sends out a describe message either via the sendDescribeStatement () or (most likely) the sendDescribePortal () method calls in org.postgresql.core.v3.QueryExecutorImpl. However, as the statement has been prepared, it's unlikely to change and so the results of the sendDescribeXXX () could be requested once and then cached with the prepared statement. Of course, if any tables referenced by the query where changed, the prepared statement would be caching the original structure. Although, I'm not sure how much of a problem this would be as changing a table's data types etc. might cause code to break anyway. I don't see a direct way to turn off the metadata, perhaps I'm missing something? David -----Original Message----- From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] On Behalf Of Simon Riggs Sent: Friday, October 27, 2006 3:58 PM To: pgsql-jdbc@postgresql.org Subject: [JDBC] jdbc driver performance TODO I'm interested in updating the performance TODO list for the jdbc driver. I want to refresh the TODO list with any new items people are aware of, plus make sure each item has a link to an agreed design if one exists. Forgive me for the observation, but the current list does seem to be a little out of date: - Add statement pooling to take advantage of server prepared statements. - Allow scrollable ResultSets to not fetch all results in one batch. - Allow refcursor ResultSets to not fetch all results in one batch. - Allow binary data transfers for all datatypes not just bytea. Could I shake the tree for any new performance suggestions? Or maybe not new exactly, but just not listed. Lots of detail please.... -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
On Tue, 31 Oct 2006, Strong, David wrote: > Here's an observation about the JDBC driver, but I'm not sure if it's > practical to implement. After preparing a statement, the driver still > sends out a describe message either via the sendDescribeStatement () or > (most likely) the sendDescribePortal () method calls in > org.postgresql.core.v3.QueryExecutorImpl. However, as the statement has > been prepared, it's unlikely to change and so the results of the > sendDescribeXXX () could be requested once and then cached with the > prepared statement. I'm not sure how much benefit this will actually produce, but I agree there is duplication of work here. > Of course, if any tables referenced by the query where changed, the > prepared statement would be caching the original structure. Although, > I'm not sure how much of a problem this would be as changing a table's > data types etc. might cause code to break anyway. Right now it's not a big deal for the driver because plans don't change, but for 8.3 there are plans to do prepared query invalidation when underlying tables change. At that point we'd need to detect and refetch metadata. I'm not sure how a client would detect this change. > I don't see a direct way to turn off the metadata, perhaps I'm missing > something? > Passing the QueryExecutor.QUERY_NO_METADATA flag to the QueryExecutor will prevent the describe message from being sent. Kris Jurka
Kris Jurka <books@ejurka.com> writes: > Right now it's not a big deal for the driver because plans don't change, > but for 8.3 there are plans to do prepared query invalidation when > underlying tables change. At that point we'd need to detect and refetch > metadata. I'm not sure how a client would detect this change. We haven't really talked about the semantics of this stuff, but I'm inclined to think that a prepared statement ought to go into some kind of "broken" status where it couldn't be invoked, if a change occurs that would force a change in the output column set. Otherwise you could have situations where a client does Describe Statement followed (almost) immediately by Execute and gets inconsistent results. I think we really want the auto-replan facility to handle things like addition of a new index or availability of new ANALYZE stats --- having it automatically propagate things like an ALTER COLUMN TYPE seems a good bit more questionable. regards, tom lane
On Tue, 2006-10-31 at 14:32 -0500, Kris Jurka wrote: > > On Tue, 31 Oct 2006, Strong, David wrote: > > > Here's an observation about the JDBC driver, but I'm not sure if it's > > practical to implement. After preparing a statement, the driver still > > sends out a describe message either via the sendDescribeStatement () or > > (most likely) the sendDescribePortal () method calls in > > org.postgresql.core.v3.QueryExecutorImpl. However, as the statement has > > been prepared, it's unlikely to change and so the results of the > > sendDescribeXXX () could be requested once and then cached with the > > prepared statement. > > I'm not sure how much benefit this will actually produce, but I agree > there is duplication of work here. So we can optimise it, but only for 8.2 and below until we work out the next steps. > > Of course, if any tables referenced by the query where changed, the > > prepared statement would be caching the original structure. Although, > > I'm not sure how much of a problem this would be as changing a table's > > data types etc. might cause code to break anyway. > > Right now it's not a big deal for the driver because plans don't change, > but for 8.3 there are plans to do prepared query invalidation when > underlying tables change. At that point we'd need to detect and refetch > metadata. I'm not sure how a client would detect this change. Plus we have the difference between changed plans that change metadata and changed plans that do not change metadata. The latter might well be considered to be more common, though both are fairly rare in the context of the prepared statement. So we probably can do something with this, but plans haven't really been thought through yet. So, this is a good reminder of something to include in the mix when designs start to be proposed. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com