Обсуждение: avoid prepared statements on complex queries?

Поиск
Список
Период
Сортировка

avoid prepared statements on complex queries?

От
Anish Kejariwal
Дата:
I'm running into an issue where a complex query is performing significantly slower if I use a prepared statement.

I'm using:
-postgres 9.0.3
-postgresql-9.0-801.jdbc4.jar  JDBC driver

Some notes about the query:
-I'm calling a postgres function, via: select * from foo(?,?,?,?,?,?,?)
-I'm getting back 4 million results
-the underlying query is quite complex joining several tables, and the core table is partitioned over a couple hundred tables.

Here's the java code:

conn.setAutoCommit(false);
pstmt = conn.prepareStatement("select * from foo(?,?,?,?,?,?,?)");
pstmt.setFetchSize(1000);

//and, then I do the usual thing to set parameters in the prepared statement
stmt.setInt(1, SOME_NUMBER);
pstmt.setArray(2, genepoolConn.createArrayOf("integer", ARRAY_OF_INTEGERS));
//and so on

What I have found:
-if ARRAY_OF_INTEGERS has a length of 3, the query performs fast
-if ARRAY_OF_INTEGERS has a length of 150, then query takes 1200 seconds
-if ARRAY_OF_INTEGERS has a length of 150, and I don't use a prepared statement, the query takes the expected 30 seconds

So, obviously when using a prepared statement, postgres is coming up with the wrong execution plan when the the parameter list is unexpectedly large.  Pretty understandable.

My question: is there a work around to this?  Can I force it not set the execution plan until I bind the variables?

The only reasons I'm using a prepared statement:
-parameterized queries are far easier to work with than building my query via string concatenation
-minimize chances of SQL Injection

If only there was a way to have parameterized queries without using prepared statements....

Thanks!






Re: avoid prepared statements on complex queries?

От
Maciek Sakrejda
Дата:
> If only there was a way to have parameterized queries without using prepared
> statements....

Take a look at the prepareThreshold connection string parameter.
---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com

Re: avoid prepared statements on complex queries?

От
Anish Kejariwal
Дата:
Hi Maciek,

I tried playing around with the prepareThreshold parameter, by setting it in the JDBC connection string.  I tried setting it to zero, which should cause it to not use a prepared statement, right?  Query is still slow.  I also tried setting it to 1 and 3, and in all cases it's slow.  Is there any way to verify if postgres is using a planned query?

I found this old post:
http://archives.postgresql.org/pgsql-jdbc/2008-02/msg00077.php
"However even when plans are not cached, you will still run into the issue that an unnamed statement using out-of-line parameter values may generate a less efficient plan than an unnamed statement using inline parameter values, because there is no way to tell the planner at the protocol level "I am really only ever using this query once, please give me a specific plan for these values and don't worry about generating a plan that is correct for other values too"."

Maybe that's what's happening?

I tried a different tactic: setting the protocolVersion=2.  I can see that forces it to not use a prepared statement, but it seems to ignore that I set the fetch size to 1000 for the prepared statement.  I need to be able to set the fetch size, otherwise I run out of memory since I'm getting back 4 million results.

thanks for your help.

Anish



From: Maciek Sakrejda <msakrejda@truviso.com>
To: Anish Kejariwal <anishkej@yahoo.com>
Cc: "pgsql-jdbc@postgresql.org" <pgsql-jdbc@postgresql.org>
Sent: Tuesday, November 15, 2011 5:31 PM
Subject: Re: [JDBC] avoid prepared statements on complex queries?

> If only there was a way to have parameterized queries without using prepared
> statements....

Take a look at the prepareThreshold connection string parameter.
---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com


Re: avoid prepared statements on complex queries?

От
Maciek Sakrejda
Дата:
>I tried playing around with the prepareThreshold parameter, by setting it in the JDBC connection string.  I tried
settingit to zero, which should cause it to not use a prepared statement, right?  Query is still >slow.  I also tried
settingit to 1 and 3, and in all cases it's slow.  Is there any way to verify if postgres is using a planned query? 

I think 0 should do it, yes (technically, it's using an unnamed
prepared statement rather than not using one at all, but this is the
behavior you're looking for). To verify, you should be able to change
log_min_messages (to debug2?) and see all plan/execute steps in the
server logs. Alternately, if you're not using SSL, you could just fire
up Wireshark and see what's on the wire (it has a built-in PostgreSQL
protocol plugin, so this is pretty straightforward). In either case,
the unnamed prepared statements should have no name (in Prepare, Bind,
and Execute messages); if memory serves, the named ones are something
like S_1, S_2, etc..

> "However even when plans are not cached, you will still run into the issue
> that an unnamed statement using out-of-line parameter values may generate a
> less efficient plan than an unnamed statement using inline parameter values,
> because there is no way to tell the planner at the protocol level "I am
> really only ever using this query once, please give me a specific plan for
> these values and don't worry about generating a plan that is correct for
> other values too"."
> Maybe that's what's happening?

Maybe, although I've never seen that happen and as I understand, the
driver always sends parameters along with the statement when using
unnamed statements, so I'm not sure what would trigger this. Perhaps
Oliver will chime in (he's still active on the list).

You may also want to try an EXPLAIN ANALYZE on your query in both
protocol versions and see what the plan differences are.

---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com

Re: avoid prepared statements on complex queries?

От
Kris Jurka
Дата:

On Wed, 16 Nov 2011, Maciek Sakrejda wrote:

> Maybe, although I've never seen that happen and as I understand, the
> driver always sends parameters along with the statement when using
> unnamed statements, so I'm not sure what would trigger this.

The standard example is "WHERE col LIKE ?".  If the parameter is something
like 'abcdef%' then the query can potentially be converted to a range
scan, but only if knows the parameter value.  When preparing this with an
unnamed statement the server has the parameter value, but cannot assume
that the same unnamed statement won't be used with a different parameter
value later and must come up with a plan that is correct for all possible
parameters even if it's not ideal for the current one.

Kris Jurka

Re: avoid prepared statements on complex queries?

От
Kris Jurka
Дата:

On Tue, 15 Nov 2011, Anish Kejariwal wrote:

> I'm running into an issue where a complex query is performing significantly
> slower if I use a prepared statement.
>
> Some notes about the query:
> -I'm calling a postgres function, via: select * from foo(?,?,?,?,?,?,?)
> -I'm getting back 4 million results
> -the underlying query is quite complex joining several tables, and the core
> table is partitioned over a couple hundred tables.
>
> Here's the java code:
>
> conn.setAutoCommit(false);
> pstmt = conn.prepareStatement("select * from foo(?,?,?,?,?,?,?)");
> pstmt.setFetchSize(1000);
>
> //and, then I do the usual thing to set parameters in the prepared statement
> stmt.setInt(1, SOME_NUMBER);
> pstmt.setArray(2, genepoolConn.createArrayOf("integer", ARRAY_OF_INTEGERS));
> //and so on
>
> So, obviously when using a prepared statement, postgres is coming up with
> the wrong execution plan when the the parameter list is unexpectedly large.

You haven't really shown that.  Unless you issue the same query, manually
interpolating in the query parameters and get a much faster time, all
you've shown is that things get slower with a larger number of values.

What language is your function written in?  Unless it's a sql function
which could be inlined, I wouldn't have expected any difference between
prepared vs inline query execution.

Kris Jurka

Re: avoid prepared statements on complex queries?

От
Anish Kejariwal
Дата:
Hi Kris,

Sorry, if I wasn't clear.  I'm doing everything in JDBC.

If I use a prepared statement, and set my main parameter to be an array of 150 integers, the query takes for ever (1200 seconds).  If I instead skip using a prepared statement by doing the following, the query takes 30 seconds:
Statement stmt = conn.createStatement();
stmt.setFetchSize(1000);
ResultSet rs = stmt.executeQuery(QUERY_STRING);

where QUERY_STRING is the entire query with the 150 integers.

In both cases, I'm calling a SQL function:
create or replace function foo(int,int[],int[],int, boolean, boolean, double precision)
.
.
where
        dataset_id = any($2)
.
.
$$ language 'sql' immutable;

the second parameter is the one where the array will verify anywhere from 1 to 500 integers.

I'm going to try Maciek's suggestions (server logging, and wire shark),  to see if I can verify that when I set ?prepareThreshold=0 in the JDBC string, it is in fact forcing it to be an unnamed prepared statement.

thanks,
Anish



From: Kris Jurka <books@ejurka.com>
To: Anish Kejariwal <anishkej@yahoo.com>
Cc: "pgsql-jdbc@postgresql.org" <pgsql-jdbc@postgresql.org>
Sent: Wednesday, November 16, 2011 1:20 PM
Subject: Re: [JDBC] avoid prepared statements on complex queries?



On Tue, 15 Nov 2011, Anish Kejariwal wrote:

> I'm running into an issue where a complex query is performing significantly
> slower if I use a prepared statement.
>
> Some notes about the query:
> -I'm calling a postgres function, via: select * from foo(?,?,?,?,?,?,?)
> -I'm getting back 4 million results
> -the underlying query is quite complex joining several tables, and the core
> table is partitioned over a couple hundred tables.
>
> Here's the java code:
>
> conn.setAutoCommit(false);
> pstmt = conn.prepareStatement("select * from foo(?,?,?,?,?,?,?)");
> pstmt.setFetchSize(1000);
>
> //and, then I do the usual thing to set parameters in the prepared statement
> stmt.setInt(1, SOME_NUMBER);
> pstmt.setArray(2, genepoolConn.createArrayOf("integer", ARRAY_OF_INTEGERS));
> //and so on
>
> So, obviously when using a prepared statement, postgres is coming up with
> the wrong execution plan when the the parameter list is unexpectedly large.

You haven't really shown that.  Unless you issue the same query, manually
interpolating in the query parameters and get a much faster time, all
you've shown is that things get slower with a larger number of values.

What language is your function written in?  Unless it's a sql function
which could be inlined, I wouldn't have expected any difference between
prepared vs inline query execution.

Kris Jurka

Re: avoid prepared statements on complex queries?

От
Tom Lane
Дата:
Kris Jurka <books@ejurka.com> writes:
> On Wed, 16 Nov 2011, Maciek Sakrejda wrote:
>> Maybe, although I've never seen that happen and as I understand, the
>> driver always sends parameters along with the statement when using
>> unnamed statements, so I'm not sure what would trigger this.

> The standard example is "WHERE col LIKE ?".  If the parameter is something
> like 'abcdef%' then the query can potentially be converted to a range
> scan, but only if knows the parameter value.  When preparing this with an
> unnamed statement the server has the parameter value, but cannot assume
> that the same unnamed statement won't be used with a different parameter
> value later and must come up with a plan that is correct for all possible
> parameters even if it's not ideal for the current one.

BTW, this should be all better in 9.2 ... so if anyone is thinking of
expending lots of effort to fix it on the JDBC side, don't bother.

            regards, tom lane

Re: avoid prepared statements on complex queries?

От
Anish Kejariwal
Дата:
Thanks Tom.  That's great to know this will be addressed in 9.2.

Btw - I turned logging on and verified that by setting by setting prepareThreshold the prepared statement became an unnamed statement.  This did not fix the performance issue I've been seeing.  I'm going to have to resort to constructing the query myself by building the query string....

thanks for everyones help.  

Anish



From: Tom Lane <tgl@sss.pgh.pa.us>
To: Kris Jurka <books@ejurka.com>
Cc: Maciek Sakrejda <msakrejda@truviso.com>; Anish Kejariwal <anishkej@yahoo.com>; "pgsql-jdbc@postgresql.org" <pgsql-jdbc@postgresql.org>
Sent: Wednesday, November 16, 2011 4:35 PM
Subject: Re: [JDBC] avoid prepared statements on complex queries?

Kris Jurka <books@ejurka.com> writes:
> On Wed, 16 Nov 2011, Maciek Sakrejda wrote:
>> Maybe, although I've never seen that happen and as I understand, the
>> driver always sends parameters along with the statement when using
>> unnamed statements, so I'm not sure what would trigger this.

> The standard example is "WHERE col LIKE ?".  If the parameter is something
> like 'abcdef%' then the query can potentially be converted to a range
> scan, but only if knows the parameter value.  When preparing this with an
> unnamed statement the server has the parameter value, but cannot assume
> that the same unnamed statement won't be used with a different parameter
> value later and must come up with a plan that is correct for all possible
> parameters even if it's not ideal for the current one.

BTW, this should be all better in 9.2 ... so if anyone is thinking of
expending lots of effort to fix it on the JDBC side, don't bother.

            regards, tom lane