Обсуждение: Upgrading JDBC driver from 7.3 to 8.4 affected execution plans

Поиск
Список
Период
Сортировка

Upgrading JDBC driver from 7.3 to 8.4 affected execution plans

От
Boris
Дата:
Hi all, I am trying to understand an interesting side effect of our
recent JDBC driver upgrade (7.3 -> 8.4.701). Here is what I am seeing:

We use prepared statements to construct a multi inner/outer join (~ 15
joins, all on primary keys) query which brings back 0 or 1 raw.
With old driver, a query would take single digit milliseconds on
average when executed from java code. With new driver, I noticed that
the load on db server skyrocketed and all those queries take seconds
to execute. The reason is that the optimizer chooses merge joins
instead of nested loops. When I turn merge/hash joins off in
postgresql.conf, all went back to normal, queries are now executed
very fast.
 Another interesting point is that, when I paste the slow query into
psql, and explain/analyze it,  bad plan is chosen (like observed from
the code with new driver).

Any ideas? I'd be happy to provide more details if needed. Is the
issue somehow caused by the fact that 8.4 driver uses real server side
prepared statements vs the old one that didn't?


Cheers,
Boris

Re: Upgrading JDBC driver from 7.3 to 8.4 affected execution plans

От
Oliver Jowett
Дата:
Boris wrote:

>  Another interesting point is that, when I paste the slow query into
> psql, and explain/analyze it,  bad plan is chosen (like observed from
> the code with new driver).
>
> Any ideas? I'd be happy to provide more details if needed. Is the
> issue somehow caused by the fact that 8.4 driver uses real server side
> prepared statements vs the old one that didn't?

If you get the same problem via psql, I doubt it's related to your
driver changes. What changed on the server side?

-O

Re: Upgrading JDBC driver from 7.3 to 8.4 affected execution plans

От
Maciek Sakrejda
Дата:
Could this be related to how planning works with respect to
parameterized queries [1]? If memory serves, a prepared statement is
typically planned once and that plan is used each time the statement
is executed. There can be some issues around this because when you
have a query like "SELECT * FROM foo WHERE bar > ?", the planner has
no idea of the selectivity of that predicate if it has to plan the
query before it knows what the parameters will be. E.g., if it picks
an index scan and then a query is submitted where ? is 10, if 95% of
bars are greater than 10, the executor will waste a lot of time
mucking around with the index where it should just be doing a seq
scan.

I think with prepared statements, the driver typically executes them
in an anonymous portal (i.e., more or less, a prepared statement
context) which means the planner waits until parameters are provided.
However, once prepareThreshold is reached, it uses a named portal,
which means a single parameter-agnostic plan (the benefit is that you
don't have to replan per-query, and planning can be moderately
expensive). The page linked below mentions that things have changed in
the driver in this area. Like Oliver mentioned, I'd take a look at the
server first if you've upgraded that as well, but otherwise, this
could be the right path.

Note that the above only means that there could be differences in how
the 7.3 and 8.4 drivers are causing your query to be planned. If the
old driver you're using is using named portals for statement execution
right off the bat, but that's somehow coming up with better plans
(even though it has less information), something could be wonky with
your planner-related GUCS.

[1]: http://jdbc.postgresql.org/documentation/84/server-prepare.html
---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com

Re: Upgrading JDBC driver from 7.3 to 8.4 affected execution plans

От
Craig Ringer
Дата:
On 04/10/10 11:35, Maciek Sakrejda wrote:
> Could this be related to how planning works with respect to
> parameterized queries [1]? If memory serves, a prepared statement is
> typically planned once and that plan is used each time the statement
> is executed. There can be some issues around this because when you
> have a query like "SELECT * FROM foo WHERE bar > ?", the planner has
> no idea of the selectivity of that predicate if it has to plan the
> query before it knows what the parameters will be. E.g., if it picks
> an index scan and then a query is submitted where ? is 10, if 95% of
> bars are greater than 10, the executor will waste a lot of time
> mucking around with the index where it should just be doing a seq
> scan.

IMO Pg would be well served by supporting something like "EXECUTE WITH
REPLAN" or a PREPARE that doesn't cache the plan, only storing the
parsed statement and handling parameter substitution.

This is in the TODO as "Allow finer control over the caching of prepared
query plans" http://wiki.postgresql.org/wiki/Todo . Perhaps this is
something a few commercial users of Pg might consider getting together
and sponsoring work on?

--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

Re: Upgrading JDBC driver from 7.3 to 8.4 affected execution plans

От
Maciek Sakrejda
Дата:
> Not sure what I can do about this behavior though other than switching
> off hash/merge joins, which we did. I looked through the source, and
> preparedThreshold is set to 5 by default in 8.4.

I'd try mucking around with prepareThreshold. Chances are, you'll get
different plans once you hit the threshold, so if you're getting a bad
plan right off the bat, I'd set the threshold to 0 (or maybe 1; I'm
not sure if 0 acts as a "never prepare" value). If you're getting a
bad plan only once you hit the threshold, jack up the threshold to
Integer.MAX_VALUE or something like that.

Based on the page you linked in the subsequent e-mail, the issue may
be different than I originally thought (it looks like the old driver
didn't use server-side prepared statements at all), but I'm almost
certain there's a better way to address this than turning off
hashjoins and mergejoins. To help debug, you can turn up debugging
logging when you kick off the server (something like pg_ctl start ...
-o "-d 2", or via postgresql.conf), and see exactly which statements
are being issued by the driver, then explain them (note that you *can*
explain an EXECUTE). Once you've got that isolated, you may be able to
get additional help on the -performance list.

---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com

Re: Upgrading JDBC driver from 7.3 to 8.4 affected execution plans

От
Boris Partensky
Дата:
Oh, this issue should probably be mentioned in
http://jdbc.postgresql.org/documentation/faq.html#upgradeTo80
document. It's pretty subtle and can bite someone hard.

Boris

On Mon, Oct 4, 2010 at 9:38 AM, Boris Partensky
<boris.partensky@gmail.com> wrote:
> Thanks Maciek, you probably nailed it. Sounds like the problem is
> there somewhere :).
>
>> Like Oliver mentioned, I'd take a look at the
>> server first if you've upgraded that as well, but otherwise, this
>> could be the right path.
>
> Nothing changed server side. Again, I can reproduce the "faulty"
> behavior at will by switching the driver jars (after commenting out
> set enable_hashjoin = false and set enable_mergejoin = false from the
> server conf).
>
> Not sure what I can do about this behavior though other than switching
> off hash/merge joins, which we did. I looked through the source, and
> preparedThreshold is set to 5 by default in 8.4.
>
>
>
> On Sun, Oct 3, 2010 at 11:35 PM, Maciek Sakrejda <msakrejda@truviso.com> wrote:
>> Could this be related to how planning works with respect to
>> parameterized queries [1]? If memory serves, a prepared statement is
>> typically planned once and that plan is used each time the statement
>> is executed. There can be some issues around this because when you
>> have a query like "SELECT * FROM foo WHERE bar > ?", the planner has
>> no idea of the selectivity of that predicate if it has to plan the
>> query before it knows what the parameters will be. E.g., if it picks
>> an index scan and then a query is submitted where ? is 10, if 95% of
>> bars are greater than 10, the executor will waste a lot of time
>> mucking around with the index where it should just be doing a seq
>> scan.
>>
>> I think with prepared statements, the driver typically executes them
>> in an anonymous portal (i.e., more or less, a prepared statement
>> context) which means the planner waits until parameters are provided.
>> However, once prepareThreshold is reached, it uses a named portal,
>> which means a single parameter-agnostic plan (the benefit is that you
>> don't have to replan per-query, and planning can be moderately
>> expensive). The page linked below mentions that things have changed in
>> the driver in this area. Like Oliver mentioned, I'd take a look at the
>> server first if you've upgraded that as well, but otherwise, this
>> could be the right path.
>>
>> Note that the above only means that there could be differences in how
>> the 7.3 and 8.4 drivers are causing your query to be planned. If the
>> old driver you're using is using named portals for statement execution
>> right off the bat, but that's somehow coming up with better plans
>> (even though it has less information), something could be wonky with
>> your planner-related GUCS.
>>
>> [1]: http://jdbc.postgresql.org/documentation/84/server-prepare.html
>> ---
>> Maciek Sakrejda | System Architect | Truviso
>>
>> 1065 E. Hillsdale Blvd., Suite 215
>> Foster City, CA 94404
>> (650) 242-3500 Main
>> www.truviso.com
>>
>

Re: Upgrading JDBC driver from 7.3 to 8.4 affected execution plans

От
Boris Partensky
Дата:
Thanks Maciek, you probably nailed it. Sounds like the problem is
there somewhere :).

> Like Oliver mentioned, I'd take a look at the
> server first if you've upgraded that as well, but otherwise, this
> could be the right path.

Nothing changed server side. Again, I can reproduce the "faulty"
behavior at will by switching the driver jars (after commenting out
set enable_hashjoin = false and set enable_mergejoin = false from the
server conf).

Not sure what I can do about this behavior though other than switching
off hash/merge joins, which we did. I looked through the source, and
preparedThreshold is set to 5 by default in 8.4.



On Sun, Oct 3, 2010 at 11:35 PM, Maciek Sakrejda <msakrejda@truviso.com> wrote:
> Could this be related to how planning works with respect to
> parameterized queries [1]? If memory serves, a prepared statement is
> typically planned once and that plan is used each time the statement
> is executed. There can be some issues around this because when you
> have a query like "SELECT * FROM foo WHERE bar > ?", the planner has
> no idea of the selectivity of that predicate if it has to plan the
> query before it knows what the parameters will be. E.g., if it picks
> an index scan and then a query is submitted where ? is 10, if 95% of
> bars are greater than 10, the executor will waste a lot of time
> mucking around with the index where it should just be doing a seq
> scan.
>
> I think with prepared statements, the driver typically executes them
> in an anonymous portal (i.e., more or less, a prepared statement
> context) which means the planner waits until parameters are provided.
> However, once prepareThreshold is reached, it uses a named portal,
> which means a single parameter-agnostic plan (the benefit is that you
> don't have to replan per-query, and planning can be moderately
> expensive). The page linked below mentions that things have changed in
> the driver in this area. Like Oliver mentioned, I'd take a look at the
> server first if you've upgraded that as well, but otherwise, this
> could be the right path.
>
> Note that the above only means that there could be differences in how
> the 7.3 and 8.4 drivers are causing your query to be planned. If the
> old driver you're using is using named portals for statement execution
> right off the bat, but that's somehow coming up with better plans
> (even though it has less information), something could be wonky with
> your planner-related GUCS.
>
> [1]: http://jdbc.postgresql.org/documentation/84/server-prepare.html
> ---
> Maciek Sakrejda | System Architect | Truviso
>
> 1065 E. Hillsdale Blvd., Suite 215
> Foster City, CA 94404
> (650) 242-3500 Main
> www.truviso.com
>

Re: Upgrading JDBC driver from 7.3 to 8.4 affected execution plans

От
Boris Partensky
Дата:
Thanks, this was very helpful. I will experiment with those settings.

On Mon, Oct 4, 2010 at 12:13 PM, Maciek Sakrejda <msakrejda@truviso.com> wrote:
>> Not sure what I can do about this behavior though other than switching
>> off hash/merge joins, which we did. I looked through the source, and
>> preparedThreshold is set to 5 by default in 8.4.
>
> I'd try mucking around with prepareThreshold. Chances are, you'll get
> different plans once you hit the threshold, so if you're getting a bad
> plan right off the bat, I'd set the threshold to 0 (or maybe 1; I'm
> not sure if 0 acts as a "never prepare" value). If you're getting a
> bad plan only once you hit the threshold, jack up the threshold to
> Integer.MAX_VALUE or something like that.
>
> Based on the page you linked in the subsequent e-mail, the issue may
> be different than I originally thought (it looks like the old driver
> didn't use server-side prepared statements at all), but I'm almost
> certain there's a better way to address this than turning off
> hashjoins and mergejoins. To help debug, you can turn up debugging
> logging when you kick off the server (something like pg_ctl start ...
> -o "-d 2", or via postgresql.conf), and see exactly which statements
> are being issued by the driver, then explain them (note that you *can*
> explain an EXECUTE). Once you've got that isolated, you may be able to
> get additional help on the -performance list.
>
> ---
> Maciek Sakrejda | System Architect | Truviso
>
> 1065 E. Hillsdale Blvd., Suite 215
> Foster City, CA 94404
> (650) 242-3500 Main
> www.truviso.com
>

Re: Upgrading JDBC driver from 7.3 to 8.4 affected execution plans

От
Oliver Jowett
Дата:
Boris Partensky wrote:
> Oh, this issue should probably be mentioned in
> http://jdbc.postgresql.org/documentation/faq.html#upgradeTo80
> document. It's pretty subtle and can bite someone hard.

The thing that confuses me is that you said you also see the problem via
psql - but psql isn't using parameterized queries at all (psql should be
very similar to how the 7.3 driver executes queries, as in the 7.3 era
the driver would do parameter substitution before sending a single query
string to the server)

-O

Re: Upgrading JDBC driver from 7.3 to 8.4 affected execution plans

От
Boris Partensky
Дата:
Oliver, yes. I agree with you, I am still trying to understand that.

On Mon, Oct 4, 2010 at 6:09 PM, Oliver Jowett <oliver@opencloud.com> wrote:
> Boris Partensky wrote:
>> Oh, this issue should probably be mentioned in
>> http://jdbc.postgresql.org/documentation/faq.html#upgradeTo80
>> document. It's pretty subtle and can bite someone hard.
>
> The thing that confuses me is that you said you also see the problem via
> psql - but psql isn't using parameterized queries at all (psql should be
> very similar to how the 7.3 driver executes queries, as in the 7.3 era
> the driver would do parameter substitution before sending a single query
> string to the server)
>
> -O
>