Обсуждение: Prepared statements and suboptimal plans

Поиск
Список
Период
Сортировка

Prepared statements and suboptimal plans

От
Royce Ausburn
Дата:
Hi all,

It looks like I've been hit with this well known issue.  I have a complicated query that is intended to run every few minutes, I'm using JDBC's Connection.prepareStatement() mostly for nice parameterisation, but postgres produces a suboptimal plan due to its lack of information when the statement is prepared.

I've been following the mailing list for a few years and I've seen this topic come up a bit.  I've just done a quick google and I'm not quite sure how to fix this short of manually substituting my query parameters in to a query string -- avoiding prepared statements… An alternative might be to re-write the query and hope that the planner's general plan is a bit closer to optimal… but are these my only options?  

I notice that the non-prepared-statement (both below my sig) plan estimates 5500 rows output.  I think that's out by a factor of up to 100, suggesting that I might want to increase my statistics and re-analyse… but as I understand the prepared-statement problem, this probably won't help here.  Correct?

We've been worst hit by this query on an 8.3 site.  Another site is running 8.4.  Have there been improvements in this area recently?  Upgrading to 9.0 might be viable for us.

Any tips would be appreciated,

--Royce


test=# PREPARE test (integer) as 
  select 
    sid, 
    role, 
    starttime::date, 
    nasid, importer, 
    max(eventbinding.biid) as biid, 
    sum(bytesin) as bytesin, 
    sum(bytesout) as bytesout,  
    sum(seconds) as seconds, 
    sum(coalesce(pages, 0)) as pages, 
    sum(coalesce(count, 0)) as count,  
    sum(coalesce(rate, 0.0)) as rate, 
    sum(coalesce(bytesSentRate, 0.0)) as bytesSentRate,  
    sum(coalesce(bytesReceivedRate, 0.0)) as bytesReceivedRate, 
    count(*) as entries  
  from billingItem, eventBinding , fqun  
  where eventBinding.biid > $1 and eventBinding.biid = billingItem.biid and fqun.uid = eventBinding.uid  
  group by sid, starttime::date, nasid, importer, role;
PREPARE
test=# explain EXECUTE test(57205899);
                                                              QUERY PLAN                                                               
---------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=12338998.78..13770049.38 rows=18465169 width=148)
   ->  Sort  (cost=12338998.78..12385161.70 rows=18465169 width=148)
         Sort Key: fqun.sid, ((billingitem.starttime)::date), billingitem.nasid, billingitem.importer, eventbinding.role
         ->  Hash Join  (cost=1498473.48..7333418.55 rows=18465169 width=148)
               Hash Cond: (eventbinding.uid = fqun.uid)
               ->  Hash Join  (cost=1496916.06..6916394.83 rows=18465169 width=148)
                     Hash Cond: (billingitem.biid = eventbinding.biid)
                     ->  Seq Scan on billingitem  (cost=0.00..1433087.88 rows=56222688 width=142)
                     ->  Hash  (cost=1175939.45..1175939.45 rows=18465169 width=10)
                           ->  Bitmap Heap Scan on eventbinding  (cost=427409.84..1175939.45 rows=18465169 width=10)
                                 Recheck Cond: (biid > $1)
                                 ->  Bitmap Index Scan on eventbinding_biid_uid_role_idx  (cost=0.00..422793.55 rows=18465169 width=0)
                                       Index Cond: (biid > $1)
               ->  Hash  (cost=943.85..943.85 rows=49085 width=8)
                     ->  Seq Scan on fqun  (cost=0.00..943.85 rows=49085 width=8)
(15 rows)




As a query on the psql command line:

test=# explain 
  select 
    sid, 
    role, 
    starttime::date, 
    nasid, 
    importer, 
    max(eventbinding.biid) as biid, 
    sum(bytesin) as bytesin, 
    sum(bytesout) as bytesout,  
    sum(seconds) as seconds, 
    sum(coalesce(pages, 0)) as pages, 
    sum(coalesce(count, 0)) as count,  
    sum(coalesce(rate, 0.0)) as rate, 
    sum(coalesce(bytesSentRate, 0.0)) as bytesSentRate,  
    sum(coalesce(bytesReceivedRate, 0.0)) as bytesReceivedRate, 
    count(*) as entries  
  from billingItem, eventBinding , fqun  
  where eventBinding.biid > 57205899 and eventBinding.biid = billingItem.biid and fqun.uid = eventBinding.uid  
  group by sid, starttime::date, nasid, importer, role;
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=102496.80..102704.55 rows=5540 width=148)
   ->  Hash Join  (cost=1697.13..102289.05 rows=5540 width=148)
         Hash Cond: (eventbinding.uid = fqun.uid)
         ->  Nested Loop  (cost=139.71..100606.99 rows=5540 width=148)
               ->  Bitmap Heap Scan on eventbinding  (cost=139.71..20547.20 rows=5540 width=10)
                     Recheck Cond: (biid > 57205899)
                     ->  Bitmap Index Scan on eventbinding_biid_uid_role_idx  (cost=0.00..138.33 rows=5540 width=0)
                           Index Cond: (biid > 57205899)
               ->  Index Scan using billingitem_db52003_pkey on billingitem  (cost=0.00..14.44 rows=1 width=142)
                     Index Cond: (billingitem.biid = eventbinding.biid)
         ->  Hash  (cost=943.85..943.85 rows=49085 width=8)
               ->  Seq Scan on fqun  (cost=0.00..943.85 rows=49085 width=8)
(12 rows)

Re: Prepared statements and suboptimal plans

От
Craig Ringer
Дата:
On 21/09/2011 7:27 AM, Royce Ausburn wrote:
> Hi all,
>
> It looks like I've been hit with this well known issue.  I have
> a complicated query that is intended to run every few minutes, I'm
> using JDBC's Connection.prepareStatement() mostly for nice
> parameterisation, but postgres produces a suboptimal plan due to its
> lack of information when the statement is prepared.
>
> [snip]
>
> We've been worst hit by this query on an 8.3 site.  Another site is
> running 8.4.  Have there been improvements in this area recently?
>  Upgrading to 9.0 might be viable for us.

Tom just mentioned that 9.1 will be able to re-plan parameterized
prepared statements, so this issue will go away. In the mean time you
can only really use the standard workaround of setting the prepare
theshold to 0 to disable server-side prepare, so you can continue to use
JDBC prepared statements and have the driver do the parameter
substitution for you.

--
Craig Ringer

Re: Prepared statements and suboptimal plans

От
Royce Ausburn
Дата:
On 21/09/2011, at 9:39 AM, Craig Ringer wrote:

> On 21/09/2011 7:27 AM, Royce Ausburn wrote:
>> Hi all,
>>
>> It looks like I've been hit with this well known issue.  I have a complicated query that is intended to run every
fewminutes, I'm using JDBC's Connection.prepareStatement() mostly for nice parameterisation, but postgres produces a
suboptimalplan due to its lack of information when the statement is prepared. 
>>
>> [snip]
>>
>> We've been worst hit by this query on an 8.3 site.  Another site is running 8.4.  Have there been improvements in
thisarea recently?  Upgrading to 9.0 might be viable for us. 
>
> Tom just mentioned that 9.1 will be able to re-plan parameterized prepared statements, so this issue will go away. In
themean time you can only really use the standard workaround of setting the prepare theshold to 0 to disable
server-sideprepare, so you can continue to use JDBC prepared statements and have the driver do the parameter
substitutionfor you. 

Thanks Craig -- that trick helps a lot.

--Royce

Re: Prepared statements and suboptimal plans

От
Tom Lane
Дата:
Craig Ringer <ringerc@ringerc.id.au> writes:
> On 21/09/2011 7:27 AM, Royce Ausburn wrote:
>> We've been worst hit by this query on an 8.3 site.  Another site is
>> running 8.4.  Have there been improvements in this area recently?
>> Upgrading to 9.0 might be viable for us.

> Tom just mentioned that 9.1 will be able to re-plan parameterized
> prepared statements, so this issue will go away.

9.2, sorry, not 9.1.  We could use some motivated people testing that
aspect of GIT HEAD, though, since I doubt the policy for when to re-plan
is quite ideal yet.

            regards, tom lane

Re: Prepared statements and suboptimal plans

От
Andy Lester
Дата:

On Sep 20, 2011, at 7:36 PM, Tom Lane wrote:

9.2, sorry, not 9.1.  We could use some motivated people testing that
aspect of GIT HEAD, though, since I doubt the policy for when to re-plan
is quite ideal yet.

Is motivation and a box enough?  I have motivation, but not knowledge of internals enough to drive testing myself.

xoa

Re: Prepared statements and suboptimal plans

От
Tom Lane
Дата:
Andy Lester <andy@petdance.com> writes:
> On Sep 20, 2011, at 7:36 PM, Tom Lane wrote:
>> 9.2, sorry, not 9.1.  We could use some motivated people testing that
>> aspect of GIT HEAD, though, since I doubt the policy for when to re-plan
>> is quite ideal yet.

> Is motivation and a box enough?  I have motivation, but not knowledge of internals enough to drive testing myself.

Well, you probably don't need much internals knowledge as long as you
can read a little C.  The crux of what I'm worried about is
choose_custom_plan() in src/backend/utils/cache/plancache.c, which is
really pretty simple: decide whether to use a custom (parameter-aware)
plan or a generic (not-parameter-aware) plan.  You might want to shove
some elog() calls in there for tracing purposes so that you can log what
it did, but then it's just a matter of throwing real-world workloads at
it and seeing if it makes good decisions.

            regards, tom lane

Re: Prepared statements and suboptimal plans

От
Stephen Frost
Дата:
* Royce Ausburn (royce.ml@inomial.com) wrote:
> > Tom just mentioned that 9.1 will be able to re-plan parameterized prepared statements, so this issue will go away.
Inthe mean time you can only really use the standard workaround of setting the prepare theshold to 0 to disable
server-sideprepare, so you can continue to use JDBC prepared statements and have the driver do the parameter
substitutionfor you. 
>
> Thanks Craig -- that trick helps a lot.

You might also be able to bump up work_mem by a fair bit to get PG to
use a hashagg instead of groupagg/sort, even though its estimate is way
off.  That's what I've done in the past for similar situations and it's
worked well.  I'd recommend increasing it for just this query and then
resetting it (assuming you don't just drop the connection, in which case
you don't need to reset it since a new connection will get the default).

    Thanks,

        Stephen

Вложения

Re: Prepared statements and suboptimal plans

От
Stephen Frost
Дата:
Tom,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> really pretty simple: decide whether to use a custom (parameter-aware)
> plan or a generic (not-parameter-aware) plan.

Before I go digging into this, I was wondering, is this going to address
our current problem of not being able to use prepared queries and
constraint exclusion..?

    Thanks,

        Stephen

Вложения

Re: Prepared statements and suboptimal plans

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> really pretty simple: decide whether to use a custom (parameter-aware)
>> plan or a generic (not-parameter-aware) plan.

> Before I go digging into this, I was wondering, is this going to address
> our current problem of not being able to use prepared queries and
> constraint exclusion..?

Possibly ... you didn't say exactly what your problem consists of.

            regards, tom lane