Обсуждение: Simple Join
I'll just start by warning that I'm new-ish to postgresql.
I'm running 8.1 installed from source on a Debian Sarge server.  I have a
simple query that I believe I've placed the indexes correctly for, and I
still end up with a seq scan.  It makes sense, kinda, but it should be able
to use the index to gather the right values.  I do have a production set of
data inserted into the tables, so this is running realistically:
dli=# explain analyze SELECT ordered_products.product_id
dli-# FROM to_ship, ordered_products
dli-# WHERE to_ship.ordered_product_id = ordered_products.id AND
dli-# ordered_products.paid = TRUE AND
dli-# ordered_products.suspended_sub = FALSE;
                                                             QUERY PLAN
                       
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=5126.19..31528.40 rows=20591 width=8) (actual
time=6517.438..25123.115 rows=14367 loops=1)
   Hash Cond: ("outer".ordered_product_id = "inner".id)
   ->  Seq Scan on to_ship  (cost=0.00..11529.12 rows=611612 width=8) (actual
time=393.206..15711.715 rows=611612 loops=1)
   ->  Hash  (cost=4954.79..4954.79 rows=21759 width=16) (actual
time=6076.153..6076.153 rows=18042 loops=1)
         ->  Index Scan using paid_index on ordered_products
(cost=0.00..4954.79 rows=21759 width=16) (actual time=136.472..5966.275
rows=18042 loops=1)
               Index Cond: (paid = true)
               Filter: (paid AND (NOT suspended_sub))
 Total runtime: 25136.190 ms
(8 rows)
This is running on just about the world's slowest server (with a laptop hard
drive to boot), but how can I avoid the seq scan, or in general speed up this
query?
to_ship will have far less tuples than ordered_products, but it's still not
small, as you can see.
			
		Kevin Brown <blargity@gmail.com> writes:
> I'm running 8.1 installed from source on a Debian Sarge server.  I have a
> simple query that I believe I've placed the indexes correctly for, and I
> still end up with a seq scan.  It makes sense, kinda, but it should be able
> to use the index to gather the right values.
I continue to marvel at how many people think that if it's not using an
index it must ipso facto be a bad plan ...
That plan looks perfectly fine to me.  You could try forcing some other
choices by fooling with the planner enable switches (eg set
enable_seqscan = off) but I doubt you'll find much improvement.  There
are too many rows being pulled from ordered_products to make an index
nestloop a good idea.
            regards, tom lane
			
		On Wednesday 14 December 2005 16:47, you wrote: > Kevin Brown <blargity@gmail.com> writes: > > I'm running 8.1 installed from source on a Debian Sarge server. I have a > > simple query that I believe I've placed the indexes correctly for, and I > > still end up with a seq scan. It makes sense, kinda, but it should be > > able to use the index to gather the right values. > > I continue to marvel at how many people think that if it's not using an > index it must ipso facto be a bad plan ... > > That plan looks perfectly fine to me. You could try forcing some other > choices by fooling with the planner enable switches (eg set > enable_seqscan = off) but I doubt you'll find much improvement. There > are too many rows being pulled from ordered_products to make an index > nestloop a good idea. That's fine, so being a postgres novice, as I stated in my original post, what would be the best way to improve performance? Redundant column that's updated via a trigger? I'm asking this list because I'd like to do it right, as opposed to get it done. > regards, tom lane
On 12/14/05, Kevin Brown <blargity@gmail.com> wrote:
> I'll just start by warning that I'm new-ish to postgresql.
>
> I'm running 8.1 installed from source on a Debian Sarge server.  I have a
> simple query that I believe I've placed the indexes correctly for, and I
> still end up with a seq scan.  It makes sense, kinda, but it should be able
> to use the index to gather the right values.  I do have a production set of
> data inserted into the tables, so this is running realistically:
>
what hardware?
> dli=# explain analyze SELECT ordered_products.product_id
> dli-# FROM to_ship, ordered_products
> dli-# WHERE to_ship.ordered_product_id = ordered_products.id AND
> dli-# ordered_products.paid = TRUE AND
> dli-# ordered_products.suspended_sub = FALSE;
>                                                             QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------------------------
>  Hash Join  (cost=5126.19..31528.40 rows=20591 width=8) (actual
> time=6517.438..25123.115 rows=14367 loops=1)
>   Hash Cond: ("outer".ordered_product_id = "inner".id)
>   ->  Seq Scan on to_ship  (cost=0.00..11529.12 rows=611612 width=8) (actual
> time=393.206..15711.715 rows=611612 loops=1)
>   ->  Hash  (cost=4954.79..4954.79 rows=21759 width=16) (actual
> time=6076.153..6076.153 rows=18042 loops=1)
>         ->  Index Scan using paid_index on ordered_products
> (cost=0.00..4954.79 rows=21759 width=16) (actual time=136.472..5966.275
> rows=18042 loops=1)
>               Index Cond: (paid = true)
>               Filter: (paid AND (NOT suspended_sub))
>  Total runtime: 25136.190 ms
> (8 rows)
>
show the tables and the indexes for those tables
> This is running on just about the world's slowest server (with a laptop hard
> drive to boot), but how can I avoid the seq scan, or in general speed up this
> query?
>
> to_ship will have far less tuples than ordered_products, but it's still not
> small, as you can see.
>
--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)
			
		Kevin Brown wrote:
> I'll just start by warning that I'm new-ish to postgresql.
>
> I'm running 8.1 installed from source on a Debian Sarge server.  I have a
> simple query that I believe I've placed the indexes correctly for, and I
> still end up with a seq scan.  It makes sense, kinda, but it should be able
> to use the index to gather the right values.  I do have a production set of
> data inserted into the tables, so this is running realistically:
>
> dli=# explain analyze SELECT ordered_products.product_id
> dli-# FROM to_ship, ordered_products
> dli-# WHERE to_ship.ordered_product_id = ordered_products.id AND
> dli-# ordered_products.paid = TRUE AND
> dli-# ordered_products.suspended_sub = FALSE;
You scan 600000 rows from to_ship to get about 25000 - so some way to
cut this down would help.
Try out an explicit INNER JOIN which includes the filter info for paid
and suspended_sub in the join condition (you may need indexes on each of
id, paid and suspended_sub, so that the 8.1 optimizer can use a bitmap
scan):
SELECT ordered_products.product_id
FROM to_ship INNER JOIN ordered_products
ON (to_ship.ordered_product_id = ordered_products.id
     AND ordered_products.paid = TRUE      AND
ordered_products.suspended_sub = FALSE);
			
		On Wed, Dec 14, 2005 at 04:03:52PM -0600, Kevin Brown wrote: > -> Index Scan using paid_index on ordered_products > (cost=0.00..4954.79 rows=21759 width=16) (actual time=136.472..5966.275 > rows=18042 loops=1) > Index Cond: (paid = true) > Filter: (paid AND (NOT suspended_sub)) > Total runtime: 25136.190 ms You might want to consider an index on (paid,suspended_sub), not just (paid); it's probably not going to give you any dramatic improvements, but it could help a bit. /* Steinar */ -- Homepage: http://www.sesse.net/
On Wednesday 14 December 2005 17:23, you wrote:
> what hardware?
Via 800 mhz (about equiv to a 300 mhz pentium 2)
128 mb of slow ram
4200 rpm ide hard drive.
Told you it was slow. :-)
This is not the production system.  I don't expect this to be "fast" but
everything else happens in under 2 seconds, so I know I could do this faster.
Especially becaue the information I'm looking for probably just needs some
denormalization, or other such trick to pop right out.  I'm using this system
so I can locate my performance bottlenecks easier, and actually, it's plenty
fast enough except for this one single query.  I don't necessarily want to
optimize the query, more than just get the info faster, so that's why I'm
posting here.
> show the tables and the indexes for those tables
No prob:
CREATE TABLE to_ship
(
  id int8 NOT NULL DEFAULT nextval(('to_ship_seq'::text)::regclass),
  ordered_product_id int8 NOT NULL,
  bounced int4 NOT NULL DEFAULT 0,
  operator_id varchar(20) NOT NULL,
  "timestamp" timestamptz NOT NULL DEFAULT ('now'::text)::timestamp(6) with
time zone,
  CONSTRAINT to_ship_pkey PRIMARY KEY (id),
  CONSTRAINT to_ship_ordered_product_id_fkey FOREIGN KEY (ordered_product_id)
REFERENCES ordered_products (id) ON UPDATE RESTRICT ON DELETE RESTRICT
)
WITHOUT OIDS;
CREATE TABLE ordered_products
(
  id int8 NOT NULL DEFAULT nextval(('ordered_products_seq'::text)::regclass),
  order_id int8 NOT NULL,
  product_id int8 NOT NULL,
  recipient_address_id int8 NOT NULL,
  hide bool NOT NULL DEFAULT false,
  renewal bool NOT NULL DEFAULT false,
  "timestamp" timestamptz NOT NULL DEFAULT ('now'::text)::timestamp(6) with
time zone,
  operator_id varchar(20) NOT NULL,
  suspended_sub bool NOT NULL DEFAULT false,
  quantity int4 NOT NULL DEFAULT 1,
  price_paid numeric NOT NULL,
  tax_paid numeric NOT NULL DEFAULT 0,
  shipping_paid numeric NOT NULL DEFAULT 0,
  remaining_issue_obligation int4 NOT NULL DEFAULT 0,
  parent_product_id int8,
  delivery_method_id int8 NOT NULL,
  paid bool NOT NULL DEFAULT false,
  CONSTRAINT ordered_products_pkey PRIMARY KEY (id),
  CONSTRAINT ordered_products_order_id_fkey FOREIGN KEY (order_id) REFERENCES
orders (id) ON UPDATE RESTRICT ON DELETE RESTRICT,
  CONSTRAINT ordered_products_parent_product_id_fkey FOREIGN KEY
(parent_product_id) REFERENCES ordered_products (id) ON UPDATE RESTRICT ON
DELETE RESTRICT,
  CONSTRAINT ordered_products_recipient_address_id_fkey FOREIGN KEY
(recipient_address_id) REFERENCES addresses (id) ON UPDATE RESTRICT ON DELETE
RESTRICT
)
WITHOUT OIDS;
=== The two indexes that should matter ===
CREATE INDEX ordered_product_id_index
  ON to_ship
  USING btree
  (ordered_product_id);
CREATE INDEX paid_index
  ON ordered_products
  USING btree
  (paid);
ordered_products.id is a primary key, so it should have an implicit index.
			
		On Wednesday 14 December 2005 17:30, Mark Kirkwood wrote:
> You scan 600000 rows from to_ship to get about 25000 - so some way to
> cut this down would help.
Yup.  I'm open to anything too, as this is the only real part of the system
that cares.  So either maintaining a denormalized copy column, or whatever
would be fine.  We're doing far more reads than writes.
> Try out an explicit INNER JOIN which includes the filter info for paid
> and suspended_sub in the join condition (you may need indexes on each of
> id, paid and suspended_sub, so that the 8.1 optimizer can use a bitmap
> scan):
I only had two explicit indexes.  One was on to_ship.ordered_product_id and
the other was on ordered_products.paid.  ordered_products.id is a primary
key.  This is on your query with an index added on suspended_sub:
dli=# explain analyze SELECT ordered_products.product_id
dli-# FROM to_ship INNER JOIN ordered_products
dli-# ON (to_ship.ordered_product_id = ordered_products.id
dli(#      AND ordered_products.paid = TRUE      AND
dli(# ordered_products.suspended_sub = FALSE);
                                                               QUERY PLAN
                         
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=5126.19..31528.40 rows=20591 width=8) (actual
time=4554.190..23519.618 rows=14367 loops=1)
   Hash Cond: ("outer".ordered_product_id = "inner".id)
   ->  Seq Scan on to_ship  (cost=0.00..11529.12 rows=611612 width=8) (actual
time=11.254..15192.042 rows=611612 loops=1)
   ->  Hash  (cost=4954.79..4954.79 rows=21759 width=16) (actual
time=4494.900..4494.900 rows=18042 loops=1)
         ->  Index Scan using paid_index on ordered_products
(cost=0.00..4954.79 rows=21759 width=16) (actual time=72.431..4414.697
rows=18042 loops=1)
               Index Cond: (paid = true)
               Filter: (paid AND (NOT suspended_sub))
 Total runtime: 23532.785 ms
(8 rows)
So what's the best way to performance wiggle this info out of the db?  The
list of values is only about 30 tuples long out of this query, so I was
figuring I could trigger on insert to to_ship to place the value into another
table if it didn't already exist.  I'd rather the writing be slow than the
reading.
			
		Kevin Brown wrote:
>
>
> I only had two explicit indexes.  One was on to_ship.ordered_product_id and
> the other was on ordered_products.paid.  ordered_products.id is a primary
> key.  This is on your query with an index added on suspended_sub:
>
> dli=# explain analyze SELECT ordered_products.product_id
> dli-# FROM to_ship INNER JOIN ordered_products
> dli-# ON (to_ship.ordered_product_id = ordered_products.id
> dli(#      AND ordered_products.paid = TRUE      AND
> dli(# ordered_products.suspended_sub = FALSE);
>                                                                QUERY PLAN
                           
>
---------------------------------------------------------------------------------------------------------------------------------------------------------
>  Hash Join  (cost=5126.19..31528.40 rows=20591 width=8) (actual
> time=4554.190..23519.618 rows=14367 loops=1)
>    Hash Cond: ("outer".ordered_product_id = "inner".id)
>    ->  Seq Scan on to_ship  (cost=0.00..11529.12 rows=611612 width=8) (actual
> time=11.254..15192.042 rows=611612 loops=1)
>    ->  Hash  (cost=4954.79..4954.79 rows=21759 width=16) (actual
> time=4494.900..4494.900 rows=18042 loops=1)
>          ->  Index Scan using paid_index on ordered_products
> (cost=0.00..4954.79 rows=21759 width=16) (actual time=72.431..4414.697
> rows=18042 loops=1)
>                Index Cond: (paid = true)
>                Filter: (paid AND (NOT suspended_sub))
>  Total runtime: 23532.785 ms
> (8 rows)
>
Well - that had no effect at all :-) You don't have and index on
to_ship.ordered_product_id do you? - try adding one (ANALYZE again), and
let use know what happens (you may want to play with SET
enable_seqscan=off as well).
And also, if you are only ever interested in paid = true and
suspended_sub = false, then you can recreate these indexes as partials -
e.g:
CREATE INDEX paid_index ON ordered_products (paid) WHERE paid = true;
CREATE INDEX suspended_sub_index ON ordered_products (suspended_sub)
WHERE suspended_sub = false;
> So what's the best way to performance wiggle this info out of the db?  The
> list of values is only about 30 tuples long out of this query, so I was
> figuring I could trigger on insert to to_ship to place the value into another
> table if it didn't already exist.  I'd rather the writing be slow than the
> reading.
Yeah - all sort of horrible denormalizations are possible :-), hopefully
we can get the original query to work ok, and avoid the need to add code
or triggers to you app.
			
		On Wednesday 14 December 2005 18:36, you wrote: > Well - that had no effect at all :-) You don't have and index on > to_ship.ordered_product_id do you? - try adding one (ANALYZE again), and > let use know what happens (you may want to play with SET > enable_seqscan=off as well). I _DO_ have an index on to_ship.ordered_product_id. It's a btree. > And also, if you are only ever interested in paid = true and > suspended_sub = false, then you can recreate these indexes as partials - > e.g: > > CREATE INDEX paid_index ON ordered_products (paid) WHERE paid = true; > CREATE INDEX suspended_sub_index ON ordered_products (suspended_sub) > WHERE suspended_sub = false; They're currently defined as individuals and I'm depending on the bitmap indexing. > > So what's the best way to performance wiggle this info out of the db? > > The list of values is only about 30 tuples long out of this query, so I > > was figuring I could trigger on insert to to_ship to place the value into > > another table if it didn't already exist. I'd rather the writing be slow > > than the reading. > > Yeah - all sort of horrible denormalizations are possible :-), hopefully > we can get the original query to work ok, and avoid the need to add code > or triggers to you app. That'd be great.
On Thursday 15 December 2005 00:52, you wrote:
> On Wed, 2005-12-14 at 17:47 -0500, Tom Lane wrote:
> > That plan looks perfectly fine to me.  You could try forcing some other
> > choices by fooling with the planner enable switches (eg set
> > enable_seqscan = off) but I doubt you'll find much improvement.  There
> > are too many rows being pulled from ordered_products to make an index
> > nestloop a good idea.
>
> Well, I'm no expert either, but if there was an index on
> ordered_products (paid, suspended_sub, id) it should be mergejoinable
> with the index on to_ship.ordered_product_id, right?  Given the
> conditions on paid and suspended_sub.
>
> If you (Kevin) try adding such an index, ideally it would get used given
> that you're only pulling out a small fraction of the rows in to_ship.
> If it doesn't get used, then I had a similar issue with 8.0.3 where an
> index that was mergejoinable (only because of the restrictions in the
> where clause) wasn't getting picked up.
The following is already there:
CREATE INDEX ordered_product_id_index
  ON to_ship
  USING btree
  (ordered_product_id);
That's why I emailed this list.
> Mitch
>
> Kevin Brown wrote:
> > CREATE TABLE to_ship
> > (
> >   id int8 NOT NULL DEFAULT nextval(('to_ship_seq'::text)::regclass),
> >   ordered_product_id int8 NOT NULL,
> >   bounced int4 NOT NULL DEFAULT 0,
> >   operator_id varchar(20) NOT NULL,
> >   "timestamp" timestamptz NOT NULL DEFAULT ('now'::text)::timestamp(6)
> > with
> > time zone,
> >   CONSTRAINT to_ship_pkey PRIMARY KEY (id),
> >   CONSTRAINT to_ship_ordered_product_id_fkey FOREIGN KEY
> > (ordered_product_id)
> > REFERENCES ordered_products (id) ON UPDATE RESTRICT ON DELETE RESTRICT
> > )
> > WITHOUT OIDS;
> >
> > CREATE TABLE ordered_products
> > (
> >   id int8 NOT NULL DEFAULT
> > nextval(('ordered_products_seq'::text)::regclass),
> >   order_id int8 NOT NULL,
> >   product_id int8 NOT NULL,
> >   recipient_address_id int8 NOT NULL,
> >   hide bool NOT NULL DEFAULT false,
> >   renewal bool NOT NULL DEFAULT false,
> >   "timestamp" timestamptz NOT NULL DEFAULT ('now'::text)::timestamp(6)
> > with
> > time zone,
> >   operator_id varchar(20) NOT NULL,
> >   suspended_sub bool NOT NULL DEFAULT false,
> >   quantity int4 NOT NULL DEFAULT 1,
> >   price_paid numeric NOT NULL,
> >   tax_paid numeric NOT NULL DEFAULT 0,
> >   shipping_paid numeric NOT NULL DEFAULT 0,
> >   remaining_issue_obligation int4 NOT NULL DEFAULT 0,
> >   parent_product_id int8,
> >   delivery_method_id int8 NOT NULL,
> >   paid bool NOT NULL DEFAULT false,
> >   CONSTRAINT ordered_products_pkey PRIMARY KEY (id),
> >   CONSTRAINT ordered_products_order_id_fkey FOREIGN KEY (order_id)
> > REFERENCES
> > orders (id) ON UPDATE RESTRICT ON DELETE RESTRICT,
> >   CONSTRAINT ordered_products_parent_product_id_fkey FOREIGN KEY
> > (parent_product_id) REFERENCES ordered_products (id) ON UPDATE
> > RESTRICT ON
> > DELETE RESTRICT,
> >   CONSTRAINT ordered_products_recipient_address_id_fkey FOREIGN KEY
> > (recipient_address_id) REFERENCES addresses (id) ON UPDATE RESTRICT ON
> > DELETE
> > RESTRICT
> > )
> > WITHOUT OIDS;
> >
> > === The two indexes that should matter ===
> > CREATE INDEX ordered_product_id_index
> >   ON to_ship
> >   USING btree
> >   (ordered_product_id);
> >
> > CREATE INDEX paid_index
> >   ON ordered_products
> >   USING btree
> >   (paid);
> >
> > ordered_products.id is a primary key, so it should have an implicit
> > index.
			
		Kevin Brown wrote: > On Wednesday 14 December 2005 18:36, you wrote: > >>Well - that had no effect at all :-) You don't have and index on >>to_ship.ordered_product_id do you? - try adding one (ANALYZE again), and >>let use know what happens (you may want to play with SET >>enable_seqscan=off as well). > > > I _DO_ have an index on to_ship.ordered_product_id. It's a btree. > Sorry - read right past it! Did you try out enable_seqscan=off? I'm interested to see if we can get 8.1 bitmap anding the three possibly useful columns together on ordered_products and *then* doing the join to to_ship. Cheers Mark
On Thu, 2005-12-15 at 01:48 -0600, Kevin Brown wrote: > > Well, I'm no expert either, but if there was an index on > > ordered_products (paid, suspended_sub, id) it should be mergejoinable > > with the index on to_ship.ordered_product_id, right? Given the > > conditions on paid and suspended_sub. > > > The following is already there: > > CREATE INDEX ordered_product_id_index > ON to_ship > USING btree > (ordered_product_id); > > That's why I emailed this list. I saw that; what I'm suggesting is that that you try creating a 3-column index on ordered_products using the paid, suspended_sub, and id columns. In that order, I think, although you could also try the reverse. It may or may not help, but it's worth a shot--the fact that all of those columns are used together in the query suggests that you might do better with a three-column index on those. With all three columns indexed individually, you're apparently not getting the bitmap plan that Mark is hoping for. I imagine this has to do with the lack of multi-column statistics in postgres, though you could also try raising the statistics target on the columns of interest. Setting enable_seqscan to off, as others have suggested, is also a worthwhile experiment, just to see what you get. Mitch
Mitch Skinner wrote: > I saw that; what I'm suggesting is that that you try creating a 3-column > index on ordered_products using the paid, suspended_sub, and id columns. > In that order, I think, although you could also try the reverse. It may > or may not help, but it's worth a shot--the fact that all of those > columns are used together in the query suggests that you might do better > with a three-column index on those. > > With all three columns indexed individually, you're apparently not > getting the bitmap plan that Mark is hoping for. I imagine this has to > do with the lack of multi-column statistics in postgres, though you > could also try raising the statistics target on the columns of interest. > > Setting enable_seqscan to off, as others have suggested, is also a > worthwhile experiment, just to see what you get. > > Right on. Some of these "coerced" plans may perform much better. If so, we can look at tweaking your runtime config: e.g. effective_cache_size random_page_cost default_statistics_target to see if said plans can be chosen "naturally". cheers Mark
I asked a while back if there were any plans to allow developers to override the optimizer's plan and force certain
plans,and received a fairly resounding "No".  The general feeling I get is that a lot of work has gone into the
optimizer,and by God we're going to use it! 
I think this is just wrong, and I'm curious whether I'm alone in this opinion.
Over and over, I see questions posted to this mailing list about execution plans that don't work out well.  Many times
thereare good answers - add an index, refactor the design, etc. - that yield good results.  But, all too often the
answercomes down to something like this recent one: 
   > Right on. Some of these "coerced" plans may perform
   > much better. If so, we can look at tweaking your runtime
   > config: e.g.
   >
   > effective_cache_size
   > random_page_cost
   > default_statistics_target
   >
   > to see if said plans can be chosen "naturally".
I see this over and over.  Tweak the parameters to "force" a certain plan, because there's no formal way for a
developerto say, "I know the best plan." 
There isn't a database in the world that is as smart as a developer, or that can have insight into things that only a
developercan possibly know.  Here's a real-life example that caused me major headaches.  It's a trivial query, but
Postgrestotally blows it: 
    select * from my_table
     where row_num >= 50000 and row_num < 100000
     and myfunc(foo, bar);
How can Postgres possibly know what "myfunc()" does?  In this example, my_table is about 10 million rows and row_num is
indexed. When the row_num range is less than about 30,000, Postgres (correctly) uses an row_num index scan, then
filtersby myfunc().  But beyond that, it chooses a sequential scan, filtering by myfunc().  This is just wrong.
Postgrescan't possibly know that myfunc() is VERY expensive.  The correct plan would be to switch from index to
filteringon row_num.  Even if 99% of the database is selected by row_num, it should STILL at least filter by row_num
first,and only filter by myfunc() as the very last step. 
How can a database with no ability to override a plan possibly cope with this?
Without the explicit ability to override the plan Postgres generates, these problems dominate our development efforts.
Postgresdoes an excellent job optimizing on 90% of the SQL we write, but the last 10% is nearly impossible to get
right. We spend huge amounts of time on trial-and-error queries, second guessing Postgress, creating unnecessary
temporarytables, sticking in the occasional OFFSET in a subquery to prevent merging layers, and so forth. 
This same application also runs on Oracle, and although I've cursed Oracle's stupid planner many times, at least I can
forceit to do it right if I need to. 
The danger of forced plans is that inexperienced developers tend to abuse them.  So it goes -- the documentation should
beclear that forced plans are always a last resort.   
But there's no getting around the fact that Postgres needs a way for a developer to specify the execution plan.
Craig
			
		"Craig A. James" <cjames@modgraph-usa.com> writes:
> I see this over and over.  Tweak the parameters to "force" a certain
> plan, because there's no formal way for a developer to say, "I know
> the best plan."
I think you've misunderstood those conversations entirely.  The point
is not to force the planner into a certain plan, it is to explore what's
going on with a view to understanding why the planner isn't making a
good choice, and thence hopefully improve the planner in future.  (Now,
that's not necessarily what the user with an immediate problem is
thinking, but that's definitely what the developers are thinking.)
> There isn't a database in the world that is as smart as a developer,
People who are convinced they are smarter than the machine are often
wrong ;-).  If we did put in the nontrivial amount of work needed to
have such a facility, it would probably get abused more often than it
was used correctly.  I'd rather spend the work on making the planner
better.
This discussion has been had before (many times) ... see the -hackers
archives for detailed arguments.  The one that carries the most weight
in my mind is that planner hints embedded in applications will not adapt
to changing circumstances --- the plan that was best when you designed
the code might not be best today.
            regards, tom lane
			
		> select * from my_table where row_num >= 50000 and row_num < 100000 > and myfunc(foo, bar); You just create an index on myfunc(foo, bar) Chris
On 12/15/05, Christopher Kings-Lynne <chriskl@familyhealth.com.au> wrote: > > select * from my_table where row_num >= 50000 and row_num < 100000 > > and myfunc(foo, bar); > > You just create an index on myfunc(foo, bar) > > Chris > only if myfunc(foo, bar) is immutable... -- regards, Jaime Casanova (DBA: DataBase Aniquilator ;)
>>> select * from my_table where row_num >= 50000 and row_num < 100000 >>> and myfunc(foo, bar); >> >>You just create an index on myfunc(foo, bar) > > only if myfunc(foo, bar) is immutable... And if it's not then the best any database can do is to index scan row_num - so still you have no problem. Chris
Tom, >>I see this over and over. Tweak the parameters to "force" a certain >>plan, because there's no formal way for a developer to say, "I know >>the best plan." > > I think you've misunderstood those conversations entirely. The point > is not to force the planner into a certain plan, it is to explore what's > going on with a view to understanding why the planner isn't making a > good choice, and thence hopefully improve the planner in future. No, I understood the conversations very clearly. But no matter how clever the optimizer, it simply can't compete with adeveloper who has knowledge that Postgres *can't* have. The example of a user-written function is obvious. >>There isn't a database in the world that is as smart as a developer, > > People who are convinced they are smarter than the machine are often > wrong ;-). Often, but not always -- as I noted in my original posting. And when the developer is smarter than Postgres, and Postgresmakes the wrong choice, what is the developer supposed to do? This isn't academic -- the wrong plans Postgres makescan be *catastrophic*, e.g. turning a 3-second query into a three-hour query. How about this: Instead of arguing in the abstract, tell me in concrete terms how you would address the very specific exampleI gave, where myfunc() is a user-written function. To make it a little more challenging, try this: myfunc() can behavevery differently depending on the parameters, and sometimes (but not always), the application knows how it will behaveand could suggest a good execution plan. (And before anyone suggests that I rewrite myfunc(), I should explain that it's in the class of NP-complete problems. Thefunction is inherently hard and can't be made faster or more predictable.) The example I raised in a previous thread, of irregular usage, is the same: I have a particular query that I *always* wantto be fast even if it's only used rarely, but the system swaps its tables out of the file-system cache, based on "lowusage", even though the "high usage" queries are low priority. How can Postgres know such things when there's no wayfor me to tell it? The answers from the Postgres community were essentially, "Postgres is smarter than you, let it do its job." Unfortunately,this response completely ignores the reality: Postgres is NOT doing its job, and can't, because it doesn'thave enough information. Craig
Christopher Kings-Lynne wrote: >> select * from my_table where row_num >= 50000 and row_num < 100000 >> and myfunc(foo, bar); > > > You just create an index on myfunc(foo, bar) Thanks, but myfunc() takes parameters (shown here as "foo, bar"), one of which is not a column, it's external and changeswith every query. A function index won't work. Craig
> > Right on. Some of these "coerced" plans may perform > much better. > If so, we can look at tweaking your runtime > > config: e.g. > > > > effective_cache_size > > random_page_cost > > default_statistics_target > > > > to see if said plans can be chosen "naturally". > > I see this over and over. Tweak the parameters to "force" a certain > plan, because there's no formal way for a developer to say, "I know the > best plan." No, this is "fixing your wrongn, inaccurate parameters so that postgresql can choose a better plan". I don't necessarily disagree with your assertion that we need planner hints, but unless you or someone else is willing to submit a patch with the feature it's unlikely to ever be implemented... Chris
Christopher Kings-Lynne wrote: >>>> select * from my_table where row_num >= 50000 and row_num < >>>> 100000 >>>> and myfunc(foo, bar); >>> >>> >>> You just create an index on myfunc(foo, bar) >> >> >> only if myfunc(foo, bar) is immutable... > > > And if it's not then the best any database can do is to index scan > row_num - so still you have no problem. Boy, you picked a *really* bad example ;-) The problem is that Postgres decided to filter on myfunc() *first*, and then filter on row_num, resulting in a query timethat jumped from seconds to hours. And there's no way for me to tell Postgres not to do that! So, "you still have no problem" is exactly wrong, because Postgres picked the wrong plan. Postgres decided that applyingmyfunc() to 10,000,000 rows was a better plan than an index scan of 50,000 row_nums. So I'm screwed. Craig
Craig A. James wrote: > I asked a while back if there were any plans to allow developers to > override the optimizer's plan and force certain plans, and received a > fairly resounding "No". The general feeling I get is that a lot of work > has gone into the optimizer, and by God we're going to use it! > > I think this is just wrong, and I'm curious whether I'm alone in this > opinion. > > Over and over, I see questions posted to this mailing list about > execution plans that don't work out well. Many times there are good > answers - add an index, refactor the design, etc. - that yield good > results. But, all too often the answer comes down to something like > this recent one: > > > Right on. Some of these "coerced" plans may perform > much better. > If so, we can look at tweaking your runtime > > config: e.g. > > > > effective_cache_size > > random_page_cost > > default_statistics_target > > > > to see if said plans can be chosen "naturally". > > I see this over and over. Tweak the parameters to "force" a certain > plan, because there's no formal way for a developer to say, "I know the > best plan." > I hear what you are saying, but to use this fine example - I don't know what the best plan is - these experiments part of an investigation to find *if* there is a better plan, and if so, why Postgres is not finding it. > There isn't a database in the world that is as smart as a developer, or > that can have insight into things that only a developer can possibly > know. That is often true - but the aim is to get Postgres's optimizer closer to developer smartness. After years of using several other database products (some supporting hint type constructs and some not), I have come to believe that hinting (or similar) actually *hinders* the development of a great optimizer. Best wishes Mark
> Boy, you picked a *really* bad example ;-) > > The problem is that Postgres decided to filter on myfunc() *first*, and > then filter on row_num, resulting in a query time that jumped from > seconds to hours. And there's no way for me to tell Postgres not to do > that! Can you paste explain analyze and your effective_cache_size, etc. settings. > So, "you still have no problem" is exactly wrong, because Postgres > picked the wrong plan. Postgres decided that applying myfunc() to > 10,000,000 rows was a better plan than an index scan of 50,000 > row_nums. So I'm screwed. This seems like a case where PostgreSQL's current optimiser should easily know what to do if your config settings are correct and you've been running ANALYZE, so I'd like to see your settings and the explain analyze plan... Chris
Christopher Kings-Lynne wrote: > I don't necessarily disagree with your assertion that we need planner > hints, but unless you or someone else is willing to submit a patch with > the feature it's unlikely to ever be implemented... Now that's an answer I understand and appreciate. Open-source development relies on many volunteers, and I've benefittedfrom it since the early 1980's when emacs and Common Lisp first came to my attention. I've even written a widely-circulatedarticle about open-source development, which some of you may have read: http://www.moonviewscientific.com/essays/software_lifecycle.htm I hope nobody here thinks I'm critical of all the hard work that's been put into Postgres. My hope is to raise the awarenessof this issue in the hope that it's at least put on "the list" for serious consideration. Craig
Mark Kirkwood wrote: > I hear what you are saying, but to use this fine example - I don't know > what the best plan is - these experiments part of an investigation to > find *if* there is a better plan, and if so, why Postgres is not finding > it. > >> There isn't a database in the world that is as smart as a developer, >> or that can have insight into things that only a developer can >> possibly know. > > That is often true - but the aim is to get Postgres's optimizer closer > to developer smartness. What would be cool would be some way the developer could alter the plan, but they way of doing so would strongly encouragethe developer to send the information to this mailing list. Postgres would essentially say, "Ok, you can do that,but we want to know why!" > After years of using several other database products (some supporting > hint type constructs and some not), I have come to believe that hinting > (or similar) actually *hinders* the development of a great optimizer. I agree. It takes the pressure off the optimizer gurus. If the users can just work around every problem, then the optimizercan suck and the system is still usable. Lest anyone think I'm an all-out advocate of overriding the optimizer, I know from first-hand experience what a catastropheit can be. An Oracle hint I used worked fine on my test schema, but the customer's "table" turned out to be aview, and Oracle's optimizer worked well on the view whereas my hint was horrible. Unfortunately, without the hint, Oraclesucked when working on an ordinary table. Hints are dangerous, and I consider them a last resort. Craig
Tom Lane wrote: > This discussion has been had before (many times) ... see the -hackers > archives for detailed arguments. The one that carries the most weight > in my mind is that planner hints embedded in applications will not adapt > to changing circumstances --- the plan that was best when you designed > the code might not be best today. Absolutely right. But what am I supposed to do *today* if the planner makes a mistake? Shut down my web site? Ropes are useful, but you can hang yourself with them. Knives are useful, but you can cut yourself with them. Should weban useful tools because they cause harm to the careless? Craig
Christopher Kings-Lynne wrote: > Can you paste explain analyze and your effective_cache_size, etc. settings. > ... > This seems like a case where PostgreSQL's current optimiser should > easily know what to do if your config settings are correct and you've > been running ANALYZE, so I'd like to see your settings and the explain > analyze plan... I could, but it would divert us from the main topic of this discussion. It's not about that query, which was just an example. It's the larger issue. Tom's earlier response tells the story better than I can: > This discussion has been had before (many times) ... see > the -hackers archives for detailed arguments. If it's "been had before (many times)", and now I'm bringing it up again, then it's clearly an ongoing problem that hasn'tbeen resolved. Craig
Craig A. James wrote: > > What would be cool would be some way the developer could alter the plan, > but they way of doing so would strongly encourage the developer to send > the information to this mailing list. Postgres would essentially say, > "Ok, you can do that, but we want to know why!" > Yeah it would - an implementation I have seen that I like is where the developer can supply the *entire* execution plan with a query. This is complex enough to make casual use unlikely :-), but provides the ability to try out other plans, and also fix that vital query that must run today..... cheers Mark
>> ... This seems like a case where PostgreSQL's current optimiser should >> easily know what to do if your config settings are correct and you've >> been running ANALYZE, so I'd like to see your settings and the explain >> analyze plan... > > I could, but it would divert us from the main topic of this discussion. > It's not about that query, which was just an example. It's the larger > issue. So your main example bad query is possibly just a case of lack of analyze stats and wrong postgresql.conf config? And that's what causes you to shut down your database? Don't you want your problem FIXED? But like I said - no developer is interested in doing planner hints. Possibly you could get a company to sponsor it. Maybe what you want is a statement of "If someone submits a good, working, fully implemented patch that does planner hints, then we'll accept it." Chris
Christopher Kings-Lynne wrote:
> So your main example bad query is possibly just a case of lack of
> analyze stats and wrong postgresql.conf config?  And that's what causes
> you to shut down your database?  Don't you want your problem FIXED?
I'm trying to help by raising a question that I think is important, and have an honest, perhaps vigorous, but
respectful,discussion about it.  I respect everyone's opinion, and I hope you respect mine.  I've been in this business
along time, and I don't raise issues lightly. 
Yes, I want my query fixed.  And I may post it, in a thread with a new title.  In fact, I posted a different query with
essentiallythe same problem a while back and got nothing that helped: 
    http://archives.postgresql.org/pgsql-performance/2005-11/msg00133.php
(I can't help but point out that Tom's response was to suggest a way to fool the optimizer so as to prevent it from
"optimizing"the query.  In other words, he told me a trick that would force a particular plan on the optimizer.  Which
isexactly the point of this discussion.) 
The point is that the particular query is not relevant -- it's the fact that this topic (according to Tom) has been and
continuesto be raised.  This should tell us all something, that it's not going to go away, and that it's a real issue. 
Regards,
Craig
			
		> Yeah it would - an implementation I have seen that I like is where the > developer can supply the *entire* execution plan with a query. This is > complex enough to make casual use unlikely :-), but provides the ability > to try out other plans, and also fix that vital query that must run > today..... So, to move on to the concrete... I'm not familiar with the innards of Postgres except in a theoretical way. Maybe this is a totally naive or dumb question,but I have to ask: How hard would it be to essentially turn off the optimizer? 1. Evaluate WHERE clauses left-to-right. select ... from FOO where A and B and C; This would just apply the criteria left-to-right, first A, then B, then C. If an index was available it would use it, butonly in left-to-right order, i.e. if A had no index but B did, then too bad, you should have written "B and A and C". 2. Evaluate joins left-to-right. select ... from FOO join BAR on (...) join BAZ on (...) where ... This would join FOO to BAR, then join the result to BAZ. The only optimization would be to apply relevant "where" conditionsto each join before processing the next join. 3. Don't flatten sub-selects select ... from (select ... from FOO where ...) as X where ...; This would do the inner select then use the result in the outer select, and wouldn't attempt to flatten the query. Thanks, Craig
On Thu, 15 Dec 2005, Craig A. James wrote: > Mark Kirkwood wrote: >> I hear what you are saying, but to use this fine example - I don't know >> what the best plan is - these experiments part of an investigation to find >> *if* there is a better plan, and if so, why Postgres is not finding it. >> >>> There isn't a database in the world that is as smart as a developer, or >>> that can have insight into things that only a developer can possibly know. >> >> That is often true - but the aim is to get Postgres's optimizer closer to >> developer smartness. > > What would be cool would be some way the developer could alter the plan, but > they way of doing so would strongly encourage the developer to send the > information to this mailing list. Postgres would essentially say, "Ok, you > can do that, but we want to know why!" at the risk of sounding flippent (which is NOT what I intend) I will point out that with the source you can change the optimizer any way you need to :-) that being said, in your example the issue is the cost of the user created function and the fact that postgres doesn't know it's cost. would a resonable answer be to give postgres a way to learn how expensive the call is? a couple ways I could see to do this. 1. store some stats automagicly when the function is called and update the optimization plan when you do an ANALYSE 2. provide a way for a user to explicitly set a cost factor for a function (with a default value that's sane for fairly trivial functions so that it would only have to be set for unuseually expensive functions) now, neither of these will work all the time if a given function is sometimes cheap and sometimes expensive (depending on it's parameters), but in that case I would say that if the application knows that a function will be unusueally expensive under some conditions (and knows what those conditions will be) it may be a reasonable answer to duplicate the function, one copy that it uses most of the time, and a second copy that it uses when it expects it to be expensive. at this point the cost of the function can be set via either of the methods listed above) >> After years of using several other database products (some supporting hint >> type constructs and some not), I have come to believe that hinting (or >> similar) actually *hinders* the development of a great optimizer. > > I agree. It takes the pressure off the optimizer gurus. If the users can > just work around every problem, then the optimizer can suck and the system is > still usable. > > Lest anyone think I'm an all-out advocate of overriding the optimizer, I know > from first-hand experience what a catastrophe it can be. An Oracle hint I > used worked fine on my test schema, but the customer's "table" turned out to > be a view, and Oracle's optimizer worked well on the view whereas my hint was > horrible. Unfortunately, without the hint, Oracle sucked when working on an > ordinary table. Hints are dangerous, and I consider them a last resort. I've been on the linux-kernel mailing list for the last 9 years, and have seen a similar debate rage during that entire time about kernel memory management. overall both of these tend to be conflicts between short-term and long-term benifits. in the short-term the application user wants to be able to override the system to get the best performance _now_ in the long run the system designers don't trust the application programmers to get the hints right and want to figure out the right optimizer plan, even if it takes a lot longer to do so. the key to this balance seems to be to work towards as few controls as possible, becouse the user will get them wrong far more frequently then they get them right, but when you hit a point where there's absolutly no way for the system to figure things out (and it's a drastic difference) provide the application with a way to hint to the system that things are unusueal, but always keep looking for patterns that will let the system detect the need itself even the existing defaults are wrong as frequently as they are right (they were set when hardware was very different then it is today) so some way to gather real-world stats and set the system defaults based on actual hardware performance is really the right way to go (even for things like sequential scan speed that are set in the config file today) David Lang
On Fri, 16 Dec 2005, Mark Kirkwood wrote: > > Right on. Some of these "coerced" plans may perform much better. If so, we > can look at tweaking your runtime config: e.g. > > effective_cache_size > random_page_cost > default_statistics_target > > to see if said plans can be chosen "naturally". Mark, I've seen these config options listed as tweaking targets fairly frequently, has anyone put any thought or effort into creating a test program that could analyse the actual system and set the defaults based on the measured performance? David Lang
On 12/15/05, Craig A. James <cjames@modgraph-usa.com> wrote: > > Yeah it would - an implementation I have seen that I like is where the > > developer can supply the *entire* execution plan with a query. This is > > complex enough to make casual use unlikely :-), but provides the ability > > to try out other plans, and also fix that vital query that must run > > today..... > > So, to move on to the concrete... > > I'm not familiar with the innards of Postgres except in a theoretical way. > Maybe this is a totally naive or dumb question, but I have to ask: How > hard would it be to essentially turn off the optimizer? > > 1. Evaluate WHERE clauses left-to-right. > > select ... from FOO where A and B and C; > > This would just apply the criteria left-to-right, first A, then B, then C. > If an index was available it would use it, but only in left-to-right order, > i.e. if A had no index but B did, then too bad, you should have written "B > and A and C". > pg < 8.1 when you use multi-column indexes do exactly this... but i don't know why everyone wants this... > > 2. Evaluate joins left-to-right. > > select ... from FOO join BAR on (...) join BAZ on (...) where ... > > This would join FOO to BAR, then join the result to BAZ. The only > optimization would be to apply relevant "where" conditions to each join > before processing the next join. > using explicit INNER JOIN syntax and parenthesis > > 3. Don't flatten sub-selects > > select ... from (select ... from FOO where ...) as X where ...; > select ... from (select ... from FOO where ... offset 0) as X where ...; > This would do the inner select then use the result in the outer select, and > wouldn't attempt to flatten the query. > > Thanks, > Craig > what else? -- regards, Jaime Casanova (DBA: DataBase Aniquilator ;)
On 12/15/05, David Lang <dlang@invendra.net> wrote: > On Thu, 15 Dec 2005, Craig A. James wrote: > > > Mark Kirkwood wrote: > >> I hear what you are saying, but to use this fine example - I don't know > >> what the best plan is - these experiments part of an investigation to > find > >> *if* there is a better plan, and if so, why Postgres is not finding it. > >> > >>> There isn't a database in the world that is as smart as a developer, or > >>> that can have insight into things that only a developer can possibly > know. > >> > >> That is often true - but the aim is to get Postgres's optimizer closer to > >> developer smartness. > > > > What would be cool would be some way the developer could alter the plan, > but > > they way of doing so would strongly encourage the developer to send the > > information to this mailing list. Postgres would essentially say, "Ok, > you > > can do that, but we want to know why!" > > at the risk of sounding flippent (which is NOT what I intend) I will point > out that with the source you can change the optimizer any way you need to > :-) > > that being said, in your example the issue is the cost of the user created > function and the fact that postgres doesn't know it's cost. > > would a resonable answer be to give postgres a way to learn how expensive > the call is? > > a couple ways I could see to do this. > > 1. store some stats automagicly when the function is called and update the > optimization plan when you do an ANALYSE > > 2. provide a way for a user to explicitly set a cost factor for a function > (with a default value that's sane for fairly trivial functions so that it > would only have to be set for unuseually expensive functions) > > now, neither of these will work all the time if a given function is > sometimes cheap and sometimes expensive (depending on it's parameters), > but in that case I would say that if the application knows that a function > will be unusueally expensive under some conditions (and knows what those > conditions will be) it may be a reasonable answer to duplicate the > function, one copy that it uses most of the time, and a second copy that > it uses when it expects it to be expensive. at this point the cost of the > function can be set via either of the methods listed above) > > >> After years of using several other database products (some supporting > hint > >> type constructs and some not), I have come to believe that hinting (or > >> similar) actually *hinders* the development of a great optimizer. > > > > I agree. It takes the pressure off the optimizer gurus. If the users can > > just work around every problem, then the optimizer can suck and the system > is > > still usable. > > > > Lest anyone think I'm an all-out advocate of overriding the optimizer, I > know > > from first-hand experience what a catastrophe it can be. An Oracle hint I > > used worked fine on my test schema, but the customer's "table" turned out > to > > be a view, and Oracle's optimizer worked well on the view whereas my hint > was > > horrible. Unfortunately, without the hint, Oracle sucked when working on > an > > ordinary table. Hints are dangerous, and I consider them a last resort. > > I've been on the linux-kernel mailing list for the last 9 years, and have > seen a similar debate rage during that entire time about kernel memory > management. overall both of these tend to be conflicts between short-term > and long-term benifits. > > in the short-term the application user wants to be able to override the > system to get the best performance _now_ > > in the long run the system designers don't trust the application > programmers to get the hints right and want to figure out the right > optimizer plan, even if it takes a lot longer to do so. > > the key to this balance seems to be to work towards as few controls as > possible, becouse the user will get them wrong far more frequently then > they get them right, but when you hit a point where there's absolutly no > way for the system to figure things out (and it's a drastic difference) > provide the application with a way to hint to the system that things are > unusueal, but always keep looking for patterns that will let the system > detect the need itself > > even the existing defaults are wrong as frequently as they are right (they > were set when hardware was very different then it is today) so some way to > gather real-world stats and set the system defaults based on actual > hardware performance is really the right way to go (even for things like > sequential scan speed that are set in the config file today) > > David Lang > there was discussion on this and IIRC the consensus was that could be useful tu give some statistics to user defined functions... i don't if someone is working on this or even if it is doable... -- regards, Jaime Casanova (DBA: DataBase Aniquilator ;)
On Thu, 15 Dec 2005, Craig A. James wrote: > The example I raised in a previous thread, of irregular usage, is the same: I > have a particular query that I *always* want to be fast even if it's only > used rarely, but the system swaps its tables out of the file-system cache, > based on "low usage", even though the "high usage" queries are low priority. > How can Postgres know such things when there's no way for me to tell it? actually, postgres doesn't manage the file-system cache, it deliberatly leaves that up to the OS it is running on to do that job. one (extremely ugly) method that you could use would be to have a program that looks up what files are used to store your high priority tables and then write a trivial program to keep those files in memory (it may be as simple as mmaping the files and then going to sleep, or you may have to read various points through the file to keep them current in the cache, it WILL vary depending on your OS and filesystem in use) oracle goes to extremes with this sort of control, I'm actually mildly surprised that they still run on a host OS and haven't completely taken over the machine (I guess they don't want to have to write device drivers, that's about the only OS code they really want to use, they do their own memory management, filesystem, and user systems), by avoiding areas like this postgres sacrafices a bit of performance, but gains a much broader set of platforms (hardware and OS) that it can run on. and this by itself can result in significant wins (does oracle support Opteron CPU's in 64 bit mode yet? as of this summer it just wasn't an option) David Lang
Craig A. James wrote: > > > Christopher Kings-Lynne wrote: > >>>> select * from my_table where row_num >= 50000 and row_num < > >>>>100000 > >>>> and myfunc(foo, bar); > >>> > >>> > >>>You just create an index on myfunc(foo, bar) > >> > >> > >>only if myfunc(foo, bar) is immutable... > > > > > >And if it's not then the best any database can do is to index scan > >row_num - so still you have no problem. > > Boy, you picked a *really* bad example ;-) > > The problem is that Postgres decided to filter on myfunc() *first*, and > then filter on row_num, resulting in a query time that jumped from seconds > to hours. And there's no way for me to tell Postgres not to do that! Apologies in advance if all of this has been said, or if any of it is wrong. What kind of plan do you get if you eliminate the myfunc(foo, bar) from the query entirely? An index scan or a full table scan? If the latter then (assuming that the statistics are accurate) the reason you want inclusion of myfunc() to change the plan must be the expense of the function, not the expense of the scan (index versus sequential). While the expense of the function isn't, as far as I know, known or used by the planner, that obviously needn't be the case. On the other hand, if the inclusion of the function call changes the plan that is selected from an index scan to a sequential scan, then that, I think, is clearly a bug, since even a zero-cost function cannot make the sequential scan more efficient than an index scan which is already more efficient than the base sequential scan. > So, "you still have no problem" is exactly wrong, because Postgres picked > the wrong plan. Postgres decided that applying myfunc() to 10,000,000 > rows was a better plan than an index scan of 50,000 row_nums. So I'm > screwed. If PostgreSQL is indeed applying myfunc() to 10,000,000 rows, then that is a bug if the function is declared VOLATILE (which is the default if no volatility is specified), because it implies that it's applying the function to rows that don't match the selection condition. From your prior description, it sounds like your function is declared STABLE. For your specific situation, my opinion is that the proper modification to PostgreSQL would be to give it (if it isn't already there) the ability to include the cost of functions in the plan. The cost needn't be something that it automatically measures -- it could be specified at function creation time. -- Kevin Brown kevin@sysexperts.com
Craig A. James wrote: > Hints are dangerous, and I consider them a last resort. If you consider them a last resort, then why do you consider them to be a better alternative than a workaround such as turning off enable_seqscan, when all the other tradeoffs are considered? If your argument is that planner hints would give you finer grained control, then the question is whether you'd rather the developers spend their time implementing planner hints or improving the planner. I'd rather they did the latter, as long as workarounds are available when needed. A workaround will probably give the user greater incentive to report the problem than use of planner hints. -- Kevin Brown kevin@sysexperts.com
Kevin Brown wrote: >>Hints are dangerous, and I consider them a last resort. > > If you consider them a last resort, then why do you consider them to > be a better alternative than a workaround such as turning off > enable_seqscan, when all the other tradeoffs are considered? If I understand enable_seqscan, it's an all-or-nothing affair. Turning it off turns it off for the whole database, right? The same is true of all of the planner-tuning parameters in the postgres conf file. Since the optimizer does a goodjob most of the time, I'd hate to change a global setting like this -- what else would be affected? I could try this,but it would make me nervous to alter the whole system to fix one particular query. > If your argument is that planner hints would give you finer grained > control, then the question is whether you'd rather the developers > spend their time implementing planner hints or improving the planner. I agree 100% -- I'd much prefer a better planner. But when it comes down to a do-or-die situation, you need a hack, somesort of workaround, to get you working *today*. Regards, Craig
Craig A. James wrote: > Kevin Brown wrote: > >>Hints are dangerous, and I consider them a last resort. > > > >If you consider them a last resort, then why do you consider them to > >be a better alternative than a workaround such as turning off > >enable_seqscan, when all the other tradeoffs are considered? > > If I understand enable_seqscan, it's an all-or-nothing affair. Turning it > off turns it off for the whole database, right? The same is true of all > of the planner-tuning parameters in the postgres conf file. Nope. What's in the conf file are the defaults. You can change them on a per-connection basis, via the SET command. Thus, before doing your problematic query: SET enable_seqscan = off; and then, after your query is done, SET enable_seqscan = on; > >If your argument is that planner hints would give you finer grained > >control, then the question is whether you'd rather the developers > >spend their time implementing planner hints or improving the planner. > > I agree 100% -- I'd much prefer a better planner. But when it comes down > to a do-or-die situation, you need a hack, some sort of workaround, to get > you working *today*. And that's why I was asking about workarounds versus planner hints. I expect that the situations in which the planner gets things wrong *and* where there's no workaround are very rare indeed. -- Kevin Brown kevin@sysexperts.com
On Thu, Dec 15, 2005 at 21:41:06 -0800, "Craig A. James" <cjames@modgraph-usa.com> wrote: > > If I understand enable_seqscan, it's an all-or-nothing affair. Turning it > off turns it off for the whole database, right? The same is true of all of You can turn it off just for specific queries. However, it will apply to all joins within a query.
Kevin Brown wrote: >Craig A. James wrote: > > >>Hints are dangerous, and I consider them a last resort. >> >> > >If you consider them a last resort, then why do you consider them to >be a better alternative than a workaround such as turning off >enable_seqscan, when all the other tradeoffs are considered? > > I would like a bit finer degree of control on this - I'd like to be able to tell PG that for my needs, it is never OK to scan an entire table of more than N rows. I'd typically set N to 1,000,000 or so. What I would really like is for my DBMS to give me a little more pushback - I'd like to ask it to run a query, and have it either find a "good" way to run the query, or politely refuse to run it at all. Yes, I know that is an unusual request :-) The context is this - in a busy OLTP system, sometimes a query comes through that, for whatever reason (foolishness on my part as a developer, unexpected use by a user, imperfection of the optimizer, etc.), takes a really long time to run, usually because it table-scans one or more large tables. If several of these happen at once, it can grind an important production system effectively to a halt. I'd like to have a few users/operations get a "sorry, I couldn't find a good way to do that" message, rather than all the users find that their system has effectively stopped working. Kyle Cordes www.kylecordes.com
On 12/16/05, Kyle Cordes <kyle@kylecordes.com> wrote: > Kevin Brown wrote: > > >Craig A. James wrote: > > > > > >>Hints are dangerous, and I consider them a last resort. > >> > >> > > > >If you consider them a last resort, then why do you consider them to > >be a better alternative than a workaround such as turning off > >enable_seqscan, when all the other tradeoffs are considered? > > > > > > I would like a bit finer degree of control on this - I'd like to be able > to tell PG that for my needs, it is never OK to scan an entire table of > more than N rows. I'd typically set N to 1,000,000 or so. What I would > really like is for my DBMS to give me a little more pushback - I'd like > to ask it to run a query, and have it either find a "good" way to run > the query, or politely refuse to run it at all. > > Yes, I know that is an unusual request :-) > > The context is this - in a busy OLTP system, sometimes a query comes > through that, for whatever reason (foolishness on my part as a > developer, unexpected use by a user, imperfection of the optimizer, > etc.), takes a really long time to run, usually because it table-scans > one or more large tables. If several of these happen at once, it can > grind an important production system effectively to a halt. I'd like to > have a few users/operations get a "sorry, I couldn't find a good way to > do that" message, rather than all the users find that their system has > effectively stopped working. > > Kyle Cordes > www.kylecordes.com > > set statement_timeout in postgresql.conf -- Atentamente, Jaime Casanova (DBA: DataBase Aniquilator ;)
Dnia 16-12-2005, pią o godzinie 16:16 +1300, Mark Kirkwood napisał(a): > Craig A. James wrote: > > > > > What would be cool would be some way the developer could alter the plan, > > but they way of doing so would strongly encourage the developer to send > > the information to this mailing list. Postgres would essentially say, > > "Ok, you can do that, but we want to know why!" > > > > Yeah it would - an implementation I have seen that I like is where the > developer can supply the *entire* execution plan with a query. This is > complex enough to make casual use unlikely :-), but provides the ability > to try out other plans, and also fix that vital query that must run > today..... I think you could use SPI for that. There is function SPI_prepare, which prepares plan, and SPI_execute_plan, executing it. These functions are defined in src/backend/executor/spi.c. I think (someone please correct me if I'm wrong) you could prepare plan yourself, instead of taking it from SPI_prepare, and give it to SPI_execute_plan. SPI_prepare calls _SPI_prepare_plan, which parses query and calls pg_analyze_and_rewrite. In your version don't call this function, but provide PostgreSQL with your own plan (not-optimised according to PostrgeSQL, but meeting your criteria). -- Tomasz Rybak <bogomips@post.pl>
Jaime Casanova wrote: >>The context is this - in a busy OLTP system, sometimes a query comes >>through that, for whatever reason (foolishness on my part as a >>developer, unexpected use by a user, imperfection of the optimizer, >>etc.), takes a really long time to run, usually because it table-scans >>one or more large tables. If several of these happen at once, it can >>grind an important production system effectively to a halt. I'd like to >>have a few users/operations get a "sorry, I couldn't find a good way to >>do that" message, rather than all the users find that their system has >>effectively stopped working. > ... > set statement_timeout in postgresql.conf I found it's better to use "set statement_timeout" in the code, rather than setting it globally. Someone else pointed outto me that setting it in postgresql.conf makes it apply to ALL transactions, including VACUUM, ANALYZE and so forth. I put it in my code just around the queries that are "user generated" -- queries that are from users' input. I expect anySQL that I write to finish in a reasonable time ;-). Craig
On Thu, 2005-12-15 at 18:23 -0800, Craig A. James wrote:
> So, "you still have no problem" is exactly wrong, because Postgres picked the wrong plan.  Postgres decided that
applyingmyfunc() to 10,000,000 rows was a better plan than an index scan of 50,000 row_nums.  So I'm screwed. 
FWIW,
The cost_functionscan procedure in costsize.c has the following comment:
        /*
         * For now, estimate function's cost at one operator eval per
function
         * call.  Someday we should revive the function cost estimate
columns in         * pg_proc...
         */
I recognize that you're trying to talk about the issue in general rather
than about this particular example.  However, the example does seem to
me to be exactly the case where the effort might be better spent
improving the optimizer (reviving the function cost estimate columns),
rather than implementing a general hinting facility.  Which one is more
effort?  I don't really know for sure, but cost_functionscan does seem
pretty straightforward.
What percentage of problems raised on this list can be fixed by setting
configuration parameters, adding indexes, increasing statistics, or
re-architecting a crazy schema?  I've only been lurking for a few
months, but it seems like a pretty large fraction.  Of the remainder,
what percentage represent actual useful feedback about what needs
improvement in the optimizer?  A pretty large fraction, I think.
Including your example.
Personally, I think whoever was arguing for selectivity hints in
-hackers recently made a pretty good point, so I'm partly on your side.
Actually, function cost "hints" don't really seem that much different
from selectivity hints, and both seem to me to be slicker solutions
(closer to the right level of abstraction) than a general hint facility.
Mitch
			
		David Lang wrote: > On Fri, 16 Dec 2005, Mark Kirkwood wrote: > >> >> Right on. Some of these "coerced" plans may perform much better. If >> so, we can look at tweaking your runtime config: e.g. >> >> effective_cache_size >> random_page_cost >> default_statistics_target >> >> to see if said plans can be chosen "naturally". > > > Mark, I've seen these config options listed as tweaking targets fairly > frequently, has anyone put any thought or effort into creating a test > program that could analyse the actual system and set the defaults based > on the measured performance? > I am sure this has been discussed before, I found this thread - http://archives.postgresql.org/pgsql-performance/2004-07/msg00189.php but I seem to recall others (but offhand can't find any of them). I think that the real difficultly here is that the construction of the test program is non trivial - for instance, the best test program for tuning *my* workload is my application with its collection of data, but it is probably not a good test program for *anyone else's* workload. cheers Mark
On Thu, Dec 15, 2005 at 09:48:55PM -0800, Kevin Brown wrote: > Craig A. James wrote: > > Kevin Brown wrote: > > >>Hints are dangerous, and I consider them a last resort. > > > > > >If you consider them a last resort, then why do you consider them to > > >be a better alternative than a workaround such as turning off > > >enable_seqscan, when all the other tradeoffs are considered? > > > > If I understand enable_seqscan, it's an all-or-nothing affair. Turning it > > off turns it off for the whole database, right? The same is true of all > > of the planner-tuning parameters in the postgres conf file. > > Nope. What's in the conf file are the defaults. You can change them > on a per-connection basis, via the SET command. Thus, before doing > your problematic query: > > SET enable_seqscan = off; > > and then, after your query is done, > > SET enable_seqscan = on; You can also turn it off inside a transaction and have it only affect that transaction so that you can't accidentally forget to turn it back on (which could seriously hose things up if you're doing this in conjunction with a connection pool). -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Fri, Dec 16, 2005 at 03:31:03PM +1300, Mark Kirkwood wrote: > After years of using several other database products (some supporting > hint type constructs and some not), I have come to believe that hinting > (or similar) actually *hinders* the development of a great optimizer. I don't think you can assume that would hold true for an open-source database. Unlike a commercial database, it's trivially easy to notify developers about a bad query plan. With a commercial database you'd have to open a support ticket and hope they actually use that info to improve the planner. Here you need just send an email to this list and the developers will at least see it, and will usually try and fix the issue. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Fri, Dec 16, 2005 at 04:16:58PM +1300, Mark Kirkwood wrote: > Craig A. James wrote: > > > > >What would be cool would be some way the developer could alter the plan, > >but they way of doing so would strongly encourage the developer to send > >the information to this mailing list. Postgres would essentially say, > >"Ok, you can do that, but we want to know why!" > > > > Yeah it would - an implementation I have seen that I like is where the > developer can supply the *entire* execution plan with a query. This is > complex enough to make casual use unlikely :-), but provides the ability > to try out other plans, and also fix that vital query that must run > today..... Being able to specify an exact plan would also provide for query plan stability; something that is critically important in certain applications. If you have to meet a specific response time requirement for a query, you can't afford to have the optimizer suddenly decide that some other plan might be faster when in fact it's much slower. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Jaime Casanova wrote: >>What I would >>really like is for my DBMS to give me a little more pushback - I'd like >>to ask it to run a query, and have it either find a "good" way to run >>the query, or politely refuse to run it at all. >> >> >set statement_timeout in postgresql.conf > > That is what I am doing now, and it is much better than nothing. But it's not really sufficient, in that it is still quite possible for users repeatedly trying an operation that unexpectedly causes excessive DB usage, to load down the system to the point of failure. In other words, I'd ideally like it to give up right away, not after N seconds of table scanning my 100-million-row tables... and not with a timeout, but with an explicit throwing up of its hands, exasperated, that it could not come up with an efficient way to run my query. Kyle Cordes www.kylecordes.com
"Craig A. James" <cjames@modgraph-usa.com> writes:
> How about this: Instead of arguing in the abstract, tell me in
> concrete terms how you would address the very specific example I gave,
> where myfunc() is a user-written function.  To make it a little more
> challenging, try this: myfunc() can behave very differently depending
> on the parameters, and sometimes (but not always), the application
> knows how it will behave and could suggest a good execution plan.
A word to the wise:
regression=# explain select * from tenk1 where ten > 5 and ten < 9
regression-# and myfunc(unique1,unique2);
                            QUERY PLAN
------------------------------------------------------------------
 Seq Scan on tenk1  (cost=0.00..533.00 rows=982 width=244)
   Filter: ((ten > 5) AND (ten < 9) AND myfunc(unique1, unique2))
(2 rows)
regression=# explain select * from tenk1 where myfunc(unique1,unique2)
regression-# and ten > 5 and ten < 9;
                            QUERY PLAN
------------------------------------------------------------------
 Seq Scan on tenk1  (cost=0.00..533.00 rows=982 width=244)
   Filter: (myfunc(unique1, unique2) AND (ten > 5) AND (ten < 9))
(2 rows)
I might have taken your original complaint more seriously if it
weren't so blatantly bogus.  Your query as written absolutely
would not have evaluated myfunc() first, because there was no
reason for the planner to reorder the WHERE list.
            regards, tom lane
			
		On Fri, 16 Dec 2005, Mark Kirkwood wrote: > Craig A. James wrote: > >> >> What would be cool would be some way the developer could alter the plan, >> but they way of doing so would strongly encourage the developer to send the >> information to this mailing list. Postgres would essentially say, "Ok, you >> can do that, but we want to know why!" >> > > Yeah it would - an implementation I have seen that I like is where the > developer can supply the *entire* execution plan with a query. This is > complex enough to make casual use unlikely :-), but provides the ability to > try out other plans, and also fix that vital query that must run today..... hmm, I wonder if this option would have uses beyond the production hacks that are being discussed. specificly developers working on the optimizer (or related things like clustered databases) could use the same hooks to develop and modify the 'optimizer' externally to postgres (doing an explain would let them find the costs that postgres thinks each option has, along with it's reccomendation, but the developer could try different execution plans without having to recompile postgres between runs. and for clustered databases where the data is split between machines this would be a hook that the cluster engine could use to put it's own plan into place without having to modify and recompile) David Lang
> > Yeah it would - an implementation I have seen that I like is where the > > developer can supply the *entire* execution plan with a query. This is > > complex enough to make casual use unlikely :-), but provides the ability > > to try out other plans, and also fix that vital query that must run > > today..... > > Being able to specify an exact plan would also provide for query plan > stability; something that is critically important in certain > applications. If you have to meet a specific response time requirement > for a query, you can't afford to have the optimizer suddenly decide that > some other plan might be faster when in fact it's much slower. Plan stability doesn't mean time response stability... The plan that today is almost instantaneous tomorrow can take hours... -- regards, Jaime Casanova (DBA: DataBase Aniquilator ;)
Tom Lane wrote:
> Kevin Brown <blargity@gmail.com> writes:
> > I'm running 8.1 installed from source on a Debian Sarge server.  I have a
> > simple query that I believe I've placed the indexes correctly for, and I
> > still end up with a seq scan.  It makes sense, kinda, but it should be able
> > to use the index to gather the right values.
>
> I continue to marvel at how many people think that if it's not using an
> index it must ipso facto be a bad plan ...
>
> That plan looks perfectly fine to me.  You could try forcing some other
> choices by fooling with the planner enable switches (eg set
> enable_seqscan = off) but I doubt you'll find much improvement.  There
> are too many rows being pulled from ordered_products to make an index
> nestloop a good idea.
We do have an FAQ item:
     4.6) Why are my queries slow?  Why don't they use my indexes?
--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
			
		On Sat, Dec 17, 2005 at 07:31:40AM -0500, Jaime Casanova wrote: > > > Yeah it would - an implementation I have seen that I like is where the > > > developer can supply the *entire* execution plan with a query. This is > > > complex enough to make casual use unlikely :-), but provides the ability > > > to try out other plans, and also fix that vital query that must run > > > today..... > > > > Being able to specify an exact plan would also provide for query plan > > stability; something that is critically important in certain > > applications. If you have to meet a specific response time requirement > > for a query, you can't afford to have the optimizer suddenly decide that > > some other plan might be faster when in fact it's much slower. > > Plan stability doesn't mean time response stability... > The plan that today is almost instantaneous tomorrow can take hours... Sure, if your underlying data changes that much, but that's often not going to happen in production systems (especially OLTP where this is most important). Of course if you have a proposal for ensuring that a query always finishes in X amount of time, rather than always using the same plan, I'd love to hear it. ;) -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Wed, 2005-12-14 at 17:47 -0500, Tom Lane wrote:
> That plan looks perfectly fine to me.  You could try forcing some other
> choices by fooling with the planner enable switches (eg set
> enable_seqscan = off) but I doubt you'll find much improvement.  There
> are too many rows being pulled from ordered_products to make an index
> nestloop a good idea.
Well, I'm no expert either, but if there was an index on
ordered_products (paid, suspended_sub, id) it should be mergejoinable
with the index on to_ship.ordered_product_id, right?  Given the
conditions on paid and suspended_sub.
If you (Kevin) try adding such an index, ideally it would get used given
that you're only pulling out a small fraction of the rows in to_ship.
If it doesn't get used, then I had a similar issue with 8.0.3 where an
index that was mergejoinable (only because of the restrictions in the
where clause) wasn't getting picked up.
Mitch
Kevin Brown wrote:
> CREATE TABLE to_ship
> (
>   id int8 NOT NULL DEFAULT nextval(('to_ship_seq'::text)::regclass),
>   ordered_product_id int8 NOT NULL,
>   bounced int4 NOT NULL DEFAULT 0,
>   operator_id varchar(20) NOT NULL,
>   "timestamp" timestamptz NOT NULL DEFAULT ('now'::text)::timestamp(6)
> with
> time zone,
>   CONSTRAINT to_ship_pkey PRIMARY KEY (id),
>   CONSTRAINT to_ship_ordered_product_id_fkey FOREIGN KEY
> (ordered_product_id)
> REFERENCES ordered_products (id) ON UPDATE RESTRICT ON DELETE RESTRICT
> )
> WITHOUT OIDS;
>
> CREATE TABLE ordered_products
> (
>   id int8 NOT NULL DEFAULT
> nextval(('ordered_products_seq'::text)::regclass),
>   order_id int8 NOT NULL,
>   product_id int8 NOT NULL,
>   recipient_address_id int8 NOT NULL,
>   hide bool NOT NULL DEFAULT false,
>   renewal bool NOT NULL DEFAULT false,
>   "timestamp" timestamptz NOT NULL DEFAULT ('now'::text)::timestamp(6)
> with
> time zone,
>   operator_id varchar(20) NOT NULL,
>   suspended_sub bool NOT NULL DEFAULT false,
>   quantity int4 NOT NULL DEFAULT 1,
>   price_paid numeric NOT NULL,
>   tax_paid numeric NOT NULL DEFAULT 0,
>   shipping_paid numeric NOT NULL DEFAULT 0,
>   remaining_issue_obligation int4 NOT NULL DEFAULT 0,
>   parent_product_id int8,
>   delivery_method_id int8 NOT NULL,
>   paid bool NOT NULL DEFAULT false,
>   CONSTRAINT ordered_products_pkey PRIMARY KEY (id),
>   CONSTRAINT ordered_products_order_id_fkey FOREIGN KEY (order_id)
> REFERENCES
> orders (id) ON UPDATE RESTRICT ON DELETE RESTRICT,
>   CONSTRAINT ordered_products_parent_product_id_fkey FOREIGN KEY
> (parent_product_id) REFERENCES ordered_products (id) ON UPDATE
> RESTRICT ON
> DELETE RESTRICT,
>   CONSTRAINT ordered_products_recipient_address_id_fkey FOREIGN KEY
> (recipient_address_id) REFERENCES addresses (id) ON UPDATE RESTRICT ON
> DELETE
> RESTRICT
> )
> WITHOUT OIDS;
>
> === The two indexes that should matter ===
> CREATE INDEX ordered_product_id_index
>   ON to_ship
>   USING btree
>   (ordered_product_id);
>
> CREATE INDEX paid_index
>   ON ordered_products
>   USING btree
>   (paid);
>
> ordered_products.id is a primary key, so it should have an implicit
> index.
			
		Mark Kirkwood wrote: > Kevin Brown wrote: > >> I'll just start by warning that I'm new-ish to postgresql. >> >> I'm running 8.1 installed from source on a Debian Sarge server. I >> have a simple query that I believe I've placed the indexes correctly >> for, and I still end up with a seq scan. It makes sense, kinda, but >> it should be able to use the index to gather the right values. I do >> have a production set of data inserted into the tables, so this is >> running realistically: >> >> dli=# explain analyze SELECT ordered_products.product_id >> dli-# FROM to_ship, ordered_products >> dli-# WHERE to_ship.ordered_product_id = ordered_products.id AND >> dli-# ordered_products.paid = TRUE AND >> dli-# ordered_products.suspended_sub = FALSE; > > > You scan 600000 rows from to_ship to get about 25000 - so some way to > cut this down would help. > > Try out an explicit INNER JOIN which includes the filter info for paid > and suspended_sub in the join condition (you may need indexes on each of > id, paid and suspended_sub, so that the 8.1 optimizer can use a bitmap > scan): > > > SELECT ordered_products.product_id > FROM to_ship INNER JOIN ordered_products > ON (to_ship.ordered_product_id = ordered_products.id > AND ordered_products.paid = TRUE AND > ordered_products.suspended_sub = FALSE); It has been a quiet day today, so I took another look at this. If the selectivity of clauses : paid = TRUE suspended_sub = FALSE is fairly high, then rewriting as a subquery might help: SELECT o.product_id FROM ordered_products o WHERE o.paid = TRUE AND o.suspended_sub = FALSE AND EXISTS ( SELECT 1 FROM to_ship s WHERE s.ordered_product_id = o.id ); However it depends on you not needing anything from to_ship in the SELECT list... Cheers Mark