Обсуждение: Which is better, correlated subqueries or joins?
Looking for some general advice on correlated subqueries vs. joins. Which of these plans is likely to perform better. One table is a master record table for entities and their IDs (nv_products), the other represents a transitive closure of parent/child relationships (for a tree) of ID's in the master record table (and so is larger) (ssv_product_children). The query is, in english: for direct children of an ID, return the ones for which isrel is true. I have only a tiny demo table set for which there is only one record matched by the queries below, it's hard to guess at how deep or branchy a production table might be, so I'm trying to develop a general query strategy and learn a thing or two about pgsql. Here's the join: # explain select child_pid from ssv_product_children, nv_products where nv_products.id = ssv_product_children.child_pid and ssv_product_children.pid = 1 and nv_products.isrel = 't'; QUERY PLAN -------------------------------------------------------------------------- Hash Join (cost=1.22..2.47 rows=2 width=8) Hash Cond: ("outer".child_pid = "inner".id) -> Seq Scan on ssv_product_children (cost=0.00..1.18 rows=9 width=4) Filter: (pid = 1) -> Hash (cost=1.21..1.21 rows=4 width=4) -> Seq Scan on nv_products (cost=0.00..1.21 rows=4 width=4) Filter: (isrel = true) (7 rows) Here's the correlated subquery: # explain select child_pid from ssv_product_children where pid = 1 and child_pid = (select nv_products.id from nv_products where nv_products.id = child_pid and isrel = 't'); QUERY PLAN --------------------------------------------------------------------- Seq Scan on ssv_product_children (cost=0.00..18.78 rows=1 width=4) Filter: ((pid = 1) AND (child_pid = (subplan))) SubPlan -> Seq Scan on nv_products (cost=0.00..1.26 rows=1 width=4) Filter: ((id = $0) AND (isrel = true)) (5 rows) Thanks for any advice.
Hello, It always depends on the dataset but you should try an explain analyze on each query. It will tell you which one is more efficient for your particular data. Sincerely, Joshua D. Drake > > Here's the join: > > # explain select child_pid from ssv_product_children, nv_products where > nv_products.id = ssv_product_children.child_pid and > ssv_product_children.pid = 1 and nv_products.isrel = 't'; > QUERY PLAN > -------------------------------------------------------------------------- > Hash Join (cost=1.22..2.47 rows=2 width=8) > Hash Cond: ("outer".child_pid = "inner".id) > -> Seq Scan on ssv_product_children (cost=0.00..1.18 rows=9 width=4) > Filter: (pid = 1) > -> Hash (cost=1.21..1.21 rows=4 width=4) > -> Seq Scan on nv_products (cost=0.00..1.21 rows=4 width=4) > Filter: (isrel = true) > (7 rows) > > > Here's the correlated subquery: > > > # explain select child_pid from ssv_product_children where pid = 1 and > child_pid = (select nv_products.id from nv_products where nv_products.id > = child_pid and isrel = 't'); > QUERY PLAN > --------------------------------------------------------------------- > Seq Scan on ssv_product_children (cost=0.00..18.78 rows=1 width=4) > Filter: ((pid = 1) AND (child_pid = (subplan))) > SubPlan > -> Seq Scan on nv_products (cost=0.00..1.26 rows=1 width=4) > Filter: ((id = $0) AND (isrel = true)) > (5 rows) > > > Thanks for any advice. > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/