Обсуждение: Which is better, correlated subqueries or joins?

Поиск
Список
Период
Сортировка

Which is better, correlated subqueries or joins?

От
Jeffrey Tenny
Дата:
Looking for some general advice on correlated subqueries vs. joins.

Which of these plans is likely to perform better.  One table is a master
record table for entities and their IDs (nv_products), the other
represents a transitive closure of parent/child relationships (for a
tree) of ID's in the master record table (and so is larger)
(ssv_product_children).

The query is, in english: for direct children of an ID, return the ones
for which isrel is true.

I have only a tiny demo table set for which there is only one record
matched by the queries below, it's hard to guess at how deep or branchy
a production table might be, so I'm trying to develop a general query
strategy and learn a thing or two about pgsql.


Here's the join:

# explain select child_pid from ssv_product_children, nv_products where
nv_products.id = ssv_product_children.child_pid and
ssv_product_children.pid = 1 and nv_products.isrel = 't';
                                 QUERY PLAN
--------------------------------------------------------------------------
  Hash Join  (cost=1.22..2.47 rows=2 width=8)
    Hash Cond: ("outer".child_pid = "inner".id)
    ->  Seq Scan on ssv_product_children  (cost=0.00..1.18 rows=9 width=4)
          Filter: (pid = 1)
    ->  Hash  (cost=1.21..1.21 rows=4 width=4)
          ->  Seq Scan on nv_products  (cost=0.00..1.21 rows=4 width=4)
                Filter: (isrel = true)
(7 rows)


Here's the correlated subquery:


# explain select child_pid from ssv_product_children where pid = 1 and
child_pid = (select nv_products.id from nv_products where nv_products.id
= child_pid and isrel = 't');
                              QUERY PLAN
---------------------------------------------------------------------
  Seq Scan on ssv_product_children  (cost=0.00..18.78 rows=1 width=4)
    Filter: ((pid = 1) AND (child_pid = (subplan)))
    SubPlan
      ->  Seq Scan on nv_products  (cost=0.00..1.26 rows=1 width=4)
            Filter: ((id = $0) AND (isrel = true))
(5 rows)


Thanks for any advice.

Re: Which is better, correlated subqueries or joins?

От
"Joshua D. Drake"
Дата:
Hello,

It always depends on the dataset but you should try an explain analyze
on each query. It will tell you which one is more efficient for your
particular data.

Sincerely,

Joshua D. Drake


>
> Here's the join:
>
> # explain select child_pid from ssv_product_children, nv_products where
> nv_products.id = ssv_product_children.child_pid and
> ssv_product_children.pid = 1 and nv_products.isrel = 't';
>                                 QUERY PLAN
> --------------------------------------------------------------------------
>  Hash Join  (cost=1.22..2.47 rows=2 width=8)
>    Hash Cond: ("outer".child_pid = "inner".id)
>    ->  Seq Scan on ssv_product_children  (cost=0.00..1.18 rows=9 width=4)
>          Filter: (pid = 1)
>    ->  Hash  (cost=1.21..1.21 rows=4 width=4)
>          ->  Seq Scan on nv_products  (cost=0.00..1.21 rows=4 width=4)
>                Filter: (isrel = true)
> (7 rows)
>
>
> Here's the correlated subquery:
>
>
> # explain select child_pid from ssv_product_children where pid = 1 and
> child_pid = (select nv_products.id from nv_products where nv_products.id
> = child_pid and isrel = 't');
>                              QUERY PLAN
> ---------------------------------------------------------------------
>  Seq Scan on ssv_product_children  (cost=0.00..18.78 rows=1 width=4)
>    Filter: ((pid = 1) AND (child_pid = (subplan)))
>    SubPlan
>      ->  Seq Scan on nv_products  (cost=0.00..1.26 rows=1 width=4)
>            Filter: ((id = $0) AND (isrel = true))
> (5 rows)
>
>
> Thanks for any advice.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>      joining column's datatypes do not match


--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/