Re: Can someone explain the problem with this select
От | Ted Allen |
---|---|
Тема | Re: Can someone explain the problem with this select |
Дата | |
Msg-id | 4575E11F.8000803@blackducksoftware.com обсуждение исходный текст |
Ответ на | Can someone explain the problem with this select (Richard Ray <rray@mstc.state.ms.us>) |
Список | pgsql-sql |
Hey Ray, I'm by no means a guru but here is my simple analysis. In the first query, the 10 "documents" specified 'IN' the in are first selected from the "documents" table. Then, those 10 rows are joined with the "comments" table. In the second query, every row in the "documents" table is joined with the "comments" table, which took forever according to the Explain Analyse. Then, the results of the complete join of those two tables from are compared against the sub-select. I'm guessing (emphasis on guessing) that the query planner chose this approach because it does not know how many rows the sub-select will return so it does the join and then checks those results against that sub-select. Another approach may be to do this. Create Temp table limit_documents as select doc_num from documents limit 10; select * from limit_documents left outer join comments on (limit_documents.doc_num = comments.doc_num); Hope that helps, Ted Richard Ray wrote: > Allow me to demonstrate my pitiful SQL knowledge > I have tables documents and comments > If I run join and list doc_nums the query is quite fast > If I run join and use subselect the query is extremely slow > Can someone offer analysis > > Thanks > Richard > > dcc=# EXPLAIN ANALYZE select doc_num from documents limit 10; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------- > > Limit (cost=0.00..1.01 rows=10 width=13) (actual time=0.013..0.061 > rows=10 loops=1) > -> Seq Scan on documents (cost=0.00..909333.85 rows=9014885 > width=13) (actual time=0.008..0.027 rows=10 loops=1) > Total runtime: 0.125 ms > (3 rows) > > dcc=# > > > > dcc=# EXPLAIN ANALYZE select * from documents left outer join > comments on (documents.doc_num = comments.doc_num) where > documents.doc_num in > ('105364107','105513059','105513095','105513112','105585627','102933195','014650340','014650361','014650362','105419865'); > > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Hash Left Join (cost=21.23..61.54 rows=10 width=444) (actual > time=0.507..0.574 rows=10 loops=1) > Hash Cond: ("outer".doc_num = "inner".doc_num) > -> Bitmap Heap Scan on documents (cost=20.03..60.28 rows=10 > width=361) (actual time=0.397..0.432 rows=10 loops=1) > Recheck Cond: ((doc_num = '105364107'::bpchar) OR (doc_num = > '105513059'::bpchar) OR (doc_num = '105513095'::bpchar) OR (doc_num = > '105513112'::bpchar) OR (doc_num = '105585627'::bpchar) OR (doc_num = > '102933195'::bpchar) OR (doc_num = '014650340'::bpchar) OR (doc_num = > '014650361'::bpchar) OR (doc_num = '014650362'::bpchar) OR (doc_num = > '105419865'::bpchar)) > -> BitmapOr (cost=20.03..20.03 rows=10 width=0) (actual > time=0.383..0.383 rows=0 loops=1) > -> Bitmap Index Scan on documents_pkey > (cost=0.00..2.00 rows=1 width=0) (actual time=0.059..0.059 rows=1 > loops=1) > Index Cond: (doc_num = '105364107'::bpchar) > -> Bitmap Index Scan on documents_pkey > (cost=0.00..2.00 rows=1 width=0) (actual time=0.039..0.039 rows=1 > loops=1) > Index Cond: (doc_num = '105513059'::bpchar) > -> Bitmap Index Scan on documents_pkey > (cost=0.00..2.00 rows=1 width=0) (actual time=0.034..0.034 rows=1 > loops=1) > Index Cond: (doc_num = '105513095'::bpchar) > -> Bitmap Index Scan on documents_pkey > (cost=0.00..2.00 rows=1 width=0) (actual time=0.031..0.031 rows=1 > loops=1) > Index Cond: (doc_num = '105513112'::bpchar) > -> Bitmap Index Scan on documents_pkey > (cost=0.00..2.00 rows=1 width=0) (actual time=0.034..0.034 rows=1 > loops=1) > Index Cond: (doc_num = '105585627'::bpchar) > -> Bitmap Index Scan on documents_pkey > (cost=0.00..2.00 rows=1 width=0) (actual time=0.036..0.036 rows=1 > loops=1) > Index Cond: (doc_num = '102933195'::bpchar) > -> Bitmap Index Scan on documents_pkey > (cost=0.00..2.00 rows=1 width=0) (actual time=0.036..0.036 rows=1 > loops=1) > Index Cond: (doc_num = '014650340'::bpchar) > -> Bitmap Index Scan on documents_pkey > (cost=0.00..2.00 rows=1 width=0) (actual time=0.031..0.031 rows=1 > loops=1) > Index Cond: (doc_num = '014650361'::bpchar) > -> Bitmap Index Scan on documents_pkey > (cost=0.00..2.00 rows=1 width=0) (actual time=0.031..0.031 rows=1 > loops=1) > Index Cond: (doc_num = '014650362'::bpchar) > -> Bitmap Index Scan on documents_pkey > (cost=0.00..2.00 rows=1 width=0) (actual time=0.035..0.035 rows=1 > loops=1) > Index Cond: (doc_num = '105419865'::bpchar) > -> Hash (cost=1.16..1.16 rows=16 width=83) (actual > time=0.080..0.080 rows=16 loops=1) > -> Seq Scan on comments (cost=0.00..1.16 rows=16 width=83) > (actual time=0.005..0.037 rows=16 loops=1) > Total runtime: 0.775 ms > (28 rows) > > dcc=# > > > > dcc=# EXPLAIN ANALYZE select * from documents left outer join comments > on (documents.doc_num = comments.doc_num) where documents.doc_num in > (select doc_num from documents limit 10); > > QUERY PLAN > > -------------------------------------------------------------------------------- > > -------------------------------------------------------------------------------- > > -- > Merge IN Join (cost=100000002.19..136154797.93 rows=10 width=654) > (actual time > =23.534..2216180.550 rows=10 loops=1) > Merge Cond: ("outer".doc_num = "inner".doc_num) > -> Merge Left Join (cost=0.00..36129585.92 rows=10083868 > width=654) (actual > time=23.239..2188733.430 rows=6696218 loops=1) > Merge Cond: ("outer".doc_num = "inner".doc_num) > -> Index Scan using documents_pkey on documents > (cost=0.00..35723277. > 60 rows=10083868 width=569) (actual time=6.845..2107300.767 > rows=6695853 loops=1 > ) > -> Index Scan using doc_num_idx on comments > (cost=0.00..377203.50 row > s=311612 width=85) (actual time=16.368..6984.365 rows=243797 loops=1) > -> Sort (cost=100000002.19..100000002.22 rows=10 width=13) > (actual time=0.2 > 52..0.293 rows=10 loops=1) > Sort Key: "IN_subquery".doc_num > -> Limit (cost=100000000.00..100000001.92 rows=10 width=13) > (actual t > ime=0.019..0.128 rows=10 loops=1) > -> Seq Scan on documents > (cost=100000000.00..101940460.68 rows= > 10083868 width=13) (actual time=0.010..0.055 rows=10 loops=1) > Total runtime: 2216180.973 ms > (11 rows) > > dcc=# > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > -- *Edward Allen* Software Engineer Black Duck Software, Inc. tallen@blackducksoftware.com <mailto:tallen@blackducksoftware.com> T +1.781.891.5100 x133 F +1.781.891.5145 http://www.blackducksoftware.com
В списке pgsql-sql по дате отправления: