Обсуждение: query optimization
Dear list, We have a database schema, which looks the same as the attached script. When filling the tables with data, and skipping analyze on the table (so pg_stats contains no records for table 'a'), the first select in the script runs fast, but after an analyze the planner decides to sequence scan tables b and c, thus making the query much slower. Can somebody help me solving this issue, or tuning our installation to not to use sequence scans in this case? Thanks in advance, Kojedzinszky Richard Euronet Magyarorszag Informatikai Zrt.
Вложения
Richard Kojedzinszky <krichy@tvnetwork.hu> wrote: > tuning our installation to not to use sequence scans in this case? Make sure effective_cache_size is set to the sum of shared_buffers and whatever your OS shows as usable for caching. Try adjusting cost factors: maybe random_page_cost between 1 and 2, and cpu_tuple_cost between 0.03 and 0.05. -Kevin
Richard Kojedzinszky <krichy@tvnetwork.hu> writes: > Dear list, > We have a database schema, which looks the same as the attached script. > When filling the tables with data, and skipping analyze on the table (so > pg_stats contains no records for table 'a'), the first select in the > script runs fast, but after an analyze the planner decides to sequence > scan tables b and c, thus making the query much slower. Can somebody help > me solving this issue, or tuning our installation to not to use sequence > scans in this case? Um ... did you analyze all the tables, or just some of them? I get sub-millisecond runtimes if all four tables have been analyzed, but it does seem to pick lousy plans if, say, only a and b have been analyzed. What you really need for this query structure is the parameterized-path work I've been doing for 9.2; but at least on the exact example given, I'm not seeing that 9.1 is that much worse. regards, tom lane
Tom Lane wrote on 26.04.2012 21:17: > Richard Kojedzinszky<krichy@tvnetwork.hu> writes: >> Dear list, >> We have a database schema, which looks the same as the attached script. > >> When filling the tables with data, and skipping analyze on the table (so >> pg_stats contains no records for table 'a'), the first select in the >> script runs fast, but after an analyze the planner decides to sequence >> scan tables b and c, thus making the query much slower. Can somebody help >> me solving this issue, or tuning our installation to not to use sequence >> scans in this case? > > Um ... did you analyze all the tables, or just some of them? I get > sub-millisecond runtimes if all four tables have been analyzed, but it > does seem to pick lousy plans if, say, only a and b have been analyzed. > Here it's similar to Richard's experience: Before analyzing the four tables, the first statement yields this plan: Merge Left Join (cost=504.89..2634509.91 rows=125000000 width=16) (actual time=0.103..0.108 rows=1 loops=1) Merge Cond: (a.b = b.id) -> Sort (cost=504.89..506.14 rows=500 width=8) (actual time=0.043..0.043 rows=1 loops=1) Sort Key: a.b Sort Method: quicksort Memory: 17kB -> Bitmap Heap Scan on a (cost=12.14..482.47 rows=500 width=8) (actual time=0.028..0.029 rows=1 loops=1) Recheck Cond: (id = 4) -> Bitmap Index Scan on a_idx1 (cost=0.00..12.01 rows=500 width=0) (actual time=0.021..0.021 rows=1 loops=1) Index Cond: (id = 4) -> Materialize (cost=0.00..884002.52 rows=50000000 width=8) (actual time=0.041..0.057 rows=5 loops=1) -> Merge Join (cost=0.00..759002.52 rows=50000000 width=8) (actual time=0.037..0.051 rows=5 loops=1) Merge Cond: (b.id = c.id) -> Index Scan using b_idx1 on b (cost=0.00..4376.26 rows=100000 width=4) (actual time=0.016..0.018 rows=5loops=1) -> Materialize (cost=0.00..4626.26 rows=100000 width=4) (actual time=0.017..0.022 rows=5 loops=1) -> Index Scan using c_idx1 on c (cost=0.00..4376.26 rows=100000 width=4) (actual time=0.014..0.017rows=5 loops=1) Total runtime: 0.209 ms This continues to stay the plan for about 10-15 repetitions, then it turns to this plan Hash Right Join (cost=2701.29..6519.30 rows=1 width=16) (actual time=79.604..299.227 rows=1 loops=1) Hash Cond: (b.id = a.b) -> Hash Join (cost=2693.00..6136.00 rows=100000 width=8) (actual time=79.550..265.251 rows=100000 loops=1) Hash Cond: (b.id = c.id) -> Seq Scan on b (cost=0.00..1443.00 rows=100000 width=4) (actual time=0.011..36.158 rows=100000 loops=1) -> Hash (cost=1443.00..1443.00 rows=100000 width=4) (actual time=79.461..79.461 rows=100000 loops=1) Buckets: 16384 Batches: 1 Memory Usage: 2735kB -> Seq Scan on c (cost=0.00..1443.00 rows=100000 width=4) (actual time=0.010..32.930 rows=100000 loops=1) -> Hash (cost=8.28..8.28 rows=1 width=8) (actual time=0.015..0.015 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Index Scan using a_idx1 on a (cost=0.00..8.28 rows=1 width=8) (actual time=0.010..0.012 rows=1 loops=1) Index Cond: (id = 4) Total runtime: 299.564 ms (I guess autovacuum kicked in, because that the same plan I get when running analyze on all four tables right after populatingthem) And the second one yields this one here (Regardless of analyze or not): QUERY PLAN Nested Loop Left Join (cost=0.00..16.89 rows=1 width=16) (actual time=0.027..0.031 rows=1 loops=1) -> Nested Loop Left Join (cost=0.00..16.57 rows=1 width=12) (actual time=0.020..0.022 rows=1 loops=1) -> Index Scan using a_idx1 on a (cost=0.00..8.28 rows=1 width=8) (actual time=0.011..0.012 rows=1 loops=1) Index Cond: (id = 4) -> Index Scan using b_idx1 on b (cost=0.00..8.28 rows=1 width=4) (actual time=0.004..0.005 rows=1 loops=1) Index Cond: (a.b = id) -> Index Scan using c_idx1 on c (cost=0.00..0.31 rows=1 width=4) (actual time=0.004..0.005 rows=1 loops=1) Index Cond: (b.id = id) Total runtime: 0.104 ms My version: PostgreSQL 9.1.3, compiled by Visual C++ build 1500, 32-bit Running on Windows XP SP3 shared_buffers = 768MB work_mem = 24MB effective_cache_size = 1024MB All other (relevant) settings are on defaults Regards Thomas
Thomas Kellerer <spam_eater@gmx.net> writes: > Tom Lane wrote on 26.04.2012 21:17: >> Um ... did you analyze all the tables, or just some of them? I get >> sub-millisecond runtimes if all four tables have been analyzed, but it >> does seem to pick lousy plans if, say, only a and b have been analyzed. > Here it's similar to Richard's experience: > Before analyzing the four tables, the first statement yields this plan: > [ merge joins ] > This continues to stay the plan for about 10-15 repetitions, then it turns to this plan > [ hash joins ] Hmm. I see it liking the merge-join plan (with minor variations) with or without analyze data, but if just some of the tables have been analyzed, it goes for the hash plan which is a good deal slower. The cost estimates aren't that far apart though. In any case, the only reason the merge join is so fast is that the data is perfectly ordered in each table; on a less contrived example, it could well be a lot slower. > And the second one yields this one here (Regardless of analyze or not): Yeah, the trick there is that it's valid to re-order the joins, since they're both left joins. In git HEAD I get something like this: regression=# explain analyze select * from a left join (b inner join c on b.id = c.id) on a.b = b.id where a.id = 4; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Nested Loop Left Join (cost=0.00..17.18 rows=1 width=16) (actual time=0.024..0.026 rows=1 loops=1) -> Index Scan using a_idx1 on a (cost=0.00..8.38 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=1) Index Cond: (id = 4) -> Nested Loop (cost=0.00..8.80 rows=1 width=8) (actual time=0.011..0.012 rows=1 loops=1) -> Index Only Scan using b_idx1 on b (cost=0.00..8.38 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=1) Index Cond: (id = a.b) Heap Fetches: 1 -> Index Only Scan using c_idx1 on c (cost=0.00..0.41 rows=1 width=4) (actual time=0.004..0.005 rows=1 loops=1) Index Cond: (id = b.id) Heap Fetches: 1 Total runtime: 0.080 ms (11 rows) but 9.1 and older are not smart enough to do it like that when they can't re-order the joins. regards, tom lane
On 04/26/2012 04:08 PM, Tom Lane wrote: > Thomas Kellerer<spam_eater@gmx.net> writes: >> Tom Lane wrote on 26.04.2012 21:17: >>> Um ... did you analyze all the tables, or just some of them? I get >>> sub-millisecond runtimes if all four tables have been analyzed, but it >>> does seem to pick lousy plans if, say, only a and b have been analyzed. >> Here it's similar to Richard's experience: >> Before analyzing the four tables, the first statement yields this plan: >> [ merge joins ] >> This continues to stay the plan for about 10-15 repetitions, then it turns to this plan >> [ hash joins ] > Hmm. I see it liking the merge-join plan (with minor variations) with > or without analyze data, but if just some of the tables have been > analyzed, it goes for the hash plan which is a good deal slower. The > cost estimates aren't that far apart though. In any case, the only > reason the merge join is so fast is that the data is perfectly ordered > in each table; on a less contrived example, it could well be a lot > slower. > It's not so terribly contrived, is it? It's common enough to have tables which are append-only and to join them by something that corresponds to the append order (serial field, timestamp etc.) cheers andrew
Dear Tom, Thanks for your response. I my test cases, until an analyze have been run, the queries run fast. After only a have been analyzed, the query plan changes so that a sequence scan for b and c tables is done, and joining them with 'a' is done within memory. So, my tests are here, with autovacuum turned off: krichy=> \i test.sql DROP TABLE DROP TABLE DROP TABLE DROP TABLE CREATE TABLE CREATE TABLE CREATE TABLE CREATE TABLE INSERT 0 100000 INSERT 0 100000 INSERT 0 100000 INSERT 0 100000 CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX krichy=> explain analyze select * from a left join (b inner join c on b.id = c.id) on a.b = b.id where a.id = 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ Merge Left Join (cost=504.88..2633669.90 rows=125000000 width=16) (actual time=0.183..0.189 rows=1 loops=1) Merge Cond: (a.b = b.id) -> Sort (cost=504.88..506.13 rows=500 width=8) (actual time=0.073..0.074 rows=1 loops=1) Sort Key: a.b Sort Method: quicksort Memory: 17kB -> Bitmap Heap Scan on a (cost=12.13..482.47 rows=500 width=8) (actual time=0.048..0.048 rows=1 loops=1) Recheck Cond: (id = 1) -> Bitmap Index Scan on a_idx1 (cost=0.00..12.01 rows=500 width=0) (actual time=0.044..0.044 rows=1 loops=1) Index Cond: (id = 1) -> Materialize (cost=0.00..883162.52 rows=50000000 width=8) (actual time=0.103..0.108 rows=2 loops=1) -> Merge Join (cost=0.00..758162.52 rows=50000000 width=8) (actual time=0.100..0.104 rows=2 loops=1) Merge Cond: (b.id = c.id) -> Index Scan using b_idx1 on b (cost=0.00..3956.26 rows=100000 width=4) (actual time=0.050..0.050 rows=2 loops=1) -> Materialize (cost=0.00..4206.26 rows=100000 width=4) (actual time=0.048..0.049 rows=2 loops=1) -> Index Scan using c_idx1 on c (cost=0.00..3956.26 rows=100000 width=4) (actual time=0.046..0.047 rows=2 loops=1) Total runtime: 0.276 ms (16 rows) krichy=> ANALYZE a; ANALYZE krichy=> explain analyze select * from a left join (b inner join c on b.id = c.id) on a.b = b.id where a.id = 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Merge Right Join (cost=8.29..883178.31 rows=500 width=16) (actual time=0.050..0.056 rows=1 loops=1) Merge Cond: (b.id = a.b) -> Merge Join (cost=0.00..758162.52 rows=50000000 width=8) (actual time=0.034..0.038 rows=2 loops=1) Merge Cond: (b.id = c.id) -> Index Scan using b_idx1 on b (cost=0.00..3956.26 rows=100000 width=4) (actual time=0.015..0.016 rows=2 loops=1) -> Materialize (cost=0.00..4206.26 rows=100000 width=4) (actual time=0.015..0.017 rows=2 loops=1) -> Index Scan using c_idx1 on c (cost=0.00..3956.26 rows=100000 width=4) (actual time=0.012..0.013 rows=2 loops=1) -> Materialize (cost=8.29..8.29 rows=1 width=8) (actual time=0.015..0.016 rows=1 loops=1) -> Sort (cost=8.29..8.29 rows=1 width=8) (actual time=0.013..0.013 rows=1 loops=1) Sort Key: a.b Sort Method: quicksort Memory: 17kB -> Index Scan using a_idx1 on a (cost=0.00..8.28 rows=1 width=8) (actual time=0.007..0.008 rows=1 loops=1) Index Cond: (id = 1) Total runtime: 0.101 ms (14 rows) krichy=> ANALYZE b; ANALYZE krichy=> explain analyze select * from a left join (b inner join c on b.id = c.id) on a.b = b.id where a.id = 1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Hash Right Join (cost=2651.29..6419.30 rows=1 width=16) (actual time=83.823..257.890 rows=1 loops=1) Hash Cond: (b.id = a.b) -> Hash Join (cost=2643.00..6036.00 rows=100000 width=8) (actual time=83.790..224.552 rows=100000 loops=1) Hash Cond: (c.id = b.id) -> Seq Scan on c (cost=0.00..1393.00 rows=100000 width=4) (actual time=0.010..35.925 rows=100000 loops=1) -> Hash (cost=1393.00..1393.00 rows=100000 width=4) (actual time=83.752..83.752 rows=100000 loops=1) Buckets: 16384 Batches: 1 Memory Usage: 2344kB -> Seq Scan on b (cost=0.00..1393.00 rows=100000 width=4) (actual time=0.009..36.302 rows=100000 loops=1) -> Hash (cost=8.28..8.28 rows=1 width=8) (actual time=0.012..0.012 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Index Scan using a_idx1 on a (cost=0.00..8.28 rows=1 width=8) (actual time=0.007..0.008 rows=1 loops=1) Index Cond: (id = 1) Total runtime: 258.245 ms (13 rows) krichy=> ANALYZE c; ANALYZE krichy=> explain analyze select * from a left join (b inner join c on b.id = c.id) on a.b = b.id where a.id = 1; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Hash Right Join (cost=2651.29..6419.30 rows=1 width=16) (actual time=83.295..255.653 rows=1 loops=1) Hash Cond: (b.id = a.b) -> Hash Join (cost=2643.00..6036.00 rows=100000 width=8) (actual time=83.275..222.373 rows=100000 loops=1) Hash Cond: (b.id = c.id) -> Seq Scan on b (cost=0.00..1393.00 rows=100000 width=4) (actual time=0.010..35.726 rows=100000 loops=1) -> Hash (cost=1393.00..1393.00 rows=100000 width=4) (actual time=83.238..83.238 rows=100000 loops=1) Buckets: 16384 Batches: 1 Memory Usage: 2344kB -> Seq Scan on c (cost=0.00..1393.00 rows=100000 width=4) (actual time=0.009..36.243 rows=100000 loops=1) -> Hash (cost=8.28..8.28 rows=1 width=8) (actual time=0.011..0.011 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Index Scan using a_idx1 on a (cost=0.00..8.28 rows=1 width=8) (actual time=0.008..0.009 rows=1 loops=1) Index Cond: (id = 1) Total runtime: 256.008 ms (13 rows) So after analyzing all the tables, the result changed much, psql uses other plans to do the query, and in my case it is effectively much slower. My configuration file has: work_mem=16MB with this removed, the query goes fast again, but I dont know why the more memory makes psql chose a worse plan. Thanks in advance, Kojedzinszky Richard Euronet Magyarorszag Informatikai Zrt. On Thu, 26 Apr 2012, Tom Lane wrote: > Date: Thu, 26 Apr 2012 15:17:18 -0400 > From: Tom Lane <tgl@sss.pgh.pa.us> > To: Richard Kojedzinszky <krichy@tvnetwork.hu> > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] query optimization > > Richard Kojedzinszky <krichy@tvnetwork.hu> writes: >> Dear list, >> We have a database schema, which looks the same as the attached script. > >> When filling the tables with data, and skipping analyze on the table (so >> pg_stats contains no records for table 'a'), the first select in the >> script runs fast, but after an analyze the planner decides to sequence >> scan tables b and c, thus making the query much slower. Can somebody help >> me solving this issue, or tuning our installation to not to use sequence >> scans in this case? > > Um ... did you analyze all the tables, or just some of them? I get > sub-millisecond runtimes if all four tables have been analyzed, but it > does seem to pick lousy plans if, say, only a and b have been analyzed. > > What you really need for this query structure is the parameterized-path > work I've been doing for 9.2; but at least on the exact example given, > I'm not seeing that 9.1 is that much worse. > > regards, tom lane >