Обсуждение: Re: [SQL] bad select performance fixed by forbidding hash joins
[PostgreSQL 6.5.0 on i586-pc-linux-gnu, compiled by gcc egcs-2.91.66] table opset_steps (name text, id int2, ver int2) [1400 rows] non-unique index is on (id, ver) table run_opsets (status int2, id int2, ver int2, run_id int2, seq int2) [17000 rows] pkey is (id, seq), second index on(status, id, ver, run_id) select count(*) from run_opsets where status=1; --> 187 select count(*) from run_opsets where status=3; --> 10564 table runs (run_name text, run_id int2, status int2) [900 rows] pkey is run_name, second index(run_id, status) select count(*)from runs where status=1; -->68 I have vacuum analyzed all relevant tables. explain select os.name,r.run_name,ro.status from opset_steps os,runs r,run_opsets ro where (ro.status=3 or ro.status=1) andro.opset_id=os.opset_id and ro.run_id=r.run_id and ro.opset_ver=os.opset_ver and r.status=1; Hash Join (cost=1793.58 rows=14560 width=38) -> Hash Join (cost=1266.98 rows=14086 width=24) -> Seq Scan on run_opsets ro (cost=685.51 rows=13903 width=8) -> Hash (cost=70.84 rows=1389 width=16) -> Seq Scan on opset_steps os (cost=70.84 rows=1389 width=16) -> Hash (cost=47.43 rows=374 width=14) -> Seq Scan on runs r (cost=47.43 rows=374 width=14) This query takes 16 seconds. [returns 3126 rows] On Tue, Jul 20, 1999 at 05:42:20PM -0400, Tom Lane wrote: > On Tue, 20 Jul 1999 14:56:46 -0400 George Young wrote: > > ... Is this then > > the best that postgres can do? Is there some rephrasing/restructuring of > > this query that would make it faster? > > Hard to say. The query looks reasonable as it stands --- > ... You have no restriction > clause on opset_steps so all of those entries get loaded for hashing; > can you provide one? No. > The system's plan looks pretty reasonable as well. It might be that > a merge join would be faster than a hash join but I wouldn't assume > so. If you want, you can try forcing the system not to use hashes; > start psql with environment variable > PGOPTIONS="-fh" > and see what sort of plan and execution time you get. If that does > turn out to be a big win it would indicate that the planner is using > poor cost estimates, which is certainly possible... Yes! PGOPTIONS="-fh" made the query time go from 16 seconds to 2 seconds! Is this a safe thing to leave on permanently, or is there some way to set PGOPTIONS for just this query? explain select os.name,r.run_name,ro.status from opset_steps os,runs r,run_opsets ro where (ro.status=3 or ro.status=1) andro.opset_id=os.opset_id and ro.run_id=r.run_id and ro.opset_ver=os.opset_ver and r.status=1; Merge Join (cost=9295.54 rows=14560 width=38) -> Seq Scan (cost=8676.01 rows=14371 width=22) -> Sort (cost=8676.01 rows=14371 width=22) -> Merge Join (cost=1657.30 rows=14371 width=22) -> Index Scan using run_opsets_pkey on run_opsets ro (cost=1031.25 rows=13903 width=8) -> Seq Scan (cost=154.91 rows=374 width=14) -> Sort (cost=154.91 rows=374 width=14) -> Seq Scan on runs r (cost=47.43 rows=374 width=14) -> Index Scan using opset_steps_idx_ver_id on opset_steps os (cost=99.45 rows=1389 width=16) With PGOPTIONS=-fh, this query takes ~ 2 seconds! [returns 3126 rows] -- George Young, Rm. L-204 gry@ll.mit.edu MIT Lincoln Laboratory 244 Wood St. Lexington, Massachusetts 02420-9108 (781) 981-2756
George Young <gry@ll.mit.edu> writes: > Yes! PGOPTIONS="-fh" made the query time go from 16 seconds to 2 seconds! > Is this a safe thing to leave on permanently, or is there some way to set > PGOPTIONS for just this query? I wouldn't recommend leaving it on as a long-term solution, because you're hobbling the system for cases where hashjoin *is* the best method. AFAIK there is not a SET VARIABLE method for enabling/disabling plan types on-the-fly, though perhaps one should be added. The right long-term solution is to figure out why the system is misestimating the relative costs of the two plans, and fix the cost estimates. (The system is estimating that the mergejoin is about 4x slower than hash; if it's really 8x faster, there is something pretty broken about the estimate...) I am interested in looking into this. If your data is not proprietary, perhaps you would be willing to send me a database dump so that I can reproduce the problem exactly? (If the dump is no more than a few megabytes, emailing it should be OK.) No big hurry, since I probably won't be able to get to it for a week or so anyway. regards, tom lane
Re: [GENERAL] Re: [SQL] bad select performance fixed by forbidding hash joins
От
Bruce Momjian
Дата:
> George Young <gry@ll.mit.edu> writes: > > Yes! PGOPTIONS="-fh" made the query time go from 16 seconds to 2 seconds! > > Is this a safe thing to leave on permanently, or is there some way to set > > PGOPTIONS for just this query? > > I wouldn't recommend leaving it on as a long-term solution, because > you're hobbling the system for cases where hashjoin *is* the best > method. AFAIK there is not a SET VARIABLE method for enabling/disabling > plan types on-the-fly, though perhaps one should be added. Postgres does have options to prohibit certain join types, so you could try forcing a certain join type and see what happens. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
A few weeks ago, George Young <gry@ll.mit.edu> complained that the following query: > select os.name,r.run_name,ro.status from opset_steps os,runs r, > run_opsets ro where (ro.status=3 or ro.status=1) and > ro.opset_id=os.opset_id and ro.run_id=r.run_id and > ro.opset_ver=os.opset_ver and r.status=1; had horrible performance when executed via the system's preferred plan, > Hash Join (cost=1793.58 rows=14560 width=38) > -> Hash Join (cost=1266.98 rows=14086 width=24) > -> Seq Scan on run_opsets ro (cost=685.51 rows=13903 width=8) > -> Hash (cost=70.84 rows=1389 width=16) > -> Seq Scan on opset_steps os (cost=70.84 rows=1389 width=16) > -> Hash (cost=47.43 rows=374 width=14) > -> Seq Scan on runs r (cost=47.43 rows=374 width=14) I have looked into this, and it seems that the problem is this: the innermost hash join between run_opsets and opset_steps is being done on the join clause ro.opset_ver=os.opset_ver. In George's data, the opset_ver columns only have about 14 distinct values, with a very strong bias towards the values 1,2,3. This means that the vast majority of the opset_steps entries go into only three hash buckets, and the vast majority of the probes from run_opsets search one of those same three buckets, so that most of the run_opsets rows are being compared to almost a third of the opset_steps rows, not just a small fraction of them. Almost all of the runtime of the query is going into the tuple comparison tests :-( It seems clear that we want the system not to risk using a hashjoin unless it has good evidence that the inner table's column has a fairly flat distribution. I'm thinking that the right sort of check would be to check whether the "disbursion" statistic set by VACUUM ANALYZE is fairly small, maybe 0.01 or less (but not zero, which would suggest that VACUUM ANALYZE has never been run). This would roughly correspond to the most common value appearing not more than 1% of the time, so that we can be sure at least 100 different hashbuckets will be used. Comments? Is that value too small? This change is likely to reduce the optimizer's willingness to use hashjoins by a *lot*, especially if we make the threshold too small. If you'd like to see what kind of disbursion numbers you get on your own data, try something likeselect relname,attname,attdisbursion from pg_class,pg_attributewhere attrelid = pg_class.oidand relkind = 'r' and attnum > 0order by relname,attname; after a vacuum analyze. regards, tom lane PS: George, in the meantime I bet your query would run fine if the system would only choose the opset_id clause instead of opset_ver to do the hashjoin with --- opset_id has far better distribution. I'm guessing that it thinks the two clauses are equally attractive and is just choosing whichever one it happens to process first (or last?). You might try rearranging the order of the WHERE clauses as a stopgap solution...