Обсуждение: That killer 3rd join...
In order to explore some postgres performance options with table collation, I decided to use a little experimental dabase to try out some of the options I saw. What I want to create queries to combine data from 2+ tables into individual rows. So - being a bit of an EQ player, I cobbled together a trivial little database that tries to generate an 'EQ Jewellery' table. It all works fine, and it works fine under MS Access or mysql. But under Postgres, it grinds. It chugs. When I experimented with the database, I found that it only started to do this when I go to a fourth level of join. The database can be found here: http://www.kfs.org/~oliver/jewellery/dbcreate.sql Definition http://www.kfs.org/~oliver/jewellery/insert.sql Insert statements As you'll see - it's a pretty small table. If you do: SELECT * FROM jcombo_query WHERE metal_uid = 1 ; or SELECT * FROM jcombo_query jq, metal_types mt WHERE mt.metal_name = 'Silver' AND mt.metal_uid = jq.metal_uid ; There is no delay. Also doing SELECT * from jcombo_query silv, jcombo_query elec WHERE silv.metal_uid = 1 AND elec.metal_uid = 2 ; Still no delay. But add a third join SELECT * from jcombo_query silv, jcombo_query elec, jcombo_query gold WHERE silv.metal_uid = 1 AND elec.metal_uid = 2 AND gold.metal_uid = 3; Add a fourth: SELECT * from jcombo_query silv, jcombo_query elec, jcombo_query gold, jcombo_query plat WHERE silv.metal_uid = 1 AND elec.metal_uid = 2 AND gold.metal_uid = 3 AND plat.metal_uid = 4 ; And it's painful. So naturally, when I add the join (stone_types.stone_uid): SELECT * from jcombo_query silv, jcombo_query elec, jcombo_query gold, jcombo_query plat, stone_types st WHERE silv.metal_uid = 1 AND silv.stone_uid = st.stone_uid AND elec.metal_uid = 2 AND elec.stone_uid = st.stone_uid AND gold.metal_uid = 3 AND gold.stone_uid = st.stone_uid AND plat.metal_uid = 4 AND plat.stone_uid = st.stone_uid ; It takes way way way too long to come back for such a small database. How can I improve upon this kind of query? Oliver -- If at first you don't succeed, skydiving is not for you...
I'd put the explain output in, but it's just too too long. I think the issue here is that you're actually doing alot more joins than you think you are. (I think it's like 13 or so after the rewriting - 3 for each copy of the view) Your best bet is to put the results of jcombo_query into a temporary table and then join that 4 times rather than the 12 joins it turns out to in your query below. OTOH, I'm not 100% sure what you're trying to get out from this query, I'd have expected that it would be, using these metals on this stone gives you this result, but since the types of jewelery I get are different on the same row of output, I'm a little confused. As a separate issue I can't even do the query listed below on my machine with reasonably current sources, as it fails after a while with an ExecRestrPos: node type 18 not supported. Haven't looked yet... Stephan Szabo sszabo@bigpanda.com On Mon, 4 Sep 2000, Oliver Smith wrote: > In order to explore some postgres performance options with table > collation, I decided to use a little experimental dabase to try out > some of the options I saw. What I want to create queries to combine > data from 2+ tables into individual rows. > > So - being a bit of an EQ player, I cobbled together a trivial little > database that tries to generate an 'EQ Jewellery' table. It all works > fine, and it works fine under MS Access or mysql. But under Postgres, > it grinds. It chugs. > > When I experimented with the database, I found that it only started to > do this when I go to a fourth level of join. > > The database can be found here: > http://www.kfs.org/~oliver/jewellery/dbcreate.sql Definition > http://www.kfs.org/~oliver/jewellery/insert.sql Insert statements > > As you'll see - it's a pretty small table. > > > So naturally, when I add the join (stone_types.stone_uid): > > SELECT * from jcombo_query silv, jcombo_query elec, jcombo_query gold, > jcombo_query plat, stone_types st > WHERE silv.metal_uid = 1 AND silv.stone_uid = st.stone_uid AND > elec.metal_uid = 2 AND elec.stone_uid = st.stone_uid AND > gold.metal_uid = 3 AND gold.stone_uid = st.stone_uid AND > plat.metal_uid = 4 AND plat.stone_uid = st.stone_uid ; > > It takes way way way too long to come back for such a small database. > > How can I improve upon this kind of query?
> OTOH, I'm not 100% sure what you're trying to get out > from this query, I'd have expected that it would be, using > these metals on this stone gives you this result, but since > the types of jewelery I get are different on the same row > of output, I'm a little confused. Each stone gives a specific set of attributes, but combining it with silver & plat gives one type of jewellery, while gold & electrum produce a different type. The reason for including this is so that a chart can be produced which shows what attributes each stone modifies, how much those attributes are modified for each stone+metal combo, and what type of jewellery is produced in the combo. See http://www.kfs.org/~oliver/eq/jewellery.jsp to see the chart itself. As my own side note, on Postgres 7.0.2, I at one point tried creating a view which said CREATE VIEW silver_view AS SELECT * from jcombo_query WHERE metal_uid = 1 ; and so on for elec, gold, plat. And then used these to simplify the main query. This caused Postgres to go away permanently, and I had to manually delete the database. And the original query was so slow, that I decided to, for the time being, do a SELECT * INTO jewellery FROM metals_query ORDER BY stone_uid ; Ol -- If at first you don't succeed, skydiving is not for you...
On Thu, 7 Sep 2000, Oliver Smith wrote: > Each stone gives a specific set of attributes, but combining it with > silver & plat gives one type of jewellery, while gold & electrum > produce a different type. > > The reason for including this is so that a chart can be produced > which shows what attributes each stone modifies, how much those > attributes are modified for each stone+metal combo, and what type > of jewellery is produced in the combo. Ah, I see. > As my own side note, on Postgres 7.0.2, I at one point tried creating > a view which said > > CREATE VIEW silver_view AS > SELECT * from jcombo_query WHERE metal_uid = 1 ; > > and so on for elec, gold, plat. And then used these to simplify the > main query. This caused Postgres to go away permanently, and I had > to manually delete the database. I'll try that when I get home, I'd like to see what explain says in this case (if it even runs). My guess is that it would only make postgres' job more difficult since views are implemented as rewrite rules, this would just add another layer of rewrites that it would have work through. > And the original query was so slow, that I decided to, for the time > being, do a > > SELECT * INTO jewellery FROM metals_query ORDER BY stone_uid ; Yeah, tables for temp storage help, esp if you've got data that you don't need to interpret multiple times (like in the original one, the jcombo_query view getting joined with itself 4 times ends up being 12 joins - rather larger set of joins to try to work with). On my machine, just putting the jcombo_query data in a temporary table and using that rather than the view cut the time down to nearly nothing. In the original form, even just doing explain took like half a minute or something of that sort.
For the record, I have a similar query in a program I developed that aliases the same table multiple times and the performance on that one is terrible as well. In fact, it's the only query of the many that program uses that isn't downright snappy. Short of a table redesign (which I may do anyway), I haven't found a way of eliminating the bottleneck. Here's a snippet of the query... SELECT ... FROM ... review_statuses g, review_statuses h, review_statuses i, review_statuses j, review_statuses k, review_statuses l, review_statuses m, review_statuses n WHERE ... and a.review_status_wsep = g.status_code and a.review_status_dec = h.status_code and a.review_status_mar = i.status_code and a.review_status_jun = j.status_code and b.review_status_wsep = k.status_code and b.review_status_dec = l.status_code and b.review_status_mar = m.status_code and b.review_status_jun = n.status_code ORDER BY ...; Similarly, the EXPLAIN is extraordinarily long, so I haven't included. I've not run this particular query on our Oracle database (when I get a little free time I will), but have run similar types of queries and have never noticed any substantial performance hit, so I suspect that there's an opportunity for optimization here. Incidentally, and I probably should have mentioned it sooner than now, the program of which this query is a part won a CIO Magazine Web Business 50/50 Award this past July (IEPManager). Unfortunately, I gave the magazine lots of information which they reduced to a very short write-up. Included in that information was the fact that the project was build upon and runs on PostgreSQL (wanted to plug the developers' fine work), first v6.5.3 and now v7.0.2 (which is much improved -- many thanks). I had played with the earlier 6.4.x releases but did not find them robust enough for production use (we've been using Oracle since v5.something and, other than some early v6.x problems, have always appreciated its robustness, so our expectations are relatively high). Anyway, we needed to go to production use with IEPManager way faster than we had anticipated and therefore didn't have any budget for the project. So, we planned to start development on PostgreSQL then port to Oracle when funds were available (we're an underfunded not for profit organization). We are so pleased with the performance of PostgreSQL, though, that we've scrapped the porting plans. In fact, we've since done another project on PostgreSQL and look forward to all the great stuff that's planned for future releases (schemas and outer joins, especially). Thanks to all of the core developers for an absolutely superb job! - Bill Stephan Szabo wrote: > OTOH, I'm not 100% sure what you're trying to get out > from this query, I'd have expected that it would be, using > these metals on this stone gives you this result, but since > the types of jewelery I get are different on the same row > of output, I'm a little confused. > > Stephan Szabo > sszabo@bigpanda.com > > On Mon, 4 Sep 2000, Oliver Smith wrote: > > > In order to explore some postgres performance options with table > > collation, I decided to use a little experimental dabase to try out > > some of the options I saw. What I want to create queries to combine > > data from 2+ tables into individual rows. > > > > So - being a bit of an EQ player, I cobbled together a trivial little > > database that tries to generate an 'EQ Jewellery' table. It all works > > fine, and it works fine under MS Access or mysql. But under Postgres, > > it grinds. It chugs. > > > > When I experimented with the database, I found that it only started to > > do this when I go to a fourth level of join. > > > > The database can be found here: > > http://www.kfs.org/~oliver/jewellery/dbcreate.sql Definition > > http://www.kfs.org/~oliver/jewellery/insert.sql Insert statements > > > > As you'll see - it's a pretty small table. > > > > > > So naturally, when I add the join (stone_types.stone_uid): > > > > SELECT * from jcombo_query silv, jcombo_query elec, jcombo_query gold, > > jcombo_query plat, stone_types st > > WHERE silv.metal_uid = 1 AND silv.stone_uid = st.stone_uid AND > > elec.metal_uid = 2 AND elec.stone_uid = st.stone_uid AND > > gold.metal_uid = 3 AND gold.stone_uid = st.stone_uid AND > > plat.metal_uid = 4 AND plat.stone_uid = st.stone_uid ; > > > > It takes way way way too long to come back for such a small database. > > > > How can I improve upon this kind of query? -- Bill Sofko Publisher, Contrariwise
Just played with this a little bit, and I'm not 100% certain whether I'm getting the "desired results", but why not do something like: SELECT stone_name, st.stone_uid, stone_modifies, stone_difficulty, stone_cost, silver.jtyp_name AS silver_name, (silver.metal_cost + stone_cost) AS Expr1, silver.jcombo_stats AS silver_stats, elec.jtyp_name AS electrum_name, (elec.metal_cost + stone_cost) AS Expr2, elec.jcombo_stats AS electrum_stats, gold.jtyp_name AS gold_name, (gold.metal_cost + stone_cost) AS Expr3, gold.jcombo_stats AS gold_stats, plat.jtyp_name AS plat_name, (plat.metal_cost + stone_cost) AS Expr4, plat.jcombo_stats AS plat_stats FROM stone_types st, silver, elec, gold, plat WHERE st.stone_uid = silver.stone_uid AND st.stone_uid = elec.stone_uid AND st.stone_uid = gold.stone_uid AND st.stone_uid = plat.stone_uid ; where 'silver','elec','gold','plat' are each: CREATE VIEW silver AS SELECT metal_name, jtyp_name, metal_cost, jc.stone_uid, mt.metal_uid, jc.jcombo_stats FROM jewellery_types jt, jewellery_combinations jc, metal_types mt WHERE jt.jtyp_uid = jc.jtyp_uid AND jc.metal_uid = mt.metal_uid AND mt.metal_uid = 1; I have to be missing something though, since I'm only getting back 23 results, but they come back quick as anything ... On Mon, 4 Sep 2000, Oliver Smith wrote: > In order to explore some postgres performance options with table > collation, I decided to use a little experimental dabase to try out > some of the options I saw. What I want to create queries to combine > data from 2+ tables into individual rows. > > So - being a bit of an EQ player, I cobbled together a trivial little > database that tries to generate an 'EQ Jewellery' table. It all works > fine, and it works fine under MS Access or mysql. But under Postgres, > it grinds. It chugs. > > When I experimented with the database, I found that it only started to > do this when I go to a fourth level of join. > > The database can be found here: > http://www.kfs.org/~oliver/jewellery/dbcreate.sql Definition > http://www.kfs.org/~oliver/jewellery/insert.sql Insert statements > > As you'll see - it's a pretty small table. > > > If you do: > > SELECT * FROM jcombo_query WHERE metal_uid = 1 ; > or > SELECT * FROM jcombo_query jq, metal_types mt > WHERE mt.metal_name = 'Silver' AND mt.metal_uid = jq.metal_uid ; > > There is no delay. > > Also doing > > SELECT * from jcombo_query silv, jcombo_query elec > WHERE silv.metal_uid = 1 AND elec.metal_uid = 2 ; > > Still no delay. But add a third join > > SELECT * from jcombo_query silv, jcombo_query elec, jcombo_query gold > WHERE silv.metal_uid = 1 AND elec.metal_uid = 2 AND gold.metal_uid = 3; > > Add a fourth: > > SELECT * from jcombo_query silv, jcombo_query elec, jcombo_query gold, > jcombo_query plat > WHERE silv.metal_uid = 1 AND elec.metal_uid = 2 AND gold.metal_uid = 3 AND > plat.metal_uid = 4 ; > > And it's painful. > > So naturally, when I add the join (stone_types.stone_uid): > > SELECT * from jcombo_query silv, jcombo_query elec, jcombo_query gold, > jcombo_query plat, stone_types st > WHERE silv.metal_uid = 1 AND silv.stone_uid = st.stone_uid AND > elec.metal_uid = 2 AND elec.stone_uid = st.stone_uid AND > gold.metal_uid = 3 AND gold.stone_uid = st.stone_uid AND > plat.metal_uid = 4 AND plat.stone_uid = st.stone_uid ; > > It takes way way way too long to come back for such a small database. > > How can I improve upon this kind of query? > > > Oliver > -- > If at first you don't succeed, skydiving is not for you... > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes: > As a separate issue I can't even do the query listed below on my > machine with reasonably current sources, as it fails after a while with > an ExecRestrPos: node type 18 not supported. Haven't looked yet... Yes, there's a nasty little planning bug in 7.0.*: the planner can try to use merge or nestloop joins as the inner input of another mergejoin. That doesn't work, and what's worse, 7.0.* will just silently give wrong answers for such a plan (well, there's a DEBUG message in the postmaster log, but that's little help :-(). Current sources give an executor error instead. I have the planner bug fixed in my own files but it's intertwined with a ton of OUTER JOIN changes I'm not ready to commit yet... regards, tom lane
On Wed, Sep 06, 2000 at 10:52:34PM -0300, The Hermit Hacker wrote: > SELECT > stone_name, st.stone_uid, stone_modifies, stone_difficulty, stone_cost, > silver.jtyp_name AS silver_name, (silver.metal_cost + stone_cost) AS Expr1, > silver.jcombo_stats AS silver_stats, > elec.jtyp_name AS electrum_name, (elec.metal_cost + stone_cost) AS Expr2, > elec.jcombo_stats AS electrum_stats, > gold.jtyp_name AS gold_name, (gold.metal_cost + stone_cost) AS Expr3, > gold.jcombo_stats AS gold_stats, > plat.jtyp_name AS plat_name, (plat.metal_cost + stone_cost) AS Expr4, > plat.jcombo_stats AS plat_stats > FROM > stone_types st, silver, elec, gold, plat > WHERE > st.stone_uid = silver.stone_uid AND > st.stone_uid = elec.stone_uid AND > st.stone_uid = gold.stone_uid AND > st.stone_uid = plat.stone_uid ; > > where 'silver','elec','gold','plat' are each: > > CREATE VIEW silver AS > SELECT > metal_name, jtyp_name, metal_cost, jc.stone_uid, mt.metal_uid, jc.jcombo_stats > FROM > jewellery_types jt, jewellery_combinations jc, metal_types mt > WHERE > jt.jtyp_uid = jc.jtyp_uid AND jc.metal_uid = mt.metal_uid AND mt.metal_uid = 1; > > I have to be missing something though, since I'm only getting back 23 > results, but they come back quick as anything ... 23 results is correct (or, infact, correct with the data I supplied). Hmm - Interesting tho. It hadn't occured to me to drop the 'generic' jcombo_query and replace that with specific views. Infact, what I'd been doing was using CREATE VIEW jcombo_query AS SELECT metal_name, jtyp_name, metal_cost, jc.stone_uid, mt.metal_uid, jc.jcombo_stats FROM jewellery_types jt, jewellery_combinations jc, metal_types mt WHERE jt.jtyp_uid = jc.jtyp_uid AND jc.metal_uid = mt.metal_uid ; and then having the following: CREATE VIEW silver AS SELECT * FROM jcombo_query WHERE metal_uid = 1 ; CREATE VIEW elec AS ... However, when I did that, postgres went away. I'll give your idea a try, it looks quite promising. Oliver -- If at first you don't succeed, skydiving is not for you...
On Thu, 7 Sep 2000, Oliver Smith wrote: > CREATE VIEW silver AS > SELECT * FROM jcombo_query WHERE metal_uid = 1 ; > > CREATE VIEW elec AS > ... > > However, when I did that, postgres went away. I'll give your idea a try, > it looks quite promising. Ya, looked at that ... you were doing a VIEW of a VIEW ... not sure why it blew up taht way, but I'm suspecting it was that tha tblew it up ... BTW, what do you mean by 'went away'? crashed, or just hung there?
On Thu, Sep 07, 2000 at 02:11:38PM +0100, Oliver Smith wrote: > However, when I did that, postgres went away. I'll give your idea a try, > it looks quite promising. I still found this took quite a long time to process the query. So - I created the four views, silver, elec, gold, plat, and a new view, jewellery, which uses those four directly. Here's the explain result: EXPLAIN SELECT * FROM jewellery ORDER BY stone_uid ; NOTICE: QUERY PLAN: Sort (cost=22.93..22.93 rows=1 width=236) -> Nested Loop (cost=11.29..22.92 rows=1 width=236) -> Nested Loop (cost=11.29..21.81 rows=1 width=220) -> Nested Loop (cost=11.29..19.78 rows=1 width=196) -> Nested Loop (cost=11.29..18.67 rows=1 width=180) -> Nested Loop (cost=11.29..17.56 rows=1 width=164) -> Nested Loop (cost=11.29..16.50 rows=1 width=156) -> Nested Loop (cost=11.29..14.47 rows=1 width=132) -> Nested Loop (cost=11.29..13.41 rows=1 width=124) -> Merge Join (cost=11.29..11.38 rows=1 width=100) -> Sort (cost=10.18..10.18 rows=2 width=84) -> Hash Join (cost=3.49..10.16 rows=2 width=84) -> Nested Loop (cost=0.00..3.91 rows=27 width=60) -> Seq Scan on metal_types mt (cost=0.00..1.05rows=1 width=8) -> Materialize (cost=2.59..2.59 rows=27 width=52) -> Nested Loop (cost=0.00..2.59 rows=27width=52) -> Seq Scan on metal_types mt (cost=0.00..1.05rows=1 width=8) -> Seq Scan on stone_types st (cost=0.00..1.27rows=27 width=44) -> Hash (cost=2.08..2.08 rows=108 width=24) -> Seq Scan on jewellery_combinations jc (cost=0.00..2.08rows=108 width=24) -> Sort (cost=1.11..1.11 rows=5 width=16) -> Seq Scan on jewellery_types jt (cost=0.00..1.05 rows=5width=16) -> Index Scan using jewellery_combinations_pkey on jewellery_combinationsjc (cost=0.00..2.01 rows=1 width=24) -> Seq Scan on metal_types mt (cost=0.00..1.05 rows=1 width=8) -> Index Scan using jewellery_combinations_pkey on jewellery_combinations jc (cost=0.00..2.01rows=1 width=24) -> Seq Scan on metal_types mt (cost=0.00..1.05 rows=1 width=8) -> Seq Scan on jewellery_types jt (cost=0.00..1.05 rows=5 width=16) -> Seq Scan on jewellery_types jt (cost=0.00..1.05 rows=5 width=16) -> Index Scan using jewellery_combinations_pkey on jewellery_combinations jc (cost=0.00..2.01 rows=1 width=24) -> Seq Scan on jewellery_types jt (cost=0.00..1.05 rows=5 width=16) EXPLAIN -- If at first you don't succeed, skydiving is not for you...
On Thu, Sep 07, 2000 at 10:23:57AM -0300, The Hermit Hacker wrote: > BTW, what do you mean by 'went away'? crashed, or just hung there? Stopped using CPU, stopped responding to signals (except in the end -9), and when I did finally kill it (the following day, to give it chance to do what it was doing), it left the database the tables had been in flagged as 'in use' so I couldn't touch them =/ Oliver -- If at first you don't succeed, skydiving is not for you...
On Thu, 7 Sep 2000, Oliver Smith wrote: > On Thu, Sep 07, 2000 at 10:23:57AM -0300, The Hermit Hacker wrote: > > BTW, what do you mean by 'went away'? crashed, or just hung there? > > Stopped using CPU, stopped responding to signals (except in the end -9), > and when I did finally kill it (the following day, to give it chance to > do what it was doing), it left the database the tables had been in flagged > as 'in use' so I couldn't touch them =/ Stupid question, but what version of PostgreSQL did you say you were using? *raised eyebrow*
On Thu, Sep 07, 2000 at 04:59:51PM -0300, The Hermit Hacker wrote: > On Thu, 7 Sep 2000, Oliver Smith wrote: > > > On Thu, Sep 07, 2000 at 10:23:57AM -0300, The Hermit Hacker wrote: > > > BTW, what do you mean by 'went away'? crashed, or just hung there? > > > > Stopped using CPU, stopped responding to signals (except in the end -9), > > and when I did finally kill it (the following day, to give it chance to > > do what it was doing), it left the database the tables had been in flagged > > as 'in use' so I couldn't touch them =/ > > Stupid question, but what version of PostgreSQL did you say you were > using? *raised eyebrow* psql (PostgreSQL) 7.0.2 contains readline, history, multibyte support Portions Copyright (c) 1996-2000, PostgreSQL, Inc Portions Copyright (c) 1996 Regents of the University of California Read the file COPYRIGHT or use the command \copyright to see the usage and distribution terms. Oliver -- If at first you don't succeed, skydiving is not for you...
most odd then ... I've had to kill off processes in the past (recent bug report being a good example) and haven't had problems where having to rebuild the database was a requirement ... I can't recall the OS you are running on, but FreeBSD has a 'gcore' function that you can get a core dump of a runnign process with ... assuming you have similar (or are running FreeBSD), I wonder what a core dump would show ... On Fri, 8 Sep 2000, Oliver Smith wrote: > On Thu, Sep 07, 2000 at 04:59:51PM -0300, The Hermit Hacker wrote: > > On Thu, 7 Sep 2000, Oliver Smith wrote: > > > > > On Thu, Sep 07, 2000 at 10:23:57AM -0300, The Hermit Hacker wrote: > > > > BTW, what do you mean by 'went away'? crashed, or just hung there? > > > > > > Stopped using CPU, stopped responding to signals (except in the end -9), > > > and when I did finally kill it (the following day, to give it chance to > > > do what it was doing), it left the database the tables had been in flagged > > > as 'in use' so I couldn't touch them =/ > > > > Stupid question, but what version of PostgreSQL did you say you were > > using? *raised eyebrow* > > psql (PostgreSQL) 7.0.2 > contains readline, history, multibyte support > Portions Copyright (c) 1996-2000, PostgreSQL, Inc > Portions Copyright (c) 1996 Regents of the University of California > Read the file COPYRIGHT or use the command \copyright to see the > usage and distribution terms. > > Oliver > -- > If at first you don't succeed, skydiving is not for you... > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org