Обсуждение: snowflaking
Hi all, I need some help optimizing a snowflaked :( db structure. I'm using 7.3.4 at the moment Scenario: I have only a few tables holding real data: e.g. txt,flt,tf,datum these tables only hold 1 column of real data and information what 'virtual' type they are. I have a table containing meta information about 'virtual' tables, i.e. tables that don't really exist in postgresql but have to be joined on the fly via the meta information. At runtime a query is build: --------------------------------------------------------------------------------------------------- SELECT o.id_objekt as id_objekt ,kapsel.id2_objekt as id2_kapsel ,vater.id2_objekt as id2_vater ,anzeige.id_objekt_objekt as id_anzeige ,anzeige.id2_objekt as anzeige ,anzeige.id2_objekt as id2_anzeige ,anzeige_datum_display_von.id_datum as id_anzeige_datum_display_von ,anzeige_datum_display_von.datum_fld as anzeige_datum_display_von ,anzeige_datum_display_bis.id_datum as id_anzeige_datum_display_bis ,anzeige_datum_display_bis.datum_fld as anzeige_datum_display_bis ,anzeige_enabled.id_tf as id_anzeige_enabled ,anzeige_enabled.tf as anzeige_enabled ,headline.id_objekt_objekt as id_headline ,headline.id2_objekt as headline [...MANY MANY MORE...] FROM objekt o JOIN ( SELECT id_objekt, id2_objekt FROM objekt_objekt JOIN meta USING ( id_meta ) WHERE id_meta = 91 ) vater ON ( vater.id_objekt = o.id_objekt ) JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta USING ( id_meta ) WHERE id_meta = 110 ) Kapsel ON ( kapsel.id_objekt = o.id_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta USING ( id_meta ) WHERE id_meta=84 ) anzeige ON ( anzeige.id_objekt=o.id_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_datum, datum_fld FROM datum JOIN meta USING ( id_meta ) WHERE id_meta=73 ) anzeige_datum_display_von ON ( anzeige_datum_display_von.id_objekt=anzeige.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_datum, datum_fld FROM datum JOIN meta USING ( id_meta ) WHERE id_meta=74 ) anzeige_datum_display_bis ON ( anzeige_datum_display_bis.id_objekt=anzeige.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_tf, tf FROM tf JOIN meta USING ( id_meta ) WHERE id_meta=75 ) anzeige_enabled ON ( anzeige_enabled.id_objekt=anzeige.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta USING ( id_meta ) WHERE id_meta=35 ) headline ON ( headline.id_objekt=o.id_objekt ) [...MANY MANY MORE...] ---------------------------------------------------------------------------------------------------- Execution takes approximately 0.05s When I EXPLAIN ANALYZE the query I get: ---------------------------------------------------------------------------------------------------- Hash Join (cost=123.24..212.68 rows=1 width=576) (actual time=11.93..12.50 rows=1 loops=1) Hash Cond: ("outer".id_objekt = "inner".id_objekt) -> Hash Join (cost=118.56..207.99 rows=1 width=552) (actual time=11.24..11.80 rows=1 loops=1) Hash Cond: ("outer".id_objekt = "inner".id_objekt) -> Nested Loop (cost=107.33..196.76 rows=1 width=535) (actual time=10.40..10.94 rows=1 loops=1) Join Filter: ("inner".id_objekt = "outer".id2_objekt) -> Nested Loop (cost=105.41..185.51 rows=1 width=478) (actual time=9.82..10.34 rows=1 loops=1) Join Filter: ("inner".id_objekt = "outer".id2_objekt) -> Nested Loop (cost=96.53..174.35 rows=1 width=421) (actual time=9.12..9.63 rows=1 loops=1) [... MANY MORE COLUMNS LOOKING SIMILAR ...] Total runtime: 17.36 msec (171 rows) ---------------------------------------------------------------------------------------------------- Is there anything I can do to speed it a bit up? We can assume, that a) inserts are done very rarely and b) the number of actual results are very low -- e-Trolley Sayegh & John, Nabil Sayegh Tel.: 0700 etrolley /// 0700 38765539 Fax.: +49 69 8299381-8 PGP : http://www.e-trolley.de
Nabil Sayegh wrote: [...big snowflake query and explain analyze results...] > Is there anything I can do to speed it a bit up? > We can assume, that > a) inserts are done very rarely and > b) the number of actual results are very low One simple idea is to use PREPARE and EXECUTE. This is only useful if you can PREPARE once and then EXECUTE multiple times (i.e. as long as you don't need to reconnect prior to every execution of the query). HTH, Joe
Joe Conway wrote: > Nabil Sayegh wrote: > [...big snowflake query and explain analyze results...] > >> Is there anything I can do to speed it a bit up? >> We can assume, that >> a) inserts are done very rarely and >> b) the number of actual results are very low > > > One simple idea is to use PREPARE and EXECUTE. This is only useful if > you can PREPARE once and then EXECUTE multiple times (i.e. as long as > you don't need to reconnect prior to every execution of the query). Hm, unfortunately my queries (at least the expensive ones) are only executed once per session. If only it could be made persistent, then it would help :( Any other ideas? What if I create views for such queries? Would these be faster (i.e. prepared automatically/persistent)? TFYH -- e-Trolley Sayegh & John, Nabil Sayegh Tel.: 0700 etrolley /// 0700 38765539 Fax.: +49 69 8299381-8 PGP : http://www.e-trolley.de
When I take the query of my first post and EXPLAIN ANALYZE it with 7.3 I get the following output: ---------------------------------------------------------------------------------------------------- Hash Join (cost=85.66..213.16 rows=1 width=240) (actual time=34.01..34.47 rows=1 loops=1) Hash Cond: ("outer".id_objekt = "inner".id_objekt) -> Hash Join (cost=80.98..208.48 rows=1 width=228) (actual time=10.49..10.94 rows=1 loops=1) Hash Cond: ("outer".id_objekt = "inner".id_objekt) -> Nested Loop (cost=69.75..197.24 rows=1 width=216) (actual time=9.65..10.09 rows=1 loops=1) Join Filter: ("inner".id_objekt = "outer".id2_objekt) -> Nested Loop (cost=67.83..185.99 rows=1 width=204) (actual time=9.17..9.58 rows=1 loops=1) Join Filter: ("inner".id_objekt = "outer".id2_objekt) -> Nested Loop (cost=65.91..174.72 rows=1 width=192) (actual time=8.68..9.08 rows=1 loops=1) Join Filter: ("inner".id_objekt = "outer".id2_objekt) -> Nested Loop (cost=63.99..163.44 rows=1 width=180) (actual time=8.19..8.58 rows=1 loops=1) Join Filter: ("inner".id_objekt = "outer".id2_objekt) -> Hash Join (cost=59.66..149.14 rows=1 width=164) (actual time=7.63..8.00 rows=1 loops=1) Hash Cond: ("outer".id_objekt = "inner".id_objekt) -> Nested Loop (cost=45.61..135.07 rows=1 width=148) (actual time=6.48..6.83 rows=1 [etc etc etc] ---------------------------------------------------------------------------------------------------- Now I tested it with 7.4 to see if gets faster, but guess what? It's about 40 times slower(!): ---------------------------------------------------------------------------------------------------- Merge Left Join (cost=142.22..142.28 rows=2 width=0) (actual time=18.681..18.684 rows=1 loops=1) Merge Cond: ("outer".id_objekt = "inner".id_objekt) -> Sort (cost=136.19..136.20 rows=2 width=4) (actual time=18.464..18.465 rows=1 loops=1) Sort Key: o.id_objekt -> Hash Left Join (cost=136.01..136.18 rows=2 width=4) (actual time=18.289..18.293 rows=1 loops=1) Hash Cond: ("outer".id_objekt = "inner".id_objekt) -> Merge Left Join (cost=129.95..130.03 rows=2 width=4) (actual time=17.860..17.863 rows=1 loops=1) Merge Cond: ("outer".id2_objekt = "inner".id_objekt) -> Sort (cost=123.05..123.06 rows=2 width=8) (actual time=17.568..17.569 rows=1 loops=1) Sort Key: public.objekt_objekt.id2_objekt -> Hash Left Join (cost=122.77..123.04 rows=2 width=8) (actual time=17.543..17.547 rows=1 loops=1) Hash Cond: ("outer".id2_objekt = "inner".id_objekt) -> Merge Left Join (cost=116.05..116.13 rows=2 width=12) (actual time=16.933..16.936 rows=1 loops=1) Merge Cond: ("outer".id2_objekt = "inner".id_objekt) -> Sort (cost=109.11..109.12 rows=2 width=12) (actual time=16.622..16.623 rows=1 loops=1) [etc etc etc] ---------------------------------------------------------------------------------------------------- Any idea? Additional Information: I'm JOINing exclusively on PKeys/FKeys and the WHERE clause also only uses PKeys (these should have indexes automatically, right?) I did VACUUM ANALYZE on both machines, didn't help. -- e-Trolley Sayegh & John, Nabil Sayegh Tel.: 0700 etrolley /// 0700 38765539 Fax.: +49 69 8299381-8 PGP : http://www.e-trolley.de
On Tue, 6 Apr 2004, Nabil Sayegh wrote: > When I take the query of my first post and EXPLAIN ANALYZE it with 7.3 I get the following output: > ---------------------------------------------------------------------------------------------------- > Hash Join (cost=85.66..213.16 rows=1 width=240) (actual time=34.01..34.47 rows=1 loops=1) > Now I tested it with 7.4 to see if gets faster, but guess what? > It's about 40 times slower(!): > > ---------------------------------------------------------------------------------------------------- > Merge Left Join (cost=142.22..142.28 rows=2 width=0) (actual time=18.681..18.684 rows=1 loops=1) Hmm, this actual time number seems about 1/2 the above unless I'm missing something (34 vs 19). Is it possible you missed some lines or something while posting?
Stephan Szabo wrote: > On Tue, 6 Apr 2004, Nabil Sayegh wrote: > > >>When I take the query of my first post and EXPLAIN ANALYZE it with 7.3 I get the following output: >>---------------------------------------------------------------------------------------------------- >> Hash Join (cost=85.66..213.16 rows=1 width=240) (actual time=34.01..34.47 rows=1 loops=1) > > >>Now I tested it with 7.4 to see if gets faster, but guess what? >>It's about 40 times slower(!): >> >>---------------------------------------------------------------------------------------------------- >> Merge Left Join (cost=142.22..142.28 rows=2 width=0) (actual time=18.681..18.684 rows=1 loops=1) > > > Hmm, this actual time number seems about 1/2 the above unless I'm missing > something (34 vs 19). Is it possible you missed some lines or something > while posting? I posted only the first few lines (it's repeating). But I found the solution/problem: If I use JOIN instead of LEFT OUTER JOIN it is on 7.4 as fast as on 7.3 (At the moment there are no NULL values, so for the moment I can live without the LEFT OUTER JOIN). But how can it be, that the query with LEFT OUTER JOIN has the same speed on 7.3 but is 40 times slower on 7.4? TFYH -- e-Trolley Sayegh & John, Nabil Sayegh Tel.: 0700 etrolley /// 0700 38765539 Fax.: +49 69 8299381-8 PGP : http://www.e-trolley.de
Nabil Sayegh <postgresql@e-trolley.de> writes: > When I take the query of my first post and EXPLAIN ANALYZE it with 7.3 I get the following output: > ---------------------------------------------------------------------------------------------------- > Hash Join (cost=85.66..213.16 rows=1 width=240) (actual time=34.01..34.47 rows=1 loops=1) > Now I tested it with 7.4 to see if gets faster, but guess what? > It's about 40 times slower(!): > Merge Left Join (cost=142.22..142.28 rows=2 width=0) (actual time=18.681..18.684 rows=1 loops=1) You're showing actual time of 18 msec vs 34 msec in 7.3, which doesn't work out to 40 times slower on my calculator. If there's a problem here you're not showing it. regards, tom lane
On Tue, 6 Apr 2004, Nabil Sayegh wrote: > Stephan Szabo wrote: > > On Tue, 6 Apr 2004, Nabil Sayegh wrote: > > > > > >>When I take the query of my first post and EXPLAIN ANALYZE it with 7.3 I get the following output: > >>---------------------------------------------------------------------------------------------------- > >> Hash Join (cost=85.66..213.16 rows=1 width=240) (actual time=34.01..34.47 rows=1 loops=1) > > > > > >>Now I tested it with 7.4 to see if gets faster, but guess what? > >>It's about 40 times slower(!): > >> > >>---------------------------------------------------------------------------------------------------- > >> Merge Left Join (cost=142.22..142.28 rows=2 width=0) (actual time=18.681..18.684 rows=1 loops=1) > > > > > > Hmm, this actual time number seems about 1/2 the above unless I'm missing > > something (34 vs 19). Is it possible you missed some lines or something > > while posting? > > I posted only the first few lines (it's repeating). One other possibility is that the planning is taking a really long time. How long do explain (non-analyze) on the two systems take? And how many joins are there precisely?
Tom Lane wrote: > You're showing actual time of 18 msec vs 34 msec in 7.3, which doesn't > work out to 40 times slower on my calculator. If there's a problem here > you're not showing it. Sorry for the confusion with 40 times slower I meant the execution of my app. due to this 1 query. To clarify I modified the query: 7.3 with LEFT OUTER JOIN: ------------------------------------- $ time psql test2 -f query.sql ?column? ---------- (1 row) real 0m0.079s user 0m0.000s sys 0m0.010s ------------------------------------- 7.4. with LEFT OUTER JOIN: ------------------------------------- $ time psql test2 -f query.sql ?column? ---------- (1 Zeile) real 0m3.256s user 0m0.038s sys 0m0.026s 3.256 / 0.079 = 41 (That's what I meant) 7.3 without LEFT OUTER JOIN --------------------------------------- $ time psql test2 -f query.sql ?column? ---------- (1 row) real 0m0.072s = same as with LEFT OUTER JOIN user 0m0.000s sys 0m0.020s --------------------------------------- 7.4 without LEFT OUTER JOIN $ time psql test2 -f query.sql ?column? ---------- (1 Zeile) real 0m0.149s user 0m0.035s sys 0m0.020s ---------------------------------------- The query was (with LEFT OUTER JOIN): SELECT NULL FROM objekt o JOIN ( SELECT id_objekt, id2_objekt FROM objekt_objekt JOIN meta USING ( id_meta ) WHERE id_meta = 91 ) vater ON ( vater.id_objekt = o.id_objekt ) JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta USING ( id_meta ) WHERE id_meta = 110 ) Kapsel ON ( kapsel.id_objekt = o.id_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta USING ( id_meta ) WHERE id_meta=84 ) anzeige ON ( anzeige.id_objekt=o.id_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_datum, datum_fld FROM datum JOIN meta USING ( id_meta ) WHERE id_meta=73 ) anzeige_datum_display_von ON ( anzeige_datum_display_von.id_objekt=anzeige.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_datum, datum_fld FROM datum JOIN meta USING ( id_meta ) WHERE id_meta=74 ) anzeige_datum_display_bis ON ( anzeige_datum_display_bis.id_objekt=anzeige.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_tf, tf FROM tf JOIN meta USING ( id_meta ) WHERE id_meta=75 ) anzeige_enabled ON ( anzeige_enabled.id_objekt=anzeige.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta USING ( id_meta ) WHERE id_meta=35 ) headline ON ( headline.id_objekt=o.id_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta USING ( id_meta ) WHERE id_meta=72 ) headline_CSS ON ( headline_CSS.id_objekt=headline.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_txt, txt FROM txt JOIN meta USING ( id_meta ) WHERE id_meta=16 ) headline_CSS_regeln ON ( headline_CSS_regeln.id_objekt=headline_CSS.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_txt, txt FROM txt JOIN meta USING ( id_meta ) WHERE id_meta=144 ) headline_CSS_hover ON ( headline_CSS_hover.id_objekt=headline_CSS.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_txt, txt FROM txt JOIN meta USING ( id_meta ) WHERE id_meta=71 ) headline_txt ON ( headline_txt.id_objekt=headline.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta USING ( id_meta ) WHERE id_meta=30 ) fliesstext ON ( fliesstext.id_objekt=o.id_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta USING ( id_meta ) WHERE id_meta=72 ) fliesstext_CSS ON ( fliesstext_CSS.id_objekt=fliesstext.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_txt, txt FROM txt JOIN meta USING ( id_meta ) WHERE id_meta=16 ) fliesstext_CSS_regeln ON ( fliesstext_CSS_regeln.id_objekt=fliesstext_CSS.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_txt, txt FROM txt JOIN meta USING ( id_meta ) WHERE id_meta=144 ) fliesstext_CSS_hover ON ( fliesstext_CSS_hover.id_objekt=fliesstext_CSS.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_txt, txt FROM txt JOIN meta USING ( id_meta ) WHERE id_meta=71 ) fliesstext_txt ON ( fliesstext_txt.id_objekt=fliesstext.id2_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_tf, tf FROM tf JOIN meta USING ( id_meta ) WHERE id_meta=34 ) html_translate ON ( html_translate.id_objekt=o.id_objekt ) LEFT OUTER JOIN ( SELECT id_objekt, id_flt, flt FROM flt JOIN meta USING ( id_meta ) WHERE id_meta=33 ) pos_y ON ( pos_y.id_objekt=o.id_objekt ) WHERE o.id_objekt=26377; Again, sorry for the confusion, I hope now I made it a bit clearer. TFYH -- e-Trolley Sayegh & John, Nabil Sayegh Tel.: 0700 etrolley /// 0700 38765539 Fax.: +49 69 8299381-8 PGP : http://www.e-trolley.de
On Tue, 6 Apr 2004, Nabil Sayegh wrote: > Stephan Szabo wrote: > > On Tue, 6 Apr 2004, Nabil Sayegh wrote: > > > > > >>When I take the query of my first post and EXPLAIN ANALYZE it with 7.3 I get the following output: > >>---------------------------------------------------------------------------------------------------- > >> Hash Join (cost=85.66..213.16 rows=1 width=240) (actual time=34.01..34.47 rows=1 loops=1) > > > > > >>Now I tested it with 7.4 to see if gets faster, but guess what? > >>It's about 40 times slower(!): > >> > >>---------------------------------------------------------------------------------------------------- > >> Merge Left Join (cost=142.22..142.28 rows=2 width=0) (actual time=18.681..18.684 rows=1 loops=1) > > > > > > Hmm, this actual time number seems about 1/2 the above unless I'm missing > > something (34 vs 19). Is it possible you missed some lines or something > > while posting? > > I posted only the first few lines (it's repeating). Like I said, if that's the first line up there, it's saying that 7.4 is twice as fast as 7.3 (at 19 ms vs 34 ms). We don't know where the other 39.5x the cost is going since it isn't into the plan AFAICS.
Stephan Szabo wrote: > One other possibility is that the planning is taking a really long time. > How long do explain (non-analyze) on the two systems take? And how many > joins are there precisely? Yeah, that seems to be it. It takes ~ (real) 3.776s on 7.4 (real) 0.082s on 7.3 measured with bash/time but shouldn't make a difference So the plan is actually better but takes much longer to ... plan, right? -- e-Trolley Sayegh & John, Nabil Sayegh Tel.: 0700 etrolley /// 0700 38765539 Fax.: +49 69 8299381-8 PGP : http://www.e-trolley.de
Nabil Sayegh <postgresql@e-trolley.de> writes: > Stephan Szabo wrote: >> One other possibility is that the planning is taking a really long time. >> How long do explain (non-analyze) on the two systems take? And how many >> joins are there precisely? > Yeah, that seems to be it. > It takes ~ > (real) 3.776s on 7.4 > (real) 0.082s on 7.3 > measured with bash/time but shouldn't make a difference > So the plan is actually better but takes much longer to ... plan, right? Right. In 7.3 your use of JOIN syntax limited the planner's search for good plans; in 7.4 it doesn't do so (by default anyway). You may care to read http://www.postgresql.org/docs/7.4/static/explicit-joins.html and experiment with altering JOIN_COLLAPSE_LIMIT (and/or reordering the joins in your query). regards, tom lane
On Tue, 6 Apr 2004, Nabil Sayegh wrote: > Stephan Szabo wrote: > > > One other possibility is that the planning is taking a really long time. > > How long do explain (non-analyze) on the two systems take? And how many > > joins are there precisely? > > Yeah, that seems to be it. > It takes ~ > (real) 3.776s on 7.4 > (real) 0.082s on 7.3 > measured with bash/time but shouldn't make a difference > > So the plan is actually better but takes much longer to ... plan, right? Yes. You might want to see if lowering join_collapse_limit helps any, I'd suggest trying 0 first and then incrementing it.
Tom Lane wrote: > Right. In 7.3 your use of JOIN syntax limited the planner's search for > good plans; in 7.4 it doesn't do so (by default anyway). You may care > to read http://www.postgresql.org/docs/7.4/static/explicit-joins.html > and experiment with altering JOIN_COLLAPSE_LIMIT (and/or reordering the > joins in your query). Thx all, I'll try that -- e-Trolley Sayegh & John, Nabil Sayegh Tel.: 0700 etrolley /// 0700 38765539 Fax.: +49 69 8299381-8 PGP : http://www.e-trolley.de