Обсуждение: View prevents index
I have a table that holds bidirectional links between objects, and had foolishly assumed that a view I created to simplify access would be rewritten to use the same indexes as the long version. Today I noticed things were rather slow, and I was disappointed to find out it wasn't working as I had expected: Here's a simplified case. The original query: ita_devel=> EXPLAIN SELECT seq FROM links WHERE id1 = 84 AND id1_type = 'pers' ita_devel-> UNION SELECT seq FROM links WHERE id2 = 84 AND id2_type = 'pers'; NOTICE: QUERY PLAN: Unique (cost=6.06..6.07 rows=1 width=4) -> Sort (cost=6.06..6.06 rows=2 width=4) -> Append (cost=0.00..6.05 rows=2 width=4) -> Subquery Scan *SELECT* 1 (cost=0.00..3.02 rows=1 width=4) -> Index Scan using links_id1 on links (cost=0.00..3.02 rows=1 width=4) -> Subquery Scan *SELECT* 2 (cost=0.00..3.03 rows=1 width=4) -> Index Scan using links_id2 on links (cost=0.00..3.03 rows=1 width=4) EXPLAIN Now in order to avoid repeating that UNION all over the place, I tried this view: ita_devel=> CREATE VIEW flat AS ita_devel-> SELECT seq, id1 AS from_id, id1_type AS from_type, ita_devel-> id2 AS to_id, id2_type AS to_type FROM links ita_devel-> UNION SELECT seq, id2 AS from_id, id2_type AS from_type, ita_devel-> id1 AS to_id, id1_type AS to_type FROM links; CREATE ita_devel=> EXPLAIN SELECT seq FROM flat WHERE from_id = 84 AND from_type = 'pers'; NOTICE: QUERY PLAN: Subquery Scan flat (cost=41.18..48.58 rows=59 width=36) -> Unique (cost=41.18..48.58 rows=59 width=36) -> Sort (cost=41.18..41.18 rows=592 width=36) -> Append (cost=0.00..13.92 rows=592 width=36) -> Subquery Scan *SELECT* 1 (cost=0.00..6.96 rows=296 width=36) -> Seq Scan on links (cost=0.00..6.96 rows=296 width=36) -> Subquery Scan *SELECT* 2 (cost=0.00..6.96 rows=296 width=36) -> Seq Scan on links (cost=0.00..6.96 rows=296 width=36) EXPLAIN The result is the same, but no more index scan. There are very few matching records in the table, so this has a real performance impact. I guess maybe I'm expecting too much magic optimization. Is this something it should be able to figure out? -- Christopher Masto Senior Network Monkey NetMonger Communications chris@netmonger.net info@netmonger.net http://www.netmonger.net Free yourself, free your machine, free the daemon -- http://www.freebsd.org/
Christopher Masto <chris+pg-general@netmonger.net> writes: > I guess maybe I'm expecting too much magic optimization. You're expecting the system to transform (SELECT foo UNION SELECT bar) WHERE condition into (SELECT foo WHERE condition) UNION (SELECT bar WHERE condition) It's not immediately obvious to me that these are equivalent, or perhaps I should say it's not clear under what conditions is the transformation legitimate. regards, tom lane
> Christopher Masto <chris+pg-general@netmonger.net> writes: > > I guess maybe I'm expecting too much magic optimization. > > You're expecting the system to transform > > (SELECT foo UNION SELECT bar) WHERE condition > > into > > (SELECT foo WHERE condition) UNION (SELECT bar WHERE condition) > > It's not immediately obvious to me that these are equivalent, or > perhaps I should say it's not clear under what conditions is the > transformation legitimate. Could you tell me in what cases two of above are not equivalent? Or any specification in the standard that refers to this kind of transformation for views? -- Tatsuo Ishii
On Thu, Jul 12, 2001 at 01:39:07PM +0900, Tatsuo Ishii wrote: > > Christopher Masto <chris+pg-general@netmonger.net> writes: > > > I guess maybe I'm expecting too much magic optimization. > > > > You're expecting the system to transform > > > > (SELECT foo UNION SELECT bar) WHERE condition > > > > into > > > > (SELECT foo WHERE condition) UNION (SELECT bar WHERE condition) > > > > It's not immediately obvious to me that these are equivalent, or > > perhaps I should say it's not clear under what conditions is the > > transformation legitimate. > > Could you tell me in what cases two of above are not equivalent? Or > any specification in the standard that refers to this kind of > transformation for views? The first clearly specifies the WHERE filtering should occur after the UNION, whereas the second is just the opposite. I think transforming the first to the second would be problematic due to the column name matching for the where clause (i.e. the user thinks they're matching on the column names that the result set from the union would generate but if it is transformed behind their back, it could fail). Consider: foo ::= { id::int, blurb::varchar } bar ::= { bar_id::int, data::varchar } Are these equivalent? 1. (SELECT foo.id As "Id", foo.blurb As "Stuff" UNION SELECT bar.bar_id, bar.data) WHERE character_length("Stuff") > 80; 2. (SELECT foo.id As "Id", foo.blurb As "Stuff" WHERE character_length ("Stuff") > 80) UNION (SELECT bar.bar_id As "Id", bar.data As "Stuff" WHERE character_length("Stuff") > 80); I guess the second would fail on the alias, but the first should succeed ?? I can't say anything about the specs, but the parser/planner/optimizer would have to be able to fall back to applying the WHERE after the UNION if it couldn't match up column names by splitting the UNION(s) out. I suppose you could try some magic to alias names by position in the select phrase, but that seems difficult. -- Eric G. Miller <egm2@jps.net>
On Wed, Jul 11, 2001 at 11:00:18PM -0700, Eric G. Miller wrote: > 2. (SELECT foo.id As "Id", foo.blurb As "Stuff" > WHERE character_length ("Stuff") > 80) > UNION > (SELECT bar.bar_id As "Id", bar.data As "Stuff" > WHERE character_length("Stuff") > 80); > > I guess the second would fail on the alias, but the first should > succeed ?? What's wrong with that query. Looks fine to me. > I can't say anything about the specs, but the parser/planner/optimizer > would have to be able to fall back to applying the WHERE after the UNION > if it couldn't match up column names by splitting the UNION(s) out. I > suppose you could try some magic to alias names by position in the > select phrase, but that seems difficult. Seems to me that you need to go through the entire union clause and realias all the output fields to their final names, and check that they don't clash. I think that as long as you don't have any aggregates, you should be fine. I certainly can't think of a counter example. Or put it another way, is: cat a.txt b.txt | grep pattern the same as: cat a.txt | grep pattern ; cat b.txt | grep pattern -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > It would be nice if someone came up with a certification system that > actually separated those who can barely regurgitate what they crammed over > the last few weeks from those who command secret ninja networking powers.
On Thu, Jul 12, 2001 at 04:25:47PM +1000, Martijn van Oosterhout wrote: > On Wed, Jul 11, 2001 at 11:00:18PM -0700, Eric G. Miller wrote: > > 2. (SELECT foo.id As "Id", foo.blurb As "Stuff" > > WHERE character_length ("Stuff") > 80) > > UNION > > (SELECT bar.bar_id As "Id", bar.data As "Stuff" > > WHERE character_length("Stuff") > 80); > > > > I guess the second would fail on the alias, but the first should > > succeed ?? > > What's wrong with that query. Looks fine to me. Well. pgsql doesn't handle column aliases in where clauses now... But in the first example, I would expect the relation formed by the inner union to have columns known as "Id" and "Stuff" so the outer WHERE would have to use those names. What would you do otherwise? Use the names from the first select? Clearly that won't transfer well to the WHERE clause in the second scenario without some name substitution. 1. (SELECT id FROM foo UNION SELECT key FROM bar) WHERE <id|key> = 2 ?? 2. (SELECT id FROM foo WHERE id = 2) UNION (SELECT key FROM bar WHERE key = 2) And, I guess you're right about aggregates as well. I still don't see the constructs as being equivalent due to the name changes that occur in a UNION. -- Eric G. Miller <egm2@jps.net>
Tatsuo Ishii <t-ishii@sra.co.jp> writes: >> It's not immediately obvious to me that these are equivalent, or >> perhaps I should say it's not clear under what conditions is the >> transformation legitimate. > Could you tell me in what cases two of above are not equivalent? It may well be OK, I just want to see a rigorous demonstration of it. It *looks* right, but intuition is frequently misleading. Two points that particularly need thought are (a) what about NULLs --- SQL's three-way boolean logic breaks a lot of other things that seem right intuitively; (b) does the same equivalence hold for UNION ALL, INTERSECT, INTERSECT ALL, EXCEPT, EXCEPT ALL? If you think that it's so obvious as not to require any thought, I offer the following counterexample: ... WHERE random() > 0.5; Pushing down a WHERE like this one *will* change the results. That particular case doesn't bother me, but user-defined functions that access other tables might have history-dependent behavior, too. Do we need to allow for that? regards, tom lane