Обсуждение: View prevents index

Поиск
Список
Период
Сортировка

View prevents index

От
Christopher Masto
Дата:
I have a table that holds bidirectional links between objects, and had
foolishly assumed that a view I created to simplify access would be
rewritten to use the same indexes as the long version.  Today I noticed
things were rather slow, and I was disappointed to find out it wasn't
working as I had expected:

Here's a simplified case.  The original query:

ita_devel=> EXPLAIN SELECT seq FROM links WHERE id1 = 84 AND id1_type = 'pers'
ita_devel->   UNION SELECT seq FROM links WHERE id2 = 84 AND id2_type = 'pers';
NOTICE:  QUERY PLAN:

Unique  (cost=6.06..6.07 rows=1 width=4)
  ->  Sort  (cost=6.06..6.06 rows=2 width=4)
        ->  Append  (cost=0.00..6.05 rows=2 width=4)
              ->  Subquery Scan *SELECT* 1  (cost=0.00..3.02 rows=1 width=4)
                    ->  Index Scan using links_id1 on links  (cost=0.00..3.02 rows=1 width=4)
              ->  Subquery Scan *SELECT* 2  (cost=0.00..3.03 rows=1 width=4)
                    ->  Index Scan using links_id2 on links  (cost=0.00..3.03 rows=1 width=4)

EXPLAIN

Now in order to avoid repeating that UNION all over the place, I tried
this view:

ita_devel=> CREATE VIEW flat AS
ita_devel->         SELECT seq, id1 AS from_id, id1_type AS from_type,
ita_devel->                     id2 AS to_id,   id2_type AS to_type FROM links
ita_devel->   UNION SELECT seq, id2 AS from_id, id2_type AS from_type,
ita_devel->                     id1 AS to_id,   id1_type AS to_type FROM links;
CREATE
ita_devel=> EXPLAIN SELECT seq FROM flat WHERE from_id = 84 AND from_type = 'pers';
NOTICE:  QUERY PLAN:

Subquery Scan flat  (cost=41.18..48.58 rows=59 width=36)
  ->  Unique  (cost=41.18..48.58 rows=59 width=36)
        ->  Sort  (cost=41.18..41.18 rows=592 width=36)
              ->  Append  (cost=0.00..13.92 rows=592 width=36)
                    ->  Subquery Scan *SELECT* 1  (cost=0.00..6.96 rows=296 width=36)
                          ->  Seq Scan on links  (cost=0.00..6.96 rows=296 width=36)
                    ->  Subquery Scan *SELECT* 2  (cost=0.00..6.96 rows=296 width=36)
                          ->  Seq Scan on links  (cost=0.00..6.96 rows=296 width=36)

EXPLAIN

The result is the same, but no more index scan.  There are very few
matching records in the table, so this has a real performance impact.

I guess maybe I'm expecting too much magic optimization.  Is this
something it should be able to figure out?
--
Christopher Masto         Senior Network Monkey      NetMonger Communications
chris@netmonger.net        info@netmonger.net        http://www.netmonger.net

Free yourself, free your machine, free the daemon -- http://www.freebsd.org/

Re: View prevents index

От
Tom Lane
Дата:
Christopher Masto <chris+pg-general@netmonger.net> writes:
> I guess maybe I'm expecting too much magic optimization.

You're expecting the system to transform

    (SELECT foo UNION SELECT bar) WHERE condition

into

    (SELECT foo WHERE condition) UNION (SELECT bar WHERE condition)

It's not immediately obvious to me that these are equivalent, or
perhaps I should say it's not clear under what conditions is the
transformation legitimate.

            regards, tom lane

Re: View prevents index

От
Tatsuo Ishii
Дата:
> Christopher Masto <chris+pg-general@netmonger.net> writes:
> > I guess maybe I'm expecting too much magic optimization.
>
> You're expecting the system to transform
>
>     (SELECT foo UNION SELECT bar) WHERE condition
>
> into
>
>     (SELECT foo WHERE condition) UNION (SELECT bar WHERE condition)
>
> It's not immediately obvious to me that these are equivalent, or
> perhaps I should say it's not clear under what conditions is the
> transformation legitimate.

Could you tell me in what cases two of above are not equivalent? Or
any specification in the standard that refers to this kind of
transformation for views?
--
Tatsuo Ishii

Re: View prevents index

От
"Eric G. Miller"
Дата:
On Thu, Jul 12, 2001 at 01:39:07PM +0900, Tatsuo Ishii wrote:
> > Christopher Masto <chris+pg-general@netmonger.net> writes:
> > > I guess maybe I'm expecting too much magic optimization.
> >
> > You're expecting the system to transform
> >
> >     (SELECT foo UNION SELECT bar) WHERE condition
> >
> > into
> >
> >     (SELECT foo WHERE condition) UNION (SELECT bar WHERE condition)
> >
> > It's not immediately obvious to me that these are equivalent, or
> > perhaps I should say it's not clear under what conditions is the
> > transformation legitimate.
>
> Could you tell me in what cases two of above are not equivalent? Or
> any specification in the standard that refers to this kind of
> transformation for views?

The first clearly specifies the WHERE filtering should occur after the
UNION, whereas the second is just the opposite.  I think transforming
the first to the second would be problematic due to the column name
matching for the where clause (i.e.  the user thinks they're matching on
the column names that the result set from the union would generate but
if it is transformed behind their back, it could fail).

Consider:

foo ::= { id::int,     blurb::varchar }
bar ::= { bar_id::int, data::varchar  }

Are these equivalent?

1. (SELECT foo.id As "Id", foo.blurb As "Stuff"
    UNION
    SELECT bar.bar_id, bar.data)
       WHERE character_length("Stuff") > 80;


2. (SELECT foo.id As "Id", foo.blurb As "Stuff"
       WHERE character_length ("Stuff") > 80)
    UNION
   (SELECT bar.bar_id As "Id", bar.data As "Stuff"
       WHERE character_length("Stuff") > 80);

I guess the second would fail on the alias, but the first should
succeed ??

I can't say anything about the specs, but the parser/planner/optimizer
would have to be able to fall back to applying the WHERE after the UNION
if it couldn't match up column names by splitting the UNION(s) out.  I
suppose you could try some magic to alias names by position in the
select phrase, but that seems difficult.

--
Eric G. Miller <egm2@jps.net>

Re: View prevents index

От
Martijn van Oosterhout
Дата:
On Wed, Jul 11, 2001 at 11:00:18PM -0700, Eric G. Miller wrote:
> 2. (SELECT foo.id As "Id", foo.blurb As "Stuff"
>        WHERE character_length ("Stuff") > 80)
>     UNION
>    (SELECT bar.bar_id As "Id", bar.data As "Stuff"
>        WHERE character_length("Stuff") > 80);
>
> I guess the second would fail on the alias, but the first should
> succeed ??

What's wrong with that query. Looks fine to me.

> I can't say anything about the specs, but the parser/planner/optimizer
> would have to be able to fall back to applying the WHERE after the UNION
> if it couldn't match up column names by splitting the UNION(s) out.  I
> suppose you could try some magic to alias names by position in the
> select phrase, but that seems difficult.

Seems to me that you need to go through the entire union clause and realias
all the output fields to their final names, and check that they don't clash.

I think that as long as you don't have any aggregates, you should be
fine. I certainly can't think of a counter example.

Or put it another way, is:

cat a.txt b.txt | grep pattern

the same as:

cat a.txt | grep pattern ; cat b.txt | grep pattern
--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

Re: View prevents index

От
"Eric G. Miller"
Дата:
On Thu, Jul 12, 2001 at 04:25:47PM +1000, Martijn van Oosterhout wrote:
> On Wed, Jul 11, 2001 at 11:00:18PM -0700, Eric G. Miller wrote:
> > 2. (SELECT foo.id As "Id", foo.blurb As "Stuff"
> >        WHERE character_length ("Stuff") > 80)
> >     UNION
> >    (SELECT bar.bar_id As "Id", bar.data As "Stuff"
> >        WHERE character_length("Stuff") > 80);
> >
> > I guess the second would fail on the alias, but the first should
> > succeed ??
>
> What's wrong with that query. Looks fine to me.

Well. pgsql doesn't handle column aliases in where clauses now...

But in the first example, I would expect the relation formed by the
inner union to have columns known as "Id" and "Stuff" so the outer WHERE
would have to use those names.  What would you do otherwise?  Use the
names from the first select?  Clearly that won't transfer well to the
WHERE clause in the second scenario without some name substitution.

1.  (SELECT id FROM foo UNION SELECT key FROM bar) WHERE <id|key> = 2 ??

2.  (SELECT id FROM foo WHERE id = 2)
       UNION
    (SELECT key FROM bar WHERE key = 2)

And, I guess you're right about aggregates as well.  I still don't see
the constructs as being equivalent due to the name changes that occur
in a UNION.

--
Eric G. Miller <egm2@jps.net>

Re: View prevents index

От
Tom Lane
Дата:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
>> It's not immediately obvious to me that these are equivalent, or
>> perhaps I should say it's not clear under what conditions is the
>> transformation legitimate.

> Could you tell me in what cases two of above are not equivalent?

It may well be OK, I just want to see a rigorous demonstration of it.
It *looks* right, but intuition is frequently misleading.  Two points
that particularly need thought are (a) what about NULLs --- SQL's
three-way boolean logic breaks a lot of other things that seem right
intuitively; (b) does the same equivalence hold for UNION ALL,
INTERSECT, INTERSECT ALL, EXCEPT, EXCEPT ALL?

If you think that it's so obvious as not to require any thought,
I offer the following counterexample:

        ... WHERE random() > 0.5;

Pushing down a WHERE like this one *will* change the results.

That particular case doesn't bother me, but user-defined functions
that access other tables might have history-dependent behavior,
too.  Do we need to allow for that?

            regards, tom lane