Re: [HACKERS] Another reason to redesign querytree representation

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] Another reason to redesign querytree representation
Дата
Msg-id 1711.932567770@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] Another reason to redesign querytree representation  (Thomas Lockhart <lockhart@alumni.caltech.edu>)
Список pgsql-hackers
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>> Thomas, what do you think is needed for outer joins?

> The information on the outer
> join can't be carried by the rte since the same table can appear twice
> in an outer join expression:
>
>   select * from t1 left join t2 using (i)
>                 left join t1 on (i = t1.j);

Is that actually a valid query?  Wouldn't you at least need to rename
one or the other appearance of t1?  (Nitpick, probably, but maybe I
am not understanding something...)

> For a query like
>   select * from t1 left join t2 using (i) where t2.j = 3;
> istm that the outer join must be done before the t2 qualification is
> applied, and that another ordering may produce the wrong result.

It's not immediately obvious what the semantics of that ought to be...
but I agree it makes a difference if you eliminate t2 rows before
rather than after the join.

This looks to me like the syntactic notion is that t1-left-join-t2
is already a single table as far as the rest of the SELECT is 
concerned.  (But what happens if they have column names in common,
other than i?)  In which case you're right, join first then apply
the WHERE condition is presumably what's supposed to happen.

If that's the way it works, I think that an RTE describing the joined
table is the natural way to handle it.  Obviously this would not be
a primitive node; it would have to be some kind of structure of nodes.

> I was thinking of having a new qualification node to carry this info,

You would need a qual clause to carry the join condition (t1.i = t2.i
in your first example, i = t1.j in your second).  This would have to
dangle off a node that represents the specially joined tables, I think.

There's no such thing as a "qualification node"; qual clauses are just
expressions that happen to be in WHERE.  If the "specially joined
tables" node isn't in the RTE then I think we need to invent some new
place to put it.  The WHERE expression isn't a natural place for it.

> From what I understand Bruce to say, the planner/optimizer is allowed
> to try all kinds of permutations of plans, choosing the one with the
> lowest cost. But if the info for the join is carried in a
> qualification node, then the planner/optimizer must know that it can't
> reorder the query as freely as it does now.

Yes, the join order would be forced W.R.T. the outer-joined tables,
at least.

The other alternative we should consider is the notion that the parser
outputs are already a multilevel plan structure, where we'd have a whole
lower plan item representing the outer-join table result.  This might
end up being the same thing as above, since quite possibly the RTE would
be the natural place for the upper plan's link to the lower one.

We need to get Jan involved in this, since this sounds like the same
kind of stuff he's been saying is needed for rules.  In fact Jan
probably ought to be leading the discussion, not me...

> A hashjoin method may be possible for queries which are structured as
> a left outer join; other outer joins will need to use the mergejoin
> method.

I don't see why plain ol' nested loop couldn't be used too.  mergejoin
is not always better than nested loop, or even always feasible.  It
requires the availability of sort operators, for one thing.

> Also, some poorly-qualified outer joins reduce to inner joins,
> and perhaps the optimizer can be smart enough to realize this.

OK, now tell me about inner joins...
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Mike Mascari
Дата:
Сообщение: Re: [HACKERS] psql & query string length
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] psql & query string length