Re: [HACKERS] Re: subselects

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: [HACKERS] Re: subselects
Дата
Msg-id 199801121424.JAA02440@candle.pha.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] Re: subselects  ("Vadim B. Mikheev" <vadim@sable.krasnoyarsk.su>)
Список pgsql-hackers
>
> Bruce Momjian wrote:
> >
> > We need a new Node structure, call it Sublink:
> >
> >         int     linkType        (IN, NOTIN, ANY, EXISTS, OPERATOR...)
> >         Oid     operator        /* subquery must return single row */
> >         List    *lefthand;      /* parent stuff */
> >         Node    *subquery;      /* represents nodes from parser */
> >         Index   Subindex;       /* filled in to index Query->subqueries */
>
> Ok, I agreed that it's better to have new node and don't put subquery stuff
> into Expr node.
>
> int linkType
>         is one of EXISTS, ANY, ALL, EXPR. EXPR is for the case of expression
>         subqueries (following Sybase naming) which must return single row -
>         (a, b, c) = (subquery).
>         Note again, that there are no linkType for IN and NOTIN here.
>         User' IN and NOT IN must be converted to = ANY and <> ALL by parser.
>
> We need not in Oid operator! In all cases we need in
>
> List *oper
>         list of Oper nodes for each of a, b, c, ... and operator (=, ...)
>         corresponding to data type of a, b, c, ...
>
> List *lefthand
>         is list of Var/Const nodes - representation of (a, b, c, ...)

I see, the opoids would be different for '=' if different variable types
are used in (a,b,c) in (subselect).  Got it.

>
> What is Node *subquery ?
> In optimizer we need either in Subindex (to get subquery from Query->subqueries
> when beeing in Sublink) or in Node *subquery inside Sublink itself.
> BTW, after some thought I don't see how Query->subqueries will be usefull.
> So, may be just add bool hassubqueries to Query (and Query *parentQuery)
> and use Query *subquery in Sublink, but not subindex ?

OK, I originally created it because the parser would have trouble
filling in a List* field in SelectStmt while it was parsing a WHERE
clause.  I decided to just stick the SelectStmt* into Sublink->subquery.

While we are going through the parse output to fill in the Query*, I
thought we should move the actual subquery parse output to a separate
place, and once the Query* was completed, spin through the saved
subquery parse list and stuff Query->subqueries with a list of Query*
for the subqueries.  I thought this would be easier, because we would
then have all the subqueries in a nice list that we can manage easier.

In fact, we can fill Query->subqueries with SelectStmt* as we process
the WHERE clause, then convert them to Query* at the end.

If you would rather keep the subquery Query* entries in the Sublink
structure, we can do that.  The only issue I see is that when you want
to get to them, you have to wade through the WHERE clause to find them.
For example, we will have to run the subquery Query* through the rewrite
system.  Right now, for UNION, I have a nice union List* in Query, and I
just spin through it in postgres.c for each Union query.  If we keep the
subquery Query* inside Sublink, we have to have some logic to go through
and find them.

If we just have an Index in Sublink to the Query->subqueries, we can use
the nth() macro to find them quite easily.

But it is up to you.  I really don't know how you are going to handle
things like:

    select *
    from taba
    where x = 3 and y = 5 and (z=6 or q in (select g from tabb ))

My logic was to break the problem down to single queries as much as
possible, so we would be breaking the problem up into pieces.  Whatever
is easier for you.

>
> >
> > Also, when parsing the subqueries, we need to keep track of correlated
> > references.  I recommend we add a field to the Var structure:
> >
> >         Index   sublevel;       /* range table reference:
> >                                    = 0  current level of query
> >                                    < 0  parent above this many levels
> >                                    > 0  index into subquery list
> >                                  */
> >
> > This way, a Var node with sublevel 0 is the current level, and is true
> > in most cases.  This helps us not have to change much code.  sublevel =
> > -1 means it references the range table in the parent query. sublevel =
> > -2 means the parent's parent. sublevel = 2 means it references the range
> > table of the second entry in Query->subqueries.  Varno and varattno are
> > still meaningful.  Of course, we can't reference variables in the
> > subqueries from the parent in the parser code, but Vadim may want to.
>                                                      ^^^^^^^^^^^^^^^^^
> No. So, just use sublevel >= 0: 0 - current level, 1 - one level up, ...
> sublevel is for optimizer only - executor will not use it.

OK, if you don't need to reference range tables DOWN in subqueries, we
can use positive numbers.

> > When doing a Var lookup in the parser, we look in the current level
> > first, but if not found, if it is a subquery, we can look at the parent
> > and parent's parent to set the sublevel, varno, and varatno properly.
> >
> > We create no phantom range table entries in the subquery, and no phantom
> > target list entries.   We can leave that all for the upper optimizer.
>
> Ok.
>
> Vadim
>


--
Bruce Momjian
maillist@candle.pha.pa.us

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: [HACKERS] Re: subselects
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [HACKERS] Re: subselects