Обсуждение: Almost there on column aliases

Поиск
Список
Период
Сортировка

Almost there on column aliases

От
Thomas Lockhart
Дата:
I've got most of the regression tests running, but one of the rules
tests has uncovered a problem in my code, at least for a query
involving a merge join.

Could someone run a "-d 99" query using the following from the
regression test (rules.sql):

select rtest_t2.a, rtest_t3.bfrom rtest_t2, rtest_t3where rtest_t2.a = rtest_t3.a;

and send me the query, the rewritten query, and the plan emitted by
the backend (it should be a MERGEJOIN plan)? It might speed up my
rummaging around for the reason for the failure :(

Another possibility is that I submit/commit my patches (there are
quite a few files touched and I *really* want to get them off of my
system and into the tree soon) but I was a bit hesitant to commit
something with a known problem of this nature.

TIA
                      - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Almost there on column aliases

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> Could someone run a "-d 99" query using the following from the
> regression test (rules.sql):
> select rtest_t2.a, rtest_t3.b
>  from rtest_t2, rtest_t3
>  where rtest_t2.a = rtest_t3.a;
> and send me the query, the rewritten query, and the plan emitted by
> the backend (it should be a MERGEJOIN plan)? It might speed up my
> rummaging around for the reason for the failure :(

This doesn't look very detailed, is it really what you wanted?

StartTransactionCommand
query: explain
select rtest_t2.a, rtest_t3.bfrom rtest_t2, rtest_t3where rtest_t2.a = rtest_t3.a
parser outputs:

{ QUERY :command 5  :utility ?  :resultRelation 0 :into <> :isPortal false :isBinary false :isTemp false :unionall
false:distinctClause <> :sortClause <> :rtable <> :targetlist <> :qual <> :groupClause <> :havingQual <> :hasAggs false
:hasSubLinksfalse :unionClause <> :intersectClause <> :limitOffset <> :limitCount <> :rowMark <>}
 

after rewriting:
{ QUERY   :command 5    :utility ?    :resultRelation 0   :into <>   :isPortal false   :isBinary false   :isTemp false
:unionall false   :distinctClause <>   :sortClause <>   :rtable <>   :targetlist <>   :qual <>   :groupClause <>
:havingQual<>   :hasAggs false   :hasSubLinks false   :unionClause <>   :intersectClause <>   :limitOffset <>
:limitCount<>   :rowMark <>  }
 

ProcessUtility: explain
select rtest_t2.a, rtest_t3.bfrom rtest_t2, rtest_t3where rtest_t2.a = rtest_t3.a
NOTICE:  QUERY PLAN:

Merge Join  (cost=164.66 rows=10000 width=12) ->  Sort  (cost=69.83 rows=1000 width=8)       ->  Seq Scan on rtest_t3
(cost=20.00rows=1000 width=8) ->  Sort  (cost=69.83 rows=1000 width=4)       ->  Seq Scan on rtest_t2  (cost=20.00
rows=1000width=4)
 

CommitTransactionCommand

> Another possibility is that I submit/commit my patches (there are
> quite a few files touched and I *really* want to get them off of my
> system and into the tree soon) but I was a bit hesitant to commit
> something with a known problem of this nature.

Any changes in backend/optimizer/ ?  I've got a bunch of uncommitted
changes there myself.
        regards, tom lane


Re: [HACKERS] Almost there on column aliases

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>> This doesn't look very detailed, is it really what you wanted?

> Hmm. I expected to get a full plan (labeled "plan:"). Did you do the
> query or just an "explain"?

I'm sorry, just did the "explain".  Is this any better?

StartTransactionCommand
query: select rtest_t2.a, rtest_t3.bfrom rtest_t2, rtest_t3where rtest_t2.a = rtest_t3.a
parser outputs:

{ QUERY :command 1  :utility <> :resultRelation 0 :into <> :isPortal false :isBinary false :isTemp false :unionall
false:distinctClause <> :sortClause <> :rtable ({ RTE :relname rtest_t2 :refname rtest_t2 :relid 404330 :inh false
:inFromCltrue :inJoinSet true :skipAcl false} { RTE :relname rtest_t3 :refname rtest_t3 :relid 404340 :inh false
:inFromCltrue :inJoinSet true :skipAcl false}) :targetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 23
:restypmod-1 :resname a :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 1
:vartype23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 1}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 23
:restypmod-1 :resname b :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 2 :varattno 2
:vartype23 :vartypmod -1  :varlevelsup 0 :varnoold 2 :varoattno 2}}) :qual { EXPR :typeOid 16  :opType op :oper { OPER
:opno96 :opid 0 :opresulttype 16 } :args ({ VAR :varno 1 :v!
 
!
arattno 1 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 1} { VAR :varno 2 :varattno 1 :vartype 23
:vartypmod-1  :varlevelsup 0 :varnoold 2 :varoattno 1})} :groupClause <> :havingQual <> :hasAggs false :hasSubLinks
false:unionClause <> :intersectClause <> :limitOffset <> :limitCount <> :rowMark <>}
 

after rewriting:
{ QUERY   :command 1    :utility <>   :resultRelation 0   :into <>   :isPortal false   :isBinary false   :isTemp false
:unionall false   :distinctClause <>   :sortClause <>   :rtable (     { RTE      :relname rtest_t2      :refname
rtest_t2     :relid 404330      :inh false      :inFromCl true      :inJoinSet true      :skipAcl false     }        {
RTE     :relname rtest_t3      :refname rtest_t3      :relid 404340      :inh false      :inFromCl true      :inJoinSet
true     :skipAcl false     }  )     :targetlist (     { TARGETENTRY      :resdom         { RESDOM         :resno 1
   :restype 23         :restypmod -1         :resname a         :reskey 0         :reskeyop 0         :ressortgroupref
0        :resjunk false         }           :expr         { VAR         :varno 1         :varattno 1         :vartype
23        :vartypmod -1          :varlevelsup 0         :varnoold 1         :varoattno 1        }     }        {
TARGETENTRY     :resdom         { RESDOM         :resno 2         :restype 23         :restypmod -1         :resname b
      :reskey 0         :reskeyop 0         :ressortgroupref 0         :resjunk false         }           :expr
{VAR         :varno 2         :varattno 2         :vartype 23         :vartypmod -1          :varlevelsup 0
:varnoold2         :varoattno 2        }     }  )     :qual      { EXPR      :typeOid 16       :opType op      :oper
    { OPER         :opno 96         :opid 0         :opresulttype 16         }           :args (        { VAR
:varno1         :varattno 1         :vartype 23         :vartypmod -1          :varlevelsup 0         :varnoold 1
 :varoattno 1        }              { VAR         :varno 2         :varattno 1         :vartype 23         :vartypmod
-1         :varlevelsup 0         :varnoold 2         :varoattno 1        }     )     }     :groupClause <>
:havingQual<>   :hasAggs false   :hasSubLinks false   :unionClause <>   :intersectClause <>   :limitOffset <>
:limitCount<>   :rowMark <>  }
 

plan:

{ MERGEJOIN :cost 164.658 :rows 10000 :width 12 :state <> :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1
:restype23 :restypmod -1 :resname a :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 65000
:varattno1 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 1}} { TARGETENTRY :resdom { RESDOM :resno 2
:restype23 :restypmod -1 :resname b :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 65001
:varattno1 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 2 :varoattno 2}}) :qpqual <> :lefttree { SORT :cost
69.8289:rows 1000 :width 8 :state <> :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 23 :restypmod -1
:resname<> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 2 :varattno 2 :vartype 23
:vartypmod-1  :varlevelsup 0 :varnoold 2 :varoattno 2}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 23 :restypmod
-1:resname <> :reskey 1 :reskeyop 66 :ressortg!
 
!
roupref 0 :resjunk false } :expr { VAR :varno 2 :varattno 1 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 2
:varoattno1}}) :qpqual <> :lefttree { SEQSCAN :cost 20 :rows 1000 :width 8 :state <> :qptargetlist ({ TARGETENTRY
:resdom{ RESDOM :resno 1 :restype 23 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false
}:expr { VAR :varno 2 :varattno 2 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 2 :varoattno 2}} { TARGETENTRY
:resdom{ RESDOM :resno 2 :restype 23 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false
}:expr { VAR :varno 2 :varattno 1 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 2 :varoattno 1}}) :qpqual <>
:lefttree<> :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :scanrelid 2 } :righttree <> :extprm () :locprm
():initplan <> :nprm 0  :nonameid 0 :keycount 1 } :righttree { SORT :cost 69.8289 :rows 1000 :width 4 :state <>
:qptargetlist({ TARGETENTRY :resdom { RESDOM :resno 1 :!
 
!
restype 23 :restypmod -1 :resname <> :reskey 1 :reskeyop 66 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1
:varattno1 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 1}}) :qpqual <> :lefttree { SEQSCAN :cost
20:rows 1000 :width 4 :state <> :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 23 :restypmod -1
:resname<> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 1 :vartype 23
:vartypmod-1  :varlevelsup 0 :varnoold 1 :varoattno 1}}) :qpqual <> :lefttree <> :righttree <> :extprm () :locprm ()
:initplan<> :nprm 0  :scanrelid 1 } :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :nonameid 0 :keycount 1 }
:extprm() :locprm () :initplan <> :nprm 0  :mergeclauses ({ EXPR :typeOid 16  :opType op :oper { OPER :opno 96 :opid 65
:opresulttype16 } :args ({ VAR :varno 65001 :varattno 2 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 2
:varoattno1} { VAR :varno 65000 :varattno 1 :vartype 2!
 
!
3 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 1})})}

ProcessQuery
CommitTransactionCommand

>> Any changes in backend/optimizer/ ?  I've got a bunch of uncommitted
>> changes there myself.

> Not too much. Though I've got a null pointer problem in executor for
> mergejoins and I'm not certain where it is coming from.

Could easy be a planner shortcoming.  Maybe you should commit so we
can get more eyeballs on the problem.

> Here are the files which have changed in the optimizer/ tree:

> [postgres@golem optimizer]$ cvs -q update .
> M prep/prepunion.c
> M util/clauses.c

> The changes are minor; I'm pretty sure I can remerge if you want to
> commit your stuff (at least if your stuff is isolated to the
> backend/optimizer/ part of the tree).

I know I've tromped on your toes in the past weeks, so I'll wait for
you to commit and then merge.  I have no changes in those two files,
but I do have some in the usual-suspect places like nodes/copyfuncs.
        regards, tom lane


Re: [HACKERS] Almost there on column aliases

От
Thomas Lockhart
Дата:
> > Could someone run a "-d 99" query using the following from the
> > regression test (rules.sql):
> This doesn't look very detailed, is it really what you wanted?

Hmm. I expected to get a full plan (labeled "plan:"). Did you do the
query or just an "explain"?

I'm compiling this way, though I don't think that it matters for this:

$ gcc -I../../include -I../../backend   -O2 -m486 -O2 -g -O0
-DUSE_ASSERT_CHECKING -DENABLE_OUTER_JOINS -DEXEC_MERGEJOINDEBUG -Wall
-Wmissing-prototypes -I..   -c copyfuncs.c -o copyfuncs.o

> Any changes in backend/optimizer/ ?  I've got a bunch of uncommitted
> changes there myself.

Not too much. Though I've got a null pointer problem in executor for
mergejoins and I'm not certain where it is coming from. Here are the
files which have changed in the optimizer/ tree:

[postgres@golem optimizer]$ cvs -q update .
M prep/prepunion.c
M util/clauses.c

The changes are minor; I'm pretty sure I can remerge if you want to
commit your stuff (at least if your stuff is isolated to the
backend/optimizer/ part of the tree).
                    - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [PATCHES] Re: [HACKERS] Almost there on column aliases

От
Thomas Lockhart
Дата:
> > Right. I'm looking forward to advice on the right way to do this. The
> > problem is that the introductory character for list structures is
> > *also* the introductory character for plans, so everything blows
> > chunks if I just call nodeRead() from _readAttr().
> Huh?  '{' introduces a node, '(' introduces a list.  See the comments
> I added (not very long ago :-() in read.c.  My guess is that you are
> either emitting the wrong character or have some sort of error in the
> way you call nodeRead.  Nothing obviously wrong in the patch diffs
> though.

The problem I recall is that paren also introduces a "plan", and if
you call nodeRead() it sees the paren and then complains later because
it expects a node label following the paren.

I probably misdiagnosed the behavior, but in any case I'd be *really*
happy if someone wants to put me out of my misery on this one ;)
                  - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Almost there on column aliases

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> I probably misdiagnosed the behavior, but in any case I'd be *really*
> happy if someone wants to put me out of my misery on this one ;)

At least some of your problems are due to confusing list nodes with the
nodes they point to, as in this example from parse_clause.c:

List *
ListTableAsAttrs(ParseState *pstate, char *table);
List *
ListTableAsAttrs(ParseState *pstate, char *table)
{List *rlist = NULL;List *col;
Attr *attr = expandTable(pstate, table, TRUE);foreach(col, attr->attrs){    Attr *a;    a = makeAttr(table,
strVal((Value*) col));    rlist = lappend(rlist, a);}
 
return rlist;
}

I tried, but failed, to refrain from remarking about the horrible
style of the function declaration --- either it's static (which
looks like the right answer here) or it should be declared in
a header file.  The above method of preventing gcc from telling
you how horrible your style is is just, well, never mind.

The more immediate problem is that you want
    a = makeAttr(table, strVal((Value *) lfirst(col)));

I cleaned up a similar error in ruleutils.c, but am too tired to
fix this one or go digging for more.
        regards, tom lane


Re: [HACKERS] Almost there on column aliases

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> The problem I recall is that paren also introduces a "plan", and if
> you call nodeRead() it sees the paren and then complains later because
> it expects a node label following the paren.

> I probably misdiagnosed the behavior, but in any case I'd be *really*
> happy if someone wants to put me out of my misery on this one ;)

Ah-hah, I see it.  nodeRead() expects that simple Value objects
(T_Integer, T_Float, T_String) will be output without any '{' ... '}'
wrapping.  _outNode() was putting them out with wrapping.  Apparently,
you're the first person in a long time (maybe forever) to try to dump
and reload node structures in which these node types appear outside
the context of a Const node.  (outConst calls outValue directly, without
going through outNode, so the bug didn't appear in that case.)

I've fixed _outNode() to suppress the unwanted wrapper for a Value
and removed the now-unnecessary special-case code for Attr lists.

BTW, the rule regress test is presently failing because I modified
ruleutils.c to dump the Attr list if it is not null, rather than
only if the refname is different from the relname:

*** 992,1008 ****                                  quote_identifier(rte->relname),
inherit_marker(rte));                if (strcmp(rte->relname, rte->ref->relname) != 0)
 
-                 {
-                     List *col;                     appendStringInfo(buf, " %s",
quote_identifier(rte->ref->relname));                    appendStringInfo(buf, " (");
 
!                     foreach (col, rte->ref->attrs)                     {
!                         if (col != lfirst(rte->ref->attrs))                             appendStringInfo(buf, ", ");
!                         appendStringInfo(buf, "%s", strVal(col));                     }                 }
}        }
 
--- 992,1012 ----                                  quote_identifier(rte->relname),
inherit_marker(rte));                if (strcmp(rte->relname, rte->ref->relname) != 0)
appendStringInfo(buf," %s",                                      quote_identifier(rte->ref->relname));
 
+                 if (rte->ref->attrs != NIL)
+                 {
+                     List *col;
+                      appendStringInfo(buf, " (");
!                     foreach(col, rte->ref->attrs)                     {
!                         if (col != rte->ref->attrs)                             appendStringInfo(buf, ", ");
!                         appendStringInfo(buf, "%s",
!                                          quote_identifier(strVal(lfirst(col))));                     }
+                     appendStringInfo(buf, ")");                 }             }         }

While this seems like appropriate logic, a bunch of the rule tests are
now showing long and perfectly content-free lists of attribute names in
reverse-listed FROM clauses.  This bothers me because it implies that
those names are being stored in the querytree that's dumped out to
pg_rewrite, which will be a further crimp in people's ability to write
complex rules.  I think we really don't want to store those nodes if we
don't have to.  Why are we building Attr lists when there's no actual
column aliasing being done?
        regards, tom lane


Re: [HACKERS] Almost there on column aliases

От
Thomas Lockhart
Дата:
> Ah-hah, I see it.  nodeRead() expects that simple Value objects
> (T_Integer, T_Float, T_String) will be output without any '{' ... '}'
> wrapping.  _outNode() was putting them out with wrapping.  Apparently,
> you're the first person in a long time (maybe forever) to try to dump
> and reload node structures in which these node types appear outside
> the context of a Const node.  (outConst calls outValue directly, without
> going through outNode, so the bug didn't appear in that case.)
> I've fixed _outNode() to suppress the unwanted wrapper for a Value
> and removed the now-unnecessary special-case code for Attr lists.

Great. Thanks. And I should have committed my garbage earlier rather
than trying to make it work poorly ;)

> BTW, the rule regress test is presently failing because I modified
> ruleutils.c to dump the Attr list if it is not null, rather than
> only if the refname is different from the relname:
> 
> *** 992,1008 ****
>                                    quote_identifier(rte->relname),
>                                    inherit_marker(rte));
>                   if (strcmp(rte->relname, rte->ref->relname) != 0)
> -                 {
> -                     List *col;
>                       appendStringInfo(buf, " %s",
>                                        quote_identifier(rte->ref->relname));
>                       appendStringInfo(buf, " (");
> !                     foreach (col, rte->ref->attrs)
>                       {
> !                         if (col != lfirst(rte->ref->attrs))
>                               appendStringInfo(buf, ", ");
> !                         appendStringInfo(buf, "%s", strVal(col));
>                       }
>                   }
>               }
>           }
> --- 992,1012 ----
>                                    quote_identifier(rte->relname),
>                                    inherit_marker(rte));
>                   if (strcmp(rte->relname, rte->ref->relname) != 0)
>                       appendStringInfo(buf, " %s",
>                                        quote_identifier(rte->ref->relname));
> +                 if (rte->ref->attrs != NIL)
> +                 {
> +                     List *col;
> +
>                       appendStringInfo(buf, " (");
> !                     foreach(col, rte->ref->attrs)
>                       {
> !                         if (col != rte->ref->attrs)
>                               appendStringInfo(buf, ", ");
> !                         appendStringInfo(buf, "%s",
> !                                          quote_identifier(strVal(lfirst(col))));
>                       }
> +                     appendStringInfo(buf, ")");
>                   }
>               }
>           }

istm that the column aliases (rte->ref->attrs) should not be written out
if the table alias (rte->ref->relname) is not written. And the rules
regression test should be failing anyway, because I didn't update it
since I knew that there was something wrong with those plan strings and
I didn't want to hide that.

> While this seems like appropriate logic, a bunch of the rule tests are
> now showing long and perfectly content-free lists of attribute names in
> reverse-listed FROM clauses.  This bothers me because it implies that
> those names are being stored in the querytree that's dumped out to
> pg_rewrite, which will be a further crimp in people's ability to write
> complex rules.  I think we really don't want to store those nodes if we
> don't have to.  Why are we building Attr lists when there's no actual
> column aliasing being done?

Hmm. Because there are multiple places in the parser which needs to get
at a column name. When handling column aliases, I was having to look up
the actual column names anyway to verify that there were the correct
number of aliases specified (actually, I decided to allow any number of
aliases <= the number of actual columns, filling in with the underlying
column names if an alias was not specified) and so while I had the info
I cached it into the RTE structure for later use.

If I make the ref structure optional, then I have to start returning
lists of columns when working out the new join syntax, and I hated to
keep generating a bunch of temporary lists of things. Also, by making
the ref->refname non-optional in the structure, I could stop checking
for its existance before using either it *or* the true table name; this
cleaned up a bit of the code.
                     - Thomas

-- 
Thomas Lockhart
Caltech/JPL
Interferometry Systems and Technology


Re: [HACKERS] Almost there on column aliases

От
Tom Lane
Дата:
>> BTW, the rule regress test is presently failing because I modified
>> ruleutils.c to dump the Attr list if it is not null, rather than
>> only if the refname is different from the relname:

> istm that the column aliases (rte->ref->attrs) should not be written out
> if the table alias (rte->ref->relname) is not written.

Hmm.  If it's not possible to specify column aliases without specifying
a table-name alias, then that's OK ... but I thought table aliases were
optional.

> And the rules
> regression test should be failing anyway, because I didn't update it
> since I knew that there was something wrong with those plan strings and
> I didn't want to hide that.

The weird thing is that I'm pretty sure the rules test was *passing*
(against the present expected file) last night after I made the change
I just quoted.  It wasn't till after I changed the readfuncs/outfuncs
stuff this morning that I started seeing the long lists of column names
in the rules output.

OTOH, "last night" was about 3AM and I was tired.  Maybe I remember it
wrong.

>> While this seems like appropriate logic, a bunch of the rule tests are
>> now showing long and perfectly content-free lists of attribute names in
>> reverse-listed FROM clauses.  This bothers me because it implies that
>> those names are being stored in the querytree that's dumped out to
>> pg_rewrite, which will be a further crimp in people's ability to write
>> complex rules.  I think we really don't want to store those nodes if we
>> don't have to.  Why are we building Attr lists when there's no actual
>> column aliasing being done?

> Hmm. Because there are multiple places in the parser which needs to get
> at a column name. When handling column aliases, I was having to look up
> the actual column names anyway to verify that there were the correct
> number of aliases specified (actually, I decided to allow any number of
> aliases <= the number of actual columns, filling in with the underlying
> column names if an alias was not specified) and so while I had the info
> I cached it into the RTE structure for later use.

Fair enough, but we don't need those column names any more after the
parse/analyze phase completes, right?  Maybe we could remove the lists
at that time, or at least do so before writing out rule querytrees.

Since we aren't going to have TOAST in 7.0, I'm concerned that the
rule representation not get any more verbose than it is already...
        regards, tom lane


Re: [HACKERS] Almost there on column aliases

От
Thomas Lockhart
Дата:
> > istm that the column aliases (rte->ref->attrs) should not be written out
> > if the table alias (rte->ref->relname) is not written.
> Hmm.  If it's not possible to specify column aliases without specifying
> a table-name alias, then that's OK ... but I thought table aliases were
> optional.

I don't think so (ie a table alias is required if a column alias is
specified), but my SQL books are at home so I can't verify my
recollection.

> Fair enough, but we don't need those column names any more after the
> parse/analyze phase completes, right?  Maybe we could remove the lists
> at that time, or at least do so before writing out rule querytrees.

Possibly. I'm transforming the qualifications on the join clause as the
join clause is transformed (rather than later during the WHERE
transformation) in the hope that the column (and table) names will have
been replaced by attribute numbers and RTE indices. If that is the case,
and if the "correlation names" or aliases are never needed after that,
then we can drop 'em.

Except that we'll possibly need them to get a valid pg_dump of the
rules? Or is an untransformed copy of the original definition kept
around someplace??

> Since we aren't going to have TOAST in 7.0, I'm concerned that the
> rule representation not get any more verbose than it is already...

Right.
                        - Thomas

-- 
Thomas Lockhart
Caltech/JPL
Interferometry Systems and Technology


Re: [HACKERS] Almost there on column aliases

От
Tom Lane
Дата:
Thomas Lockhart <Thomas.G.Lockhart@jpl.nasa.gov> writes:
>> Fair enough, but we don't need those column names any more after the
>> parse/analyze phase completes, right?  Maybe we could remove the lists
>> at that time, or at least do so before writing out rule querytrees.

> Except that we'll possibly need them to get a valid pg_dump of the
> rules? Or is an untransformed copy of the original definition kept
> around someplace??

As far as I can tell without having tried it, you'd still get a correct
dump, although it might look different from the original query because
columns would be referred to by their untransformed names (but that'll
happen anyway, unless you go back and change ruleutil.c's way of looking
up column names).  For example, with current sources:

regression=# create view qq as select a from tenk1 t1 (a);
CREATE 276745 1
regression=# \d qq          View "qq"Attribute |  Type   | Modifier
-----------+---------+----------a         | integer |
View definition: SELECT t1.unique1 AS a FROM tenk1 t1 (a, unique2, two, four, ten, twenty, hundred, thousand,
twothousand,fivethous, tenthous, odd, even, stringu1, stringu2, string4);
 

The only "external" view of the alias is as the column title, and notice
that that's getting enforced by an AS clause independently of any
aliases.  (In the querytree, that title is coming from a refname in the
targetlist entry --- we don't need another copy in the RTE to make it
work.)

BTW, I'm practically certain that I tried this same example last night
and got a rule dump of just

SELECT t1.unique1 AS a FROM tenk1 t1 (a);

which is more like what I would expect.  Did you change the behavior
w.r.t. adding additional columns to the alias list just recently, like
since 11PM EST yesterday?
        regards, tom lane

PS: Am I the only one who thinks that column aliases done this way are
extremely brain-dead?  If you write "FROM table alias (a b c)" then
you've just written a query that depends critically and non-obviously
on which columns are first, second, third in the physical table.
One of the few things I know about good SQL style is that you don't
write INSERT without an explicit column list, because such code will
break (possibly without warning) if you insert/delete/rearrange columns
in the table definition.  This alias facility seems to be just another
method of shooting yourself in the foot with that same bullet...


Re: [HACKERS] Almost there on column aliases

От
Thomas Lockhart
Дата:
> > Except that we'll possibly need them to get a valid pg_dump of the
> > rules? Or is an untransformed copy of the original definition kept
> > around someplace??
> As far as I can tell without having tried it, you'd still get a correct
> dump, although it might look different from the original query because
> columns would be referred to by their untransformed names (but that'll
> happen anyway, unless you go back and change ruleutil.c's way of looking
> up column names).  For example, with current sources:
> View definition: SELECT t1.unique1 AS a
>   FROM tenk1 t1 (a, unique2, two, four, ten, twenty, hundred,
>    thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4);
> The only "external" view of the alias is as the column title, and notice
> that that's getting enforced by an AS clause independently of any
> aliases.  (In the querytree, that title is coming from a refname in the
> targetlist entry --- we don't need another copy in the RTE to make it
> work.)

Well, there are other queries which *do* rely on the column aliases:
 select a, b from t1 ta (a, b, c) natural join t2 tb (a, d);

where the column in the target list called "a" is not allowed to have an
explicit reference to a table name. That is, neither
 select t1.a, b from t1 ta (a, b, c) natural join t2 tb (a, d);

nor
 select t2.a, b from t1 ta (a, b, c) natural join t2 tb (a, d);

are legal SQL, but, for example,
 select a, ta.b from t1 ta (a, b, c) natural join t2 tb (a, d);

is. Not sure how this impacts the rule representation or dump/reload of
views.

> BTW, I'm practically certain that I tried this same example last night
> which is more like what I would expect.  Did you change the behavior
> w.r.t. adding additional columns to the alias list just recently, like
> since 11PM EST yesterday?

Yeah right ;)

I've only committed one set of patches; don't remember what time that
was...

> PS: Am I the only one who thinks that column aliases done this way are
> extremely brain-dead?  If you write "FROM table alias (a b c)" then
> you've just written a query that depends critically and non-obviously
> on which columns are first, second, third in the physical table.
> One of the few things I know about good SQL style is that you don't
> write INSERT without an explicit column list, because such code will
> break (possibly without warning) if you insert/delete/rearrange columns
> in the table definition.  This alias facility seems to be just another
> method of shooting yourself in the foot with that same bullet...

It's required for doing complex join syntax, and is allowed for other
joins as well. But we certainly have got along just fine without it, eh?
                         - Thomas

-- 
Thomas Lockhart
Caltech/JPL
Interferometry Systems and Technology


Re: [HACKERS] Almost there on column aliases

От
Thomas Lockhart
Дата:
> I tried, but failed, to refrain from remarking about the horrible
> style of the function declaration --- either it's static (which
> looks like the right answer here) or it should be declared in
> a header file.  The above method of preventing gcc from telling
> you how horrible your style is is just, well, never mind.

Uh, Tom, it is unused code, and I use this kind of thing while doing
development. I did warn about some crufty stuff, and glad you agree :/
                 - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Almost there on column aliases

От
Thomas Lockhart
Дата:
> >> BTW, the rule regress test is presently failing because I modified
> >> ruleutils.c to dump the Attr list if it is not null, rather than
> >> only if the refname is different from the relname:

I'm currently (2000-02-16 15:40 GMT) seeing the rules test
blank-filling the "bpchar" fields. Do you see that?

> > istm that the column aliases (rte->ref->attrs) should not be written out
> > if the table alias (rte->ref->relname) is not written.
> Hmm.  If it's not possible to specify column aliases without specifying
> a table-name alias, then that's OK ... but I thought table aliases were
> optional.

I've just looked it up in the Date book: table aliases are optional in
general, but column aliases require a table alias. The bnf looks like
 table [ [ AS ] range-variable [ ( column-commalist ) ] ]
                       - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Almost there on column aliases

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> I'm currently (2000-02-16 15:40 GMT) seeing the rules test
> blank-filling the "bpchar" fields. Do you see that?

No ...
        regards, tom lane


Re: [HACKERS] Almost there on column aliases

От
Thomas Lockhart
Дата:
> > I'm currently (2000-02-16 15:40 GMT) seeing the rules test
> > blank-filling the "bpchar" fields. Do you see that?

Hmm. Still seeing it; here is a snippet from a diff of
results/rules.out and expected/rules.out:

...
<  rtest_emp     | rtest_emp_ins   | CREATE RULE rtest_emp_insAS ON INSERT TO rtest_emp DOINSERT INTO rtest_emplog
(ename,who, "action", newsal, oldsal)VALUES (new.ename, getpgusername(),'hired     '::bpchar, new.salary,
'$0.00'::money);
...
>  rtest_emp     | rtest_emp_ins   | CREATE RULE rtest_emp_insAS ON INSERT TO rtest_emp DOINSERT INTO rtest_emplog
(ename,who, "action", newsal, oldsal)VALUES (new.ename, getpgusername(),'hired'::bpchar, new.salary, '$0.00'::money);
 
...

But if you are not seeing it, then perhaps my "make clean install"
isn't sufficient; I'll try a clean checkout sometime...
                   - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Almost there on column aliases

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> I've just looked it up in the Date book: table aliases are optional in
> general, but column aliases require a table alias. The bnf looks like

>   table [ [ AS ] range-variable [ ( column-commalist ) ] ]

OK, but that doesn't really solve my concern about rule bloat, because
if you write "FROM table alias", you'll still get a list of column names
appended to that by the system.

Here is a possible answer that I think would address both our concerns:
keep track of how many column aliases the user actually wrote (without
counting the Attr-list entries added by the system for its internal
convenience), and dump only the user-supplied aliases in rule strings.
This'd be easy enough to do with an extra field in Attr nodes.
It'd not only preserve compactness in the cases we previously handled,
but it'd make the reverse-listed display of rules more like the original
query in cases where the user did write some aliases (but not a full
set).
        regards, tom lane


Re: [HACKERS] Almost there on column aliases

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>>>> I'm currently (2000-02-16 15:40 GMT) seeing the rules test
>>>> blank-filling the "bpchar" fields. Do you see that?

> Hmm. Still seeing it; here is a snippet from a diff of
> results/rules.out and expected/rules.out:

> ...
> <  rtest_emp     | rtest_emp_ins   | CREATE RULE rtest_emp_ins
>  AS ON INSERT TO rtest_emp DO
>  INSERT INTO rtest_emplog (ename, who, "action", newsal, oldsal)
>  VALUES (new.ename, getpgusername(),
>  'hired     '::bpchar, new.salary, '$0.00'::money);
> ...
>> rtest_emp     | rtest_emp_ins   | CREATE RULE rtest_emp_ins
>  AS ON INSERT TO rtest_emp DO
>  INSERT INTO rtest_emplog (ename, who, "action", newsal, oldsal)
>  VALUES (new.ename, getpgusername(),
>  'hired'::bpchar, new.salary, '$0.00'::money);
> ...

Oh, I'm sorry, I *am* seeing that.  I don't think this has anything
to do with your changes; the system's been producing pre-padded
strings in those tests for a while now, at least on good days ;-).
If you look closely you'll see that the padded string has just been
pre-coerced to the length of the char() target field.  I don't think
that's wrong.

The difference is normally masked from causing a comparison failure
in the regress tests because we use diff -w to look for differences.
Probably the expected file was last updated at a time when it wasn't
doing that...
        regards, tom lane


Re: [HACKERS] Almost there on column aliases

От
Thomas Lockhart
Дата:
> >>>> I'm currently (2000-02-16 15:40 GMT) seeing the rules test
> >>>> blank-filling the "bpchar" fields. Do you see that?
> > Hmm. Still seeing it; here is a snippet from a diff of
> > results/rules.out and expected/rules.out:
> Oh, I'm sorry, I *am* seeing that.  I don't think this has anything
> to do with your changes; the system's been producing pre-padded
> strings in those tests for a while now, at least on good days ;-).
> If you look closely you'll see that the padded string has just been
> pre-coerced to the length of the char() target field.  I don't think
> that's wrong.

Ah, right; "bpchar" is "blank padded char". But would there be any
downside to removing those blank pads when doing the transformation
back to a printed query? i.e. if the outnode() functions stripped the
padding? Or maybe at that point there is not enough info to do it?

Seems like an ill-advised char(2000) or two in a table might bollux up
a lot of potential rules (even more than my extraneous column aliases
might ;)
                - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Almost there on column aliases

От
Tom Lane
Дата:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
> Ah, right; "bpchar" is "blank padded char". But would there be any
> downside to removing those blank pads when doing the transformation
> back to a printed query? i.e. if the outnode() functions stripped the
> padding? Or maybe at that point there is not enough info to do it?

There's not enough info to know whether trailing spaces were inserted
by the system or given by the user.  I'd be pretty uncomfortable with
trying to make outfuncs.c apply a potentially semantics-changing
transformation like that.  It isn't nearly smart enough to do the right
thing at present, and trying to make it smart enough seems like the
wrong direction to go in.

> Seems like an ill-advised char(2000) or two in a table might bollux up
> a lot of potential rules (even more than my extraneous column aliases
> might ;)

Good point... of course people will be hitting other problems besides
rule length with such things, but...

Perhaps the right answer here is that addition of length-coercion
functions to an INSERT or UPDATE's targetlist entries doesn't belong
in the parser, but should be handled downstream of the rule stuff
--- right before the planner's constant-folding step seems like a
good spot.  Then we wouldn't be paying for either the padding (if
the function got constant-folded out) or the function call (if not)
in the stored rule's querytree.

This would also allow ruleutils.c to get rid of some grotty code it has
to try to hide said functions in the reverse-listed query.  Might make
life a little easier for the rule rewriter too, I dunno.
        regards, tom lane


Re: [HACKERS] Almost there on column aliases

От
Thomas Lockhart
Дата:
> OK, but that doesn't really solve my concern about rule bloat, because
> if you write "FROM table alias", you'll still get a list of column names
> appended to that by the system.
> Here is a possible answer that I think would address both our concerns:
> keep track of how many column aliases the user actually wrote (without
> counting the Attr-list entries added by the system for its internal
> convenience), and dump only the user-supplied aliases in rule strings.
> This'd be easy enough to do with an extra field in Attr nodes.
> It'd not only preserve compactness in the cases we previously handled,
> but it'd make the reverse-listed display of rules more like the original
> query in cases where the user did write some aliases (but not a full
> set).

I put the Attr node into the rte because column aliases need to travel
with the table alias. But I'm not sure how the table alias is actually
used after being transformed back and forth for rules. And I'm not
sure how we would use the column aliases beyond the parser in the
future.

How about if I have the rte->ref Attr node hold *only* the column
aliases specified by the user (which addresses your concern), and then
make a "hidden" Attr node (or list of nodes; see below) which is build
and used in the parser but which is never read or written by the
dump/transformation stuff used for rules. So I'll define a new Attr *
field, say "p_ref" which is used internally but ignored after I'm done
with it. I'm not *certain* this will work: I still have issues
regarding outer join syntax which I'm pretty sure are not addressed by
either the status quo or this new suggestion, but at least with a
"hidden field" I'd have some flexibility to muck around with how it is
defined and used.

Also, the "layered aliases" you can get with outer joins are not
handled yet, and I'm pretty sure that more work needs to be done on
the structures to get it to fly at all. e.g.
 SELECT n, m FROM (t1 ta (a, b) OUTER JOIN t2 tb (a, c)) AS tj (n,
m);

cannot currently be transformed properly for rules given the info
available in the existing structures. This is true because there is no
equivalent query which allows you to specify anything like t1, ta, t2,
or tb in the target list, and there is no way currently to carry along
the "tj (n, m)" info.

Comments on any or all?
                    - Thomas

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California


Re: [HACKERS] Almost there on column aliases

От
Bruce Momjian
Дата:
> cannot currently be transformed properly for rules given the info
> available in the existing structures. This is true because there is no
> equivalent query which allows you to specify anything like t1, ta, t2,
> or tb in the target list, and there is no way currently to carry along
> the "tj (n, m)" info.
> 
> Comments on any or all?

It makes my head hurt.  Is that a comment?

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Almost there on column aliases

От
Thomas Lockhart
Дата:
> It makes my head hurt.  Is that a comment?

:)

-- 
Thomas Lockhart                lockhart@alumni.caltech.edu
South Pasadena, California