Re: [BUGS] Improper const-evaluation of HAVING with grouping sets and subquery pullup

Поиск
Список
Период
Сортировка
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Heikki" == Heikki Linnakangas <hlinnaka@iki.fi> writes:
>  Heikki> Why are there no "summary" rows with NULLs, despite the ROLLUP?

> Here's why it happens: after substituting for the aliases, you have
> GROUP BY g, rollup(g)
> which is equivalent to
> GROUP BY GROUPING SETS ((g,g), (g))
> which is equivalent to
> GROUP BY GROUPING SETS ((g), (g))

I don't think I buy this explanation, because the plan tree doesn't show
any indication that we're actually folding (g,g) to (g):

regression=# EXPLAIN SELECT g as newalias1, g as newalias3
FROM generate_series(1,3) g
GROUP BY newalias1, ROLLUP(newalias3);                                  QUERY PLAN
--------------------------------------------------------------------------------HashAggregate  (cost=15.00..21.50
rows=400width=8)  Hash Key: g, g  Hash Key: g  ->  Function Scan on generate_series g  (cost=0.00..10.00 rows=1000
width=8)
(4 rows)

regression=# EXPLAIN SELECT g as newalias1, g+0 as newalias3
FROM generate_series(1,3) g
GROUP BY newalias1, ROLLUP(newalias3);                                  QUERY PLAN
--------------------------------------------------------------------------------HashAggregate  (cost=17.50..25.00
rows=400width=8)  Hash Key: g, (g + 0)  Hash Key: g  ->  Function Scan on generate_series g  (cost=0.00..12.50
rows=1000width=8) 
(4 rows)

If these behave differently, why does the plan structure look the same?

I think that Heikki's expectation is the correct one, and the reason the
output looks the way it does is that setrefs.c is dropping the ball
somehow and confusing the two "g" references.  The finished plan has two
identical Var references in the Agg node's output tlist:
      :targetlist (         {TARGETENTRY          :expr             {VAR             :varno 65001             :varattno
1            :vartype 23             :vartypmod -1             :varcollid 0             :varlevelsup 0
:varnoold1             :varoattno 1             :location 7            }         :resno 1          :resname newalias1
      :ressortgroupref 1          :resorigtbl 0          :resorigcol 0          :resjunk false         }
{TARGETENTRY         :expr             {VAR             :varno 65001             :varattno 1             :vartype 23
        :vartypmod -1             :varcollid 0             :varlevelsup 0             :varnoold 1
:varoattno1             :location 23            }         :resno 2          :resname newalias3
:ressortgroupref2          :resorigtbl 0          :resorigcol 0          :resjunk false         }      ) 

The two "g" values are correctly distinguished in the FunctionScan's
tlist, however:
         :targetlist (            {TARGETENTRY             :expr                {VAR                :varno 1
   :varattno 1                :vartype 23                :vartypmod -1                :varcollid 0
:varlevelsup0                :varnoold 1                :varoattno 1                :location 7               }
  :resno 1             :resname <>             :ressortgroupref 1             :resorigtbl 0             :resorigcol 0
         :resjunk false            }            {TARGETENTRY             :expr                {VAR
:varno1                :varattno 1                :vartype 23                :vartypmod -1                :varcollid 0
             :varlevelsup 0                :varnoold 1                :varoattno 1                :location 23
    }            :resno 2             :resname <>             :ressortgroupref 2             :resorigtbl 0
:resorigcol0             :resjunk false            }         ) 

We should have used ressortgroupref matching to prevent this, but without
having checked the code right now, I think that we might only apply that
logic to non-Var tlist entries.  If the Agg output tlist had mentioned
column 2 not column 1 of the child node, I bet we'd get the right answer.

Digression: this seems like another example in which the "same" Var can
represent two different values.  I've had a bee in my bonnet for awhile
that we need to stop doing that, but I'm not entirely sure what to do
instead.  In the case of Vars that might go to null because of an outer
join, we could perhaps fix things by not smashing join alias Vars to their
referents (at least, not till much much later in the planner than we do
now).  That doesn't seem to apply here though.  Perhaps the GROUP BY
operation ought to be understood as providing its own set of output Vars?
That would mean creating an RTE to represent GROUP BY, but maybe that's
not awful.

In the nearer term, maybe we can get a back-patchable fix by being
more careful about ressortgroupref matching.
        regards, tom lane


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

В списке pgsql-bugs по дате отправления:

Предыдущее
От: mdevan.r@gmail.com
Дата:
Сообщение: [BUGS] BUG #14860: pg_read_all_stats and pg_stat_replication
Следующее
От: andrey_shvidky@hotmail.com
Дата:
Сообщение: [BUGS] BUG #14861: Handle syntax_error