On 28/10/2023 17:10, Alexander Korotkov wrote:
>>
>> It is a solution. But does it mask the real problem? In my mind, we copy
>> node trees to use somewhere else or probe a conjecture. Here, we have
>> two different representations of the same subquery. Keeping aside the
>> memory consumption issue, is it correct?
>> Make sense to apply both options: switch the groups estimation to
>> subroot targetList and keep one version of a subquery.
>> In attachment - second (combined) version of the change. Here I added
>> assertions to check identity of root->parse and incoming query tree.
>
> Andrei, did you read the comment just before the groups estimation as
> pointed by Tom [1]?
Yes, and I am a bit confused. We use here subroot->parse->targetList.
The processed_tlist, where we can find the "Varno 0" value, is based on
it, but it is different. As I see, forming processed_tlist, we make a
new node and don't change the original targetList. Am I wrong?
> * XXX you don't really want to know about this: we do the estimation
> * using the subquery's original targetlist expressions, not the
> * subroot->processed_tlist which might seem more appropriate. The
> * reason is that if the subquery is itself a setop, it may return a
> * processed_tlist containing "varno 0" Vars generated by
> * generate_append_tlist, and those would confuse estimate_num_groups
> * mightily. We ought to get rid of the "varno 0" hack, but that
> * requires a redesign of the parsetree representation of setops, so
> * that there can be an RTE corresponding to each setop's output.
>
> As I understand, it requires much more work to correctly switch the
> groups estimation to subroot targetList.
"Varno 0" is quite an irritating problem, which has beaten me a lot
before, during the development of the GROUP-BY optimization feature and
not only. I'd be glad to redesign this part of the planner. But I didn't
find an easy way to implement that yet.
--
regards,
Andrei Lepikhov
Postgres Professional