I'm fairly sure I thought it wouldn't matter because of the Param de-duplication done in paramassign.c. However, Richard's example shows that's not so, because process_subquery_nestloop_params is picky about the param ID assigned to a particular Var while replace_nestloop_params is not. So flipping the order makes sense. I'd change the comment though, maybe to
/* * Replace any outer-relation variables with nestloop params. * * We must provide nestloop params for both lateral references of * the subquery and outer vars in the scan_clauses. It's better * to assign the former first, because that code path requires * specific param IDs, while replace_nestloop_params can adapt * to the IDs assigned by process_subquery_nestloop_params. * This avoids possibly duplicating nestloop params when the * same Var is needed for both reasons. */
+1. It's much better.
However ... it seems like we're not out of the woods yet. Why is Richard's proposed test case still showing
Seems like there is missing de-duplication logic, or something.
When we collect the cache keys in paraminfo_get_equal_hashops() we search param_info's ppi_clauses as well as innerrel's lateral_vars for outer expressions. We do not perform de-duplication on the collected outer expressions there. In my proposed test case, the same Var 't1.two' appears both in the param_info->ppi_clauses and in the innerrel->lateral_vars, so we see two identical cache keys in the plan. I noticed this before and wondered if we should do de-duplication on the cache keys, but somehow I did not chase this to the ground.