Re: Analyzing bug 8049
От | Tom Lane |
---|---|
Тема | Re: Analyzing bug 8049 |
Дата | |
Msg-id | 738.1367193617@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Analyzing bug 8049 (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Analyzing bug 8049
(Jim Nasby <jim@nasby.net>)
|
Список | pgsql-hackers |
I wrote: > The only alternative I can see is to make a back-patch that just teaches > get_eclass_for_sort_expr() to compute valid nullable_relids for the sort > expression. That's necessary code in any case, and it would fix the > immediately complained-of bug. The thing that scares me is that it is > now clear that equivclass.c is capable of considering two expressions to > be equivalent when they should not be; that is, I'm afraid there are > variants of the problem that would not be cured by such a limited > back-patch. But maybe I should try to create such an example before > proposing the more invasive approach. In thinking about how to compute valid nullable_relids in get_eclass_for_sort_expr, I realized that it is impractical to do it in the current code structure, because computing nullable_relids requires that the joininfo list has been built by deconstruct_jointree, which hasn't been run yet at the time grouping_planner computes the pathkeys for the ORDER BY and related clauses. So I concluded that we need to postpone computation of those pathkeys lists until after deconstruct_jointree. Working through this, I realized that we could, in fact, simply delay computation of all five of group_pathkeys, window_pathkeys, distinct_pathkeys, sort_pathkeys, and query_pathkeys until the point where we currently do canonicalize_all_pathkeys. There isn't anything that looks at those lists in-between. The only reason they're set up by grouping_planner is separation of concerns --- query_planner doesn't really have any business dealing with group_pathkeys for instance. Furthermore, if we delay computation of those lists, then we can get rid of the notion of non-canonical pathkeys altogether: there is noplace that needs to generate a PathKey at all before this point. So that results in a nice conceptual simplification as well as saving a few cycles. Once I'd finished moving the pathkeys calculations, I was a tad surprised to discover that bug 8049 was gone! The reason for this is that now, if an ORDER BY item matches some expression that has another reason to be in an EquivalenceClass, the ORDER BY item will get matched to a pre-existing EC item and so there will be no opportunity for get_eclass_for_sort_expr() to do the wrong thing. If an ORDER BY item doesn't match anything, then it will be created as a new single-item EquivalenceClass, containing arguably-incorrect NULL em_nullable_relids, but there is nothing that will notice that. I remain suspicious that there are or will someday be cases where we actually need to generate valid em_nullable_relids for an ORDER BY item, but in the absence of a demonstrated bug it doesn't seem real prudent to add a bunch of new code here, either in 9.2 or 9.3. Therefore, I think what we ought to do is apply something like the attached patch to 9.2 and 9.3 and call it a day for now. The difficulty with back-patching this patch is that it involves API changes for query_planner() and make_pathkeys_for_sortclauses(), as well as outright removal of canonicalize_pathkeys(). So there's a risk of breaking third-party code that calls any of those functions. I don't have the slightest hesitation about changing these APIs in 9.3, but back-patching them into 9.2 is a bit more nervous-making. It would be trivial to make make_pathkeys_for_sortclauses() ignore its "canonicalize" parameter in 9.2, or maybe better add an Assert (not a runtime test) that the parameter is "true". And we could turn canonicalize_pathkeys() into a no-op function in 9.2. However, I'm not seeing a better alternative to adding the callback parameter to query_planner(). The difficulty is that in order to compute those pathkey lists, we need access to the "tlist" and "activeWindows" values, which heretofore have just been local in grouping_planner(). We could add fields to PlannerInfo to carry them, and then grouping_planner() could just call some new function in planner.c rather than needing an explicit callback argument. But new fields in PlannerInfo are an ABI break that in my judgment would be more risky in 9.2 than changing query_planner()'s signature --- I think it's somewhat unlikely that any plug-ins are calling query_planner() directly. Thoughts? Anybody know of a counterexample to the idea that no plug-ins call query_planner()? regards, tom lane
Вложения
В списке pgsql-hackers по дате отправления: