Re: Analyzing bug 8049

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Analyzing bug 8049
Дата
Msg-id 738.1367193617@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Analyzing bug 8049  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Analyzing bug 8049  (Jim Nasby <jim@nasby.net>)
Список pgsql-hackers
I wrote:
> The only alternative I can see is to make a back-patch that just teaches
> get_eclass_for_sort_expr() to compute valid nullable_relids for the sort
> expression.  That's necessary code in any case, and it would fix the
> immediately complained-of bug.  The thing that scares me is that it is
> now clear that equivclass.c is capable of considering two expressions to
> be equivalent when they should not be; that is, I'm afraid there are
> variants of the problem that would not be cured by such a limited
> back-patch.  But maybe I should try to create such an example before
> proposing the more invasive approach.

In thinking about how to compute valid nullable_relids in
get_eclass_for_sort_expr, I realized that it is impractical to do it
in the current code structure, because computing nullable_relids
requires that the joininfo list has been built by deconstruct_jointree,
which hasn't been run yet at the time grouping_planner computes the
pathkeys for the ORDER BY and related clauses.  So I concluded that
we need to postpone computation of those pathkeys lists until after
deconstruct_jointree.

Working through this, I realized that we could, in fact, simply delay
computation of all five of group_pathkeys, window_pathkeys,
distinct_pathkeys, sort_pathkeys, and query_pathkeys until the point
where we currently do canonicalize_all_pathkeys.  There isn't anything
that looks at those lists in-between.  The only reason they're set up by
grouping_planner is separation of concerns --- query_planner doesn't
really have any business dealing with group_pathkeys for instance.
Furthermore, if we delay computation of those lists, then we can get rid
of the notion of non-canonical pathkeys altogether: there is noplace
that needs to generate a PathKey at all before this point.  So that
results in a nice conceptual simplification as well as saving a few
cycles.

Once I'd finished moving the pathkeys calculations, I was a tad
surprised to discover that bug 8049 was gone!  The reason for this is
that now, if an ORDER BY item matches some expression that has another
reason to be in an EquivalenceClass, the ORDER BY item will get matched
to a pre-existing EC item and so there will be no opportunity for
get_eclass_for_sort_expr() to do the wrong thing.  If an ORDER BY item
doesn't match anything, then it will be created as a new single-item
EquivalenceClass, containing arguably-incorrect NULL em_nullable_relids,
but there is nothing that will notice that.

I remain suspicious that there are or will someday be cases where we
actually need to generate valid em_nullable_relids for an ORDER BY item,
but in the absence of a demonstrated bug it doesn't seem real prudent to
add a bunch of new code here, either in 9.2 or 9.3.  Therefore, I think
what we ought to do is apply something like the attached patch to 9.2
and 9.3 and call it a day for now.

The difficulty with back-patching this patch is that it involves API
changes for query_planner() and make_pathkeys_for_sortclauses(), as well
as outright removal of canonicalize_pathkeys().  So there's a risk of
breaking third-party code that calls any of those functions.  I don't
have the slightest hesitation about changing these APIs in 9.3, but
back-patching them into 9.2 is a bit more nervous-making.

It would be trivial to make make_pathkeys_for_sortclauses() ignore
its "canonicalize" parameter in 9.2, or maybe better add an Assert
(not a runtime test) that the parameter is "true".  And we could
turn canonicalize_pathkeys() into a no-op function in 9.2.  However,
I'm not seeing a better alternative to adding the callback parameter
to query_planner().  The difficulty is that in order to compute those
pathkey lists, we need access to the "tlist" and "activeWindows" values,
which heretofore have just been local in grouping_planner().  We could
add fields to PlannerInfo to carry them, and then grouping_planner()
could just call some new function in planner.c rather than needing an
explicit callback argument.  But new fields in PlannerInfo are an ABI
break that in my judgment would be more risky in 9.2 than changing
query_planner()'s signature --- I think it's somewhat unlikely that
any plug-ins are calling query_planner() directly.

Thoughts?  Anybody know of a counterexample to the idea that no plug-ins
call query_planner()?

            regards, tom lane


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Remaining beta blockers
Следующее
От: Josh Berkus
Дата:
Сообщение: ALTER DEFAULT PRIVILEGES FOR ROLE is broken