Re: BUG #15795: ERROR: could not find pathkey item to sort
От | Amit Langote |
---|---|
Тема | Re: BUG #15795: ERROR: could not find pathkey item to sort |
Дата | |
Msg-id | da30ee93-1655-2220-f413-29017c7606a5@lab.ntt.co.jp обсуждение исходный текст |
Ответ на | Re: BUG #15795: ERROR: could not find pathkey item to sort (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #15795: ERROR: could not find pathkey item to sort
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-bugs |
On 2019/05/08 19:33, Tom Lane wrote: > For the archives, though, this isn't hard to reproduce: > > regression=# create table person(_id int, _actual_type_name text, name text); > CREATE TABLE > regression=# SELECT DISTINCT A._id0 as _id0, A._actual_type_name0 as _actual_type_name0 > FROM ( ( SELECT DISTINCT _id as _id0, _actual_type_name as > _actual_type_name0, name as name0 FROM person ) union all ( > SELECT DISTINCT _id as _id0, _actual_type_name as _actual_type_name0, name > as name0 FROM person)) as A WHERE ( A.name0 = A.name0 ); > ERROR: could not find pathkey item to sort > > Curiously, this only fails for me in 9.6 and 10, not earlier or later > branches. Here are my findings after debugging this on 9.5, 9.6, 10, and 11: As you also said, the problem only seems to occur in 9.6 and 10. It does because the MergeAppend path created for UNION ALL subquery in the outer query's FROM contains one pathkey item too many in its pathkeys field. That's because each of its child subpaths has pathkeys (_id, _actual_type_name, name), whereas the outer query only wants (_id, _actual_type_name). I hoped that convert_subquery_pathkeys() called during child subquery's path creation would've taken the unnecessary "name" out because the outer query doesn't need it, but it didn't. It fails to do so because the subquery's pathkey "name" is being matched successfully to an outer EC installed due to the outer query's WHERE (A.name0 = A.name0); that's via the following code in convert_subquery_pathkeys(): outer_ec = get_eclass_for_sort_expr(root, outer_expr, I checked if and how convert_subquery_pathkeys() determines which of the subquery's pathkeys are useful to the outer query and there does exist a scoring system to evaluate the usefulness of subquery's pathkeys to outer query, but I didn't fully understand it. In any case, the main problem seems to be that convert_subquery_pathkeys() can't keep "name" from appearing in the output pathkeys that it produces. Based on that premise, I added the following code to convert_subquery_pathkeys(): + + if (retvallen == outer_query_keys) + break; which seems to fix the issue. Alternatively, maybe we can apply truncate_useless_pathkeys() to the result of convert_subquery_pathkeys(). The problem doesn't manifest with 9.5 or 11 (even HEAD for that matter), because a sort-based path is not chosen in their case for unique-fying (UNION ALL uses Append, not MergeAppend on cost grounds), so there's no attempt to look for "pathkey item to sort" to begin with. In 11's (and HEAD's) case, even if MergeAppend had won in terms of costing, the problem wouldn't occur at least for this query, because "name" doesn't appear in the outer query's ECs due to the following commit: commit 8ec5429e2f422f4d570d4909507db0d4ca83bbac Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Sun Oct 8 12:23:32 2017 -0400 Reduce "X = X" to "X IS NOT NULL", if it's easy to do so. However it's easy to tweak the query such that "name" *will* end up in outer query's ECs as follows: SELECT DISTINCT A._id0 as _id0, A._actual_type_name0 as _actual_type_name0 FROM ( ( SELECT DISTINCT _id as _id0, _actual_type_name as _actual_type_name0, name as name0 FROM person ) union all ( SELECT DISTINCT _id as _id0, _actual_type_name as _actual_type_name0, name as name0 FROM person)) as A INNER JOIN person A1 ON ( A.name0 = A1.name ); Now, A.name0 and A1.name in the join condition do form a an EC, which does trick convert_subquery_pathkeys() into accepting the child subqueries' "name" key. IOW, if the "fix" I mentioned above is correct, it will have to applied to all the branches, because this seems to be a fundamental problem with convert_subquery_pathkeys(). Thoughts? Thanks, Amit
В списке pgsql-bugs по дате отправления: