Re: pg16: XX000: could not find pathkey item to sort

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: pg16: XX000: could not find pathkey item to sort
Дата
Msg-id CAApHDvpJJigQRW29TppTOPYp+Aui0mtd3MpfRxyKv=N-tB62jQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: pg16: XX000: could not find pathkey item to sort  (Richard Guo <guofenglinux@gmail.com>)
Ответы Re: pg16: XX000: could not find pathkey item to sort
Список pgsql-hackers
On Tue, 19 Sept 2023 at 23:45, Richard Guo <guofenglinux@gmail.com> wrote:
> My first thought about the fix is that we artificially add resjunk
> target entries to parse->targetList for the ordered aggregates'
> arguments that are ORDER BY expressions, as attached.  While this can
> fix the given query, it would cause Assert failure for the query in
> sql/triggers.sql.

> Any thoughts?

Unfortunately, we can't do that as it'll lead to target entries
existing in the GroupAggregate's target list that have been
aggregated.

postgres=# explain verbose SELECT a, COUNT(DISTINCT b) FROM rp GROUP BY a;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Append  (cost=88.17..201.39 rows=400 width=44)
   ->  GroupAggregate  (cost=88.17..99.70 rows=200 width=44)
         Output: rp.a, count(DISTINCT rp.b), rp.b

Your patch adds rp.b as an output column of the GroupAggregate.
Logically, that column cannot exist there as there is no correct
single value of rp.b after aggregation.

I think the fix needs to go into create_agg_path().  The problem is
that for AGG_SORTED we do:

if (aggstrategy == AGG_SORTED)
    pathnode->path.pathkeys = subpath->pathkeys; /* preserves order */

which assumes that all of the columns before the aggregate will be
available after the aggregate.  That likely used to work ok before
1349d2790 as the planner wouldn't have requested any Pathkeys for
columns that were not available below the Agg node.

We can no longer take the subpath pathkey's verbatim. We need to strip
off pathkeys for columns that are not in pathnode's targetlist.

I've attached a patch which adds a new function to pathkeys.c to strip
off any PathKeys in a list that don't have a corresponding item in the
given PathTarget and just return a prefix of the input pathkey list up
until the first expr that can't be found.

I'm concerned that this patch will be too much overhead when creating
paths when a PathKey's EquivalenceClass has a large number of members
from partitioned tables.  I wondered if we should instead just check
if the subpath's pathkeys match root->group_pathkeys and if they do
set the AggPath's pathkeys to list_copy_head(subpath->pathkeys,
root->num_groupby_pathkeys),  that'll be much cheaper, but it just
feels a bit too much like a special case.

David

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Pre-proposal: unicode normalized text
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: CHECK Constraint Deferrable