Обсуждение: ORDER/GROUP BY expression not found in targetlist

Поиск
Список
Период
Сортировка

ORDER/GROUP BY expression not found in targetlist

От
Andres Freund
Дата:
Hi,

trying to reproduce a performance problem I just found:

=# CREATE TABLE twocol(col01 int, col02 int);
=# SELECT DISTINCT col01, col02, col01 FROM twocol ;
ERROR:  XX000: ORDER/GROUP BY expression not found in targetlist
LOCATION:  get_sortgroupref_tle, tlist.c:341

which appears to be a 9.6 regression, presumable fallout from the path
restructuring.

Greetings,

Andres Freund



Re: ORDER/GROUP BY expression not found in targetlist

От
Peter Geoghegan
Дата:
On Wed, May 25, 2016 at 7:12 PM, Andres Freund <andres@anarazel.de> wrote:
>
> =# CREATE TABLE twocol(col01 int, col02 int);
> =# SELECT DISTINCT col01, col02, col01 FROM twocol ;
> ERROR:  XX000: ORDER/GROUP BY expression not found in targetlist
> LOCATION:  get_sortgroupref_tle, tlist.c:341
>
> which appears to be a 9.6 regression, presumable fallout from the path
> restructuring.

It's surprising that SQL Smith didn't catch something with such simple
steps to reproduce.


-- 
Peter Geoghegan



Re: ORDER/GROUP BY expression not found in targetlist

От
Andreas Seltenreich
Дата:
Peter Geoghegan writes:

> On Wed, May 25, 2016 at 7:12 PM, Andres Freund <andres@anarazel.de> wrote:
>>
>> =# CREATE TABLE twocol(col01 int, col02 int);
>> =# SELECT DISTINCT col01, col02, col01 FROM twocol ;
>> ERROR:  XX000: ORDER/GROUP BY expression not found in targetlist
>> LOCATION:  get_sortgroupref_tle, tlist.c:341
>>
>> which appears to be a 9.6 regression, presumable fallout from the path
>> restructuring.
>
> It's surprising that SQL Smith didn't catch something with such simple
> steps to reproduce.

I removed distinct relatively early because it causes a large part of
queries to fail due to it not finding an equality operator it likes.  It
seems to be more picky about the equality operator than, say, joins.
I'm sure it has a good reason to do so?

regression=> select distinct f1 from path_tbl;
ERROR:  could not identify an equality operator for type path
LINE 1: select distinct f1 from path_tbl;

regression=> \do =
-[ RECORD 38 ]-+----------------------------
Schema         | pg_catalog
Name           | =
Left arg type  | path
Right arg type | path
Result type    | boolean
Description    | equal




Re: ORDER/GROUP BY expression not found in targetlist

От
Tom Lane
Дата:
Andreas Seltenreich <seltenreich@gmx.de> writes:
> Peter Geoghegan writes:
>> It's surprising that SQL Smith didn't catch something with such simple
>> steps to reproduce.

> I removed distinct relatively early because it causes a large part of
> queries to fail due to it not finding an equality operator it likes.  It
> seems to be more picky about the equality operator than, say, joins.
> I'm sure it has a good reason to do so?

It's looking for an operator that is known to be semantically equality,
by virtue of being the equality member of a btree or hash opclass.
Type path has no such opclass unfortunately.  But when you write "a = b"
that just looks for an operator named "=".
        regards, tom lane



Re: ORDER/GROUP BY expression not found in targetlist

От
Andreas Seltenreich
Дата:
Tom Lane writes:

> Andreas Seltenreich <seltenreich@gmx.de> writes:
>> Peter Geoghegan writes:
>>> It's surprising that SQL Smith didn't catch something with such simple
>>> steps to reproduce.
>
>> I removed distinct relatively early because it causes a large part of
>> queries to fail due to it not finding an equality operator it likes.  It
>> seems to be more picky about the equality operator than, say, joins.
>> I'm sure it has a good reason to do so?
>
> It's looking for an operator that is known to be semantically equality,
> by virtue of being the equality member of a btree or hash opclass.
> Type path has no such opclass unfortunately.

As do lots of data types in the regression db while still having an
operator providing semantic equivalence.  I was hoping for someone to
question that status quo.  Naively I'd say an equivalence flag is
missing in the catalog that is independent of opclasses.

regards
Andreas



Re: ORDER/GROUP BY expression not found in targetlist

От
Tom Lane
Дата:
Andreas Seltenreich <seltenreich@gmx.de> writes:
> Tom Lane writes:
>> It's looking for an operator that is known to be semantically equality,
>> by virtue of being the equality member of a btree or hash opclass.
>> Type path has no such opclass unfortunately.

> As do lots of data types in the regression db while still having an
> operator providing semantic equivalence.  I was hoping for someone to
> question that status quo.  Naively I'd say an equivalence flag is
> missing in the catalog that is independent of opclasses.

[ shrug... ]  I see little wrong with that status quo.  For this
particular use-case, there are two ways we could implement DISTINCT: one
of them requires sorting, and the other requires hashing.  So you would
need to provide that opclass infrastructure even if there were some other
way of identifying the operator that means equality.

Type path and the other geometric types lack any natural sort order so
it's hard to imagine making a default btree opclass for them.  But a
default hash opclass might not be out of reach, given an exact equality
operator.

Another problem with the geometric types is that long ago somebody
invented "=" operators for most of them that have little to do with what
anyone would consider equality.  The "path = path" operator just compares
whether the paths have the same number of points.  A lot of the other ones
compare areas.  It'd be hard to justify marking any of them as default
equality for the type.
        regards, tom lane



Re: ORDER/GROUP BY expression not found in targetlist

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> trying to reproduce a performance problem I just found:

> =# CREATE TABLE twocol(col01 int, col02 int);
> =# SELECT DISTINCT col01, col02, col01 FROM twocol ;
> ERROR:  XX000: ORDER/GROUP BY expression not found in targetlist
> LOCATION:  get_sortgroupref_tle, tlist.c:341

> which appears to be a 9.6 regression, presumable fallout from the path
> restructuring.

Huh.  The problem is that createplan.c is trying to apply the
physical-tlist optimization to the seqscan underneath the aggregate
node.  That means that the output from the seqscan is just
"col01, col02", which means that col01 can only be decorated with
a single ressortgroupref ... but there are two ressortgrouprefs
for it as far as the groupClause is concerned.  Only one gets applied
to the seqscan's tlist, and then later we fail because we don't find
the other one there.  Conclusions:

* we need to back off the physical-tlist optimization in this case

* the code that transfers sortgroupref labels onto a tlist probably
ought to notice and complain if it's asked to put inconsistent labels
onto the same column.

I'm a little surprised that it's not discarding the third grouping
item as redundant ... but that's probably not something to mess with
right now.  Prior versions don't appear to do that either.
        regards, tom lane



Re: ORDER/GROUP BY expression not found in targetlist

От
Tatsuro Yamada
Дата:
Hi,

I got same error by TPC-H: Q1,4,8,12 and 17.
I've attached results of the queries.

TPC-H (thanks to Tomas Vondra)
  https://github.com/tvondra/pg_tpch

  Datasize
     Scale Factor: 1

PG96beta1
  commit: f721e94b5f360391fc3ffe183bf697a0441e9184

Regards,
Tatsuro Yamada
NTT OSS Center

On 2016/05/27 2:22, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>> trying to reproduce a performance problem I just found:
>
>> =# CREATE TABLE twocol(col01 int, col02 int);
>> =# SELECT DISTINCT col01, col02, col01 FROM twocol ;
>> ERROR:  XX000: ORDER/GROUP BY expression not found in targetlist
>> LOCATION:  get_sortgroupref_tle, tlist.c:341
>
>> which appears to be a 9.6 regression, presumable fallout from the path
>> restructuring.
>
> Huh.  The problem is that createplan.c is trying to apply the
> physical-tlist optimization to the seqscan underneath the aggregate
> node.  That means that the output from the seqscan is just
> "col01, col02", which means that col01 can only be decorated with
> a single ressortgroupref ... but there are two ressortgrouprefs
> for it as far as the groupClause is concerned.  Only one gets applied
> to the seqscan's tlist, and then later we fail because we don't find
> the other one there.  Conclusions:
>
> * we need to back off the physical-tlist optimization in this case
>
> * the code that transfers sortgroupref labels onto a tlist probably
> ought to notice and complain if it's asked to put inconsistent labels
> onto the same column.
>
> I'm a little surprised that it's not discarding the third grouping
> item as redundant ... but that's probably not something to mess with
> right now.  Prior versions don't appear to do that either.
>
>             regards, tom lane
>
>



Вложения