Обсуждение: problem with query and group by error

Поиск
Список
Период
Сортировка

problem with query and group by error

От
Susan Cassidy
Дата:
I have a large query:
   SELECT distinct on (srs.scan_run_id) srs.scan_run_id, srs.run_request_number, srs.container_id, srs.manifest_id, srs.scan_system_name_id,
       srs.scan_site_name_id, srs.scan_site_nickname_id, to_char(srs.start_time, 'MM/DD/YY HH24:MI:SS'),
       to_char(srs.stop_time, 'MM/DD/YY HH24:MI:SS'), srs.system_operator,
       srs.system_baseline_configuration_file_version_id, srs.container_contents, srs.container_run_truth_data,
       srs.scan_type_id, sty.scan_type, srs.hardware_version_or_hardware_identifier_id,
       srs.software_version_id, srs.operator_notes,
       to_char(srs.expiration_date, 'MM/DD/YY HH24:MI:SS'), srs.scan_outcome_id,
        to_char(srs.alarm_time, 'MM/DD/YY HH24:MI:SS'), srs.alarm_decision_id, srs.material_detected_id, srs.data_access,
          ssn.scan_system_name, ssn.simulation, ssitenames.scan_site_name, ssitenicknames.scan_site_nickname,
          hvhi.hardware_version_or_hardware_identifier_name, sv.software_version, sc.description
      from scan_run_summary srs left outer join scan_system_names ssn on
             srs.scan_system_name_id = ssn.scan_system_name_id
         left outer join scan_site_names ssitenames on srs.scan_site_name_id = ssitenames.scan_site_name_id
         left outer join scan_site_nicknames ssitenicknames on
           srs.scan_site_nickname_id = ssitenicknames.scan_site_nickname_id
         left outer join hardware_version_or_hardware_identifiers hvhi on srs.hardware_version_or_hardware_identifier_id =
          hvhi.hardware_version_or_hardware_identifier_id
         left outer join software_versions sv on srs.software_version_id = sv.software_version_id
         left outer join scenes sc on srs.container_run_truth_data = sc.scene_id
         left outer join scan_types sty on srs.scan_type_id = sty.scan_type_id
          join scene_thing_instances sti on srs.container_run_truth_data = sti.scene_id
 join scene_things stg on sti.scene_thing_id = stg.scene_thing_id
  group by srs.scan_run_id
;

 
That gives this error:

ERROR:  column "sty.scan_type" must appear in the GROUP BY clause or be used in an aggregate function
LINE 5:        srs.scan_type_id, sty.scan_type, srs.hardware_version...

I don't see why sty.scan_type should be singled out as requiring a group by clause, when there are many other columns specified.

If I add scan_type to the group by, then it gives the same error, but with ssn.scan_system_name. 

Am I going to have to specify all the columns in the group by clause?

I originally had the query without the group by, but I had duplicate rows, so I added a group by to eliminate them.

Thanks,
Susan


Re: problem with query and group by error

От
David Johnston
Дата:
Susan Cassidy-3 wrote
> I have a large query:
>    SELECT distinct on (srs.scan_run_id) srs.scan_run_id,
> srs.run_request_number, srs.container_id, srs.manifest_id,
> srs.scan_system_name_id,
>        srs.scan_site_name_id, srs.scan_site_nickname_id,
> to_char(srs.start_time, 'MM/DD/YY HH24:MI:SS'),
>        to_char(srs.stop_time, 'MM/DD/YY HH24:MI:SS'), srs.system_operator,
>        srs.system_baseline_configuration_file_version_id,
> srs.container_contents, srs.container_run_truth_data,
>        srs.scan_type_id, sty.scan_type,
> srs.hardware_version_or_hardware_identifier_id,
>        srs.software_version_id, srs.operator_notes,
>        to_char(srs.expiration_date, 'MM/DD/YY HH24:MI:SS'),
> srs.scan_outcome_id,
>         to_char(srs.alarm_time, 'MM/DD/YY HH24:MI:SS'),
> srs.alarm_decision_id, srs.material_detected_id, srs.data_access,
>           ssn.scan_system_name, ssn.simulation, ssitenames.scan_site_name,
> ssitenicknames.scan_site_nickname,
>           hvhi.hardware_version_or_hardware_identifier_name,
> sv.software_version, sc.description
>       from scan_run_summary srs left outer join scan_system_names ssn on
>              srs.scan_system_name_id = ssn.scan_system_name_id
>          left outer join scan_site_names ssitenames on
> srs.scan_site_name_id = ssitenames.scan_site_name_id
>          left outer join scan_site_nicknames ssitenicknames on
>            srs.scan_site_nickname_id =
> ssitenicknames.scan_site_nickname_id
>          left outer join hardware_version_or_hardware_identifiers hvhi on
> srs.hardware_version_or_hardware_identifier_id =
>           hvhi.hardware_version_or_hardware_identifier_id
>          left outer join software_versions sv on srs.software_version_id =
> sv.software_version_id
>          left outer join scenes sc on srs.container_run_truth_data =
> sc.scene_id
>          left outer join scan_types sty on srs.scan_type_id =
> sty.scan_type_id
>           join scene_thing_instances sti on srs.container_run_truth_data =
> sti.scene_id
>  join scene_things stg on sti.scene_thing_id = stg.scene_thing_id
>   group by srs.scan_run_id
> ;
>
>
> That gives this error:
>
> ERROR:  column "sty.scan_type" must appear in the GROUP BY clause or be
> used in an aggregate function
> LINE 5:        srs.scan_type_id, sty.scan_type, srs.hardware_version...
>
> I don't see why sty.scan_type should be singled out as requiring a group
> by
> clause, when there are many other columns specified.
>
> If I add scan_type to the group by, then it gives the same error, but with
> ssn.scan_system_name.
>
> Am I going to have to specify all the columns in the group by clause?
>
> I originally had the query without the group by, but I had duplicate rows,
> so I added a group by to eliminate them.
>
> Thanks,
> Susan

Newer releases (not sure which) are capable of identifying a primary key in
a group by and allow you to omit all dependent columns of said primary key.

However, as a general rule, every non-aggregated column must appear in the
GROUP BY.  There is not "default behavior" for columns not appearing in
group by nor that have been aggregated.

David J.





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/problem-with-query-and-group-by-error-tp5793127p5793128.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: problem with query and group by error

От
Emanuel Calvo
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512



El 21/02/14 14:40, Susan Cassidy escribió:
> I have a large query: SELECT distinct on (srs.scan_run_id)
> srs.scan_run_id, srs.run_request_number, srs.container_id,
> srs.manifest_id, srs.scan_system_name_id, srs.scan_site_name_id,
> srs.scan_site_nickname_id, to_char(srs.start_time, 'MM/DD/YY
> HH24:MI:SS'), to_char(srs.stop_time, 'MM/DD/YY HH24:MI:SS'),
> srs.system_operator,
> srs.system_baseline_configuration_file_version_id,
> srs.container_contents, srs.container_run_truth_data,
> srs.scan_type_id, sty.scan_type,
> srs.hardware_version_or_hardware_identifier_id,
> srs.software_version_id, srs.operator_notes,
> to_char(srs.expiration_date, 'MM/DD/YY HH24:MI:SS'),
> srs.scan_outcome_id, to_char(srs.alarm_time, 'MM/DD/YY
> HH24:MI:SS'), srs.alarm_decision_id, srs.material_detected_id,
> srs.data_access, ssn.scan_system_name, ssn.simulation,
> ssitenames.scan_site_name, ssitenicknames.scan_site_nickname,
> hvhi.hardware_version_or_hardware_identifier_name,
> sv.software_version, sc.description from scan_run_summary srs left
> outer join scan_system_names ssn on srs.scan_system_name_id =
> ssn.scan_system_name_id left outer join scan_site_names ssitenames
> on srs.scan_site_name_id = ssitenames.scan_site_name_id left outer
> join scan_site_nicknames ssitenicknames on
> srs.scan_site_nickname_id = ssitenicknames.scan_site_nickname_id
> left outer join hardware_version_or_hardware_identifiers hvhi on
> srs.hardware_version_or_hardware_identifier_id =
> hvhi.hardware_version_or_hardware_identifier_id left outer join
> software_versions sv on srs.software_version_id =
> sv.software_version_id left outer join scenes sc on
> srs.container_run_truth_data = sc.scene_id left outer join
> scan_types sty on srs.scan_type_id = sty.scan_type_id join
> scene_thing_instances sti on srs.container_run_truth_data =
> sti.scene_id join scene_things stg on sti.scene_thing_id =
> stg.scene_thing_id group by srs.scan_run_id ;
>
>
> That gives this error:
>
> ERROR:  column "sty.scan_type" must appear in the GROUP BY clause
> or be used in an aggregate function LINE 5:
> srs.scan_type_id, sty.scan_type, srs.hardware_version...
>
> I don't see why sty.scan_type should be singled out as requiring a
> group by clause, when there are many other columns specified.

Many not always means enough.

>
> If I add scan_type to the group by, then it gives the same error,
> but with ssn.scan_system_name.
>
> Am I going to have to specify all the columns in the group by
> clause?
>

Yes, except those that are inside the aggregate.

http://www.postgresql.org/docs/9.3/static/queries-table-expressions.html

Section 7.2.3

> I originally had the query without the group by, but I had
> duplicate rows, so I added a group by to eliminate them.
>
> Thanks, Susan
>

- --
- --
Emanuel Calvo
Consultant // 2ndQuadrant
Bs. As., Argentina // (GMT-3)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.18 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCgAGBQJTB5KAAAoJEIBeI/HMagHm5iQQAI8WcPcdUJrfTyensI1oI1ig
2zqq+mLVfgCHnh+9+1AH7eESl7mSqpk3cD6L5FoMNWVsG/5VKZG/vEcgB1IuN8DQ
RPVa0MBAV03HTWX37HXAleyu++vQ8BCIUIgfsmmrWpmJonhssnwW91uHvisl3hXq
dujypLtT6Xcu+0b+jtAwsayX6H5dH7g1ODzU8ofxS6o8SRxe3zCCJIykeK81PR0Q
L55WH30xx1YRPhj48OFrPuvcCRS1M3nhWTTlh5OQ6UCzMMCCmUv2bR2nurqq9gBs
lUx3iB+ra2fnIIZYcZHocMFaWJUOQQ4+dj9LsUxyel8qOLZvIzcNrSrA868XlZNT
IXoWm9IYfKyyZYtHD7PdwLPSZuFYDqW0ll+GMm3/wbaK2NOIW7p8C4/DylIxbUgO
DXkt8y3Hn05UjpfgFCDiOrMeXvdEjlb66aNiIePYmsJWDq6/CF8fj77EXZ3KP6t7
JUJ7YzDRtW99M+GsOYOLjVvMbE7NfS1KUKt/NNKGFsZAJ/TmQlHyFxYThIVYeYq2
FFqp6s1cWYJILhFD150zCZt2DpDt7NmNuczm7gJEb61avUHIZIrTw9VThcTF+Yh0
5YmJmF+wsewKy38jiyHWGRChH5n65NscZDzMO0NpfO8VR1KN4Su/ahMA+GADLpE9
WjLHQHHHRWcoiOCOgiyF
=ighX
-----END PGP SIGNATURE-----


Re: problem with query and group by error

От
Scott Marlowe
Дата:
On Fri, Feb 21, 2014 at 10:40 AM, Susan Cassidy
<susan.cassidy@decisionsciencescorp.com> wrote:
>
> I originally had the query without the group by, but I had duplicate rows,
> so I added a group by to eliminate them.


Have you tried select distinct or select distinct on ()?


Re: problem with query and group by error

От
Susan Cassidy
Дата:
I tried distinct on srs.scan_run_id, which is a primary key, and got an error, but I tried it again just now, and it worked fine.

Thanks for having me try it again.

I had ended up with
   group by srs.scan_run_id, sty.scan_type, ssn.scan_system_name, ssn.simulation, ssitenames.scan_site_name,
   ssitenames.scan_site_name, ssitenicknames.scan_site_nickname,
   hvhi.hardware_version_or_hardware_identifier_name, sv.software_version, sc.description

because I kept adding column names that I got errors on, but this is by no means the whole list.  So, I'm still confused as to why I got the error in the first place.

Someone said something about
 "Yes, except those that are inside the aggregate."
but I don't have an aggregate specified.

Susan



On Fri, Feb 21, 2014 at 10:38 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Fri, Feb 21, 2014 at 10:40 AM, Susan Cassidy
<susan.cassidy@decisionsciencescorp.com> wrote:
>
> I originally had the query without the group by, but I had duplicate rows,
> so I added a group by to eliminate them.


Have you tried select distinct or select distinct on ()?

Re: problem with query and group by error

От
David Johnston
Дата:
Susan Cassidy-3 wrote
> Someone said something about
>  "Yes, except those that are inside the aggregate."
> but I don't have an aggregate specified.

So every column then...

As soon as you add "group by" the rule becomes - every column is either a
group determinate or is aggregated.  If you simply want to remove duplicates
you can write:

Select distinct ... From

No group by clause required and every output column is used to create an
implicit non-duplicated group.

I haven't tried to figure out what you are trying to do with this query so I
don't know which, if either, form is more correct but grouping without
aggregates is unusual and I also find that using distinct is not that common
a need of you have a correctly normalized database.  IOW you should not use
group by or distinct to "make the query work" but only if you
know/understand why doing so is necessary.

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/problem-with-query-and-group-by-error-tp5793127p5793140.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.