Обсуждение: problem with query and group by error
I have a large query:
SELECT distinct on (srs.scan_run_id) srs.scan_run_id, srs.run_request_number, srs.container_id, srs.manifest_id, srs.scan_system_name_id,
srs.scan_site_name_id, srs.scan_site_nickname_id, to_char(srs.start_time, 'MM/DD/YY HH24:MI:SS'),
to_char(srs.stop_time, 'MM/DD/YY HH24:MI:SS'), srs.system_operator,
srs.system_baseline_configuration_file_version_id, srs.container_contents, srs.container_run_truth_data,
srs.scan_type_id, sty.scan_type, srs.hardware_version_or_hardware_identifier_id,
srs.software_version_id, srs.operator_notes,
to_char(srs.expiration_date, 'MM/DD/YY HH24:MI:SS'), srs.scan_outcome_id,
to_char(srs.alarm_time, 'MM/DD/YY HH24:MI:SS'), srs.alarm_decision_id, srs.material_detected_id, srs.data_access,
ssn.scan_system_name, ssn.simulation, ssitenames.scan_site_name, ssitenicknames.scan_site_nickname,
hvhi.hardware_version_or_hardware_identifier_name, sv.software_version, sc.description
from scan_run_summary srs left outer join scan_system_names ssn on
srs.scan_system_name_id = ssn.scan_system_name_id
left outer join scan_site_names ssitenames on srs.scan_site_name_id = ssitenames.scan_site_name_id
left outer join scan_site_nicknames ssitenicknames on
srs.scan_site_nickname_id = ssitenicknames.scan_site_nickname_id
left outer join hardware_version_or_hardware_identifiers hvhi on srs.hardware_version_or_hardware_identifier_id =
hvhi.hardware_version_or_hardware_identifier_id
left outer join software_versions sv on srs.software_version_id = sv.software_version_id
left outer join scenes sc on srs.container_run_truth_data = sc.scene_id
left outer join scan_types sty on srs.scan_type_id = sty.scan_type_id
join scene_thing_instances sti on srs.container_run_truth_data = sti.scene_id
join scene_things stg on sti.scene_thing_id = stg.scene_thing_id
group by srs.scan_run_id
;
That gives this error:SELECT distinct on (srs.scan_run_id) srs.scan_run_id, srs.run_request_number, srs.container_id, srs.manifest_id, srs.scan_system_name_id,
srs.scan_site_name_id, srs.scan_site_nickname_id, to_char(srs.start_time, 'MM/DD/YY HH24:MI:SS'),
to_char(srs.stop_time, 'MM/DD/YY HH24:MI:SS'), srs.system_operator,
srs.system_baseline_configuration_file_version_id, srs.container_contents, srs.container_run_truth_data,
srs.scan_type_id, sty.scan_type, srs.hardware_version_or_hardware_identifier_id,
srs.software_version_id, srs.operator_notes,
to_char(srs.expiration_date, 'MM/DD/YY HH24:MI:SS'), srs.scan_outcome_id,
to_char(srs.alarm_time, 'MM/DD/YY HH24:MI:SS'), srs.alarm_decision_id, srs.material_detected_id, srs.data_access,
ssn.scan_system_name, ssn.simulation, ssitenames.scan_site_name, ssitenicknames.scan_site_nickname,
hvhi.hardware_version_or_hardware_identifier_name, sv.software_version, sc.description
from scan_run_summary srs left outer join scan_system_names ssn on
srs.scan_system_name_id = ssn.scan_system_name_id
left outer join scan_site_names ssitenames on srs.scan_site_name_id = ssitenames.scan_site_name_id
left outer join scan_site_nicknames ssitenicknames on
srs.scan_site_nickname_id = ssitenicknames.scan_site_nickname_id
left outer join hardware_version_or_hardware_identifiers hvhi on srs.hardware_version_or_hardware_identifier_id =
hvhi.hardware_version_or_hardware_identifier_id
left outer join software_versions sv on srs.software_version_id = sv.software_version_id
left outer join scenes sc on srs.container_run_truth_data = sc.scene_id
left outer join scan_types sty on srs.scan_type_id = sty.scan_type_id
join scene_thing_instances sti on srs.container_run_truth_data = sti.scene_id
join scene_things stg on sti.scene_thing_id = stg.scene_thing_id
group by srs.scan_run_id
;
ERROR: column "sty.scan_type" must appear in the GROUP BY clause or be used in an aggregate function
LINE 5: srs.scan_type_id, sty.scan_type, srs.hardware_version...
I don't see why sty.scan_type should be singled out as requiring a group by clause, when there are many other columns specified.
If I add scan_type to the group by, then it gives the same error, but with ssn.scan_system_name.
Am I going to have to specify all the columns in the group by clause?
Am I going to have to specify all the columns in the group by clause?
I originally had the query without the group by, but I had duplicate rows, so I added a group by to eliminate them.
Thanks,
Susan
Susan
Susan Cassidy-3 wrote > I have a large query: > SELECT distinct on (srs.scan_run_id) srs.scan_run_id, > srs.run_request_number, srs.container_id, srs.manifest_id, > srs.scan_system_name_id, > srs.scan_site_name_id, srs.scan_site_nickname_id, > to_char(srs.start_time, 'MM/DD/YY HH24:MI:SS'), > to_char(srs.stop_time, 'MM/DD/YY HH24:MI:SS'), srs.system_operator, > srs.system_baseline_configuration_file_version_id, > srs.container_contents, srs.container_run_truth_data, > srs.scan_type_id, sty.scan_type, > srs.hardware_version_or_hardware_identifier_id, > srs.software_version_id, srs.operator_notes, > to_char(srs.expiration_date, 'MM/DD/YY HH24:MI:SS'), > srs.scan_outcome_id, > to_char(srs.alarm_time, 'MM/DD/YY HH24:MI:SS'), > srs.alarm_decision_id, srs.material_detected_id, srs.data_access, > ssn.scan_system_name, ssn.simulation, ssitenames.scan_site_name, > ssitenicknames.scan_site_nickname, > hvhi.hardware_version_or_hardware_identifier_name, > sv.software_version, sc.description > from scan_run_summary srs left outer join scan_system_names ssn on > srs.scan_system_name_id = ssn.scan_system_name_id > left outer join scan_site_names ssitenames on > srs.scan_site_name_id = ssitenames.scan_site_name_id > left outer join scan_site_nicknames ssitenicknames on > srs.scan_site_nickname_id = > ssitenicknames.scan_site_nickname_id > left outer join hardware_version_or_hardware_identifiers hvhi on > srs.hardware_version_or_hardware_identifier_id = > hvhi.hardware_version_or_hardware_identifier_id > left outer join software_versions sv on srs.software_version_id = > sv.software_version_id > left outer join scenes sc on srs.container_run_truth_data = > sc.scene_id > left outer join scan_types sty on srs.scan_type_id = > sty.scan_type_id > join scene_thing_instances sti on srs.container_run_truth_data = > sti.scene_id > join scene_things stg on sti.scene_thing_id = stg.scene_thing_id > group by srs.scan_run_id > ; > > > That gives this error: > > ERROR: column "sty.scan_type" must appear in the GROUP BY clause or be > used in an aggregate function > LINE 5: srs.scan_type_id, sty.scan_type, srs.hardware_version... > > I don't see why sty.scan_type should be singled out as requiring a group > by > clause, when there are many other columns specified. > > If I add scan_type to the group by, then it gives the same error, but with > ssn.scan_system_name. > > Am I going to have to specify all the columns in the group by clause? > > I originally had the query without the group by, but I had duplicate rows, > so I added a group by to eliminate them. > > Thanks, > Susan Newer releases (not sure which) are capable of identifying a primary key in a group by and allow you to omit all dependent columns of said primary key. However, as a general rule, every non-aggregated column must appear in the GROUP BY. There is not "default behavior" for columns not appearing in group by nor that have been aggregated. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/problem-with-query-and-group-by-error-tp5793127p5793128.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 El 21/02/14 14:40, Susan Cassidy escribió: > I have a large query: SELECT distinct on (srs.scan_run_id) > srs.scan_run_id, srs.run_request_number, srs.container_id, > srs.manifest_id, srs.scan_system_name_id, srs.scan_site_name_id, > srs.scan_site_nickname_id, to_char(srs.start_time, 'MM/DD/YY > HH24:MI:SS'), to_char(srs.stop_time, 'MM/DD/YY HH24:MI:SS'), > srs.system_operator, > srs.system_baseline_configuration_file_version_id, > srs.container_contents, srs.container_run_truth_data, > srs.scan_type_id, sty.scan_type, > srs.hardware_version_or_hardware_identifier_id, > srs.software_version_id, srs.operator_notes, > to_char(srs.expiration_date, 'MM/DD/YY HH24:MI:SS'), > srs.scan_outcome_id, to_char(srs.alarm_time, 'MM/DD/YY > HH24:MI:SS'), srs.alarm_decision_id, srs.material_detected_id, > srs.data_access, ssn.scan_system_name, ssn.simulation, > ssitenames.scan_site_name, ssitenicknames.scan_site_nickname, > hvhi.hardware_version_or_hardware_identifier_name, > sv.software_version, sc.description from scan_run_summary srs left > outer join scan_system_names ssn on srs.scan_system_name_id = > ssn.scan_system_name_id left outer join scan_site_names ssitenames > on srs.scan_site_name_id = ssitenames.scan_site_name_id left outer > join scan_site_nicknames ssitenicknames on > srs.scan_site_nickname_id = ssitenicknames.scan_site_nickname_id > left outer join hardware_version_or_hardware_identifiers hvhi on > srs.hardware_version_or_hardware_identifier_id = > hvhi.hardware_version_or_hardware_identifier_id left outer join > software_versions sv on srs.software_version_id = > sv.software_version_id left outer join scenes sc on > srs.container_run_truth_data = sc.scene_id left outer join > scan_types sty on srs.scan_type_id = sty.scan_type_id join > scene_thing_instances sti on srs.container_run_truth_data = > sti.scene_id join scene_things stg on sti.scene_thing_id = > stg.scene_thing_id group by srs.scan_run_id ; > > > That gives this error: > > ERROR: column "sty.scan_type" must appear in the GROUP BY clause > or be used in an aggregate function LINE 5: > srs.scan_type_id, sty.scan_type, srs.hardware_version... > > I don't see why sty.scan_type should be singled out as requiring a > group by clause, when there are many other columns specified. Many not always means enough. > > If I add scan_type to the group by, then it gives the same error, > but with ssn.scan_system_name. > > Am I going to have to specify all the columns in the group by > clause? > Yes, except those that are inside the aggregate. http://www.postgresql.org/docs/9.3/static/queries-table-expressions.html Section 7.2.3 > I originally had the query without the group by, but I had > duplicate rows, so I added a group by to eliminate them. > > Thanks, Susan > - -- - -- Emanuel Calvo Consultant // 2ndQuadrant Bs. As., Argentina // (GMT-3) -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBCgAGBQJTB5KAAAoJEIBeI/HMagHm5iQQAI8WcPcdUJrfTyensI1oI1ig 2zqq+mLVfgCHnh+9+1AH7eESl7mSqpk3cD6L5FoMNWVsG/5VKZG/vEcgB1IuN8DQ RPVa0MBAV03HTWX37HXAleyu++vQ8BCIUIgfsmmrWpmJonhssnwW91uHvisl3hXq dujypLtT6Xcu+0b+jtAwsayX6H5dH7g1ODzU8ofxS6o8SRxe3zCCJIykeK81PR0Q L55WH30xx1YRPhj48OFrPuvcCRS1M3nhWTTlh5OQ6UCzMMCCmUv2bR2nurqq9gBs lUx3iB+ra2fnIIZYcZHocMFaWJUOQQ4+dj9LsUxyel8qOLZvIzcNrSrA868XlZNT IXoWm9IYfKyyZYtHD7PdwLPSZuFYDqW0ll+GMm3/wbaK2NOIW7p8C4/DylIxbUgO DXkt8y3Hn05UjpfgFCDiOrMeXvdEjlb66aNiIePYmsJWDq6/CF8fj77EXZ3KP6t7 JUJ7YzDRtW99M+GsOYOLjVvMbE7NfS1KUKt/NNKGFsZAJ/TmQlHyFxYThIVYeYq2 FFqp6s1cWYJILhFD150zCZt2DpDt7NmNuczm7gJEb61avUHIZIrTw9VThcTF+Yh0 5YmJmF+wsewKy38jiyHWGRChH5n65NscZDzMO0NpfO8VR1KN4Su/ahMA+GADLpE9 WjLHQHHHRWcoiOCOgiyF =ighX -----END PGP SIGNATURE-----
On Fri, Feb 21, 2014 at 10:40 AM, Susan Cassidy <susan.cassidy@decisionsciencescorp.com> wrote: > > I originally had the query without the group by, but I had duplicate rows, > so I added a group by to eliminate them. Have you tried select distinct or select distinct on ()?
I tried distinct on srs.scan_run_id, which is a primary key, and got an error, but I tried it again just now, and it worked fine.
Thanks for having me try it again.
I had ended up with
group by srs.scan_run_id, sty.scan_type, ssn.scan_system_name, ssn.simulation, ssitenames.scan_site_name,
ssitenames.scan_site_name, ssitenicknames.scan_site_nickname,
hvhi.hardware_version_or_hardware_identifier_name, sv.software_version, sc.description
because I kept adding column names that I got errors on, but this is by no means the whole list. So, I'm still confused as to why I got the error in the first place.Thanks for having me try it again.
I had ended up with
group by srs.scan_run_id, sty.scan_type, ssn.scan_system_name, ssn.simulation, ssitenames.scan_site_name,
ssitenames.scan_site_name, ssitenicknames.scan_site_nickname,
hvhi.hardware_version_or_hardware_identifier_name, sv.software_version, sc.description
"Yes, except those that are inside the aggregate."
On Fri, Feb 21, 2014 at 10:38 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Fri, Feb 21, 2014 at 10:40 AM, Susan Cassidy
<susan.cassidy@decisionsciencescorp.com> wrote:
>
> I originally had the query without the group by, but I had duplicate rows,
> so I added a group by to eliminate them.
Have you tried select distinct or select distinct on ()?
Susan Cassidy-3 wrote > Someone said something about > "Yes, except those that are inside the aggregate." > but I don't have an aggregate specified. So every column then... As soon as you add "group by" the rule becomes - every column is either a group determinate or is aggregated. If you simply want to remove duplicates you can write: Select distinct ... From No group by clause required and every output column is used to create an implicit non-duplicated group. I haven't tried to figure out what you are trying to do with this query so I don't know which, if either, form is more correct but grouping without aggregates is unusual and I also find that using distinct is not that common a need of you have a correctly normalized database. IOW you should not use group by or distinct to "make the query work" but only if you know/understand why doing so is necessary. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/problem-with-query-and-group-by-error-tp5793127p5793140.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.