Обсуждение: Is there a more elegant way to write this query?...
Hi-
I'm suffering from a performance problem, but when I look at my query, I'm
not convinced that there isn't a better way to handle this in SQL. -So I'm
seeking advice here before I go to the performance list.
I have three tables- case, actor and actor_case_assignment.
As the names imply, actor_case_assignment contains records that assign an
actor to a case. Actors such as attorneys or judges may have many cases,
while the average actor (we hope) only has one.
What I'm trying to do is link these tables to get back a single row per
actor that shows the actor's name, the number of cases that actor is
assigned to, and if they only have one case, I want the number for that
case. This means I have to do some grouping to get the case count, but I'm
then forced to use an aggregate function like max on the other fields. I
hope there's a better way. Any suggestions?
Here's what I'm using right now:
select actor.actor_id, max(actor.actor_full_name), max(case_data.case_public_id), max(case_data.case_id),
count(case_data.case_id)as case_count from actor, actor_case_assignment, case_data where
actor.actor_full_name_uppercaselike upper('martin%') and actor.actor_id = actor_case_assignment.actor_id and
case_data.case_id= actor_case_assignment.case_id group by actor.actor_id order by max(actor.actor_full_name),
case_countdesc, limit 1000;
Thanks! -Nick
---------------------------------------------------------------------
Nick Fankhauser
nickf@doxpop.com Phone 1.765.965.7363 Fax 1.765.962.9788
doxpop - Court records at your fingertips - http://www.doxpop.com/
On Nov 10, 2003, at 1:02 PM, Nick Fankhauser wrote: > Hi- > > I'm suffering from a performance problem, but when I look at my query, > I'm > not convinced that there isn't a better way to handle this in SQL. -So > I'm > seeking advice here before I go to the performance list. > An explain analyze would help. > What I'm trying to do is link these tables to get back a single row per > actor that shows the actor's name, the number of cases that actor is > assigned to, and if they only have one case, I want the number for that > case. This means I have to do some grouping to get the case count, but > I'm > then forced to use an aggregate function like max on the other fields. > I > hope there's a better way. Any suggestions? How about: selectactor.actor_full_name,actor.actor_id,s1.ctCases,s1.case_id,case_data.case_public_id fromactor inner join ( select actor_id, count(*) as ctCases, max(case_id) as case_id from actor_case_assignment group by actor_id) as s1 on (actor.actor_id = s1.actor_id) left outer join case_data using (s1.case_id=case_data.case_id) limit 1000; If you don't need the public_id, then you don't even need to join in the case data table. eric
Hi Eric- Thanks for your suggestion. > An explain analyze would help. I'll do that (and move the question to the performance list) when I get to the performance question, but at this point, I'm just seeking some help in looking at this from a different angle- I couldn't figure out how to achieve the result I was after without using max(), which felt like a kludge. Your solution gave me the start I needed. -Nick > How about: > select > actor.actor_full_name, > actor.actor_id, > s1.ctCases, > s1.case_id, > case_data.case_public_id > from > actor inner join ( select actor_id, count(*) as ctCases, > max(case_id) > as case_id > from > actor_case_assignment group by actor_id) as s1 > on (actor.actor_id = s1.actor_id) > left outer join case_data using > (s1.case_id=case_data.case_id) > limit 1000;
Nick Fankhauser wrote:
> Hi-
>
> I'm suffering from a performance problem, but when I look at my query, I'm
> not convinced that there isn't a better way to handle this in SQL. -So I'm
> seeking advice here before I go to the performance list.
>
> I have three tables- case, actor and actor_case_assignment.
>
> As the names imply, actor_case_assignment contains records that assign an
> actor to a case. Actors such as attorneys or judges may have many cases,
> while the average actor (we hope) only has one.
>
> What I'm trying to do is link these tables to get back a single row per
> actor that shows the actor's name, the number of cases that actor is
> assigned to, and if they only have one case, I want the number for that
> case. This means I have to do some grouping to get the case count, but I'm
> then forced to use an aggregate function like max on the other fields. I
> hope there's a better way. Any suggestions?
>
> Here's what I'm using right now:
>
> select
> actor.actor_id,
> max(actor.actor_full_name),
> max(case_data.case_public_id),
> max(case_data.case_id),
> count(case_data.case_id) as case_count
> from
> actor,
> actor_case_assignment,
> case_data
> where
> actor.actor_full_name_uppercase like upper('martin%')
> and actor.actor_id = actor_case_assignment.actor_id
> and case_data.case_id = actor_case_assignment.case_id
> group by
> actor.actor_id
> order by
> max(actor.actor_full_name),
> case_count desc,
> limit
> 1000;
>
>
> Thanks!
> -Nick
>
> ---------------------------------------------------------------------
> Nick Fankhauser
>
> nickf@doxpop.com Phone 1.765.965.7363 Fax 1.765.962.9788
> doxpop - Court records at your fingertips - http://www.doxpop.com/
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
selectcount(as.case_id) as case_count,max(as.case_id) as max_case_id,a.actor_full_name
fromactors a,actor_case_assignment as
wherea.case_id = actor_case_assignment.case_idand a.actor_full_name_uppercase like upper('martin%') -- this field
is(shouldbe) indexed
group bya.actor_full_name
order bya.actor_full_name_uppercase -- this field is(should be) indexed
This query will not pick up any actors that a NOT assigned to a case,
but you said in your criteria that all actors are assigned to at least
one case. This query relys on that assumption and eliminates the expense
of an outer join.
Also, you are always getting max_case_id but it is obvious that this
will be the relevent case where case_count == 1 you can simply ignore
max_case_id when case_count != 1
try running this query with explain to compare against the others.
Also, try adding an index to the actor_full_name_uppercase column since
you are using it in an order by clause and searching on it. indexes will
incur slight performance expenses when inserting and updating.
cheers.