Обсуждение: count of occurences

Поиск
Список
Период
Сортировка

count of occurences

От
adamcrume@hotmail.com (Adam)
Дата:
I help run a job database and have a table of search records.  I want
a query that will return the top 10 jobs by search frequency.  I'm
familiar with ORDER BY and LIMIT, so I basically need this:

Given a table search_records:
job_num
-------
1
2
2
3
4
4
4

I want a query that will return:
job_num | count
--------+------
1       |1
2       |2
3       |1
4       |3

I tried

select distinct job_num, (select count(*) from search_records j where
j.job_num=k.job_num) from search_records k

but it is horribly slow (it takes several minutes on a table of about
25k rows!).  I assume it scans the entire table for every job_num in
order to count the number of occurences of that job_num, taking order
n^2 time.  Since I can easily use job_num as an index (being integers
from 0 to roughly 400 so far) I could just do a "select * from
search_records" and do the counting in PHP (our HTML pre-processor) in
order n time.  However, I don't know how to do an order n*log(n) sort
in PHP, just n^2, so there would still be an efficiency problem.
I have Postgresql 7.0.3.
Help is of course greatly appreciated.

Re: count of occurrences

От
"Oliver Elphick"
Дата:
Adam wrote:
  >I help run a job database and have a table of search records.  I want
  >a query that will return the top 10 jobs by search frequency.  I'm
  >familiar with ORDER BY and LIMIT, so I basically need this:
  >
  >Given a table search_records:
  >job_num
  >-------
  >1
  >2
  >2
  >3
  >4
  >4
  >4

SELECT job_num, COUNT(*)
  FROM search_records
  GROUP BY job_num
  ORDER BY COUNT(*) DESC
  LIMIT 10;


--
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
PGP: 1024R/32B8FAA1: 97 EA 1D 47 72 3F 28 47  6B 7E 39 CC 56 E4 C1 47
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
                 ========================================
     "But without faith it is impossible to please him; for
      he that cometh to God must believe that he is, and
      that he is a rewarder of them that diligently seek
      him."        Hebrews 11:6



Re: count of occurences

От
Andrew Gould
Дата:
You're just missing 'group by', and a little
simplicity.

Try this:

select job_num, count(job_num) as frequency
from search_records
group by job_num
order by frequency desc
limit 10;

Have fun,

Andrew Gould

--- Adam <adamcrume@hotmail.com> wrote:
> I help run a job database and have a table of search
> records.  I want
> a query that will return the top 10 jobs by search
> frequency.  I'm
> familiar with ORDER BY and LIMIT, so I basically
> need this:
>
> Given a table search_records:
> job_num
> -------
> 1
> 2
> 2
> 3
> 4
> 4
> 4
>
> I want a query that will return:
> job_num | count
> --------+------
> 1       |1
> 2       |2
> 3       |1
> 4       |3
>
> I tried
>
> select distinct job_num, (select count(*) from
> search_records j where
> j.job_num=k.job_num) from search_records k
>
> but it is horribly slow (it takes several minutes on
> a table of about
> 25k rows!).  I assume it scans the entire table for
> every job_num in
> order to count the number of occurences of that
> job_num, taking order
> n^2 time.  Since I can easily use job_num as an
> index (being integers
> from 0 to roughly 400 so far) I could just do a
> "select * from
> search_records" and do the counting in PHP (our HTML
> pre-processor) in
> order n time.  However, I don't know how to do an
> order n*log(n) sort
> in PHP, just n^2, so there would still be an
> efficiency problem.
> I have Postgresql 7.0.3.
> Help is of course greatly appreciated.
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster


__________________________________________________
Terrorist Attacks on U.S. - How can you help?
Donate cash, emergency relief information
http://dailynews.yahoo.com/fc/US/Emergency_Information/

Re: count of occurences

От
"Eric Ridge"
Дата:
> I tried
>
> select distinct job_num, (select count(*) from search_records j where
> j.job_num=k.job_num) from search_records k

can't you just do:

select job_num, count(job_num) from search_records group by job_num
order by job_num

http://www.postgresql.org/idocs/index.php?queries.html#QUERIES-GROUP

and creating an index on job_num might improve performace even more, but
I don't know if index scans are used in aggregate functions or not.

eric