Обсуждение: Approach to extract top records from table based upon aggregate
Hi, I have a table that contains call records. I'm looking to get only records for users who made the most calls over a particular time duration in an efficient way. calls() time, duration, caller_number, dialed_number -- query to get top 10 callers select caller_number, count(1) from calls group by caller_number order by calls desc limit 10 --my current query to get those callers select * from call where caller_number in (above query) It works but I was hoping for something a little more efficient if anyone has an idea. Tahnks -- View this message in context: http://postgresql.nabble.com/Approach-to-extract-top-records-from-table-based-upon-aggregate-tp5872427.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Hi, I have a table that contains call records. I'm looking to get only
records for users who made the most calls over a particular time duration in
an efficient way.
calls()
time, duration, caller_number, dialed_number
-- query to get top 10 callers
select caller_number, count(1) from calls group by caller_number order by
calls desc limit 10
--my current query to get those callers
select * from call where caller_number in (above query)
It works but I was hoping for something a little more efficient if anyone
has an idea.
I don't think there is anything that is "a little more efficient" (implying, only a bit harder to implement).
You can probably get significantly faster by combining various forms of pre-computation and caching. It is likewise significantly more complex to implement.
David J.
2015-11-02 19:14 GMT-03:00 droberts <david.roberts@riverbed.com>:
Hi, I have a table that contains call records. I'm looking to get only
records for users who made the most calls over a particular time duration in
an efficient way.
calls()
time, duration, caller_number, dialed_number
-- query to get top 10 callers
select caller_number, count(1) from calls group by caller_number order by
calls desc limit 10
--my current query to get those callers
select * from call where caller_number in (above query)
It works but I was hoping for something a little more efficient if anyone
has an idea.
I think that almost every time based tables, should be partitioned. Also, depending on your workload you can create lazy views over the last entries in calls table during a particular time frame.
Probably in this particular case, you will want to dig into more underneath design in order to get the best performance.
Doing a lazy view with that query, you can use the top n of it and get less callers if you need to (or more if you want to expand the feature).
Hope it helps,
-- On Mon, Nov 2, 2015 at 4:14 PM, droberts <david.roberts@riverbed.com> wrote:
> Hi, I have a table that contains call records. I'm looking to get only
> records for users who made the most calls over a particular time duration in
> an efficient way.
>
> calls()
>
> time, duration, caller_number, dialed_number
>
>
>
> -- query to get top 10 callers
> select caller_number, count(1) from calls group by caller_number order by
> calls desc limit 10
>
> --my current query to get those callers
>
> select * from call where caller_number in (above query)
>
>
> It works but I was hoping for something a little more efficient if anyone
> has an idea.
How fast is it running, and how fast do you expect it to run? To make
that faster than that, you're going to have to rethink things a little
bit. For example, you could narrow the search down to a time range,
or maybe you could keep a running internalization of the count.
This query looks suspicious:
select caller_number, count(1) from calls group by caller_number order
by calls desc limit 10
you're ordering by the entire table, which is almost certainly a
mistake. It probably needs to look like:
select *
from
(
select
caller_number,
count(1) as count_calls
from calls
group by caller_number
) q order by count_calls desc limit 10;
merlin