Re: Create custom aggregate function and custom sfunc

Поиск
Список
Период
Сортировка
От Jasmin Dizdarevic
Тема Re: Create custom aggregate function and custom sfunc
Дата
Msg-id a0eee4d40907030014t1457dc66xfbf0c47fc4d05b0c@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Create custom aggregate function and custom sfunc  (nha <lyondif02@free.fr>)
Список pgsql-sql
Hi,
 
thank you everybody for your help. The classy solution posted by nha works great!
 
Regards
Jasmin

2009/7/3 nha <lyondif02@free.fr>
Hello,

Le 2/07/09 23:21, Greg Stark a écrit :

On Thu, Jul 2, 2009 at 3:48 PM, Jasmin
Dizdarevic<jasmin.dizdarevic@gmail.com>  wrote:
customer ; seg
111 ; L1
111 ; L2
111 ; L1
222 ; L3
222 ; L3
222 ; L2

the result should look like this:

111: L1 - because L1 is higher than L2 and ratio of L1 : L2 is 2 : 1
222: L3 - because L3 is higher than L2 and ratio of L3 : L2 is 2 : 1

Are you just looking for the most frequent seg for each customer?

select distinct on (customer) customer,seg
   from (select customer, seg, count(*) as n from tab group by seg)
  order by customer, n desc

That doesn't give the ratios though.


Some errors would occur with the above query from within the subquery because of a projection on customer column that is not a grouped column (neither an uniquely identified column by seg--seg is assumed not to be a primary key here according to the given examples of value). By the way, the ordered column n is not a member of the projected columns.

Nevertheless, in the same direction as Greg Stark, the following query would approach the target result:

SELECT T5A.customer, T5A.seg
FROM (
       SELECT T2.customer, T2.seg, COUNT(*) AS nb
       FROM cst T2
       GROUP BY T2.customer, T2.seg
) T5A INNER JOIN (
       SELECT T4.customer, MAX(T4.nb) AS maxNb
       FROM (
               SELECT T2B.customer, T2B.seg, COUNT(*) AS nb
               FROM cst T2B
               GROUP BY T2B.customer, T2B.seg
       ) T4
       GROUP BY T4.customer
) T5B
ON T5A.customer = T5B.customer
AND T5A.nb = T5B.maxNb

where 'cst' is the reference table including 'customer' and 'seg' columns. This query considers a join between two tables:
- the first table, aliased T5A, counts the multiplicity for each couple of 'customer' and 'seg';
- the secund table, aliased T5B, reveals the highest multiplicity for each 'customer' with regard to each 'seg' attached to customer.
Then the join only retains the couples of 'customer' and 'seg' whom multiplicity equals the higher for the 'customer' of the current couple.

Each multiplicity may be easily added to the resulting records by spanning the projected columns with T5A.nb column. A sort of ratio may also be added; one aggregated column may first be added to table T4 as SUM(T4.nb) -- that is, sum of multiplicity for each customer; then the resulting join may compute the ratio between the highest multiplicity and the sum of multiplicity for each customer, in addition to the seg (or list of seg) for which the multiplicity is the highest.

Here is the modified query:

SELECT T5A.customer, T5A.seg, T5A.nb/T5B.sumNb
FROM (
       SELECT T2.customer, T2.seg, COUNT(*) AS nb
       FROM cst T2
       GROUP BY T2.customer, T2.seg
) T5A INNER JOIN (
       SELECT T4.customer, MAX(T4.nb) AS maxNb, SUM(T4.nb) AS sumNb
       FROM (
               SELECT T2B.customer, T2B.seg, COUNT(*) AS nb
               FROM cst T2B
               GROUP BY T2B.customer, T2B.seg
       ) T4
       GROUP BY T4.customer
) T5B
ON T5A.customer = T5B.customer
AND T5A.nb = T5B.maxNb

As is, this query may result to multiple couples of customer and seg with the same customer value when many seg relie as many times as many others for the current customer. This "edge effect" may be avoided in many ways depending on the original purpose.

Regards.

--
nha / Lyon / France.





В списке pgsql-sql по дате отправления:

Предыдущее
От: nha
Дата:
Сообщение: Re: Create custom aggregate function and custom sfunc
Следующее
От: Chris Browne
Дата:
Сообщение: Re: Sequences