WINDOW functions - proposed addition weight (dp) for percent_rank, cume_dist

Поиск
Список
Период
Сортировка
От Michael van der Kolff
Тема WINDOW functions - proposed addition weight (dp) for percent_rank, cume_dist
Дата
Msg-id 8233f01e1002081323x36c28c14y7bd7247c5d5187ff@mail.gmail.com
обсуждение исходный текст
Список pgsql-general
Dear all,

I have a bunch of CDRs which I turned into a running list showing how
many were in progress at any one time:

WITH
parameters as (
  SELECT
    (timestamp with time zone '2010-01-19 00:00:01+11') as beginning,
    (timestamp with time zone '2010-01-19 00:00:01+11') + (interval
'24 hours' * 17) as ending
),
call_times as (
  SELECT
    greatest(calldate,beginning) as callstart,
    least(calldate + duration * (interval '1 second'),p.ending) as callend
  FROM
    cdr, parameters p
  where ((calldate BETWEEN beginning AND ending) or ((calldate +
duration * (interval '1 second')) BETWEEN beginning AND ending))
/*    AND dcontext <> 'internalNumbers' */ --uncomment for outgoing
call statistics
),
call_starts as (
  SELECT
    callstart as eventTime,
    1::int8 as numInProgress
  FROM
    call_times
),
call_ends as (
  SELECT
    callend as eventTime,
    -1::int8 as numInProgress
  FROM
    call_times
),
call_times_numInProgress as (
  select period(eventTime, lead(eventTime,1,p.ending) over
byEventTime) as timeSlice, sum(numInProgress) over byEventTime as
callsInProgress
  FROM
    (select eventTime, numInProgress from call_starts union all select
eventTime, numInProgress from call_ends) events, parameters p
  WINDOW byEventTime as (order by eventTime asc)
)
SELECT *, percent_rank() OVER (ORDER BY callsInProgress ASC) AS
percentile FROM call_times_numInProgress ORDER BY callsInProgress DESC

using the asterisk 'standard' definition for CDRs with one extra field
asking whether it's been copied to an accounting program (should do
something more sensible, but it's using firebird for the moment...):
CREATE TABLE cdr
(
  acctid bigserial NOT NULL,
  calldate timestamp with time zone NOT NULL DEFAULT now(),
  clid character varying(45) NOT NULL DEFAULT ''::character varying,
  src character varying(45) NOT NULL DEFAULT ''::character varying,
  dst character varying(45) NOT NULL DEFAULT ''::character varying,
  dcontext character varying(45) NOT NULL DEFAULT ''::character varying,
  channel character varying(45) NOT NULL DEFAULT ''::character varying,
  dstchannel character varying(45) NOT NULL DEFAULT ''::character varying,
  lastapp character varying(45) NOT NULL DEFAULT ''::character varying,
  lastdata character varying(45) NOT NULL DEFAULT ''::character varying,
  duration integer NOT NULL DEFAULT 0,
  billsec integer NOT NULL DEFAULT 0,
  disposition character varying(45) NOT NULL DEFAULT ''::character varying,
  amaflags integer NOT NULL DEFAULT 0,
  accountcode character varying(45) NOT NULL DEFAULT ''::character varying,
  uniqueid character varying(45) NOT NULL DEFAULT ''::character varying,
  "copiedToSPInfo" boolean DEFAULT false,
  CONSTRAINT cdr_pkey PRIMARY KEY (acctid)
)
WITH (
  OIDS=FALSE
);


But percent_rank here simply shows the direct percentile.  What I
would like is to say percent_rank(length(timeSlice)), which would give
the result of the cumulative sum of the length of each period divided
by the sum of the entire period

So what I propose is a minor extension:  Add a weight parameter to
percent_rank/cume_dist/others, which defaults to 1.  Current behaviour
should stay the same, AFAICT...

Cheers,

Michael

В списке pgsql-general по дате отправления:

Предыдущее
От: Oleg Bartunov
Дата:
Сообщение: Re: turning a tsvector without position in a weighted tsvector
Следующее
От: Ivan Sergio Borgonovo
Дата:
Сообщение: Re: turning a tsvector without position in a weighted tsvector