Re: Given a set of daterange, finding the continuous range thatincludes a particular date

Поиск
Список
Период
Сортировка
От Ken Tanzer
Тема Re: Given a set of daterange, finding the continuous range thatincludes a particular date
Дата
Msg-id CAD3a31VXGX-o=9OHEHgD7Ou0iwHnZ6Bnd-dm8FDkokL9-tN8Tw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Given a set of daterange, finding the continuous range thatincludes a particular date  (Viktor Fougstedt <viktor@chalmers.se>)
Список pgsql-general


On Fri, Feb 23, 2018 at 6:42 AM, Viktor Fougstedt <viktor@chalmers.se> wrote:

Hi.

This answer is perhaps useful if I understand your problem correctly. But I might have interpreted it wrongly. :-)


I would probably start with merging intervals so that overlapping and adjacent intervals become single continuous intervals, then select from those merged intervals.

We have an application with a lot of interval handling in PostgreSQL, and we use many variants of algorithms based on one by Itzik Ben Gan which he calls “packing intervals”. The post we started with was an old reader’s challenge from SQL Magazine. It has since been updated when MS-SQL started supporting window functions better (Itzik Ben Gan is a MS-SQL-guy).

Basically, it is a few CTE:s which convert the intervals into “start” (+1) and “stop” (-1) events, then keeps a running sum of these, and finally creates new intervals by noting that the merged (or “packed”) intervals starts with events that had sum=0 before them and stops with events which have sum=0 after them.

It involves both CTE:s and window functions and is quite a beautiful example of SQL, IMHO.

I think it’s best to google it, he does a good job of explaining how it works.

Hope that helps a bit at least.

/Viktor

Hi.  Thanks for the many helpful responses!  Although I didn't end up with exactly what I was looking for, I think I got to something that works good enough for now.  I did it with CTEs, in a way I think similar to what you were suggesting (my printer is out of ink, so I didn't actually get to look at that book yet!)  I ended up having to do 4 passes:

1) Identify the starts and ends of continuous ranges
2) Eliminate the middle-point records (non stops/starts)
3) Merge the stop date in with the starts
4) Eliminate the stops

I couldn't see how to do it in less steps, but if there's a way...

I posted the query along with some sample data below.

Thanks again!

Ken

BEGIN;

CREATE TEMP TABLE sample_data (
client_id INTEGER,
start_date DATE,
end_date DATE,
episode INTEGER -- I won't actually have this info; just a label I added for clarity and testing
);

INSERT INTO sample_data VALUES


(1,'1990-01-01','1990-12-31',0),
(1,'1991-01-01','1991-12-31',0),
(1,'1995-01-01','1995-06-30',1),
(1,'2000-01-01','2000-12-31',2),
(1,'2001-01-01','2001-12-31',2),
(1,'2002-01-01','2002-12-31',2),
(1,'2003-01-01','2003-12-31',2),
(1,'2004-01-01','2004-12-31',2),
(1,'2005-01-01','2005-12-31',2),
(1,'2006-01-01','2006-12-31',2),
(1,'2014-01-01','2014-12-31',3),
(1,'2015-01-01','2015-12-31',3),
(1,'2017-06-30','2017-12-31',4),
(1,'2018-01-01',NULL,4),

(2,'2014-02-01','2015-01-31',0),
(2,'2015-02-01','2015-12-31',0),
(2,'2017-09-30','2018-01-31',1),
(2,'2018-02-01','2018-02-14',1)
;

WITH peek3 AS (
WITH peek2 AS (
WITH peek AS (
SELECT
  client_id,
  episode,
  daterange(start_date,end_date,'[]') AS range,
  COALESCE(daterange(start_date,end_date+1,'[]') &&
  lead(daterange(start_date,end_date,'[]')) OVER (PARTITION BY client_id ORDER BY start_date),false) AS continues,
  COALESCE(daterange(start_date,end_date,'[]') &&
  lag(daterange(start_date,end_date+1,'[]')) OVER (PARTITION BY client_id ORDER BY start_date),false) AS is_continued
FROM
  sample_data
)
SELECT
  *
FROM peek
  WHERE NOT (is_continued AND continues)
)
SELECT client_id,episode,range,
daterange(lower(range),
  CASE WHEN lead(is_continued) OVER (PARTITION BY client_id ORDER BY range) THEN
  lead(upper(range)) OVER (PARTITION BY client_id ORDER BY range)
  ELSE upper(range)
  END) AS full_range
,continues,is_continued
FROM peek2
)
SELECT * FROM peek3
WHERE NOT is_continued
;



--
AGENCY Software  
A Free Software data system
By and for non-profits
(253) 245-3801

learn more about AGENCY or
follow the discussion.

В списке pgsql-general по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: Query error: could not resize shared memory segment
Следующее
От: Ken Tanzer
Дата:
Сообщение: Re: Given a set of daterange, finding the continuous range thatincludes a particular date (aggregates)