Обсуждение: Fastest way / best practice to calculate "next birthdays"

Поиск
Список
Период
Сортировка

Fastest way / best practice to calculate "next birthdays"

От
Hannes Dorbath
Дата:
The subject basically says it all, I'm looking for the fastest
(indexable) way to calculate the next birthdays relative to NOW() from a
dataset of about 1 million users.

I'm currently using a function based index, but leap year handling /
mapping February 29 to February 28 gives me some headaches.

Is there any best practice to do that in PostgreSQL?


--
Best regards,
Hannes Dorbath

Re: Fastest way / best practice to calculate "next birthdays"

От
"Joshua D. Drake"
Дата:
Hannes Dorbath wrote:
> The subject basically says it all, I'm looking for the fastest
> (indexable) way to calculate the next birthdays relative to NOW() from a
> dataset of about 1 million users.
>
> I'm currently using a function based index, but leap year handling /
> mapping February 29 to February 28 gives me some headaches.
>
> Is there any best practice to do that in PostgreSQL?

postgres=# SELECT current_date|| ' a ' || to_char(current_date, 'Day'),

current_date + '1 Year'::interval || ' a '  || to_char(current_date + '1
Year'::interval, 'Day') as next_birthday;
         ?column?        |          next_birthday
------------------------+---------------------------------
  2008-05-04 a Sunday    | 2009-05-04 00:00:00 a Monday

?


Sincerely,

Joshua D. Drake


Re: Fastest way / best practice to calculate "next birthdays"

От
Hannes Dorbath
Дата:
Joshua D. Drake wrote:
> postgres=# SELECT current_date|| ' a ' || to_char(current_date, 'Day'),
> current_date + '1 Year'::interval || ' a '  || to_char(current_date + '1
> Year'::interval, 'Day') as next_birthday;
>         ?column?        |          next_birthday
> ------------------------+---------------------------------
>  2008-05-04 a Sunday    | 2009-05-04 00:00:00 a Monday
>
> ?

Sorry, I think I phrased the question badly. What I'm after basically is:

http://www.depesz.com/index.php/2007/10/26/who-has-birthday-tomorrow/


--
Best regards,
Hannes Dorbath

Re: Fastest way / best practice to calculate "next birthdays"

От
Zoltan Boszormenyi
Дата:
Hannes Dorbath írta:
> Joshua D. Drake wrote:
>> postgres=# SELECT current_date|| ' a ' || to_char(current_date, 'Day'),
>> current_date + '1 Year'::interval || ' a '  || to_char(current_date +
>> '1 Year'::interval, 'Day') as next_birthday;
>>         ?column?        |          next_birthday
>> ------------------------+---------------------------------
>>  2008-05-04 a Sunday    | 2009-05-04 00:00:00 a Monday
>>
>> ?
>
> Sorry, I think I phrased the question badly. What I'm after basically is:
>
> http://www.depesz.com/index.php/2007/10/26/who-has-birthday-tomorrow/

If you define the same functional index as in the above link:

CREATE OR REPLACE FUNCTION indexable_month_day(date) RETURNS TEXT as $BODY$
SELECT to_char($1, 'MM-DD');
$BODY$ language 'sql' IMMUTABLE STRICT;

create table user_birthdate (
  id serial not null primary key,
  birthdate date
);
create index user_birthdate_day_idx on user_birthdate (
indexable_month_day(birthdate) );

Then you can use this query:

select count(*) from user_birthdate where indexable_month_day(birthdate)
 > '02-28' and indexable_month_day(birthdate) <= '03-01';

In a generic and parametrized way:

select * from user_birthdate
where
  indexable_month_day(birthdate) > indexable_month_day(now()::date) and
  indexable_month_day(birthdate) <= indexable_month_day((now() + '1
days'::interval)::date);

This will still use the index and it will work for the poor ones
who have birthday every 4 years, too. Assume, it's 02-08 today, 03-01
the next day.
The now() < X <= now() + 1 day range will find 02-29.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/



Re: Fastest way / best practice to calculate "next birthdays"

От
Shane Ambler
Дата:
Hannes Dorbath wrote:

> Sorry, I think I phrased the question badly. What I'm after basically is:
>
> http://www.depesz.com/index.php/2007/10/26/who-has-birthday-tomorrow/
>

OK So what I came up with is - (the times are from a G4 1.25Ghz)

CREATE TABLE birthdaytest
(
   id serial PRIMARY KEY,
   birthdate date
);


CREATE INDEX idx_bday_month ON birthdaytest
USING btree(extract(month from birthdate));

CREATE INDEX idx_bday_day ON birthdaytest
USING btree(extract(day from birthdate));


insert into birthdaytest (birthdate) values
('1930-01-01'::date+generate_series(0,365*70));

... I repeated this another 15 times to load some data


vacuum analyse birthdaytest;

\timing

select count(*) from birthdaytest;

>  count
> --------
>  408816
> (1 row)
>
> Time: 233.501 ms


select * from birthdaytest
where extract(month from birthdate) = 5
and extract(day from birthdate) between 6 and 12;

>    id   | birthdate
> --------+------------
>     126 | 1930-05-06
>     127 | 1930-05-07
>     128 | 1930-05-08
> ...
> ...
>  408613 | 1999-05-11
>  408614 | 1999-05-12
> (7840 rows)
>
> Time: 211.237 ms


select * from birthdaytest
where extract(month from birthdate) = extract(month from current_date)
and extract(day from birthdate) between extract(day from current_date)
and extract(day from current_date+14);

>    id   | birthdate
> --------+------------
>     125 | 1930-05-05
>     126 | 1930-05-06
>     127 | 1930-05-07
> ...
> ...
>  408619 | 1999-05-17
>  408620 | 1999-05-18
>  408621 | 1999-05-19
> (16800 rows)
>
> Time: 483.915 ms


--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

Re: Fastest way / best practice to calculate "next birthdays"

От
"er.tejaspatel88@gmail.com"
Дата:
If I have to find upcoming birthdays in current week and the current week
fall into different months - how would you handle that?





--
View this message in context:
http://postgresql.nabble.com/Fastest-way-best-practice-to-calculate-next-birthdays-tp2068398p5849705.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


Re: Fastest way / best practice to calculate "next birthdays"

От
"David G. Johnston"
Дата:
On Monday, May 18, 2015, er.tejaspatel88@gmail.com <er.tejaspatel88@gmail.com> wrote:
If I have to find upcoming birthdays in current week and the current week
fall into different months - how would you handle that?

Extract(week from timestamptz_column)

ISO weeks are not affected by month boundaries but do start on Monday.

David J.

Re: Fastest way / best practice to calculate "next birthdays"

От
Bosco Rama
Дата:
On 05/20/15 20:22, David G. Johnston wrote:
> On Monday, May 18, 2015, er.tejaspatel88@gmail.com <
> er.tejaspatel88@gmail.com> wrote:
>
>> If I have to find upcoming birthdays in current week and the current week
>> fall into different months - how would you handle that?
>>
>
> Extract(week from timestamptz_column)
>
> ISO weeks are not affected by month boundaries but do start on Monday.

There is the year start/end boundary conditions to worry about there.

If the current week covers Dec28-Jan02 then week of year won't help for
a birthday on Jan01 or Jan02 if 'today' is in the Dec portion.  Ditto
for birthday in Dec portion when 'today' is in the Jan portion.

There is probably a better way to do it than what I'm showing here, but
here's an example:

with x as (
   select now() - (extract(dow from now()) || ' days')::interval as
weekstart
)
select to_char(x.weekstart, 'YYYY-MM-DD') as first_day,
        to_char(x.weekstart + '6 days', 'YYYY-MM-DD') as last_day
    from x;

You could probably make some of that into a function that accepts a
timestamptz and generates the two days.  Or even does the compare too.

HTH.

Bosco.


Re: Fastest way / best practice to calculate "next birthdays"

От
"David G. Johnston"
Дата:
On Thursday, May 21, 2015, Bosco Rama <postgres@boscorama.com> wrote:
On 05/20/15 20:22, David G. Johnston wrote:
> On Monday, May 18, 2015, er.tejaspatel88@gmail.com <
> er.tejaspatel88@gmail.com> wrote:
>
>> If I have to find upcoming birthdays in current week and the current week
>> fall into different months - how would you handle that?
>>
>
> Extract(week from timestamptz_column)
>
> ISO weeks are not affected by month boundaries but do start on Monday.

There is the year start/end boundary conditions to worry about there.

If the current week covers Dec28-Jan02 then week of year won't help for
a birthday on Jan01 or Jan02 if 'today' is in the Dec portion.  Ditto
for birthday in Dec portion when 'today' is in the Jan portion.


You need to read the documentation regarding ISO year and ISO week more carefully.  There is no issue with years only ensuring that your definition of week starts with Monday and contains 7 days.  The ISO year for January 1st can be different than the Gregorian year for the same.

David J. 

Re: Fastest way / best practice to calculate "next birthdays"

От
Jim Nasby
Дата:
On 5/21/15 11:15 AM, Bosco Rama wrote:
> You could probably make some of that into a function that accepts a
> timestamptz and generates the two days.

You'll be better off if instead of 2 days it gives you a daterange:
http://www.postgresql.org/docs/9.4/static/rangetypes.html

I don't know about the exact ISO details, but your approach is the
correct one: find the date that the current week started on and then
build a range of [week start, week start + 7 days).

Also, note the use of [ vs ). That is the ONLY correct way to do this if
you're comparing to a timestamp.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: Fastest way / best practice to calculate "next birthdays"

От
Kevin Grittner
Дата:
"er.tejaspatel88@gmail.com" <er.tejaspatel88@gmail.com> wrote:

> If I have to find upcoming birthdays in current week and the
> current week fall into different months - how would you handle
> that?

If you don't need to cross from December into January, I find the
easiest is:

SELECT * FROM person
  WHERE (EXTRACT(MONTH FROM dob), EXTRACT(DAY FROM dob))
          BETWEEN (6, 28) AND (7, 4);

That is logicically the same as:

SELECT * FROM person
  WHERE (EXTRACT(MONTH FROM dob) >= 6
    AND (EXTRACT(MONTH FROM dob) > 6
     OR (EXTRACT(DAY FROM dob) >= 28)))
    AND (EXTRACT(MONTH FROM dob) <= 7
    AND (EXTRACT(MONTH FROM dob) < 7
     OR (EXTRACT(DAY FROM dob) <= 4)));

That's the generalized case; with the months adjacent, this simpler
form is also equivalent:

SELECT * FROM person
  WHERE (EXTRACT(MONTH FROM dob) = 6
    AND EXTRACT(DAY FROM dob) >= 28)
    OR (EXTRACT(MONTH FROM dob) = 7
    AND EXTRACT(DAY FROM dob) <= 4);

The first query I showed is faster than either of the alternatives,
especially if there is an index on dob.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company