Обсуждение: Select "todays" timestamps in an index friendly way
Hi,
I am looking for a way to select all timestamps that are "today" in an
index friendly way. This select should not depend on the concrete value
of "today".
Given a table
create temporary table t (
id SERIAL primary key,
ts timestamp not null default now()
);
with some data
insert into t (ts)
select ts
from generate_series(
'2018-01-01T00:00:01'::timestamp,
'2018-12-31T23:59:59'::timestamp,
'2 minutes')
as ts;
and an index
create index on t (ts, id);
I can of course make an explicit select for `ts` values that are
"today":
select ts, id
from t
where ts >= '2018-10-23T00:00:00'::timestamp
and ts <= '2018-10-23T23:59:59'::timestamp;
This uses an Bitmap Index Scan on `t_ts_id_idx`. Good.
But the where conditions depends on concrete values of "today" which
will not return the intended result if I execute it tomorrow. I will
have to change the where condition. Not good.
I am looking for a way to make the where condition independed of the
date of execution. I can create a function
create function is_today(timestamp) returns boolean as $$
select to_char(now(), 'YYYY-MM-DD') = to_char($1, 'YYYY-MM-DD');
$$ language sql;
that converts the timestamps to text. But using this function
select * from t where is_today(ts);
will not benefit from the existing index. A Seq Scan on `t` will be
used. Not good.
Is there a way to have both: be independed of the concrete value of
"today" *and* use the index on the timestamp column?
Lutz
Hi Lutz.
On Tue, Oct 23, 2018 at 11:38 AM, Lutz Horn <lutz.horn@posteo.de> wrote:
> I am looking for a way to select all timestamps that are "today" in an
> index friendly way. This select should not depend on the concrete value
> of "today".
> Given a table
> create temporary table t (
> id SERIAL primary key,
> ts timestamp not null default now()
> );
>
> with some data
>
> insert into t (ts)
> select ts
> from generate_series(
> '2018-01-01T00:00:01'::timestamp,
> '2018-12-31T23:59:59'::timestamp,
> '2 minutes')
> as ts;
>
> and an index
>
> create index on t (ts, id);
>
> I can of course make an explicit select for `ts` values that are
> "today":
>
> select ts, id
> from t
> where ts >= '2018-10-23T00:00:00'::timestamp
> and ts <= '2018-10-23T23:59:59'::timestamp;
1st remark. Do NOT use closed interval for timestamps. Always use
half-open or you'll run into problems ( i.e., you are going to miss
2018-10-23T23:59:59.25 in that query ). For real like things ( which
timestamps are, they identify a point on the time line ) use half-open
( you can cover a line with non-overlapping half-open segments, not
with closed ones ).
I.e., your query will better be stated as
where ts >= '2018-10-23T00:00:00'::timestamp
and ts < '2018-10-24T00:00:00'::timestamp;
Which, as a nice bonus, can rely on the time part defaulting to 0:
where ts >= '2018-10-23'::timestamp
and ts < '2018-10-24'::timestamp;
and then be expressed in other ways, like
where ts >= '2018-10-23'::timestamp
and ts < ('2018-10-23'::timestamp + '1 day'::interval)
> This uses an Bitmap Index Scan on `t_ts_id_idx`. Good.
>
> But the where conditions depends on concrete values of "today" which
> will not return the intended result if I execute it tomorrow. I will
> have to change the where condition. Not good.
>
> I am looking for a way to make the where condition independed of the
> date of execution. I can create a function
>
> create function is_today(timestamp) returns boolean as $$
> select to_char(now(), 'YYYY-MM-DD') = to_char($1, 'YYYY-MM-DD');
> $$ language sql;
This is not a good way to deal with timestamp values, they are just
numbers, play with them as such. Try using something like
date_trunc('day',now()) = date_trunc('day',$1)
which states your purposes more clearly.
> that converts the timestamps to text. But using this function
>
> select * from t where is_today(ts);
>
> will not benefit from the existing index. A Seq Scan on `t` will be
> used. Not good.
>
> Is there a way to have both: be independed of the concrete value of
> "today" *and* use the index on the timestamp column?
Well, if your definition of today is 'same value as now() when
truncated to days' we can use part of what I've written above,
1st calculate today and tomorrow with same timestamp arithmetic and date_trunc:
select now(), date_trunc('day',now()) as today,
date_trunc('day',now()+'1 day') as tomorrow;
now | today | tomorrow
-------------------------------+------------------------+------------------------
2018-10-23 11:58:01.699407+02 | 2018-10-23 00:00:00+02 | 2018-10-24 00:00:00+02
(1 row)
Then plug that result in your query ( using the half-open technique )
described above:
where ts >= date_trunc('day',now())
and ts < date_trunc('day',now()+'1 day') as tomorrow;
IIRC this should use the index, you can RTFM in case you prefer using
current_timestamp and her cousins, but bear in mind if you use
something like current_date you should convert it to timestamp, not
convert ts to date, to get easy index usage.
Francisco Olarte.
Lutz Horn schrieb am 23.10.2018 um 11:38:
> I can of course make an explicit select for `ts` values that are
> "today":
>
> select ts, id
> from t
> where ts >= '2018-10-23T00:00:00'::timestamp
> and ts <= '2018-10-23T23:59:59'::timestamp;
>
> This uses an Bitmap Index Scan on `t_ts_id_idx`. Good.
>
> But the where conditions depends on concrete values of "today" which
> will not return the intended result if I execute it tomorrow. I will
> have to change the where condition. Not good.
>
> I am looking for a way to make the where condition independed of the
> date of execution. I can create a function
>
> create function is_today(timestamp) returns boolean as $$
> select to_char(now(), 'YYYY-MM-DD') = to_char($1, 'YYYY-MM-DD');
> $$ language sql;
>
> that converts the timestamps to text. But using this function
>
> select * from t where is_today(ts);
>
> will not benefit from the existing index. A Seq Scan on `t` will be
> used. Not good.
>
> Is there a way to have both: be independed of the concrete value of
> "today" *and* use the index on the timestamp column?
I typically use:
where ts >= date '2018-10-23'
and ts < date '2018-10-23' + 1
Hi Francisco,
On Tue, Oct 23, 2018 at 12:05:17PM +0200, Francisco Olarte wrote:
> 1st remark. Do NOT use closed interval for timestamps. Always use
> half-open or you'll run into problems
Good point, thanks.
> where ts >= date_trunc('day',now())
> and ts < date_trunc('day',now()+'1 day') as tomorrow;
>
> IIRC this should use the index
And it does! Thanks!
Lutz
Hi Thomas, On Tue, Oct 23, 2018 at 12:11:55PM +0200, Thomas Kellerer wrote: > I typically use: > > where ts >= date '2018-10-23' > and ts < date '2018-10-23' + 1 But here the date is an explicit value. Francisco reworded my question: > if your definition of today is 'same value as now() when truncated to > days' That's what I am (was, thanks to Francisco) looking for. Lutz
Lutz Horn schrieb am 23.10.2018 um 12:19:
> Hi Thomas,
>
> On Tue, Oct 23, 2018 at 12:11:55PM +0200, Thomas Kellerer wrote:
>> I typically use:
>>
>> where ts >= date '2018-10-23'
>> and ts < date '2018-10-23' + 1
>
> But here the date is an explicit value. Francisco reworded my question:
>
>> if your definition of today is 'same value as now() when truncated to
>> days'
>
> That's what I am (was, thanks to Francisco) looking for.
Then use current_date:
where ts >= current_date
and ts < current_date + 1
Hi
El mar., 23 de oct. de 2018 a la(s) 05:41, Thomas Kellerer (spam_eater@gmx.net) escribió:
Lutz Horn schrieb am 23.10.2018 um 12:19:
> Hi Thomas,
>
> On Tue, Oct 23, 2018 at 12:11:55PM +0200, Thomas Kellerer wrote:
>> I typically use:
>>
>> where ts >= date '2018-10-23'
>> and ts < date '2018-10-23' + 1
>
> But here the date is an explicit value. Francisco reworded my question:
>
>> if your definition of today is 'same value as now() when truncated to
>> days'
>
> That's what I am (was, thanks to Francisco) looking for.
Then use current_date:
where ts >= current_date
and ts < current_date + 1
this is equally valid?
where ts >= current_date
cordialmente:
Hellmuth Vargas
On Tue, Oct 23, 2018 at 3:05 PM, Hellmuth Vargas <hivs77@gmail.com> wrote: ... >> Then use current_date: >> >> where ts >= current_date >> and ts < current_date + 1 > > this is equally valid? > > where ts >= current_date It'is not as the problem was stated. Although ts defaulted to now(), and it is probably defaulted, nothing prohibits him from inserting timestamps in the future. Also, I'll point the table used in the sample ( bigserial+timestamp) does not seem like a real one and the "timestamps in today" pattern is commonly used in calendaring applications, which usually insert appointments in the future and recover this way to print "todays schedule". Francisco Olarte.
On Tue, Oct 23, 2018 at 03:50:14PM +0200, Francisco Olarte wrote: > It'is not as the problem was stated. Although ts defaulted to now(), > and it is probably defaulted, nothing prohibits him from inserting > timestamps in the future. Yes, this table is only used as an example for the technical question. In my real use case there are columns like "due_date" which usually contain future dates inserted by application code. > the "timestamps in today" pattern is commonly used in calendaring > applications, which usually insert appointments in the future and > recover this way to print "todays schedule". Exactly. The application must be able to execute queries like "give me all my tasks due today" without having to use a concrete value for "today". Lutz
On Tue, Oct 23, 2018 at 3:57 PM, Lutz Horn <lutz.horn@posteo.de> wrote: > On Tue, Oct 23, 2018 at 03:50:14PM +0200, Francisco Olarte wrote: >> It'is not as the problem was stated. Although ts defaulted to now(), >> and it is probably defaulted, nothing prohibits him from inserting >> timestamps in the future. > Yes, this table is only used as an example for the technical question. > In my real use case there are columns like "due_date" which usually > contain future dates inserted by application code. If your real table uses dates instead of timestamps modify the code accordingly, they are not the same ( dates are countable, instants in time are not (they are in the computer, with finite precision, but you see the difference )) Although I supose they really are timestamps, or you would have just used "date_column=current_date". >> the "timestamps in today" pattern is commonly used in calendaring >> applications, which usually insert appointments in the future and >> recover this way to print "todays schedule". > Exactly. The application must be able to execute queries like "give me > all my tasks due today" without having to use a concrete value for > "today". Been there, done that. With an IBM 84 ( instructional use. It was, not surprissingly, easier but slower, ). Happy hacking. Francisco Olarte.
On Tue, Oct 23, 2018 at 5:38 AM Lutz Horn <lutz.horn@posteo.de> wrote: > > Hi, > > I am looking for a way to select all timestamps that are "today" in an > index friendly way. This select should not depend on the concrete value > of "today". > Per TFM, https://www.postgresql.org/docs/10/static/datatype-datetime.html on table 8.13, you can use special input values: SELECT * FROM t WHERE ts >= 'today'::timestamp AND ts < 'tomorrow'::timestamp; HTH, -- Mike Rylander | Executive Director | Equinox Open Library Initiative | phone: 1-877-OPEN-ILS (673-6457) | email: miker@equinoxinitiative.org | web: http://equinoxinitiative.org
On 24 October 2018 at 07:14, Mike Rylander <mrylander@gmail.com> wrote: > > On Tue, Oct 23, 2018 at 5:38 AM Lutz Horn <lutz.horn@posteo.de> wrote: > > I am looking for a way to select all timestamps that are "today" in an > > index friendly way. This select should not depend on the concrete value > > of "today". > > Per TFM, https://www.postgresql.org/docs/10/static/datatype-datetime.html > on table 8.13, you can use special input values: > > SELECT * FROM t WHERE ts >= 'today'::timestamp AND ts < 'tomorrow'::timestamp; Of course, you'd need to be careful never to use that in a view or even a PREPAREd statement. Those abbreviations are evaluated when the query is parsed. In those cases, you'd just get the results for whatever day you did CREATE VIEW or PREPARE. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
> create temporary table t (
> id SERIAL primary key,
> ts timestamp not null default now()
> );
* add date( ts ) as a field and index date = now()::date.
* Depending on the amount of data in your table the date
may not be seletive enough to be worth using, at which
point the index may be present and ignored. Only way to
be sure is analyze it.
* Might be worth looking at a partial index using >= 00:00:00
and < 24:00:00 (PG grocks the 2400 notation for "midnight at
the end of today) or
where ts && tsrange( ... 00:00:00, ... 24:00:00, '[)] )
Nice thing about the partial index is that you can create it
on all of the non-ts fields for fast lookup by whatever and
only index the portion for today.
* Think about using a materialized view rather than a temp
table. May prove simpler to query.
--
Steven Lembark 3920 10th Ave South
Workhorse Computing Birmingham, AL 35222
lembark@wrkhors.com +1 888 359 3508