Range Types - efficiency

Поиск
Список
Период
Сортировка
От Chris Browne
Тема Range Types - efficiency
Дата
Msg-id 877hd8zz0c.fsf@cbbrowne.afilias-int.info
обсуждение исходный текст
Ответ на Range Types - representation and alignment  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: Range Types - efficiency  (Jeff Davis <pgsql@j-davis.com>)
Re: Range Types - efficiency  (Florian Weimer <fweimer@bfk.de>)
Список pgsql-hackers
One of the things I'd particularly like to use range types for is to
make it easier to construct range-related queries.  Classic example is
that of reports that work on date ranges.

I create a table that will have transaction data:

CREATE TABLE some_data (   id serial,   whensit date   -- And it'll have other attributes, but those don't matter
here...
);
CREATE INDEX some_when ON some_data USING btree (whensit);

I then populate it with a bunch of date-based data...

rangetest@localhost->  select count(*), min(whensit), max(whensit) from some_data;count |    min     |    max
-------+------------+------------37440 | 2007-01-01 | 2014-12-27
(1 row)

Here's the traditional way of doing a range-based query on this data:

rangetest@localhost->  explain analyze  select * from some_data where whensit >= '2010-01-01' and whensit <
'2010-02-01';                                                    QUERY PLAN
 

-----------------------------------------------------------------------------------------------------------------------Bitmap
HeapScan on some_data  (cost=12.30..184.23 rows=395 width=8) (actual time=0.064..0.150 rows=390 loops=1)  Recheck Cond:
((whensit>= '2010-01-01'::date) AND (whensit < '2010-02-01'::date))  ->  Bitmap Index Scan on some_when
(cost=0.00..12.21rows=395 width=0) (actual time=0.054..0.054 rows=390 loops=1)        Index Cond: ((whensit >=
'2010-01-01'::date)AND (whensit < '2010-02-01'::date))Total runtime: 0.197 ms
 
(5 rows)

The RangeType-based equivalent is the following:

rangetest@localhost->  explain analyze select * from some_data where '[2010-01-01,2010-02-01)'::daterange @> whensit;
                                          QUERY PLAN
 
---------------------------------------------------------------------------------------------------------Seq Scan on
some_data (cost=0.00..634.00 rows=1 width=8) (actual time=1.045..111.739 rows=390 loops=1)  Filter: ('[ 2010-01-01,
2010-02-01)'::daterange @> whensit)Total runtime: 111.780 ms
 
(3 rows)

This, alas, reverts to a seq scan on the table, rather than restricting
itself to the tuples of interest.

I realize that, after a fashion, I'm using this backwards.  But when I'm
doing temporal stuff, that tends to be the pattern:
- There is a set of temporal configuration, indicating criteria that  are true for particular date ranges
- There is then event data, which has but a single date, but which  needs to be matched against the temporal
configuration.

It sure would be nice to expand that filter into subqueries involving
the two criteria, in much the same fashion that is true today for
BETWEEN.  I imagine that would allow many queries with this kind of
pattern to make use of indexes, making them visibly thousands of times
faster.
-- 
"I have  traveled the  length and breadth  of this country  and talked
with the best people, and can assure you that data processing is a fad
that won't  last out  the year".  --  Business books  editor, Prentice
Hall 1957


В списке pgsql-hackers по дате отправления:

Предыдущее
От: amit sehas
Дата:
Сообщение: query execution question
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: pl/python explicit subtransactions