Обсуждение: encourging bitmap AND

Поиск
Список
Период
Сортировка

encourging bitmap AND

От
Ben
Дата:
hello --

i have a schema similar to the following

create table foo (
 id integer not null,
 val integer not null,
 s integer not null,
 e integer not null
);

create index foo_s_idx on foo using btree (s);
create index foo_e_idx on foo using btree (e);

i want to do queries like

select * from foo where 150 between s and e;

this usually gives me index or bitmap scans on one of the indices, plus a filter for the other condition.  this is not
terriblyefficient as the table is large (billions of rows), but there should only be a few thousand rows with s < k < e
forany k.  the data is id, value, interval (s, e), with s < e, and e - s is "small". 

i am experimenting and would like to see the effect of using a bitmap index "AND" scan using both indices.  as far as i
cantell, there are no easy ways to force or encourage this -- there are no switches like enable_seqscan and such which
forcethe use of bitmap AND, and i don't know how to tell the query planner about the structure of the data (i don't
thinkthis is adequately captured in any of the statistics it generates, i would need multi-column statistics.) 

any clues?

best regards, ben

Re: encourging bitmap AND

От
Tom Lane
Дата:
Ben <midfield@gmail.com> writes:
> i have a schema similar to the following

> create index foo_s_idx on foo using btree (s);
> create index foo_e_idx on foo using btree (e);

> i want to do queries like

> select * from foo where 150 between s and e;

That index structure is really entirely unsuited to what you want to do,
so it's not surprising that the planner isn't impressed with the idea of
a bitmap AND.

I'd suggest setting up something involving a gist index over an
interval-ish datatype.  The PERIOD datatype that Jeff Davis is fooling
with would do what you want --- see
     http://pgfoundry.org/projects/temporal
     http://thoughts.j-davis.com/2009/11/08/temporal-keys-part-2/
If you don't want any add-on parts involved, you could fake it by using
a box or possibly lseg.

            regards, tom lane

Re: encourging bitmap AND

От
Ben
Дата:
On Dec 23, 2010, at 12:52 PM, Tom Lane wrote:

> Ben <midfield@gmail.com> writes:
>> i have a schema similar to the following
>
>> create index foo_s_idx on foo using btree (s);
>> create index foo_e_idx on foo using btree (e);
>
>> i want to do queries like
>
>> select * from foo where 150 between s and e;
>
> That index structure is really entirely unsuited to what you want to do,
> so it's not surprising that the planner isn't impressed with the idea of
> a bitmap AND.
>
> I'd suggest setting up something involving a gist index over an
> interval-ish datatype.  The PERIOD datatype that Jeff Davis is fooling
> with would do what you want --- see
>     http://pgfoundry.org/projects/temporal
>     http://thoughts.j-davis.com/2009/11/08/temporal-keys-part-2/
> If you don't want any add-on parts involved, you could fake it by using
> a box or possibly lseg.

Thanks for the quick response.  I've already played a lot with the PERIOD datatype and GIST, it works pretty good, but
Ifound that the lack of statistics and real selectivity functions hurt me.  I was experimenting with the two column
setupas an alternative, but if you think this is a dead end I'll look elsewhere. 

Best regards, Ben

Re: encourging bitmap AND

От
Marti Raudsepp
Дата:
On Thu, Dec 23, 2010 at 22:52, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Ben <midfield@gmail.com> writes:
>> i have a schema similar to the following
>
>> create index foo_s_idx on foo using btree (s);
>> create index foo_e_idx on foo using btree (e);
>
>> i want to do queries like
>
>> select * from foo where 150 between s and e;
>
> That index structure is really entirely unsuited to what you want to do,
> so it's not surprising that the planner isn't impressed with the idea of
> a bitmap AND.

Why is it unsuited for this query? It expands to (150 < s AND 150 > e)
 which should work nicely with bitmap AND as far as I can tell.

Regards,
Marti

Re: encourging bitmap AND

От
Tom Lane
Дата:
Marti Raudsepp <marti@juffo.org> writes:
> On Thu, Dec 23, 2010 at 22:52, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That index structure is really entirely unsuited to what you want to do,
>> so it's not surprising that the planner isn't impressed with the idea of
>> a bitmap AND.

> Why is it unsuited for this query? It expands to (150 < s AND 150 > e)
>  which should work nicely with bitmap AND as far as I can tell.

Well, maybe for small values of "nicely".  If you do it like that, then
on average each indexscan will scan about half of its index and return a
bitmap representing about half the rows in the table.  That's an
expensive indexscan, and an expensive bitmap-AND operation, even if the
final number of rows out of the AND is small.  Plus you're at serious
risk that the bitmaps will become lossy, which degrades the performance
of the final bitmap heapscan.

If you're doing interval queries enough to worry about having an index
for them, you really want an indexing structure that is designed to do
interval queries efficiently.

            regards, tom lane

Re: encourging bitmap AND

От
Jim Nasby
Дата:
On Dec 26, 2010, at 11:24 AM, Tom Lane wrote:
> If you're doing interval queries enough to worry about having an index
> for them, you really want an indexing structure that is designed to do
> interval queries efficiently.

BTW, one way to accomplish that is to transform your data into geometric shapes and then index them accordingly. Prior
tothe work Jeff Davis has done on time intervals it was common to treat time as points and ranges as lines or boxes.
Whilewe no longer need to play those games for time, I don't think there's an equivalent for non-time datatypes. 
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net