Обсуждение: btree_gin and ranges

Поиск
Список
Период
Сортировка

btree_gin and ranges

От
Teodor Sigaev
Дата:
Suggested patch adds GIN support contains operator for ranges over scalar column.

It allows more effective GIN scan. Currently, queries like
SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
will be excuted by GIN with two scans: one is from mines infinity to 1 and
another is from -1 to plus infinity. That's because GIN is "generalized" and it
doesn't know a semantics of operation.

With patch it's possible to rewrite query with ranges
SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
and GIN index will support this query with single scan from -1 to 1.

Patch provides index support only for existing range types: int4, int8, numeric,
date and timestamp with and without time zone.



--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Вложения

Re: btree_gin and ranges

От
Marti Raudsepp
Дата:
Hi

On Wed, Oct 22, 2014 at 1:55 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
> With patch it's possible to rewrite query with ranges
> SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
> and GIN index will support this query with single scan from -1 to 1.

Shouldn't this be implemented in a more generic manner? An ordinary
btree index could very well support <@ queries too, but your patch
only adds this capability to btree-gin.

The documentation describes btree-gin as providing "GIN operator
classes that implement B-tree equivalent behavior", but now the
behavior diverges.

Regards,
Marti



Re: btree_gin and ranges

От
Teodor Sigaev
Дата:
> Shouldn't this be implemented in a more generic manner? An ordinary
Unfortunately I don't see a way for that. GIN is generalized - and it doesn't 
know semantic. Actually, we could fix first five strategy numbers for BTree's 
strategies, and then we could teach GIN core to use BTRee semantic, but what 
about with already existed operator classes?

>
> The documentation describes btree-gin as providing "GIN operator
> classes that implement B-tree equivalent behavior", but now the
> behavior diverges.

Anyway GIN couldn't be used for ORDER BY clause.


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 



Re: btree_gin and ranges

От
Heikki Linnakangas
Дата:
On 10/22/2014 03:01 PM, Teodor Sigaev wrote:
> Anyway GIN couldn't be used for ORDER BY clause.

which would also be nice to fix...

- Heikki




Re: btree_gin and ranges

От
Heikki Linnakangas
Дата:
On 10/22/2014 03:01 PM, Teodor Sigaev wrote:
> Anyway GIN couldn't be used for ORDER BY clause.

which would also be nice to fix...

- Heikki




Re: btree_gin and ranges

От
Teodor Sigaev
Дата:
> which would also be nice to fix..


Of course, agree. With rbtree usage instead of tidbitmap hash and semantic 
knowledge about operators in GIN...


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 



Re: btree_gin and ranges

От
Heikki Linnakangas
Дата:
On 10/22/2014 01:55 PM, Teodor Sigaev wrote:
> Suggested patch adds GIN support contains operator for ranges over scalar column.
>
> It allows more effective GIN scan. Currently, queries like
> SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
> will be excuted by GIN with two scans: one is from mines infinity to 1 and
> another is from -1 to plus infinity. That's because GIN is "generalized" and it
> doesn't know a semantics of operation.
>
> With patch it's possible to rewrite query with ranges
> SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
> and GIN index will support this query with single scan from -1 to 1.
>
> Patch provides index support only for existing range types: int4, int8, numeric,
> date and timestamp with and without time zone.

I started to look at this, but very quickly got carried away into
refactoring away the macros. Defining long functions as macros makes
debugging quite difficult, and it's not nice to read or edit the source
code either.

I propose the attached refactoring (it doesn't include your range-patch
yet, just refactoring the existing code). It turns the bulk of the
macros into static functions. GIN_SUPPORT macro still defines the
datatype-specific functions, but they are now very thin wrappers that
just call the corresponding generic static functions.

It's annoying that we need the concept of a leftmost value, for the <
and <= queries. Without that, we could have the support functions look
up the required datatype information from the type cache, based on the
datatype of the passed argument. Then you could easily use the btree_gin
support functions also for user-defined datatypes. But that needs bigger
changes, and this is a step in the right direction anyway.

- Heikki


Вложения

Re: btree_gin and ranges

От
Michael Paquier
Дата:
On Thu, Dec 18, 2014 at 4:13 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> On 10/22/2014 01:55 PM, Teodor Sigaev wrote:
>>
>> Suggested patch adds GIN support contains operator for ranges over scalar
>> column.
>>
>> It allows more effective GIN scan. Currently, queries like
>> SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
>> will be excuted by GIN with two scans: one is from mines infinity to 1 and
>> another is from -1 to plus infinity. That's because GIN is "generalized"
>> and it
>> doesn't know a semantics of operation.
>>
>> With patch it's possible to rewrite query with ranges
>> SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
>> and GIN index will support this query with single scan from -1 to 1.
>>
>> Patch provides index support only for existing range types: int4, int8,
>> numeric,
>> date and timestamp with and without time zone.
>
>
> I started to look at this, but very quickly got carried away into
> refactoring away the macros. Defining long functions as macros makes
> debugging quite difficult, and it's not nice to read or edit the source code
> either.
>
> I propose the attached refactoring (it doesn't include your range-patch yet,
> just refactoring the existing code). It turns the bulk of the macros into
> static functions. GIN_SUPPORT macro still defines the datatype-specific
> functions, but they are now very thin wrappers that just call the
> corresponding generic static functions.
>
> It's annoying that we need the concept of a leftmost value, for the < and <=
> queries. Without that, we could have the support functions look up the
> required datatype information from the type cache, based on the datatype of
> the passed argument. Then you could easily use the btree_gin support
> functions also for user-defined datatypes. But that needs bigger changes,
> and this is a step in the right direction anyway.
I just had a look at the refactoring patch and ISTM that this is a
good step forward in terms of readability. Teodor, I am noticing that
your patch cannot apply once the refactoring is done. Could you rebase
your patch once the refactoring is pushed?s
-- 
Michael



Re: btree_gin and ranges

От
Heikki Linnakangas
Дата:
On 12/22/2014 03:15 AM, Michael Paquier wrote:
> On Thu, Dec 18, 2014 at 4:13 AM, Heikki Linnakangas
> <hlinnakangas@vmware.com> wrote:
>> On 10/22/2014 01:55 PM, Teodor Sigaev wrote:
>>>
>>> Suggested patch adds GIN support contains operator for ranges over scalar
>>> column.
>>>
>>> It allows more effective GIN scan. Currently, queries like
>>> SELECT * FROM test_int4 WHERE i <= 1 and i >= 1
>>> will be excuted by GIN with two scans: one is from mines infinity to 1 and
>>> another is from -1 to plus infinity. That's because GIN is "generalized"
>>> and it
>>> doesn't know a semantics of operation.
>>>
>>> With patch it's possible to rewrite query with ranges
>>> SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range
>>> and GIN index will support this query with single scan from -1 to 1.
>>>
>>> Patch provides index support only for existing range types: int4, int8,
>>> numeric,
>>> date and timestamp with and without time zone.
>>
>>
>> I started to look at this, but very quickly got carried away into
>> refactoring away the macros. Defining long functions as macros makes
>> debugging quite difficult, and it's not nice to read or edit the source code
>> either.
>>
>> I propose the attached refactoring (it doesn't include your range-patch yet,
>> just refactoring the existing code). It turns the bulk of the macros into
>> static functions. GIN_SUPPORT macro still defines the datatype-specific
>> functions, but they are now very thin wrappers that just call the
>> corresponding generic static functions.
>>
>> It's annoying that we need the concept of a leftmost value, for the < and <=
>> queries. Without that, we could have the support functions look up the
>> required datatype information from the type cache, based on the datatype of
>> the passed argument. Then you could easily use the btree_gin support
>> functions also for user-defined datatypes. But that needs bigger changes,
>> and this is a step in the right direction anyway.
> I just had a look at the refactoring patch and ISTM that this is a
> good step forward in terms of readability. Teodor, I am noticing that
> your patch cannot apply once the refactoring is done. Could you rebase
> your patch once the refactoring is pushed?s

I've pushed the refactoring patch. Here's a version of Teodor's patch,
rebased over the pushed patch.

Teodor's patch could use some more comments. The
STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are a good idea, but they probably
should go into src/include/access/gin.h so that they can be used in all
compare_partial implementations.

- Heikki

Вложения

Re: btree_gin and ranges

От
Teodor Sigaev
Дата:
> Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN
> macros are a good idea, but they probably should go into
> src/include/access/gin.h so that they can be used in all compare_partial
> implementations.

STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and comments
are improved.

Split patch to two: gin and module


--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Вложения

Re: btree_gin and ranges

От
Bruce Momjian
Дата:
On Fri, Dec 26, 2014 at 01:33:26PM +0300, Teodor Sigaev wrote:
> >Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN
> >macros are a good idea, but they probably should go into
> >src/include/access/gin.h so that they can be used in all compare_partial
> >implementations.
> 
> STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and
> comments are improved.
> 
> Split patch to two: gin and module

Where are we on this patch?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: btree_gin and ranges

От
Alvaro Herrera
Дата:
Teodor Sigaev wrote:
> >Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN
> >macros are a good idea, but they probably should go into
> >src/include/access/gin.h so that they can be used in all compare_partial
> >implementations.
> 
> STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and
> comments are improved.
> 
> Split patch to two: gin and module

Here you forgot to "git add" the two .sql files for the extension.  They
are present in the patch Heikki posted upthread but not here.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services