Обсуждение: btree_gin and ranges
Suggested patch adds GIN support contains operator for ranges over scalar column. It allows more effective GIN scan. Currently, queries like SELECT * FROM test_int4 WHERE i <= 1 and i >= 1 will be excuted by GIN with two scans: one is from mines infinity to 1 and another is from -1 to plus infinity. That's because GIN is "generalized" and it doesn't know a semantics of operation. With patch it's possible to rewrite query with ranges SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range and GIN index will support this query with single scan from -1 to 1. Patch provides index support only for existing range types: int4, int8, numeric, date and timestamp with and without time zone. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Вложения
Hi On Wed, Oct 22, 2014 at 1:55 PM, Teodor Sigaev <teodor@sigaev.ru> wrote: > With patch it's possible to rewrite query with ranges > SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range > and GIN index will support this query with single scan from -1 to 1. Shouldn't this be implemented in a more generic manner? An ordinary btree index could very well support <@ queries too, but your patch only adds this capability to btree-gin. The documentation describes btree-gin as providing "GIN operator classes that implement B-tree equivalent behavior", but now the behavior diverges. Regards, Marti
> Shouldn't this be implemented in a more generic manner? An ordinary Unfortunately I don't see a way for that. GIN is generalized - and it doesn't know semantic. Actually, we could fix first five strategy numbers for BTree's strategies, and then we could teach GIN core to use BTRee semantic, but what about with already existed operator classes? > > The documentation describes btree-gin as providing "GIN operator > classes that implement B-tree equivalent behavior", but now the > behavior diverges. Anyway GIN couldn't be used for ORDER BY clause. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On 10/22/2014 03:01 PM, Teodor Sigaev wrote: > Anyway GIN couldn't be used for ORDER BY clause. which would also be nice to fix... - Heikki
On 10/22/2014 03:01 PM, Teodor Sigaev wrote: > Anyway GIN couldn't be used for ORDER BY clause. which would also be nice to fix... - Heikki
> which would also be nice to fix.. Of course, agree. With rbtree usage instead of tidbitmap hash and semantic knowledge about operators in GIN... -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On 10/22/2014 01:55 PM, Teodor Sigaev wrote: > Suggested patch adds GIN support contains operator for ranges over scalar column. > > It allows more effective GIN scan. Currently, queries like > SELECT * FROM test_int4 WHERE i <= 1 and i >= 1 > will be excuted by GIN with two scans: one is from mines infinity to 1 and > another is from -1 to plus infinity. That's because GIN is "generalized" and it > doesn't know a semantics of operation. > > With patch it's possible to rewrite query with ranges > SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range > and GIN index will support this query with single scan from -1 to 1. > > Patch provides index support only for existing range types: int4, int8, numeric, > date and timestamp with and without time zone. I started to look at this, but very quickly got carried away into refactoring away the macros. Defining long functions as macros makes debugging quite difficult, and it's not nice to read or edit the source code either. I propose the attached refactoring (it doesn't include your range-patch yet, just refactoring the existing code). It turns the bulk of the macros into static functions. GIN_SUPPORT macro still defines the datatype-specific functions, but they are now very thin wrappers that just call the corresponding generic static functions. It's annoying that we need the concept of a leftmost value, for the < and <= queries. Without that, we could have the support functions look up the required datatype information from the type cache, based on the datatype of the passed argument. Then you could easily use the btree_gin support functions also for user-defined datatypes. But that needs bigger changes, and this is a step in the right direction anyway. - Heikki
Вложения
On Thu, Dec 18, 2014 at 4:13 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > On 10/22/2014 01:55 PM, Teodor Sigaev wrote: >> >> Suggested patch adds GIN support contains operator for ranges over scalar >> column. >> >> It allows more effective GIN scan. Currently, queries like >> SELECT * FROM test_int4 WHERE i <= 1 and i >= 1 >> will be excuted by GIN with two scans: one is from mines infinity to 1 and >> another is from -1 to plus infinity. That's because GIN is "generalized" >> and it >> doesn't know a semantics of operation. >> >> With patch it's possible to rewrite query with ranges >> SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range >> and GIN index will support this query with single scan from -1 to 1. >> >> Patch provides index support only for existing range types: int4, int8, >> numeric, >> date and timestamp with and without time zone. > > > I started to look at this, but very quickly got carried away into > refactoring away the macros. Defining long functions as macros makes > debugging quite difficult, and it's not nice to read or edit the source code > either. > > I propose the attached refactoring (it doesn't include your range-patch yet, > just refactoring the existing code). It turns the bulk of the macros into > static functions. GIN_SUPPORT macro still defines the datatype-specific > functions, but they are now very thin wrappers that just call the > corresponding generic static functions. > > It's annoying that we need the concept of a leftmost value, for the < and <= > queries. Without that, we could have the support functions look up the > required datatype information from the type cache, based on the datatype of > the passed argument. Then you could easily use the btree_gin support > functions also for user-defined datatypes. But that needs bigger changes, > and this is a step in the right direction anyway. I just had a look at the refactoring patch and ISTM that this is a good step forward in terms of readability. Teodor, I am noticing that your patch cannot apply once the refactoring is done. Could you rebase your patch once the refactoring is pushed?s -- Michael
On 12/22/2014 03:15 AM, Michael Paquier wrote: > On Thu, Dec 18, 2014 at 4:13 AM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> On 10/22/2014 01:55 PM, Teodor Sigaev wrote: >>> >>> Suggested patch adds GIN support contains operator for ranges over scalar >>> column. >>> >>> It allows more effective GIN scan. Currently, queries like >>> SELECT * FROM test_int4 WHERE i <= 1 and i >= 1 >>> will be excuted by GIN with two scans: one is from mines infinity to 1 and >>> another is from -1 to plus infinity. That's because GIN is "generalized" >>> and it >>> doesn't know a semantics of operation. >>> >>> With patch it's possible to rewrite query with ranges >>> SELECT * FROM test_int4 WHERE i <@ '[-1, 1]'::int4range >>> and GIN index will support this query with single scan from -1 to 1. >>> >>> Patch provides index support only for existing range types: int4, int8, >>> numeric, >>> date and timestamp with and without time zone. >> >> >> I started to look at this, but very quickly got carried away into >> refactoring away the macros. Defining long functions as macros makes >> debugging quite difficult, and it's not nice to read or edit the source code >> either. >> >> I propose the attached refactoring (it doesn't include your range-patch yet, >> just refactoring the existing code). It turns the bulk of the macros into >> static functions. GIN_SUPPORT macro still defines the datatype-specific >> functions, but they are now very thin wrappers that just call the >> corresponding generic static functions. >> >> It's annoying that we need the concept of a leftmost value, for the < and <= >> queries. Without that, we could have the support functions look up the >> required datatype information from the type cache, based on the datatype of >> the passed argument. Then you could easily use the btree_gin support >> functions also for user-defined datatypes. But that needs bigger changes, >> and this is a step in the right direction anyway. > I just had a look at the refactoring patch and ISTM that this is a > good step forward in terms of readability. Teodor, I am noticing that > your patch cannot apply once the refactoring is done. Could you rebase > your patch once the refactoring is pushed?s I've pushed the refactoring patch. Here's a version of Teodor's patch, rebased over the pushed patch. Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are a good idea, but they probably should go into src/include/access/gin.h so that they can be used in all compare_partial implementations. - Heikki
Вложения
> Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN > macros are a good idea, but they probably should go into > src/include/access/gin.h so that they can be used in all compare_partial > implementations. STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and comments are improved. Split patch to two: gin and module -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Вложения
On Fri, Dec 26, 2014 at 01:33:26PM +0300, Teodor Sigaev wrote: > >Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN > >macros are a good idea, but they probably should go into > >src/include/access/gin.h so that they can be used in all compare_partial > >implementations. > > STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and > comments are improved. > > Split patch to two: gin and module Where are we on this patch? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Teodor Sigaev wrote: > >Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN > >macros are a good idea, but they probably should go into > >src/include/access/gin.h so that they can be used in all compare_partial > >implementations. > > STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and > comments are improved. > > Split patch to two: gin and module Here you forgot to "git add" the two .sql files for the extension. They are present in the patch Heikki posted upthread but not here. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services