Обсуждение: Covering GiST indexes

Поиск
Список
Период
Сортировка

Covering GiST indexes

От
Andrey Borodin
Дата:
Hi, hackers!

Looks like we finally have covering indexes! And that's cool!

So I decided to create a thread to discuss covering GiST indexes.
Here's a prototype patch implementing this functionality.
It is quite small (+80 -30) and lacks tests and docs. But it creates a context.

I have two concerns.
First one is about INDEX_AM_RESERVED_BIT.
B-tree uses it as a base for prefix truncation (I'm not quite sure why it is usually called suffix truncation, but this
isa matter for other thread). 
GiST , probably, will not use [pre\su]fix truncation. But I'd like to use that 13th bit to implement intra-page
indexing- a way to improve search within gist page. See [0,1] 

Second, currently including indexes do not allow same attributes in both keys and include parts.
# create index on x using gist(c) include (c);
ERROR:  included columns must not intersect with key columns

But it makes sense for example for geometries like PostGIS. Index keys are truncated to small MBRs while having whole
complexgeometry right in an index could be handy. 

Any feedback will be appreciated.

Best regards, Andrey Borodin.


[0] https://www.postgresql.org/message-id/7780A07B-4D04-41E2-B228-166B41D07EEE%40yandex-team.ru
[1] https://www.postgresql.org/message-id/CAJEAwVE0rrr+OBT-P0gDCtXbVDkBBG_WcXwCBK=GHo4fewu3Yg@mail.gmail.com

Вложения

Re: Covering GiST indexes

От
Teodor Sigaev
Дата:
Interesting work. I don't have a time now to learn deep your patch, so, add it 
to next commitfest, pls. First of all I'd like to see more tests in patch, not 
only CREATE INDEX.


Andrey Borodin wrote:
> Hi, hackers!
> 
> Looks like we finally have covering indexes! And that's cool!
> 
> So I decided to create a thread to discuss covering GiST indexes.
> Here's a prototype patch implementing this functionality.
> It is quite small (+80 -30) and lacks tests and docs. But it creates a context.
> 
> I have two concerns.
> First one is about INDEX_AM_RESERVED_BIT.
> B-tree uses it as a base for prefix truncation (I'm not quite sure why it is usually called suffix truncation, but
thisis a matter for other thread).
 
> GiST , probably, will not use [pre\su]fix truncation. But I'd like to use that 13th bit to implement intra-page
indexing- a way to improve search within gist page. See [0,1]
 
> 
> Second, currently including indexes do not allow same attributes in both keys and include parts.
> # create index on x using gist(c) include (c);
> ERROR:  included columns must not intersect with key columns
> 
> But it makes sense for example for geometries like PostGIS. Index keys are truncated to small MBRs while having whole
complexgeometry right in an index could be handy.
 
> 
> Any feedback will be appreciated.
> 
> Best regards, Andrey Borodin.
> 
> 
> [0] https://www.postgresql.org/message-id/7780A07B-4D04-41E2-B228-166B41D07EEE%40yandex-team.ru
> [1] https://www.postgresql.org/message-id/CAJEAwVE0rrr+OBT-P0gDCtXbVDkBBG_WcXwCBK=GHo4fewu3Yg@mail.gmail.com
>   
> 

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/


Re: Covering GiST indexes

От
Aleksander Alekseev
Дата:
Hello Andrey,

> So I decided to create a thread to discuss covering GiST indexes.
> Here's a prototype patch implementing this functionality.  It is quite
> small (+80 -30) and lacks tests and docs. But it creates a context.

I'm glad you got interested in this area. It would be great to have
covering indexes support for GiST as well. Please don't forget to add
the thread to the nearest commitfest entry. I'm looking forward for the
next versions of your patch!

--
Best regards,
Aleksander Alekseev

Вложения

Re: Covering GiST indexes

От
Alexander Korotkov
Дата:
Hi, Andrey!

On Thu, Apr 12, 2018 at 2:00 PM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
Looks like we finally have covering indexes! And that's cool!
 
Thank you!

So I decided to create a thread to discuss covering GiST indexes.
Here's a prototype patch implementing this functionality.
It is quite small (+80 -30) and lacks tests and docs. But it creates a context.

I have two concerns.
First one is about INDEX_AM_RESERVED_BIT.
B-tree uses it as a base for prefix truncation (I'm not quite sure why it is usually called suffix truncation, but this is a matter for other thread).
GiST , probably, will not use [pre\su]fix truncation. But I'd like to use that 13th bit to implement intra-page indexing - a way to improve search within gist page. See [0,1]
 
I also think that GiST isn't going to do suffix truncation of keys, because GiST
is composing keys for every attribute and trying to use them in queries if
even GiST didn't use those attributes in any split.  Depending on data distribution,
key representation, query and so on that keys may appear useful or not.
And GiST doesn't have any legal interface to determine that in advance.

I think that GiST needs another feature: not suffix truncation, but different
attribute representation in leaf pages and internal pages.  For example,
now GiST on points stores boxes in leafs.  That's a great waste of space.
So, we might just have different tuple descriptors for internal and leaf
pages of GiST, which could have both different attribute types and
different counts of attributes.  Thankfully GiST pages don't have high keys
and we don't have to mix tuples of different types on the same page
(unlike B-tree).

Second, currently including indexes do not allow same attributes in both keys and include parts.
# create index on x using gist(c) include (c);
ERROR:  included columns must not intersect with key columns

But it makes sense for example for geometries like PostGIS. Index keys are truncated to small MBRs while having whole complex geometry right in an index could be handy.

The issue discovered in [1] seems to be related.  Thus, after fix provided there when
column is indexed without index-only scan support, then it can't be used in index-only
scan anyway (if even it's indexed another time with index-only scan support).
So, I think we need a better fix for [1] first.

Another thing that could be done for PostGIS geometries is just another opclass which
stores geometries "as is" in leafs.  As I know, geometries contain MBRs inside their
own, so there is no need to store extra MBR.  I think the reason why PostGIS
doesn't have such opclass yet is that geometries are frequently large and easily
can exceed maximum size of index tuple.  The same problem applies to coverting
indexes too.  Possible solution here might be to support external toasted attributes
in covering indexes.  But I think that still requires quite amount of work.


------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Covering GiST indexes

От
Peter Geoghegan
Дата:
On Thu, Apr 12, 2018 at 4:00 AM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> I have two concerns.
> First one is about INDEX_AM_RESERVED_BIT.
> B-tree uses it as a base for prefix truncation (I'm not quite sure why it is usually called suffix truncation, but
thisis a matter for other thread).
 

Since you brought it up, and since I pushed that particular
terminology, I should acknowledge that the original 1977 Bayer paper
on suffix truncation calls a B-Tree with suffix truncation a prefix
B-Tree. However, more recent work seems to consistently refer to the
technique as suffix truncation, while also referring to more advanced
techniques for compressing (not truncating) leaf tuples as prefix
compression.

I suggested suffix truncation because it seemed to be the dominant way
of referring to the technique. And, because it seemed more logical:
the suffix is what gets truncated away.

-- 
Peter Geoghegan


Re: Covering GiST indexes

От
Darafei "Komяpa" Praliaskouski
Дата:
 
Another thing that could be done for PostGIS geometries is just another opclass which
stores geometries "as is" in leafs.  As I know, geometries contain MBRs inside their
own, so there is no need to store extra MBR.  I think the reason why PostGIS
doesn't have such opclass yet is that geometries are frequently large and easily
can exceed maximum size of index tuple.

Geometry datatype layout was designed with TOASTing in mind: most of data is stored in the header, including type, SRID, box and some other flags, so getting just several first bytes tells you a lot. 

PostGIS datasets are often of a mixed binary length: in buildings, for example, it is quite common to have a lot of four corner houses, and just one mapped as a circle, that digitizing software decided to make via 720-point polygon. Since reading it from TOAST for an index would require a seek of some kind, it may be as efficient to just look it up in the table.

This way a lossy decompress function can help with index only scans: up to some binary length, try to store the original geometry in the index. After that, store a shape that has less points in it but covers slightly larger area, plus a flag that it's not precise. There are a lot of ways to generate a covering shape with less points: obvious is a box, less obvious is non axis aligned box, a collection of boxes for a multipart shape, an outer ring for an area with lots of holes, a convex hull or a smallest enclosing k-gon. 

In GIS there is a problem of border of Russia: the country overlaps over 180 meridian and has a complex border shape. if you take a box of it, it spans from -180 to 180. If you query any spot in US or in Europe, you'll have it intersecting with your area, require a full recheck, complete detoast and decompression, and then "no, it's not a thing we need, skip". Allowing anything better than a box would help. If we're allowing a complex shape - we've got the container for it, geometry.

If we're storing geometry in index and original's small, why not allow complete Index Only Scan on it, and let it skip recheck? :)

Darafei Praliaskouski, 
GIS Engineer / Juno Minsk

Re: Covering GiST indexes

От
Alexander Korotkov
Дата:
On Thu, Apr 12, 2018 at 11:20 PM, Darafei "Komяpa" Praliaskouski <me@komzpa.net> wrote:
Another thing that could be done for PostGIS geometries is just another opclass which
stores geometries "as is" in leafs.  As I know, geometries contain MBRs inside their
own, so there is no need to store extra MBR.  I think the reason why PostGIS
doesn't have such opclass yet is that geometries are frequently large and easily
can exceed maximum size of index tuple.

Geometry datatype layout was designed with TOASTing in mind: most of data is stored in the header, including type, SRID, box and some other flags, so getting just several first bytes tells you a lot. 

PostGIS datasets are often of a mixed binary length: in buildings, for example, it is quite common to have a lot of four corner houses, and just one mapped as a circle, that digitizing software decided to make via 720-point polygon. Since reading it from TOAST for an index would require a seek of some kind, it may be as efficient to just look it up in the table.

This way a lossy decompress function can help with index only scans: up to some binary length, try to store the original geometry in the index. After that, store a shape that has less points in it but covers slightly larger area, plus a flag that it's not precise. There are a lot of ways to generate a covering shape with less points: obvious is a box, less obvious is non axis aligned box, a collection of boxes for a multipart shape, an outer ring for an area with lots of holes, a convex hull or a smallest enclosing k-gon. 

In GIS there is a problem of border of Russia: the country overlaps over 180 meridian and has a complex border shape. if you take a box of it, it spans from -180 to 180. If you query any spot in US or in Europe, you'll have it intersecting with your area, require a full recheck, complete detoast and decompression, and then "no, it's not a thing we need, skip". Allowing anything better than a box would help. If we're allowing a complex shape - we've got the container for it, geometry.

If we're storing geometry in index and original's small, why not allow complete Index Only Scan on it, and let it skip recheck? :)

So, as I get the idea is that geometries has very different sizes
from very small to very large.  And it would be nice to store small
geometries in the index "as is".  And then for small geomentries
we can do index only scan and recheck while for large geometries
we have to visit heap for fetching geometry.

That can be done in custom opclass without work in PostgreSQL
core except index only scan.  Right now optimizer expects index
to be always capable to return original datum for some column,
or to be never capable to do this.  It doesn't allow this decision to
be done in runtime.  Allowing this would require patch to PostgreSQL
core.  This patch shouldn't be hard at the executor size.  But
optimizer part of this patch seems hard, because I don't know
how to estimate fraction of index keys, which can be used to
reconstruct original datums.  Probably that would require
GiST compress method to return some statistics...

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Covering GiST indexes

От
Andrey Borodin
Дата:

> 12 апр. 2018 г., в 17:03, Teodor Sigaev <teodor@sigaev.ru> написал(а):
>
> Interesting work. I don't have a time now to learn deep your patch, so, add it to next commitfest, pls.
Thanks! Sure, I'll add it.

> First of all I'd like to see more tests in patch, not only CREATE INDEX.

Here's V2, with basic set of tests.
Currently, I'm investigating what to document and more places to tests.

Also, I do not use generic function index_truncate_tuple(), because it deforms and then forms tuple, but GiST usually
havealready deformed tuple. 

Best regards, Andrey Borodin.

Вложения

Re: Covering GiST indexes

От
Andrey Borodin
Дата:
Hi!

> 13 апр. 2018 г., в 17:36, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):
>
> Here's V2, with basic set of tests.
> Currently, I'm investigating what to document and more places to tests.

Here is v3 version of the patch. I've fixed some comments and added some words to docs.


Best regards, Andrey Borodin.

Вложения

Re: Covering GiST indexes

От
Thomas Munro
Дата:
On Fri, Jul 6, 2018 at 5:27 AM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> Here is v3 version of the patch. I've fixed some comments and added some words to docs.

Hi again Andrey,

Cfbot reported the following difference (twice) in the
index_including_gist test:

- ERROR: included columns must not intersect with key columns
+ ERROR: relation "tbl_gist_idx" already exists

Well, both of those errors are valid complaints in this case... maybe
the order of checks changed in master since you wrote the patch?
Probably better to use a different name for the index anyway.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Covering GiST indexes

От
Andrey Borodin
Дата:
Hi, Thomas!

29 июля 2018 г., в 14:28, Thomas Munro <thomas.munro@enterprisedb.com> написал(а):

On Fri, Jul 6, 2018 at 5:27 AM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
Here is v3 version of the patch. I've fixed some comments and added some words to docs.

Hi again Andrey,

Cfbot reported the following difference (twice) in the
index_including_gist test:

- ERROR: included columns must not intersect with key columns
+ ERROR: relation "tbl_gist_idx" already exists

Well, both of those errors are valid complaints in this case... maybe
the order of checks changed in master since you wrote the patch?
Probably better to use a different name for the index anyway.

Thanks! The problem appeared with commit 701fd0b [0] which dropped validation rules checked in failed test.  Here's the patch with fixed tests.

Best regards, Andrey Borodin.

Вложения

Re: Covering GiST indexes

От
Thomas Munro
Дата:
On Sun, Jul 29, 2018 at 10:50 PM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> Thanks! The problem appeared with commit 701fd0b [0] which dropped
> validation rules checked in failed test.  Here's the patch with fixed tests.

Thanks.  I received the attachment.

Just as an FYI, something about the way your mail client encoded it
prevented the mail archives from picking it up (according to someone
who knows much more about these things than me, it's buried in the
"multipart/mixed" part, so not visible to the mail archive if it
extracts only the "text/plain" part, or something like that).  That
didn't happen on any of your other patches.  I have no opinion of
whether that's a bug in the mail archive or an unhelpful mail client
or a non-ideal way to post patches, but you can see here that we don't
have the patch in the usual place:

https://www.postgresql.org/message-id/850AE105-F89A-42E4-AD40-FBC6EA5A5A00%40yandex-team.ru

That explains why cfbot didn't automatically test your new patch, so
it still show as broken here:
http://cfbot.cputube.org/andrey-borodin.html

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Covering GiST indexes

От
Andrey Borodin
Дата:
Thanks, Thomas!

> 30 июля 2018 г., в 3:58, Thomas Munro <thomas.munro@enterprisedb.com> написал(а):
>
> On Sun, Jul 29, 2018 at 10:50 PM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>> Thanks! The problem appeared with commit 701fd0b [0] which dropped
>> validation rules checked in failed test.  Here's the patch with fixed tests.
>
> Thanks.  I received the attachment.
>
> Just as an FYI, something about the way your mail client encoded it
> prevented the mail archives from picking it up

I'll try to investigate this.

Here's patch one more: another attempt to put it into archives.

Best regards, Andrey Borodin.


Вложения

Re: Covering GiST indexes

От
Dmitry Dolgov
Дата:
> On Mon, Jul 30, 2018 at 7:50 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
> Thanks, Thomas!
>
> > 30 июля 2018 г., в 3:58, Thomas Munro <thomas.munro@enterprisedb.com> написал(а):
> >
> > On Sun, Jul 29, 2018 at 10:50 PM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> >> Thanks! The problem appeared with commit 701fd0b [0] which dropped
> >> validation rules checked in failed test.  Here's the patch with fixed tests.
> >
> > Thanks.  I received the attachment.
> >
> > Just as an FYI, something about the way your mail client encoded it
> > prevented the mail archives from picking it up
>
> I'll try to investigate this.
>
> Here's patch one more: another attempt to put it into archives.

Hi,

Looks like this time the patch was picked up correctly, but there are some
conflicts with the current master branch:
http://cfbot.cputube.org/patch_20_1615.log
Could you please rebase it one more time?


Re: Covering GiST indexes

От
Andrey Borodin
Дата:
Hi, Dmitry!

> 26 нояб. 2018 г., в 21:20, Dmitry Dolgov <9erthalion6@gmail.com> написал(а):
>
> Looks like this time the patch was picked up correctly, but there are some
> conflicts with the current master branch:
> http://cfbot.cputube.org/patch_20_1615.log
> Could you please rebase it one more time?

Here's rebased version. Thanks!

Best regards, Andrey Borodin.

Вложения

Re: Covering GiST indexes

От
Andreas Karlsson
Дата:
On 11/26/18 5:56 PM, Andrey Borodin wrote:
> Here's rebased version. Thanks!

Here is my review.

= General

The features seems useful and an obvious extension of covering B-trees, 
and a potentially useful feature together with exclusion constraints.

The patch still applies (with git apply -3), compiles and passes the 
test suite.

 From some testing it seems like it works as expected.

= Code

* Have some minor style issues like that there should be spaces around 
|| (in gistcanreturn()) and ? and : (in gistFormTuple()).

* I do not see any need for adding the new parameter to gistFormTuple. 
As far as I can tell isTruncated is always equal to !isleaf.

* The comment below from gistFormTuple() does not look correct. No 
allocation is taking place.

/*
  * Allocate each included attribute.
  */

* Why do you set a supportCollation for the included columns? As far as 
I can tell the collation is only used by the various GiST support 
functions. Also in theory we should be able to skip initializing these 
array entires, but it is probably for the best that we do.

* I think you should check for the attno first in gistcanreturn() to 
take advantage of the short circuiting.

* I am no fan of the tupdesc vs truncTupdesc separation and think that 
it is a potential hazard, but I do not have any better suggestion right now.

* There is no test case for exclusion constraints, and I feel since that 
is one of the more important uses we should probably have at least one 
such test case.

= Minor comments

* I think that "the" should have been kept in the text below.

-        Currently, only the B-tree index access method supports this 
feature.
+        Currently, B-tree and GiST index access methods supports this 
feature.

* I am not a native speaker but I think it should be "use the INCLUDE 
clause" in the diff below, and I think it also should be "without any 
GiST operator class".

+   A GiST index can be covering, i.e. use <literal>INCLUDE</literal> 
clause.
+   Included columns can have data types without GiST operator class. 
Included
+   attributes will be stored uncompressed.

* I think you should write something like "Included attributes always 
support index-only scans." rather than "Included attributes can be 
returned." in the comment for gistcanreturn().

* Why the random noise in the diff below? I think using "(c3) INCLUDE 
(c4)" for both gist and rtree results in a cleaner patch.

  CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
  CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
-CREATE INDEX on tbl USING gist(c3) INCLUDE (c4);
+CREATE INDEX on tbl USING gist(c4) INCLUDE (c3);
  CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
  CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
  CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
-CREATE INDEX on tbl USING rtree(c1, c2) INCLUDE (c3, c4);
+CREATE INDEX on tbl USING rtree(c4) INCLUDE (c3, c1);
  CREATE INDEX on tbl USING btree(c1, c2) INCLUDE (c3, c4);


Re: Covering GiST indexes

От
Andrey Borodin
Дата:
Thank you very much for reviewing the patch!

> 28 янв. 2019 г., в 12:15, Andreas Karlsson <andreas@proxel.se> написал(а):
>
> = Code
>
> * Have some minor style issues like that there should be spaces around || (in gistcanreturn()) and ? and : (in
gistFormTuple()).
Fixed.
>
> * I do not see any need for adding the new parameter to gistFormTuple. As far as I can tell isTruncated is always
equalto !isleaf. 
You are right. I've removed isTruncated parameter.
>
> * The comment below from gistFormTuple() does not look correct. No allocation is taking place.
>
> /*
> * Allocate each included attribute.
> */
Fixed.
>
> * Why do you set a supportCollation for the included columns? As far as I can tell the collation is only used by the
variousGiST support functions. Also in theory we should be able to skip initializing these array entires, but it is
probablyfor the best that we do. 
Removed supportCollation.
>
> * I think you should check for the attno first in gistcanreturn() to take advantage of the short circuiting.
Done.
>
> * I am no fan of the tupdesc vs truncTupdesc separation and think that it is a potential hazard, but I do not have
anybetter suggestion right now. 
B-tree is copying tupdesc every time they truncate tuple. We need tuple truncation a little more often: when we are
doingpage split, we have to form all page tuples, truncated. 
Initially, I've optimized only this case, but this led to prepared tupledesc for truncated tuples.
>
> * There is no test case for exclusion constraints, and I feel since that is one of the more important uses we should
probablyhave at least one such test case. 

Actually, I did not understand this test case. Can you elaborate more on this? How included attributes should
participatein exclude index? What for? 

> = Minor comments
>
> * I think that "the" should have been kept in the text below.
>
> -        Currently, only the B-tree index access method supports this feature.
> +        Currently, B-tree and GiST index access methods supports this feature.
>
Fixed.
> * I am not a native speaker but I think it should be "use the INCLUDE clause" in the diff below, and I think it also
shouldbe "without any GiST operator class". 
>
> +   A GiST index can be covering, i.e. use <literal>INCLUDE</literal> clause.
> +   Included columns can have data types without GiST operator class. Included
> +   attributes will be stored uncompressed.
>
Fixed.
> * I think you should write something like "Included attributes always support index-only scans." rather than
"Includedattributes can be returned." in the comment for gistcanreturn(). 
>
Fixed, but slightly reworded.

> * Why the random noise in the diff below? I think using "(c3) INCLUDE (c4)" for both gist and rtree results in a
cleanerpatch. 
I've used columns with and without opclass in INCLUDE. This led to these seemingly random changes.

PFA v6. Thanks for reviewing!


Best regards, Andrey Borodin.


Вложения

Re: Covering GiST indexes

От
Andreas Karlsson
Дата:
On 1/28/19 7:26 PM, Andrey Borodin wrote:
>> * I am no fan of the tupdesc vs truncTupdesc separation and think that it is a potential hazard, but I do not have
anybetter suggestion right now.
 
> B-tree is copying tupdesc every time they truncate tuple. We need tuple truncation a little more often: when we are
doingpage split, we have to form all page tuples, truncated.
 
> Initially, I've optimized only this case, but this led to prepared tupledesc for truncated tuples.
>>
>> * There is no test case for exclusion constraints, and I feel since that is one of the more important uses we should
probablyhave at least one such test case.
 
> 
> Actually, I did not understand this test case. Can you elaborate more on this? How included attributes should
participatein exclude index? What for?
 

I mean include a table like below among the tests. I feel like this is a 
main use case for INCLUDE.

CREATE TABLE t2 (
   x int4range,
   y int,

   EXCLUDE USING gist (x WITH &&) INCLUDE (y)
);
>> * Why the random noise in the diff below? I think using "(c3) INCLUDE (c4)" for both gist and rtree results in a
cleanerpatch.
 
> I've used columns with and without opclass in INCLUDE. This led to these seemingly random changes.

I mean the diff would be smaller as the below. It also may make sense to 
make both lines "(c3) INCLUDE (c1, c4)".

  CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
  CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
  CREATE INDEX on tbl USING gist(c3) INCLUDE (c4);
  CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
  CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
  CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
-CREATE INDEX on tbl USING rtree(c1, c2) INCLUDE (c3, c4);
+CREATE INDEX on tbl USING rtree(c3) INCLUDE (c4);
  CREATE INDEX on tbl USING btree(c1, c2) INCLUDE (c3, c4);

Andreas


Re: Covering GiST indexes

От
Andrey Borodin
Дата:

> 29 янв. 2019 г., в 7:32, Andreas Karlsson <andreas@proxel.se> написал(а):
>
> On 1/28/19 7:26 PM, Andrey Borodin wrote:
>>> * I am no fan of the tupdesc vs truncTupdesc separation and think that it is a potential hazard, but I do not have
anybetter suggestion right now. 
>> B-tree is copying tupdesc every time they truncate tuple. We need tuple truncation a little more often: when we are
doingpage split, we have to form all page tuples, truncated. 
>> Initially, I've optimized only this case, but this led to prepared tupledesc for truncated tuples.
>>>
>>> * There is no test case for exclusion constraints, and I feel since that is one of the more important uses we
shouldprobably have at least one such test case. 
>> Actually, I did not understand this test case. Can you elaborate more on this? How included attributes should
participatein exclude index? What for? 
>
> I mean include a table like below among the tests. I feel like this is a main use case for INCLUDE.
>
> CREATE TABLE t2 (
>  x int4range,
>  y int,
>
>  EXCLUDE USING gist (x WITH &&) INCLUDE (y)
> );

Thanks for the explanation. Added this as case 6 to index_including_gist.

>>> * Why the random noise in the diff below? I think using "(c3) INCLUDE (c4)" for both gist and rtree results in a
cleanerpatch. 
>> I've used columns with and without opclass in INCLUDE. This led to these seemingly random changes.
>
> I mean the diff would be smaller as the below. It also may make sense to make both lines "(c3) INCLUDE (c1, c4)".
>
> CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
> CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
> CREATE INDEX on tbl USING gist(c3) INCLUDE (c4);
> CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
> CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
> CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
> -CREATE INDEX on tbl USING rtree(c1, c2) INCLUDE (c3, c4);
> +CREATE INDEX on tbl USING rtree(c3) INCLUDE (c4);
> CREATE INDEX on tbl USING btree(c1, c2) INCLUDE (c3, c4);
>

I've took your version of this test and added all variations of included attributes.


PFA v7.

Thanks!

Best regards, Andrey Borodin.


Вложения

Re: Covering GiST indexes

От
Andreas Karlsson
Дата:
Thanks for the new version of the patch. Based on my knowledge of PG 
this is starting to look good, and I have only three small comments below.

I am not 100% a fan of truncTupdesc, but as long as it is well commented 
I think that it is fine.

= Review

* I think it is worth writing a short comment when you create 
truncTupdesc about why this is done.

* Very minor thing: the diff below is pointless churn on a line not 
touched by the patch.

-                        values, isnull, true /* size is currently bogus 
*/ );
+                        values, isnull, true /* size is currently bogus 
*/);

* Another very minor thing: The diff below from gistFormTuple() should 
probably be consistent about brackets.

+           if (isnull[i])
+               compatt[i] = (Datum) 0;
+           else
+           {
+               compatt[i] = attdata[i];
+           }

Andreas


Re: Covering GiST indexes

От
Andreas Karlsson
Дата:
On 29/01/2019 18.00, Andreas Karlsson wrote:
> Thanks for the new version of the patch. Based on my knowledge of PG 
> this is starting to look good, and I have only three small comments below.

Sorry, I missed retree in the tests. Can you fix the below to match the 
gist example?

  CREATE INDEX on tbl USING rtree(c1, c2) INCLUDE (c3, c4);
  NOTICE:  substituting access method "gist" for obsolete method "rtree"
-ERROR:  access method "gist" does not support included columns
+ERROR:  data type integer has no default operator class for access 
method "gist"
+HINT:  You must specify an operator class for the index or define a 
default operator class for the data type.
+CREATE INDEX on tbl USING rtree(c4) INCLUDE (c1, c4);

Andreas


Re: Covering GiST indexes

От
Andrey Borodin
Дата:
Thanks! Here's new version 8

> 29 янв. 2019 г., в 22:00, Andreas Karlsson <andreas@proxel.se> написал(а):
>
> * I think it is worth writing a short comment when you create truncTupdesc about why this is done.
It resulted in not so short comment, but I think it is OK.
>
> * Very minor thing: the diff below is pointless churn on a line not touched by the patch.
>
> -                        values, isnull, true /* size is currently bogus */ );
> +                        values, isnull, true /* size is currently bogus */);
Fixed.
>
> * Another very minor thing: The diff below from gistFormTuple() should probably be consistent about brackets.
>
> +           if (isnull[i])
> +               compatt[i] = (Datum) 0;
> +           else
> +           {
> +               compatt[i] = attdata[i];
> +           }
Fixed.


Also, I've unified gist and r-tree syntax tests for INCLUDE.

Best regards, Andrey Borodin.


Вложения

Re: Covering GiST indexes

От
Andreas Karlsson
Дата:
On 29/01/2019 18.45, Andrey Borodin wrote:
> Also, I've unified gist and r-tree syntax tests for INCLUDE.

Why not just keep it simple? The purpose is not to test the GiST 
indexes, for that we have index_including_gist.sql.

I propose the following.

/*
  * 7. Check various AMs. All but btree and gist must fail.
  */
CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
CREATE INDEX on tbl USING gist(c3) INCLUDE (c1, c4);
CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
CREATE INDEX on tbl USING rtree(c3) INCLUDE (c1, c4);
DROP TABLE tbl;

and

/*
  * 7. Check various AMs. All but btree and gist must fail.
  */
CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
ERROR:  access method "brin" does not support included columns
CREATE INDEX on tbl USING gist(c3) INCLUDE (c1, c4);
CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
ERROR:  access method "spgist" does not support included columns
CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
ERROR:  access method "gin" does not support included columns
CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
ERROR:  access method "hash" does not support included columns
CREATE INDEX on tbl USING rtree(c3) INCLUDE (c1, c4);
NOTICE:  substituting access method "gist" for obsolete method "rtree"
DROP TABLE tbl;


Re: Covering GiST indexes

От
Andrey Borodin
Дата:

> 29 янв. 2019 г., в 23:43, Andreas Karlsson <andreas@proxel.se> написал(а):
>
> On 29/01/2019 18.45, Andrey Borodin wrote:
>> Also, I've unified gist and r-tree syntax tests for INCLUDE.
>
> Why not just keep it simple? The purpose is not to test the GiST indexes, for that we have index_including_gist.sql.
>
> I propose the following.

Indeed, it's quite strange to check that GiST cannot be built without GiST opclass.

PFA patch without redundant checks.

Best regards, Andrey Borodin.




Вложения

Re: Covering GiST indexes

От
Andreas Karlsson
Дата:
On 29/01/2019 19.58, Andrey Borodin wrote:
> PFA patch without redundant checks.

I hope it is ok that I fixed a typo and some wording in the comment, 
plus re-added the btree query which I accidentally removed in my last mail.

I think the current state of the patch is fine, assuming the solution 
with truncTupdesc is deemed to be good enough by the committer, so I 
will mark this as ready for committer.

Thanks for the patch and the speedy fixes!

Andreas

Вложения

Re: Covering GiST indexes

От
Alexander Korotkov
Дата:
On Wed, Jan 30, 2019 at 4:16 AM Andreas Karlsson <andreas@proxel.se> wrote:
>
> On 29/01/2019 19.58, Andrey Borodin wrote:
> > PFA patch without redundant checks.
>
> I hope it is ok that I fixed a typo and some wording in the comment,
> plus re-added the btree query which I accidentally removed in my last mail.
>
> I think the current state of the patch is fine, assuming the solution
> with truncTupdesc is deemed to be good enough by the committer, so I
> will mark this as ready for committer.
>
> Thanks for the patch and the speedy fixes!

I made some adjustments for this patch:
 * Rename tupledesc and tructTupledesc to leafTupdesc and
nonLeafTupdesc correspondingly.  I think this makes things more clear.
 * Some improvements to docs and comments.
 * Revise commit message.

I'm going to push this on no objections.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

Re: Covering GiST indexes

От
Alexander Korotkov
Дата:
On Fri, Mar 8, 2019 at 11:58 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> I made some adjustments for this patch:
>  * Rename tupledesc and tructTupledesc to leafTupdesc and
> nonLeafTupdesc correspondingly.  I think this makes things more clear.
>  * Some improvements to docs and comments.
>  * Revise commit message.
>
> I'm going to push this on no objections.

Pushed with some more small changes in docs.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: Covering GiST indexes

От
Andrey Borodin
Дата:
Hi!

> 10 марта 2019 г., в 13:39, Alexander Korotkov <a.korotkov@postgrespro.ru> написал(а):
>
> Pushed with some more small changes in docs.

That's great, thank you!

Best regards, Andrey Borodin.