Обсуждение: Partitioned tables and covering indexes

Поиск
Список
Период
Сортировка

Partitioned tables and covering indexes

От
Jaime Casanova
Дата:
Hi,

Trying covering indexes on partitioned tables i get this error
"""
postgres=# create index on t1_part (i) include (t);
ERROR:  cache lookup failed for opclass 0
"""

To reproduce:

create table t1_part (i int, t text) partition by hash (i);
create table t1_part_0 partition of t1_part for values with (modulus
2, remainder 0);
create table t1_part_1 partition of t1_part for values with (modulus
2, remainder 1);
insert into t1_part values (1, repeat('abcdefquerty', 20));

create index on t1_part (i) include (t);

-- 
Jaime Casanova                      www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Partitioned tables and covering indexes

От
Amit Langote
Дата:
On 2018/04/10 16:07, Jaime Casanova wrote:
> Hi,
> 
> Trying covering indexes on partitioned tables i get this error
> """
> postgres=# create index on t1_part (i) include (t);
> ERROR:  cache lookup failed for opclass 0
> """
> 
> To reproduce:
> 
> create table t1_part (i int, t text) partition by hash (i);
> create table t1_part_0 partition of t1_part for values with (modulus
> 2, remainder 0);
> create table t1_part_1 partition of t1_part for values with (modulus
> 2, remainder 1);
> insert into t1_part values (1, repeat('abcdefquerty', 20));
> 
> create index on t1_part (i) include (t);

It seems that the bug is caused due to the original IndexStmt that
DefineIndex receives being overwritten when processing the INCLUDE
columns.  Also, the original copy of it should be used when recursing for
defining the index in partitions.

Does the attached fix look correct?  Haven't checked the fix with ATTACH
PARTITION though.

Maybe add this to open items list?

Thanks,
Amit

Вложения

Re: Partitioned tables and covering indexes

От
Alexander Korotkov
Дата:
On Tue, Apr 10, 2018 at 12:47 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2018/04/10 16:07, Jaime Casanova wrote:
> Hi,
>
> Trying covering indexes on partitioned tables i get this error
> """
> postgres=# create index on t1_part (i) include (t);
> ERROR:  cache lookup failed for opclass 0
> """
>
> To reproduce:
>
> create table t1_part (i int, t text) partition by hash (i);
> create table t1_part_0 partition of t1_part for values with (modulus
> 2, remainder 0);
> create table t1_part_1 partition of t1_part for values with (modulus
> 2, remainder 1);
> insert into t1_part values (1, repeat('abcdefquerty', 20));
>
> create index on t1_part (i) include (t);

It seems that the bug is caused due to the original IndexStmt that
DefineIndex receives being overwritten when processing the INCLUDE
columns.  Also, the original copy of it should be used when recursing for
defining the index in partitions.

Does the attached fix look correct?  Haven't checked the fix with ATTACH
PARTITION though.

Attached patch seems to fix the problem.  However, I would rather get
rid of modifying stmt->indexParams.  That seems to be more logical
for me.  Also, it would be good to check some covering indexes on
partitioned tables.  See the attached patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Вложения

Re: Partitioned tables and covering indexes

От
Teodor Sigaev
Дата:
>     Does the attached fix look correct?  Haven't checked the fix with ATTACH
>     PARTITION though.
> 
> 
> Attached patch seems to fix the problem.  However, I would rather get
> rid of modifying stmt->indexParams.  That seems to be more logical
> for me.  Also, it would be good to check some covering indexes on
> partitioned tables.  See the attached patch.
Seems right way, do not modify incoming object and do not copy rather large and 
deep nested structure as suggested by Amit.

But it will  be better to have a ATTACH PARTITION test too.

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/


Re: Partitioned tables and covering indexes

От
Jaime Casanova
Дата:
On 10 April 2018 at 10:36, Teodor Sigaev <teodor@sigaev.ru> wrote:
>>     Does the attached fix look correct?  Haven't checked the fix with
>> ATTACH
>>     PARTITION though.
>>
>>
>> Attached patch seems to fix the problem.  However, I would rather get
>> rid of modifying stmt->indexParams.  That seems to be more logical
>> for me.  Also, it would be good to check some covering indexes on
>> partitioned tables.  See the attached patch.
>
> Seems right way, do not modify incoming object and do not copy rather large
> and deep nested structure as suggested by Amit.
>
> But it will  be better to have a ATTACH PARTITION test too.
>

the patch worked for me, i also tried some combinations using ATTACH
PARTITION and found no problems

-- 
Jaime Casanova                      www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Partitioned tables and covering indexes

От
Amit Langote
Дата:
Hi.

On 2018/04/11 0:36, Teodor Sigaev wrote:
>>     Does the attached fix look correct?  Haven't checked the fix with
>> ATTACH
>>     PARTITION though.
>>
>>
>> Attached patch seems to fix the problem.  However, I would rather get
>> rid of modifying stmt->indexParams.  That seems to be more logical
>> for me.  Also, it would be good to check some covering indexes on
>> partitioned tables.  See the attached patch.
>
> Seems right way, do not modify incoming object and do not copy rather
> large and deep nested structure as suggested by Amit.

Yeah, Alexander's suggested way of using a separate variable for
indexParams is better.

> But it will  be better to have a ATTACH PARTITION test too.

I have added tests.  Actually, instead of modifying existing tests, I
think it might be better to have a separate section at the end of
indexing.sql to test covering indexes feature for partitioned tables.

Attached find updated patch.

Thanks,
Amit

Вложения

Re: Partitioned tables and covering indexes

От
Teodor Sigaev
Дата:
Thank you, pushed

Amit Langote wrote:
> Hi.
> 
> On 2018/04/11 0:36, Teodor Sigaev wrote:
>>>      Does the attached fix look correct?  Haven't checked the fix with
>>> ATTACH
>>>      PARTITION though.
>>>
>>>
>>> Attached patch seems to fix the problem.  However, I would rather get
>>> rid of modifying stmt->indexParams.  That seems to be more logical
>>> for me.  Also, it would be good to check some covering indexes on
>>> partitioned tables.  See the attached patch.
>>
>> Seems right way, do not modify incoming object and do not copy rather
>> large and deep nested structure as suggested by Amit.
> 
> Yeah, Alexander's suggested way of using a separate variable for
> indexParams is better.
> 
>> But it will  be better to have a ATTACH PARTITION test too.
> 
> I have added tests.  Actually, instead of modifying existing tests, I
> think it might be better to have a separate section at the end of
> indexing.sql to test covering indexes feature for partitioned tables.
> 
> Attached find updated patch.
> 
> Thanks,
> Amit
> 

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/


Re: Partitioned tables and covering indexes

От
Teodor Sigaev
Дата:
Patch makes buildfarm almost red, patch is temporary reverted.

Actually, discovered bug is not related to patch except new test faces with it,
problem is: CompareIndexInfo() checks rd_opfamily for equality for all 
attributes, not only for key attribute.

Obviously, CompareIndexInfo() needs more work:
     for (i = 0; i < info1->ii_NumIndexAttrs; i++)
     {
         if (maplen < info2->ii_KeyAttrNumbers[i])

Seems, we can go out from ii_KeyAttrNumbers array.




Amit Langote wrote:
> Hi.
> 
> On 2018/04/11 0:36, Teodor Sigaev wrote:
>>>      Does the attached fix look correct?  Haven't checked the fix with
>>> ATTACH
>>>      PARTITION though.
>>>
>>>
>>> Attached patch seems to fix the problem.  However, I would rather get
>>> rid of modifying stmt->indexParams.  That seems to be more logical
>>> for me.  Also, it would be good to check some covering indexes on
>>> partitioned tables.  See the attached patch.
>>
>> Seems right way, do not modify incoming object and do not copy rather
>> large and deep nested structure as suggested by Amit.
> 
> Yeah, Alexander's suggested way of using a separate variable for
> indexParams is better.
> 
>> But it will  be better to have a ATTACH PARTITION test too.
> 
> I have added tests.  Actually, instead of modifying existing tests, I
> think it might be better to have a separate section at the end of
> indexing.sql to test covering indexes feature for partitioned tables.
> 
> Attached find updated patch.
> 
> Thanks,
> Amit
> 

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/


Re: Partitioned tables and covering indexes

От
Teodor Sigaev
Дата:
> Actually, discovered bug is not related to patch except new test faces 
> with it,
> problem is: CompareIndexInfo() checks rd_opfamily for equality for all 
> attributes, not only for key attribute.

Patch attached. But it seems to me, field's names of
IndexInfo structure are a bit confused now:
     int         ii_NumIndexAttrs;   /* total number of columns in index */
     int         ii_NumIndexKeyAttrs;    /* number of key columns in 
index */
     AttrNumber  ii_KeyAttrNumbers[INDEX_MAX_KEYS];


ii_KeyAttrNumbers contains all columns, i.e. it contains 
ii_NumIndexAttrs number of columns, not a ii_NumIndexKeyAttrs number as 
easy to think.

I suggest rename ii_KeyAttrNumbers to ii_AttrNumbers or 
ii_IndexAttrNumbers. Opinions?


>      for (i = 0; i < info1->ii_NumIndexAttrs; i++)
>      {
>          if (maplen < info2->ii_KeyAttrNumbers[i])
> 


-- 
Teodor Sigaev                      E-mail: teodor@sigaev.ru
                                       WWW: http://www.sigaev.ru/

Вложения

Re: Partitioned tables and covering indexes

От
Alvaro Herrera
Дата:
Teodor Sigaev wrote:

> Patch attached.

I wonder why this is a problem in opfamilies but not collations.
If we don't compare collations, wouldn't it make more sense to break out
of the loop once the number of keys is reached?

When this code was written, there was no question as to what length the
opfamilies/collations the arrays were; it was obvious that they must be
of the length of the index attributes.  It's not obvious now.  Maybe add
a comment about that?

> But it seems to me, field's names of
> IndexInfo structure are a bit confused now:
>     int         ii_NumIndexAttrs;   /* total number of columns in index */
>     int         ii_NumIndexKeyAttrs;    /* number of key columns in index */
>     AttrNumber  ii_KeyAttrNumbers[INDEX_MAX_KEYS];
> 
> ii_KeyAttrNumbers contains all columns, i.e. it contains ii_NumIndexAttrs
> number of columns, not a ii_NumIndexKeyAttrs number as easy to think.
> 
> I suggest rename ii_KeyAttrNumbers to ii_AttrNumbers or ii_IndexAttrNumbers.
> Opinions?

Yeah, the current situation seems very odd.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Partitioned tables and covering indexes

От
Alexander Korotkov
Дата:
On Wed, Apr 11, 2018 at 10:47 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Teodor Sigaev wrote:

> Patch attached.

I wonder why this is a problem in opfamilies but not collations.
If we don't compare collations, wouldn't it make more sense to break out
of the loop once the number of keys is reached?
 
It appears that INCLUDE columns might have collation defined.
For instance, following query is working:

create index t_s1_s2_idx on t (s1) include (s2 collate "en_US.UTF-8");

However, I don't see any point in defining collations here, because
INCLUDE attributes exist solely for index-only scans.  So, index just
can return value of INCLUDE attribute "as is", no point to do something
with collation.

So, I propose to disable collations for INCLUDE attributes.

When this code was written, there was no question as to what length the
opfamilies/collations the arrays were; it was obvious that they must be
of the length of the index attributes.  It's not obvious now.  Maybe add
a comment about that?

In b3b7f789 Tom have resized one array size from total number of
attributes to number of key attributes.  And I like idea of applying the
same to all other array: make them sized to total number of attributes
with filling of absent attributes with 0.
 
> But it seems to me, field's names of
> IndexInfo structure are a bit confused now:
>     int         ii_NumIndexAttrs;   /* total number of columns in index */
>     int         ii_NumIndexKeyAttrs;    /* number of key columns in index */
>     AttrNumber  ii_KeyAttrNumbers[INDEX_MAX_KEYS];
>
> ii_KeyAttrNumbers contains all columns, i.e. it contains ii_NumIndexAttrs
> number of columns, not a ii_NumIndexKeyAttrs number as easy to think.
>
> I suggest rename ii_KeyAttrNumbers to ii_AttrNumbers or ii_IndexAttrNumbers.
> Opinions?

Yeah, the current situation seems very odd.

+1 for ii_IndexAttrNumbers.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Partitioned tables and covering indexes

От
Peter Geoghegan
Дата:
On Wed, Apr 11, 2018 at 1:58 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> It appears that INCLUDE columns might have collation defined.
> For instance, following query is working:
>
> create index t_s1_s2_idx on t (s1) include (s2 collate "en_US.UTF-8");
>
> However, I don't see any point in defining collations here, because
> INCLUDE attributes exist solely for index-only scans.  So, index just
> can return value of INCLUDE attribute "as is", no point to do something
> with collation.
>
> So, I propose to disable collations for INCLUDE attributes.

Hmm. I'm not sure that that's exactly the right thing to do. We seem
to want to have case-insensitive collations in the future. The fact
that you can spell out collation name in ON CONFLICT's unique index
inference specification suggests this, for example. I think that a
collation is theoretically allowed to affect the behavior of equality,
even though so far we've never tried to make that work for any
collatable datatype.

Maybe the right thing to do is to say that any collation will work
equally well (as will any opclass). Maybe that's the same thing as
what you said, though.

> +1 for ii_IndexAttrNumbers.

+1

-- 
Peter Geoghegan


Re: Partitioned tables and covering indexes

От
Peter Eisentraut
Дата:
On 4/11/18 17:08, Peter Geoghegan wrote:
>> However, I don't see any point in defining collations here, because
>> INCLUDE attributes exist solely for index-only scans.  So, index just
>> can return value of INCLUDE attribute "as is", no point to do something
>> with collation.
>>
>> So, I propose to disable collations for INCLUDE attributes.
> Hmm. I'm not sure that that's exactly the right thing to do. We seem
> to want to have case-insensitive collations in the future. The fact
> that you can spell out collation name in ON CONFLICT's unique index
> inference specification suggests this, for example. I think that a
> collation is theoretically allowed to affect the behavior of equality,
> even though so far we've never tried to make that work for any
> collatable datatype.

But in this case it doesn't even do equality comparison, it just returns
the value.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Partitioned tables and covering indexes

От
Peter Geoghegan
Дата:
On Wed, Apr 11, 2018 at 2:29 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> But in this case it doesn't even do equality comparison, it just returns
> the value.

That's the idea that I tried to express. The point is that we need to
tell the user that there is no need to worry about it, rather than
that they're wrong to ask about it. Though we should probably actually
just throw an error.

-- 
Peter Geoghegan


Re: Partitioned tables and covering indexes

От
Peter Eisentraut
Дата:
On 4/11/18 17:38, Peter Geoghegan wrote:
> On Wed, Apr 11, 2018 at 2:29 PM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> But in this case it doesn't even do equality comparison, it just returns
>> the value.
> 
> That's the idea that I tried to express. The point is that we need to
> tell the user that there is no need to worry about it, rather than
> that they're wrong to ask about it. Though we should probably actually
> just throw an error.

Or maybe it should be the collation of the underlying table columns.
Otherwise the collation returned by an index-only scan would be
different from a table scan, no?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Partitioned tables and covering indexes

От
Teodor Sigaev
Дата:

Peter Geoghegan wrote:
> On Wed, Apr 11, 2018 at 2:29 PM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> But in this case it doesn't even do equality comparison, it just returns
>> the value.
> 
> That's the idea that I tried to express. The point is that we need to
> tell the user that there is no need to worry about it, rather than
> that they're wrong to ask about it. Though we should probably actually
> just throw an error.

Any operation over including columns is a deal for filter, not an index, 
so collation will not be used somewhere inside index.

2Alexander: ComputeIndexAttrs() may use whole vectors (for including 
columns too) just to simplify coding, in other places, seems, better to 
have exact size of vectors. Using full-sized vectors could mask a error, 
for exmaple, with setting opfamily to InvalidOid for key column. But if 
you insist on that, I'll modify attached patch to use full-sized vectors 
anywhere.

In attached path I did:
1) fix a bug in CompareIndexInfo() which checks opfamily for including 
column
2) correct size for all vectors
3) Add assertion in various places to be sure that we don't try to use 
including column as key column
4) per Peter Geoghegan idea add a error message if somebody adds options 
to include column instead silently ignore it.


-- 
Teodor Sigaev                      E-mail: teodor@sigaev.ru
                                       WWW: http://www.sigaev.ru/

Вложения

Re: Partitioned tables and covering indexes

От
Teodor Sigaev
Дата:
>> That's the idea that I tried to express. The point is that we need to
>> tell the user that there is no need to worry about it, rather than
>> that they're wrong to ask about it. Though we should probably actually
>> just throw an error.
> 
> Or maybe it should be the collation of the underlying table columns.
> Otherwise the collation returned by an index-only scan would be
> different from a table scan, no?
+1, dangerous
-- 
Teodor Sigaev                      E-mail: teodor@sigaev.ru
                                       WWW: http://www.sigaev.ru/


Re: Partitioned tables and covering indexes

От
Alexander Korotkov
Дата:
On Thu, Apr 12, 2018 at 1:14 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
That's the idea that I tried to express. The point is that we need to
tell the user that there is no need to worry about it, rather than
that they're wrong to ask about it. Though we should probably actually
just throw an error.

Or maybe it should be the collation of the underlying table columns.
Otherwise the collation returned by an index-only scan would be
different from a table scan, no?
+1, dangerous

I'm OK with collation of included columns to be the same as collation
of underlying table columns.  But I still think we should throw an error
when user is trying to specify his own collation of included columns.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Partitioned tables and covering indexes

От
Alexander Korotkov
Дата:
On Thu, Apr 12, 2018 at 1:14 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
Peter Geoghegan wrote:
On Wed, Apr 11, 2018 at 2:29 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
But in this case it doesn't even do equality comparison, it just returns
the value.

That's the idea that I tried to express. The point is that we need to
tell the user that there is no need to worry about it, rather than
that they're wrong to ask about it. Though we should probably actually
just throw an error.

Any operation over including columns is a deal for filter, not an index, so collation will not be used somewhere inside index.

2Alexander: ComputeIndexAttrs() may use whole vectors (for including columns too) just to simplify coding, in other places, seems, better to have exact size of vectors. Using full-sized vectors could mask a error, for exmaple, with setting opfamily to InvalidOid for key column. But if you insist on that, I'll modify attached patch to use full-sized vectors anywhere.

In attached path I did:
1) fix a bug in CompareIndexInfo() which checks opfamily for including column
2) correct size for all vectors
3) Add assertion in various places to be sure that we don't try to use including column as key column
4) per Peter Geoghegan idea add a error message if somebody adds options to include column instead silently ignore it.

I don't insist on using full-sized vectors everywhere.  Attached patch looks OK for me.  I haven't test it though.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Partitioned tables and covering indexes

От
Teodor Sigaev
Дата:
Thanks to everyone, this part is pushed. I will waiting a bit before pushing 
topic-stater patch to have a time to look on buildfarm.

Alvaro, could you add a comment to CompareIndexInfo() to clarify why it doesn't 
compare indoptions (like DESC/ASC etc)? It doesn't matter for uniqueness of 
index but changes an order and it's not clear at least for me why we don't pay 
attention for that.

> In attached path I did:
> 1) fix a bug in CompareIndexInfo() which checks opfamily for including column
> 2) correct size for all vectors
> 3) Add assertion in various places to be sure that we don't try to use including 
> column as key column
> 4) per Peter Geoghegan idea add a error message if somebody adds options to 
> include column instead silently ignore it.
> 
> 

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/


Re: Partitioned tables and covering indexes

От
Teodor Sigaev
Дата:
pushed. Hope, second try will be successful.

Teodor Sigaev wrote:
> Thank you, pushed
> 
> Amit Langote wrote:
>> Hi.
>>
>> On 2018/04/11 0:36, Teodor Sigaev wrote:
>>>>      Does the attached fix look correct?  Haven't checked the fix with
>>>> ATTACH
>>>>      PARTITION though.
>>>>
>>>>
>>>> Attached patch seems to fix the problem.  However, I would rather get
>>>> rid of modifying stmt->indexParams.  That seems to be more logical
>>>> for me.  Also, it would be good to check some covering indexes on
>>>> partitioned tables.  See the attached patch.
>>>
>>> Seems right way, do not modify incoming object and do not copy rather
>>> large and deep nested structure as suggested by Amit.
>>
>> Yeah, Alexander's suggested way of using a separate variable for
>> indexParams is better.
>>
>>> But it will  be better to have a ATTACH PARTITION test too.
>>
>> I have added tests.  Actually, instead of modifying existing tests, I
>> think it might be better to have a separate section at the end of
>> indexing.sql to test covering indexes feature for partitioned tables.
>>
>> Attached find updated patch.
>>
>> Thanks,
>> Amit
>>
> 

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/


Re: Partitioned tables and covering indexes

От
Robert Haas
Дата:
On Thu, Apr 12, 2018 at 6:14 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> I'm OK with collation of included columns to be the same as collation
> of underlying table columns.  But I still think we should throw an error
> when user is trying to specify his own collation of included columns.

I agree.  The collation of a table column is just setting a default
for how it gets interpreted in queries, but the collation of an index
column affects the ordering of the index.  For INCLUDE columns, the
latter isn't relevant, so the value has no meaning.  Letting people
set a meaningless value sometimes gets us into trouble (see also the
nearby thread on TABLESPACE settings on partitioned tables).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company