Обсуждение: Report error position in partition bound check

Поиск
Список
Период
Сортировка

Report error position in partition bound check

От
Alexandra Wang
Дата:
Hi,

I'm playing with partitioned tables and found a minor thing with the
error reporting of bounds checking when create partitions.

In function check_new_partition_bound(), there are three places where
we call ereport() with a parser_errposition(pstate, spec->location)
argument.  However, that pstate is a dummy ParseState made from NULL,
so the error message never reports the position of the error in the
source query line.


I have attached a patch to pass in a ParseState to
check_new_partition_bound() to enable the reporting of the error
position. Below is what the error message looks like before and after
applying the patch.

-- Create parent table
create table foo (a int, b date) partition by range (b);

-- Before:
create table foo_part_1 partition of foo for values from (date '2007-01-01') to (date '2006-01-01');
ERROR:  empty range bound specified for partition "foo_part_1"
DETAIL:  Specified lower bound ('2007-01-01') is greater than or equal to upper bound ('2006-01-01').

-- After:
create table foo_part_1 partition of foo for values from (date '2007-01-01') to (date '2006-01-01');
ERROR:  empty range bound specified for partition "foo_part_1"
LINE 1: ...eate table foo_part_1 partition of foo for values from (date...
                                                             ^
DETAIL:  Specified lower bound ('2007-01-01') is greater than or equal to upper bound ('2006-01-01').

Another option is to not pass the parser_errposition() argument at all
to ereport() in this function, since the query is relatively short and
the error message is already descriptive enough.

Alex and Ashwin
Вложения

Re: Report error position in partition bound check

От
Alexandra Wang
Дата:
Forgot to run make installcheck. Here's the new version of the patch that updated the test answer file.

Вложения

Re: Report error position in partition bound check

От
Michael Paquier
Дата:
On Wed, Apr 08, 2020 at 05:15:57PM -0700, Alexandra Wang wrote:
> I have attached a patch to pass in a ParseState to
> check_new_partition_bound() to enable the reporting of the error
> position. Below is what the error message looks like before and after
> applying the patch.
>
> Another option is to not pass the parser_errposition() argument at all
> to ereport() in this function, since the query is relatively short and
> the error message is already descriptive enough.

It depends on the complexity of the relation definition, so adding a
position looks like a good idea to me.  Anyway, even if this looks
like an oversight to me, we are post feature freeze for 13 and that's
an improvement, so this looks like material for PG14 to me.  Are there
more opinions on the matter?

Please note that you forgot to update the regression test output.
--
Michael

Вложения

Re: Report error position in partition bound check

От
Alexandra Wang
Дата:
On Wed, Apr 8, 2020 at 6:11 PM Michael Paquier <michael@paquier.xyz> wrote:
> Please note that you forgot to update the regression test output.

Yep thanks! Please see my previous email for the updated patch.


Re: Report error position in partition bound check

От
Michael Paquier
Дата:
On Wed, Apr 08, 2020 at 08:17:55PM -0700, Alexandra Wang wrote:
> On Wed, Apr 8, 2020 at 6:11 PM Michael Paquier <michael@paquier.xyz> wrote:
> > Please note that you forgot to update the regression test output.
>
> Yep thanks! Please see my previous email for the updated patch.

Thanks, I saw the update.  It looks like my email was a couple of
minutes too late :)

Could you add this patch to the next commit fest [1]?

[1]: https://commitfest.postgresql.org/28/
--
Michael

Вложения

Re: Report error position in partition bound check

От
Ashutosh Bapat
Дата:
Hi Alexandra,
As Michael said it will be considered for the next commitfest. But
from a quick glance, a suggestion.
Instead of passing NULL parsestate from ATExecAttachPartition, pass
make_parsestate(NULL). parse_errorposition() takes care of NULL parse
state input, but it might be safer this way. Better if we could cook
up a parse state with the query text available in
AlterTableUtilityContext available in ATExecCmd().

On Thu, Apr 9, 2020 at 6:36 AM Alexandra Wang <lewang@pivotal.io> wrote:
>
> Forgot to run make installcheck. Here's the new version of the patch that updated the test answer file.
>


-- 
Best Wishes,
Ashutosh Bapat



Re: Report error position in partition bound check

От
Tom Lane
Дата:
While I'm quite on board with providing useful error cursors,
the example cases in this patch don't seem all that useful:

 -- trying to create range partition with empty range
 CREATE TABLE fail_part PARTITION OF range_parted2 FOR VALUES FROM (1) TO (0);
 ERROR:  empty range bound specified for partition "fail_part"
+LINE 1: ...E fail_part PARTITION OF range_parted2 FOR VALUES FROM (1) T...
+                                                             ^
 DETAIL:  Specified lower bound (1) is greater than or equal to upper bound (0).

As best I can tell from these examples, the cursor will always
point at the FROM keyword, making it pretty unhelpful.  It seems
like in addition to getting the query string passed down, you
need to do some work on the code that's actually reporting the
error position.  I'd expect at a minimum that the pointer allows
identifying which column of a multi-column partition key is
giving trouble.  The phrasing of this particular message, for
example, suggests that it ought to point at the "1" expression.

            regards, tom lane



Re: Report error position in partition bound check

От
Amit Langote
Дата:
On Thu, Apr 9, 2020 at 10:51 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Hi Alexandra,
> As Michael said it will be considered for the next commitfest. But
> from a quick glance, a suggestion.
> Instead of passing NULL parsestate from ATExecAttachPartition, pass
> make_parsestate(NULL). parse_errorposition() takes care of NULL parse
> state input, but it might be safer this way. Better if we could cook
> up a parse state with the query text available in
> AlterTableUtilityContext available in ATExecCmd().

+1.  Maybe pass the *context* down to ATExecAttachPartition() from
ATExecCmd() rather than a ParseState.

-- 

Amit Langote
EnterpriseDB: http://www.enterprisedb.com



Re: Report error position in partition bound check

От
Amit Langote
Дата:
On Thu, Apr 9, 2020 at 11:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> While I'm quite on board with providing useful error cursors,
> the example cases in this patch don't seem all that useful:
>
>  -- trying to create range partition with empty range
>  CREATE TABLE fail_part PARTITION OF range_parted2 FOR VALUES FROM (1) TO (0);
>  ERROR:  empty range bound specified for partition "fail_part"
> +LINE 1: ...E fail_part PARTITION OF range_parted2 FOR VALUES FROM (1) T...
> +                                                             ^
>  DETAIL:  Specified lower bound (1) is greater than or equal to upper bound (0).
>
> As best I can tell from these examples, the cursor will always
> point at the FROM keyword, making it pretty unhelpful.  It seems
> like in addition to getting the query string passed down, you
> need to do some work on the code that's actually reporting the
> error position.  I'd expect at a minimum that the pointer allows
> identifying which column of a multi-column partition key is
> giving trouble.  The phrasing of this particular message, for
> example, suggests that it ought to point at the "1" expression.

I agree with that.  Tried that in the attached 0002, although trying
to get the cursor to point to exactly the offending column seems a bit
tough for partition overlap errors.  The patch does allow to single
out which one of the lower and upper bounds is causing the overlap
with an existing partition, which is better than now and seems helpful
enough.

Also, updated Alexandra's patch to incorporate Ashutosh's comment such
that we get the same output with ATTACH PARTITION commands too.

-- 

Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Вложения

Re: Report error position in partition bound check

От
Ashutosh Bapat
Дата:


On Fri, 10 Apr 2020 at 14:31, Amit Langote <amitlangote09@gmail.com> wrote:
On Thu, Apr 9, 2020 at 11:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> While I'm quite on board with providing useful error cursors,
> the example cases in this patch don't seem all that useful:
>
>  -- trying to create range partition with empty range
>  CREATE TABLE fail_part PARTITION OF range_parted2 FOR VALUES FROM (1) TO (0);
>  ERROR:  empty range bound specified for partition "fail_part"
> +LINE 1: ...E fail_part PARTITION OF range_parted2 FOR VALUES FROM (1) T...
> +                                                             ^
>  DETAIL:  Specified lower bound (1) is greater than or equal to upper bound (0).
>
> As best I can tell from these examples, the cursor will always
> point at the FROM keyword, making it pretty unhelpful.  It seems
> like in addition to getting the query string passed down, you
> need to do some work on the code that's actually reporting the
> error position.  I'd expect at a minimum that the pointer allows
> identifying which column of a multi-column partition key is
> giving trouble.  The phrasing of this particular message, for
> example, suggests that it ought to point at the "1" expression.

I agree with that.  Tried that in the attached 0002, although trying
to get the cursor to point to exactly the offending column seems a bit
tough for partition overlap errors.  The patch does allow to single
out which one of the lower and upper bounds is causing the overlap
with an existing partition, which is better than now and seems helpful
enough.

Also, updated Alexandra's patch to incorporate Ashutosh's comment such
that we get the same output with ATTACH PARTITION commands too.

I looked at this briefly. It looks good, but I will review more in the next CF. Do we have entry there yet? To nit-pick: for a multi-key value the ^ points to the first column and the reader may think that that's the problematci column. Should it instead point to ( ? 

--
Best Wishes,
Ashutosh

Re: Report error position in partition bound check

От
Alexandra Wang
Дата:
On Fri, 10 Apr 2020 at 14:31, Amit Langote <amitlangote09@gmail.com> wrote:
> I agree with that.  Tried that in the attached 0002, although trying
> to get the cursor to point to exactly the offending column seems a bit
> tough for partition overlap errors.  The patch does allow to single
> out which one of the lower and upper bounds is causing the overlap
> with an existing partition, which is better than now and seems helpful
> enough.
>
> Also, updated Alexandra's patch to incorporate Ashutosh's comment such
> that we get the same output with ATTACH PARTITION commands too.

Thank you Amit for updating the patches, the cursor looks much helpful now. I
created the commitfest entry https://commitfest.postgresql.org/28/2533/

Re: Report error position in partition bound check

От
Alexandra Wang
Дата:


On Fri, Apr 10, 2020 at 8:37 AM Ashutosh Bapat <ashutosh.bapat@2ndquadrant.com> wrote:
> for a multi-key value the ^
> points to the first column and the reader may think that that's the
> problematci column. Should it instead point to ( ?

I attached a v2 of Amit's 0002 patch to also report the exact column
for the partition overlap errors.


Вложения

Re: Report error position in partition bound check

От
Daniel Gustafsson
Дата:
> On 10 Apr 2020, at 23:50, Alexandra Wang <lewang@pivotal.io> wrote:

> On Fri, Apr 10, 2020 at 8:37 AM Ashutosh Bapat <ashutosh.bapat@2ndquadrant.com
<mailto:ashutosh.bapat@2ndquadrant.com>>wrote: 
> > for a multi-key value the ^
> > points to the first column and the reader may think that that's the
> > problematci column. Should it instead point to ( ?
>
> I attached a v2 of Amit's 0002 patch to also report the exact column
> for the partition overlap errors.

This patch fails to apply to HEAD due to conflicts in the create_table expected
output.  Can you please submit a rebased version?  I'm marking the CF entry
Waiting on Author in the meantime.

cheers ./daniel


Re: Report error position in partition bound check

От
Alexandra Wang
Дата:
> On 2 July 2020, at 06:39, Daniel Gustafsson <daniel@yesql.se> wrote:
> > On 10 Apr 2020, at 23:50, Alexandra Wang <lewang@pivotal.io> wrote:
>
> > On Fri, Apr 10, 2020 at 8:37 AM Ashutosh Bapat <ashutosh.bapat@2ndquadrant.com <mailto:ashutosh.bapat@2ndquadrant.com>> wrote:
> > > for a multi-key value the ^
> > > points to the first column and the reader may think that that's the
> > > problematci column. Should it instead point to ( ?
> >
> > I attached a v2 of Amit's 0002 patch to also report the exact column
> > for the partition overlap errors.
>
> This patch fails to apply to HEAD due to conflicts in the create_table expected
> output.  Can you please submit a rebased version?  I'm marking the CF entry
> Waiting on Author in the meantime.

Thank you Daniel. Here's the rebased patch. I also squashed the two
patches into one so it's easier to review.

-- 
Alexandra Wang
Вложения

Re: Report error position in partition bound check

От
Alexandra Wang
Дата:
> On 2 July 2020, at 06:39, Daniel Gustafsson <daniel@yesql.se> wrote:
> > On 10 Apr 2020, at 23:50, Alexandra Wang <lewang@pivotal.io> wrote:
>
> > On Fri, Apr 10, 2020 at 8:37 AM Ashutosh Bapat <ashutosh.bapat@2ndquadrant.com <mailto:ashutosh.bapat@2ndquadrant.com>> wrote:
> > > for a multi-key value the ^
> > > points to the first column and the reader may think that that's the
> > > problematci column. Should it instead point to ( ?
> >
> > I attached a v2 of Amit's 0002 patch to also report the exact column
> > for the partition overlap errors.
>
> This patch fails to apply to HEAD due to conflicts in the create_table expected
> output.  Can you please submit a rebased version?  I'm marking the CF entry
> Waiting on Author in the meantime.

Thank you Daniel. Here's the rebased patch. I also squashed the two
patches into one so it's easier to review.

--
Alex

Вложения

Re: Report error position in partition bound check

От
Ashutosh Bapat
Дата:


On Fri, 10 Jul 2020 at 23:31, Alexandra Wang <alexandra.wanglei@gmail.com> wrote:


Thank you Daniel. Here's the rebased patch. I also squashed the two
patches into one so it's easier to review.

Thanks for rebasing patch. It applies cleanly still. Here are some comments
@@ -3320,7 +3338,9 @@ make_one_partition_rbound(PartitionKey key, int index, List *datums, bool lower)
  * partition_rbound_cmp
  *
  * Return for two range bounds whether the 1st one (specified in datums1,

I think it's better to reword it as. "For two range bounds decide whether ...

- * kind1, and lower1) is <, =, or > the bound specified in *b2.
+ * kind1, and lower1) is <, =, or > the bound specified in *b2. 0 is returned if
+ * equal and the 1-based index of the first mismatching bound if unequal;
+ * multiplied by -1 if the 1st bound is smaller.

This sentence makes sense after the above correction. I liked this change,
requires very small changes in other parts.

 
 /*
@@ -3495,7 +3518,7 @@ static int
 partition_range_bsearch(int partnatts, FmgrInfo *partsupfunc,
                        Oid *partcollation,
                        PartitionBoundInfo boundinfo,
-                       PartitionRangeBound *probe, bool *is_equal)
+                       PartitionRangeBound *probe, bool *is_equal, int32 *cmpval)

Please update the prologue explaining the new argument. 

After this change, the patch will be ready for a committer.
--
Best Wishes,
Ashutosh

Re: Report error position in partition bound check

От
Michael Paquier
Дата:
On Fri, Sep 04, 2020 at 07:42:27PM +0530, Ashutosh Bapat wrote:
> After this change, the patch will be ready for a committer.

Alexandra, this patch is waiting on author after this review.  Could
you answer to the points raised by Ashutosh and update this patch
accordingly?
--
Michael

Вложения

Re: Report error position in partition bound check

От
Amit Langote
Дата:
Hi Ashutosh,

I had forgotten about this thread, but Michael's ping email brought it
to my attention.

On Fri, Sep 4, 2020 at 11:12 PM Ashutosh Bapat
<ashutosh.bapat@2ndquadrant.com> wrote:
> Thanks for rebasing patch. It applies cleanly still. Here are some comments

Thanks for the review.

> @@ -3320,7 +3338,9 @@ make_one_partition_rbound(PartitionKey key, int index, List *datums, bool lower)
>   * partition_rbound_cmp
>   *
>   * Return for two range bounds whether the 1st one (specified in datums1,
>
> I think it's better to reword it as. "For two range bounds decide whether ...
>
> - * kind1, and lower1) is <, =, or > the bound specified in *b2.
> + * kind1, and lower1) is <, =, or > the bound specified in *b2. 0 is returned if
> + * equal and the 1-based index of the first mismatching bound if unequal;
> + * multiplied by -1 if the 1st bound is smaller.
>
> This sentence makes sense after the above correction. I liked this change,
> requires very small changes in other parts.

+1 to your suggested rewording, although I wrote: "For two range
bounds this decides whether..."

>  /*
> @@ -3495,7 +3518,7 @@ static int
>  partition_range_bsearch(int partnatts, FmgrInfo *partsupfunc,
>                         Oid *partcollation,
>                         PartitionBoundInfo boundinfo,
> -                       PartitionRangeBound *probe, bool *is_equal)
> +                       PartitionRangeBound *probe, bool *is_equal, int32 *cmpval)
>
> Please update the prologue explaining the new argument.

Done.  Actually, I noticed that *is_equal was unused in this
function's only caller.  *cmpval == 0 already gives that, so removed
is_equal parameter.

Attached updated version.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Вложения

Re: Report error position in partition bound check

От
Ashutosh Bapat
Дата:


On Thu, 17 Sep 2020 at 13:06, Amit Langote <amitlangote09@gmail.com> wrote:
Hi Ashutosh,

I had forgotten about this thread, but Michael's ping email brought it
to my attention.

Thanks Amit for addressing comments.

@@ -4256,5 +4256,8 @@ transformPartitionBoundValue(ParseState *pstate, Node *val,
  if (!IsA(value, Const))
  elog(ERROR, "could not evaluate partition bound expression");
 
+ /* Preserve parser location information. */
+ ((Const *) value)->location = exprLocation(val);
+
  return (Const *) value;
 }

This caught my attention and I was wondering whether transformExpr() itself should transfer the location from input expression to the output expression. Some minions of transformExprRecurse() seem to be doing that. The change here may be an indication that some of them are not doing this. In that case may be it's better to find those and fix rather than a white-wash fix here. In what case did we find that location was not set by transformExpr? Sorry for not catching this earlier.

/* New lower bound is certainly >= bound at offet. */
offet/offset? But this comment is implied by the comment just two lines above. So I am not sure it's really needed.

/* Fetch the problem bound from lower datums list. */
This is fetching problematic key value rather than the whole problematic bound. I think the comment would be useful if it explains why cmpval -1 th key is problematic but then that's evident from the prologue of partition_rbound_cmp() so I am not sure if this comment is really required. For example, we aren't adding a comment here
+ overlap_location = ((PartitionRangeDatum *)
+ list_nth(spec->upperdatums, -cmpval - 1))->location;

--
Best Wishes,
Ashutosh

Re: Report error position in partition bound check

От
Amit Langote
Дата:
Thanks Ashutosh.

On Fri, Sep 18, 2020 at 7:33 PM Ashutosh Bapat
<ashutosh.bapat@2ndquadrant.com> wrote:
> Thanks Amit for addressing comments.
>
> @@ -4256,5 +4256,8 @@ transformPartitionBoundValue(ParseState *pstate, Node *val,
>   if (!IsA(value, Const))
>   elog(ERROR, "could not evaluate partition bound expression");
>
> + /* Preserve parser location information. */
> + ((Const *) value)->location = exprLocation(val);
> +
>   return (Const *) value;
>  }
>
> This caught my attention and I was wondering whether transformExpr() itself should transfer the location from input
expressionto the output expression. Some minions of transformExprRecurse() seem to be doing that. The change here may
bean indication that some of them are not doing this. In that case may be it's better to find those and fix rather than
awhite-wash fix here. In what case did we find that location was not set by transformExpr? Sorry for not catching this
earlier.

AFAICS, transformExpr() is fine.  What loses the location value is the
unconditional evaluate_expr() call which generates a fresh Const node,
possibly after evaluating a non-Const expression that is passed to it.
I don't find it very desirable to change evaluate_expr() to accept a
location value, because other callers of it don't seem to care.
Instead, in the updated patch, I have made calling evaluate_expr()
conditional on the expression at hand being a non-Const node and
assign location by hand on return.  If the expression is already
Const, we don't need to update the location field as it should already
be correct.  Though, I did notice that the evaluate_expr() call has an
additional responsibility which is to pass the partition key specified
collation to the bound expression, so we should not fail to update an
already-Const node's collation likewise.

> /* New lower bound is certainly >= bound at offet. */
> offet/offset? But this comment is implied by the comment just two lines above. So I am not sure it's really needed.

Given that cmpval is set all the way in partition_range_bsearch(), I
thought it would help to clarify why this code can assume it must be
>= 0.  It is because a valid offset returned by
partition_range_bsearch() must correspond to a bound that it found to
be <= the probe bound passed to it.

> /* Fetch the problem bound from lower datums list. */
> This is fetching problematic key value rather than the whole problematic bound. I think the comment would be useful
ifit explains why cmpval -1 th key is problematic but then that's evident from the prologue of partition_rbound_cmp()
soI am not sure if this comment is really required. For example, we aren't adding a comment here 
> + overlap_location = ((PartitionRangeDatum *)
> + list_nth(spec->upperdatums, -cmpval - 1))->location;

In the attached updated patch, I have tried to make the code and
comments for different cases consistent.  Please have a look.


--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Вложения

Re: Report error position in partition bound check

От
Ashutosh Bapat
Дата:


On Wed, 23 Sep 2020 at 14:41, Amit Langote <amitlangote09@gmail.com> wrote:
Thanks Ashutosh.

On Fri, Sep 18, 2020 at 7:33 PM Ashutosh Bapat
<ashutosh.bapat@2ndquadrant.com> wrote:
> Thanks Amit for addressing comments.
>
> @@ -4256,5 +4256,8 @@ transformPartitionBoundValue(ParseState *pstate, Node *val,
>   if (!IsA(value, Const))
>   elog(ERROR, "could not evaluate partition bound expression");
>
> + /* Preserve parser location information. */
> + ((Const *) value)->location = exprLocation(val);
> +
>   return (Const *) value;
>  }
>
> This caught my attention and I was wondering whether transformExpr() itself should transfer the location from input expression to the output expression. Some minions of transformExprRecurse() seem to be doing that. The change here may be an indication that some of them are not doing this. In that case may be it's better to find those and fix rather than a white-wash fix here. In what case did we find that location was not set by transformExpr? Sorry for not catching this earlier.

AFAICS, transformExpr() is fine.  What loses the location value is the
unconditional evaluate_expr() call which generates a fresh Const node,
possibly after evaluating a non-Const expression that is passed to it.
I don't find it very desirable to change evaluate_expr() to accept a
location value, because other callers of it don't seem to care.
Instead, in the updated patch, I have made calling evaluate_expr()
conditional on the expression at hand being a non-Const node and
assign location by hand on return.  If the expression is already
Const, we don't need to update the location field as it should already
be correct.  Though, I did notice that the evaluate_expr() call has an
additional responsibility which is to pass the partition key specified
collation to the bound expression, so we should not fail to update an
already-Const node's collation likewise.

Thanks for the detailed explanation. I am not sure whether skipping one evaluate_expr() call for a constant is better or reassigning the location. This looks better than the last patch.


> /* New lower bound is certainly >= bound at offet. */
> offet/offset? But this comment is implied by the comment just two lines above. So I am not sure it's really needed.

Given that cmpval is set all the way in partition_range_bsearch(), I
thought it would help to clarify why this code can assume it must be
>= 0.  It is because a valid offset returned by
partition_range_bsearch() must correspond to a bound that it found to
be <= the probe bound passed to it.

> /* Fetch the problem bound from lower datums list. */
> This is fetching problematic key value rather than the whole problematic bound. I think the comment would be useful if it explains why cmpval -1 th key is problematic but then that's evident from the prologue of partition_rbound_cmp() so I am not sure if this comment is really required. For example, we aren't adding a comment here
> + overlap_location = ((PartitionRangeDatum *)
> + list_nth(spec->upperdatums, -cmpval - 1))->location;

In the attached updated patch, I have tried to make the code and
comments for different cases consistent.  Please have a look.



The comments look okay to me. I don't see a way to keep them short and yet avoid reading the prologue of partition_range_bsearch(). And there is no point in repeating a portion of that prologue at multiple places. So I am fine with these set of comments.

Setting this CF entry as "RFC". Thanks.

--
Best Wishes,
Ashutosh

Re: Report error position in partition bound check

От
Amit Langote
Дата:
On Wed, Sep 23, 2020 at 10:22 PM Ashutosh Bapat
<ashutosh.bapat@2ndquadrant.com> wrote:
> On Wed, 23 Sep 2020 at 14:41, Amit Langote <amitlangote09@gmail.com> wrote:
> Setting this CF entry as "RFC". Thanks.

Great, thanks for your time on this.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com



Re: Report error position in partition bound check

От
Tom Lane
Дата:
I looked this over and pushed it with some minor adjustments.

However, while I was looking at it I couldn't help noticing that
transformPartitionBoundValue's handling of collation concerns seems
less than sane.  There are two things bugging me:

1. Why does it care about the expression's collation only when there's
a top-level CollateExpr?  For example, that means we get an error for

regression=# create table p (f1 text collate "C") partition by list(f1);
CREATE TABLE
regression=# create table c1 partition of p for values in ('a' collate "POSIX");
ERROR:  collation of partition bound value for column "f1" does not match partition key collation "C"

but not this:

regression=# create table c2 partition of p for values in ('a' || 'b' collate "POSIX");
CREATE TABLE

Given that we will override the expression's collation with the partition
column's collation anyway, I don't see why we have this check at all,
so my preference is to just rip out the entire stanza beginning with
"if (IsA(value, CollateExpr))".  If we keep it, though, I think it needs
to do something else that is more general.

2. Nothing is doing assign_expr_collations() on the partition expression.
This can trivially be shown to cause problems:

regression=# create table p (f1 bool) partition by list(f1);
CREATE TABLE
regression=# create table cf partition of p for values in ('a' < 'b');
ERROR:  could not determine which collation to use for string comparison
HINT:  Use the COLLATE clause to set the collation explicitly.


If we want to rip out the collation mismatch error altogether, then
fixing #2 would just require inserting assign_expr_collations() before
the expression_planner() call.  The other direction that would make
sense to me is to perform assign_expr_collations() after
coerce_to_target_type(), and then to complain if exprCollation()
isn't default and doesn't match the partition collation.  In any
case a specific test for a CollateExpr seems quite wrong.

            regards, tom lane



Re: Report error position in partition bound check

От
Amit Langote
Дата:
On Thu, Sep 24, 2020 at 7:19 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I looked this over and pushed it with some minor adjustments.

Thank you.

> However, while I was looking at it I couldn't help noticing that
> transformPartitionBoundValue's handling of collation concerns seems
> less than sane.  There are two things bugging me:
>
> 1. Why does it care about the expression's collation only when there's
> a top-level CollateExpr?  For example, that means we get an error for
>
> regression=# create table p (f1 text collate "C") partition by list(f1);
> CREATE TABLE
> regression=# create table c1 partition of p for values in ('a' collate "POSIX");
> ERROR:  collation of partition bound value for column "f1" does not match partition key collation "C"
>
> but not this:
>
> regression=# create table c2 partition of p for values in ('a' || 'b' collate "POSIX");
> CREATE TABLE
>
> Given that we will override the expression's collation with the partition
> column's collation anyway, I don't see why we have this check at all,
> so my preference is to just rip out the entire stanza beginning with
> "if (IsA(value, CollateExpr))".  If we keep it, though, I think it needs
> to do something else that is more general.
>
> 2. Nothing is doing assign_expr_collations() on the partition expression.
> This can trivially be shown to cause problems:
>
> regression=# create table p (f1 bool) partition by list(f1);
> CREATE TABLE
> regression=# create table cf partition of p for values in ('a' < 'b');
> ERROR:  could not determine which collation to use for string comparison
> HINT:  Use the COLLATE clause to set the collation explicitly.
>
>
> If we want to rip out the collation mismatch error altogether, then
> fixing #2 would just require inserting assign_expr_collations() before
> the expression_planner() call.  The other direction that would make
> sense to me is to perform assign_expr_collations() after
> coerce_to_target_type(), and then to complain if exprCollation()
> isn't default and doesn't match the partition collation.  In any
> case a specific test for a CollateExpr seems quite wrong.

I tried implementing that as attached and one test failed:

create table test_part_coll_posix (a text) partition by range (a
collate "POSIX");
...
create table test_part_coll_cast2 partition of test_part_coll_posix
for values from (name 's') to ('z');
+ERROR:  collation of partition bound value for column "a" does not
match partition key collation "POSIX"
+LINE 1: ...ion of test_part_coll_posix for values from (name 's') to ('...

I dug up the discussion which resulted in this test being added and
found that Peter E had opined that this failure should not occur [1].
Maybe that is why I put that half-baked guard consisting of checking
if the erroneous collation comes from an explicit COLLATE clause.  Now
I think maybe giving an error is alright but we should tell in the
DETAIL message what the expression's collation is, like as follows:

create table test_part_coll_cast2 partition of test_part_coll_posix
for values from (name 's') to ('z');
+ERROR:  collation of partition bound value for column "a" does not
match partition key collation "POSIX"
+LINE 1: ...ion of test_part_coll_posix for values from (name 's') to ('...
+                                                             ^
+DETAIL:  The collation of partition bound value is "C".

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/04661508-b6f5-177e-6f6b-c4cb8426b9f0%402ndquadrant.com

Вложения

Re: Report error position in partition bound check

От
Tom Lane
Дата:
[ cc'ing Peter, since his opinion seems to have got us here in the first place ]

Amit Langote <amitlangote09@gmail.com> writes:
> On Thu, Sep 24, 2020 at 7:19 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> However, while I was looking at it I couldn't help noticing that
>> transformPartitionBoundValue's handling of collation concerns seems
>> less than sane.  There are two things bugging me:
>>
>> 1. Why does it care about the expression's collation only when there's
>> a top-level CollateExpr?  For example, that means we get an error for
>>
>> regression=# create table p (f1 text collate "C") partition by list(f1);
>> CREATE TABLE
>> regression=# create table c1 partition of p for values in ('a' collate "POSIX");
>> ERROR:  collation of partition bound value for column "f1" does not match partition key collation "C"
>>
>> but not this:
>>
>> regression=# create table c2 partition of p for values in ('a' || 'b' collate "POSIX");
>> CREATE TABLE
>>
>> Given that we will override the expression's collation with the partition
>> column's collation anyway, I don't see why we have this check at all,
>> so my preference is to just rip out the entire stanza beginning with
>> "if (IsA(value, CollateExpr))".  If we keep it, though, I think it needs
>> to do something else that is more general.

> I dug up the discussion which resulted in this test being added and
> found that Peter E had opined that this failure should not occur [1].

Well, I agree with Peter to that extent, but my opinion is that *none*
of these cases ought to be errors.  What we're doing here is performing
an implicit assignment-level coercion of the expression to the type of
the column, and changing the collation is allowed as part of that:

regression=# create table foo (f1 text collate "C");
CREATE TABLE
regression=# insert into foo values ('a' COLLATE "POSIX");
INSERT 0 1
regression=# update foo set f1 = 'b' COLLATE "POSIX";
UPDATE 1

So I find it completely inconsistent that the partitioning logic
complains about equivalent cases.  I think we should just rip the
whole thing out, as per the attached draft.  This causes several
regression test results to change, but AFAICS those are only there
to exercise the error tests that I think we should get rid of.

            regards, tom lane

diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 164312d60e..0dc03dd984 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -4183,50 +4183,6 @@ transformPartitionBoundValue(ParseState *pstate, Node *val,
      */
     Assert(!contain_var_clause(value));

-    /*
-     * Check that the input expression's collation is compatible with one
-     * specified for the parent's partition key (partcollation).  Don't throw
-     * an error if it's the default collation which we'll replace with the
-     * parent's collation anyway.
-     */
-    if (IsA(value, CollateExpr))
-    {
-        Oid            exprCollOid = exprCollation(value);
-
-        /*
-         * Check we have a collation iff it is a collatable type.  The only
-         * expected failures here are (1) COLLATE applied to a noncollatable
-         * type, or (2) partition bound expression had an unresolved
-         * collation.  But we might as well code this to be a complete
-         * consistency check.
-         */
-        if (type_is_collatable(colType))
-        {
-            if (!OidIsValid(exprCollOid))
-                ereport(ERROR,
-                        (errcode(ERRCODE_INDETERMINATE_COLLATION),
-                         errmsg("could not determine which collation to use for partition bound expression"),
-                         errhint("Use the COLLATE clause to set the collation explicitly.")));
-        }
-        else
-        {
-            if (OidIsValid(exprCollOid))
-                ereport(ERROR,
-                        (errcode(ERRCODE_DATATYPE_MISMATCH),
-                         errmsg("collations are not supported by type %s",
-                                format_type_be(colType))));
-        }
-
-        if (OidIsValid(exprCollOid) &&
-            exprCollOid != DEFAULT_COLLATION_OID &&
-            exprCollOid != partCollation)
-            ereport(ERROR,
-                    (errcode(ERRCODE_DATATYPE_MISMATCH),
-                     errmsg("collation of partition bound value for column \"%s\" does not match partition key
collation\"%s\"", 
-                            colName, get_collation_name(partCollation)),
-                     parser_errposition(pstate, exprLocation(value))));
-    }
-
     /*
      * Coerce to the correct type.  This might cause an explicit coercion step
      * to be added on top of the expression, which must be evaluated before
@@ -4253,6 +4209,7 @@ transformPartitionBoundValue(ParseState *pstate, Node *val,
      */
     if (!IsA(value, Const))
     {
+        assign_expr_collations(pstate, value);
         value = (Node *) expression_planner((Expr *) value);
         value = (Node *) evaluate_expr((Expr *) value, colType, colTypmod,
                                        partCollation);

Re: Report error position in partition bound check

От
Amit Langote
Дата:
On Fri, Sep 25, 2020 at 12:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> [ cc'ing Peter, since his opinion seems to have got us here in the first place ]
>
> Amit Langote <amitlangote09@gmail.com> writes:
> > On Thu, Sep 24, 2020 at 7:19 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> However, while I was looking at it I couldn't help noticing that
> >> transformPartitionBoundValue's handling of collation concerns seems
> >> less than sane.  There are two things bugging me:
> >>
> >> 1. Why does it care about the expression's collation only when there's
> >> a top-level CollateExpr?  For example, that means we get an error for
> >>
> >> regression=# create table p (f1 text collate "C") partition by list(f1);
> >> CREATE TABLE
> >> regression=# create table c1 partition of p for values in ('a' collate "POSIX");
> >> ERROR:  collation of partition bound value for column "f1" does not match partition key collation "C"
> >>
> >> but not this:
> >>
> >> regression=# create table c2 partition of p for values in ('a' || 'b' collate "POSIX");
> >> CREATE TABLE
> >>
> >> Given that we will override the expression's collation with the partition
> >> column's collation anyway, I don't see why we have this check at all,
> >> so my preference is to just rip out the entire stanza beginning with
> >> "if (IsA(value, CollateExpr))".  If we keep it, though, I think it needs
> >> to do something else that is more general.
>
> > I dug up the discussion which resulted in this test being added and
> > found that Peter E had opined that this failure should not occur [1].
>
> Well, I agree with Peter to that extent, but my opinion is that *none*
> of these cases ought to be errors.  What we're doing here is performing
> an implicit assignment-level coercion of the expression to the type of
> the column, and changing the collation is allowed as part of that:
>
> regression=# create table foo (f1 text collate "C");
> CREATE TABLE
> regression=# insert into foo values ('a' COLLATE "POSIX");
> INSERT 0 1
> regression=# update foo set f1 = 'b' COLLATE "POSIX";
> UPDATE 1
>
> So I find it completely inconsistent that the partitioning logic
> complains about equivalent cases.

My perhaps wrong impression was that the bound expression that is
specified when creating a partition is not as such being *assigned* to
the key column, but now that I think about it some more, that doesn't
matter.

>  I think we should just rip the
> whole thing out, as per the attached draft.  This causes several
> regression test results to change, but AFAICS those are only there
> to exercise the error tests that I think we should get rid of.

Yeah, I can see no other misbehavior resulting from this.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com



Re: Report error position in partition bound check

От
Tom Lane
Дата:
Amit Langote <amitlangote09@gmail.com> writes:
> On Fri, Sep 25, 2020 at 12:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Well, I agree with Peter to that extent, but my opinion is that *none*
>> of these cases ought to be errors.  What we're doing here is performing
>> an implicit assignment-level coercion of the expression to the type of
>> the column, and changing the collation is allowed as part of that:
>> 
>> regression=# create table foo (f1 text collate "C");
>> CREATE TABLE
>> regression=# insert into foo values ('a' COLLATE "POSIX");
>> INSERT 0 1
>> regression=# update foo set f1 = 'b' COLLATE "POSIX";
>> UPDATE 1
>> 
>> So I find it completely inconsistent that the partitioning logic
>> complains about equivalent cases.

> My perhaps wrong impression was that the bound expression that is
> specified when creating a partition is not as such being *assigned* to
> the key column, but now that I think about it some more, that doesn't
> matter.

>> I think we should just rip the
>> whole thing out, as per the attached draft.  This causes several
>> regression test results to change, but AFAICS those are only there
>> to exercise the error tests that I think we should get rid of.

> Yeah, I can see no other misbehavior resulting from this.

OK, I'll clean up the regression test cases and push that.

(Although this could be claimed to be a bug, I do not feel
a need to back-patch the behavioral change.)

            regards, tom lane



Re: Report error position in partition bound check

От
Amit Langote
Дата:
On Tue, Sep 29, 2020 at 2:01 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> > On Fri, Sep 25, 2020 at 12:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Well, I agree with Peter to that extent, but my opinion is that *none*
> >> of these cases ought to be errors.  What we're doing here is performing
> >> an implicit assignment-level coercion of the expression to the type of
> >> the column, and changing the collation is allowed as part of that:
> >>
> >> regression=# create table foo (f1 text collate "C");
> >> CREATE TABLE
> >> regression=# insert into foo values ('a' COLLATE "POSIX");
> >> INSERT 0 1
> >> regression=# update foo set f1 = 'b' COLLATE "POSIX";
> >> UPDATE 1
> >>
> >> So I find it completely inconsistent that the partitioning logic
> >> complains about equivalent cases.
>
> > My perhaps wrong impression was that the bound expression that is
> > specified when creating a partition is not as such being *assigned* to
> > the key column, but now that I think about it some more, that doesn't
> > matter.
>
> >> I think we should just rip the
> >> whole thing out, as per the attached draft.  This causes several
> >> regression test results to change, but AFAICS those are only there
> >> to exercise the error tests that I think we should get rid of.
>
> > Yeah, I can see no other misbehavior resulting from this.
>
> OK, I'll clean up the regression test cases and push that.

Thanks.

> (Although this could be claimed to be a bug, I do not feel
> a need to back-patch the behavioral change.)

Agreed.  The assign_expr_collations() omission was indeed a bug.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com