Обсуждение: Underscore in positional parameters?

Поиск
Список
Период
Сортировка

Underscore in positional parameters?

От
Erik Wienhold
Дата:
I noticed that we (kind of) accept underscores in positional parameters.
For example, this works:

    => PREPARE p1 AS SELECT $1_2;
    PREPARE
    => EXECUTE p1 (123);
     ?column?
    ----------
     123
    (1 row)

Parameter $1_2 is taken as $1 because in rule {param} in scan.l we get
the parameter number with atol which stops at the underscore.  That's a
regression in faff8f8e47f.  Before that commit, $1_2 resulted in
"ERROR: trailing junk after parameter".

I can't tell which fix is the way to go: (1) accept underscores without
using atol, or (2) just forbid underscores.  Any ideas?

atol can be replaced with pg_strtoint32_safe to handle the underscores.
This also avoids atol's undefined behavior on overflows.  AFAICT,
positional parameters are not part of the SQL standard, so nothing
prevents us from accepting underscores here as well.  The attached patch
does that and also adds a test case.

But reverting {param} to its old form to forbid underscores also makes
sense.  That is:

    param            \${decdigit}+
    param_junk        \${decdigit}+{ident_start}

It seems very unlikely that anybody uses that many parameters and still
cares about readability to use underscores.  But maybe users simply
expect that underscores are valid here as well.

-- 
Erik

Вложения

Re: Underscore in positional parameters?

От
Michael Paquier
Дата:
On Tue, May 14, 2024 at 05:18:24AM +0200, Erik Wienhold wrote:
> Parameter $1_2 is taken as $1 because in rule {param} in scan.l we get
> the parameter number with atol which stops at the underscore.  That's a
> regression in faff8f8e47f.  Before that commit, $1_2 resulted in
> "ERROR: trailing junk after parameter".

Indeed, the behavior of HEAD is confusing.  "1_2" means 12 as a
constant in a query, not 1, but HEAD implies 1 in the context of
PREPARE here.

> I can't tell which fix is the way to go: (1) accept underscores without
> using atol, or (2) just forbid underscores.  Any ideas?

Does the SQL specification tell anything about the way parameters
should be marked?  Not everything out there uses dollar-marked
parameters, so I guess that the answer to my question is no.  My take
is all these cases should be rejected for params, only apply to
numeric and integer constants in the queries.

> atol can be replaced with pg_strtoint32_safe to handle the underscores.
> This also avoids atol's undefined behavior on overflows.  AFAICT,
> positional parameters are not part of the SQL standard, so nothing
> prevents us from accepting underscores here as well.  The attached patch
> does that and also adds a test case.
>
> But reverting {param} to its old form to forbid underscores also makes
> sense.  That is:
>
>     param            \${decdigit}+
>     param_junk        \${decdigit}+{ident_start}
>
> It seems very unlikely that anybody uses that many parameters and still
> cares about readability to use underscores.  But maybe users simply
> expect that underscores are valid here as well.

Adding Dean in CC as the committer of faff8f8e47f, Peter E for the SQL
specification part, and an open item.
--
Michael

Вложения

Re: Underscore in positional parameters?

От
Dean Rasheed
Дата:
On Tue, 14 May 2024 at 07:43, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, May 14, 2024 at 05:18:24AM +0200, Erik Wienhold wrote:
> > Parameter $1_2 is taken as $1 because in rule {param} in scan.l we get
> > the parameter number with atol which stops at the underscore.  That's a
> > regression in faff8f8e47f.  Before that commit, $1_2 resulted in
> > "ERROR: trailing junk after parameter".
>
> Indeed, the behavior of HEAD is confusing.  "1_2" means 12 as a
> constant in a query, not 1, but HEAD implies 1 in the context of
> PREPARE here.
>
> > I can't tell which fix is the way to go: (1) accept underscores without
> > using atol, or (2) just forbid underscores.  Any ideas?
>
> Does the SQL specification tell anything about the way parameters
> should be marked?  Not everything out there uses dollar-marked
> parameters, so I guess that the answer to my question is no.  My take
> is all these cases should be rejected for params, only apply to
> numeric and integer constants in the queries.
>
> Adding Dean in CC as the committer of faff8f8e47f, Peter E for the SQL
> specification part, and an open item.

I'm sure that this wasn't intentional -- I think we just failed to
notice that "param" also uses "decinteger" in the scanner. Taking a
quick look, there don't appear to be any other uses of "decinteger",
so at least it only affects params.

Unless the spec explicitly says otherwise, I agree that we should
reject this, as we used to do, and add a comment saying that it's
intentionally not supported. I can't believe it would ever be useful,
and the current behaviour is clearly broken.

I've moved this to "Older bugs affecting stable branches", since it
came in with v16.

Regards,
Dean



Re: Underscore in positional parameters?

От
Tom Lane
Дата:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On Tue, 14 May 2024 at 07:43, Michael Paquier <michael@paquier.xyz> wrote:
>> On Tue, May 14, 2024 at 05:18:24AM +0200, Erik Wienhold wrote:
>>> Parameter $1_2 is taken as $1 because in rule {param} in scan.l we get
>>> the parameter number with atol which stops at the underscore.  That's a
>>> regression in faff8f8e47f.  Before that commit, $1_2 resulted in
>>> "ERROR: trailing junk after parameter".

> I'm sure that this wasn't intentional -- I think we just failed to
> notice that "param" also uses "decinteger" in the scanner. Taking a
> quick look, there don't appear to be any other uses of "decinteger",
> so at least it only affects params.

> Unless the spec explicitly says otherwise, I agree that we should
> reject this, as we used to do, and add a comment saying that it's
> intentionally not supported. I can't believe it would ever be useful,
> and the current behaviour is clearly broken.

+1, let's put this back the way it was.

            regards, tom lane



Re: Underscore in positional parameters?

От
Erik Wienhold
Дата:
On 2024-05-14 16:40 +0200, Tom Lane wrote:
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> > On Tue, 14 May 2024 at 07:43, Michael Paquier <michael@paquier.xyz> wrote:
> >> On Tue, May 14, 2024 at 05:18:24AM +0200, Erik Wienhold wrote:
> >>> Parameter $1_2 is taken as $1 because in rule {param} in scan.l we get
> >>> the parameter number with atol which stops at the underscore.  That's a
> >>> regression in faff8f8e47f.  Before that commit, $1_2 resulted in
> >>> "ERROR: trailing junk after parameter".
> 
> > I'm sure that this wasn't intentional -- I think we just failed to
> > notice that "param" also uses "decinteger" in the scanner. Taking a
> > quick look, there don't appear to be any other uses of "decinteger",
> > so at least it only affects params.
> 
> > Unless the spec explicitly says otherwise, I agree that we should
> > reject this, as we used to do, and add a comment saying that it's
> > intentionally not supported. I can't believe it would ever be useful,
> > and the current behaviour is clearly broken.
> 
> +1, let's put this back the way it was.

I split the change in two independent patches:

Patch 0001 changes rules param and param_junk to only accept digits 0-9.

Patch 0002 replaces atol with pg_strtoint32_safe in the backend parser
and strtoint in ECPG.  This fixes overflows like:

    => PREPARE p1 AS SELECT $4294967297;  -- same as $1
    PREPARE
    => EXECUTE p1 (123);
     ?column?
    ----------
     123
    (1 row)

    => PREPARE p2 AS SELECT $2147483648;
    ERROR:  there is no parameter $-2147483648
    LINE 1: PREPARE p2 AS SELECT $2147483648;

It now returns this error:

    => PREPARE p1 AS SELECT $4294967297;
    ERROR:  parameter too large at or near $4294967297

    => PREPARE p2 AS SELECT $2147483648;
    ERROR:  parameter too large at or near $2147483648

-- 
Erik

Вложения

Re: Underscore in positional parameters?

От
Michael Paquier
Дата:
On Tue, May 14, 2024 at 10:51:41AM +0100, Dean Rasheed wrote:
> I've moved this to "Older bugs affecting stable branches", since it
> came in with v16.

Oops, thanks for fixing.  I've somewhat missed that b2d47928908d was
in REL_16_STABLE.
--
Michael

Вложения

Re: Underscore in positional parameters?

От
Michael Paquier
Дата:
On Tue, May 14, 2024 at 06:07:51PM +0200, Erik Wienhold wrote:
> I split the change in two independent patches:

The split makes sense to me.

> Patch 0001 changes rules param and param_junk to only accept digits 0-9.

-param            \${decinteger}
-param_junk        \${decinteger}{ident_start}
+/* Positional parameters don't accept underscores. */
+param            \${decdigit}+
+param_junk        \${decdigit}+{ident_start}

scan.l, psqlscan.l and pgc.l are the three files impacted, so that's
good to me.

> Patch 0002 replaces atol with pg_strtoint32_safe in the backend parser
> and strtoint in ECPG.  This fixes overflows like:
>
>     => PREPARE p1 AS SELECT $4294967297;  -- same as $1
>     PREPARE
>
> It now returns this error:
>
>     => PREPARE p1 AS SELECT $4294967297;
>     ERROR:  parameter too large at or near $4294967297

This one is a much older problem, though.  What you are doing is an
improvement, still I don't see a huge point in backpatching that based
on the lack of complaints with these overflows in the yyac paths.

+    if (errno == ERANGE)
+        mmfatal(PARSE_ERROR, "parameter too large");

Knowong that this is working on decdigits, an ERANGE check should be
enough, indeed.
--
Michael

Вложения

Re: Underscore in positional parameters?

От
Peter Eisentraut
Дата:
On 14.05.24 18:07, Erik Wienhold wrote:
> Patch 0001 changes rules param and param_junk to only accept digits 0-9.

I have committed this patch to PG16 and master.

I was a little bit on the fence about what the behavior should be, but I 
checked Perl for comparison:

print 1000;   # ok
print 1_000;  # ok
print $1000;  # ok
print $1_000; # error

So this seems alright.

> Patch 0002 replaces atol with pg_strtoint32_safe in the backend parser
> and strtoint in ECPG.  This fixes overflows like:

Seems like a good idea, but as was said, this is an older issue, so 
let's look at that separately.




Re: Underscore in positional parameters?

От
Michael Paquier
Дата:
On Wed, May 15, 2024 at 01:59:36PM +0200, Peter Eisentraut wrote:
> On 14.05.24 18:07, Erik Wienhold wrote:
>> Patch 0002 replaces atol with pg_strtoint32_safe in the backend parser
>> and strtoint in ECPG.  This fixes overflows like:
>
> Seems like a good idea, but as was said, this is an older issue, so let's
> look at that separately.

Hmm, yeah.  I would be really tempted to fix that now.

Now, it has been this way for ages, and with my RMT hat on (aka I need
to show the example), I'd suggest to wait for when the v18 branch
opens as there is no urgency.  I'm OK to apply it myself at the end,
the patch is a good idea.
--
Michael

Вложения

Re: Underscore in positional parameters?

От
Peter Eisentraut
Дата:
On 16.05.24 01:11, Michael Paquier wrote:
> On Wed, May 15, 2024 at 01:59:36PM +0200, Peter Eisentraut wrote:
>> On 14.05.24 18:07, Erik Wienhold wrote:
>>> Patch 0002 replaces atol with pg_strtoint32_safe in the backend parser
>>> and strtoint in ECPG.  This fixes overflows like:
>>
>> Seems like a good idea, but as was said, this is an older issue, so let's
>> look at that separately.
> 
> Hmm, yeah.  I would be really tempted to fix that now.
> 
> Now, it has been this way for ages, and with my RMT hat on (aka I need
> to show the example), I'd suggest to wait for when the v18 branch
> opens as there is no urgency.  I'm OK to apply it myself at the end,
> the patch is a good idea.

On this specific patch, maybe reword "parameter too large" to "parameter 
number too large".

Also, I was bemused by the use of atol(), which is notoriously 
unportable (sizeof(long)).  So I poked around and found more places that 
might need fixing.  I'm attaching a patch here with annotations too look 
at later.

Вложения

Re: Underscore in positional parameters?

От
Michael Paquier
Дата:
On Thu, May 16, 2024 at 08:41:11AM +0200, Peter Eisentraut wrote:
> On this specific patch, maybe reword "parameter too large" to "parameter
> number too large".

WFM here.

> Also, I was bemused by the use of atol(), which is notoriously unportable
> (sizeof(long)).  So I poked around and found more places that might need
> fixing.  I'm attaching a patch here with annotations too look at later.

Yeah atoXX calls have been funky in the tree for some time.  This
reminds this thread, somewhat:
https://www.postgresql.org/message-id/CALAY4q8be6Qw_2J%3DzOp_v1X-zfMBzvVMkAfmMYv%3DUkr%3D2hPcFQ%40mail.gmail.com

The issue is also that there is no "safe" parsing alternative for 64b
integers in the frontend (as you know long is 32b in Windows, which is
why I'd encourage ripping it out as much as we can).  This may be
better as a complementary of strtoint() in src/common/string.c.  Note
as well strtoint64() in pgbench.c.  I think I have a patch lying
around, actually..
--
Michael

Вложения

Re: Underscore in positional parameters?

От
Erik Wienhold
Дата:
On 2024-05-17 02:06 +0200, Michael Paquier wrote:
> On Thu, May 16, 2024 at 08:41:11AM +0200, Peter Eisentraut wrote:
> > On this specific patch, maybe reword "parameter too large" to "parameter
> > number too large".
> 
> WFM here.

Done in v3.

I noticed this compiler warning with my previous patch:

    scan.l:997:41: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
      997 |                                         ErrorSaveContext escontext = {T_ErrorSaveContext};
          |                                         ^~~~~~~~~~~~~~~~

I thought that I had to factor this out into a function similar to
process_integer_literal (which also uses ErrorSaveContext).  But moving
that declaration to the start of the {param} action was enough in the
end.

While trying out the refactoring, I noticed two small things that can be
fixed as well in scan.l:

* Prototype and definition of addunicode do not match.  The prototype
  uses yyscan_t while the definition uses core_yyscan_t.

* Parameter base of process_integer_literal is unused.

But those should be one a separate thread, right, even for minor fixes?

-- 
Erik

Вложения

Re: Underscore in positional parameters?

От
Alexander Lakhin
Дата:
Hello Erik,

18.05.2024 04:31, Erik Wienhold wrote:
> On 2024-05-17 02:06 +0200, Michael Paquier wrote:
>> On Thu, May 16, 2024 at 08:41:11AM +0200, Peter Eisentraut wrote:
>>> On this specific patch, maybe reword "parameter too large" to "parameter
>>> number too large".
>> WFM here.
> Done in v3.

Thank you for working on this!

I encountered anomalies that you address with this patch too.
And I can confirm that it fixes most cases, but there is another one:
SELECT $300000000 \bind 'foo' \g
ERROR:  invalid memory alloc request size 1200000000

Maybe you would find this worth fixing as well.

Best regards,
Alexander



Re: Underscore in positional parameters?

От
Erik Wienhold
Дата:
On 2024-05-19 07:00 +0200, Alexander Lakhin wrote:
> I encountered anomalies that you address with this patch too.
> And I can confirm that it fixes most cases, but there is another one:
> SELECT $300000000 \bind 'foo' \g
> ERROR:  invalid memory alloc request size 1200000000
> 
> Maybe you would find this worth fixing as well.

Yes, that error message is not great.  In variable_paramref_hook we
check paramno > INT_MAX/sizeof(Oid) when in fact MaxAllocSize/sizeof(Oid)
is the more appropriate limit to avoid that unspecific alloc size error.

Fixed in v4 with a separate patch because it's unrelated to the param
number parsing.  But it fits nicely into the broader issue on the upper
limit for param numbers.  Note that $268435455 is still the largest
possible param number ((2^30-1)/4) and that we just return a more
user-friendly error message for params beyond that limit.

-- 
Erik

Вложения

Re: Underscore in positional parameters?

От
jian he
Дата:
On Sun, May 19, 2024 at 10:43 PM Erik Wienhold <ewie@ewie.name> wrote:
>
> On 2024-05-19 07:00 +0200, Alexander Lakhin wrote:
> > I encountered anomalies that you address with this patch too.
> > And I can confirm that it fixes most cases, but there is another one:
> > SELECT $300000000 \bind 'foo' \g
> > ERROR:  invalid memory alloc request size 1200000000
> >
> > Maybe you would find this worth fixing as well.
>
> Yes, that error message is not great.  In variable_paramref_hook we
> check paramno > INT_MAX/sizeof(Oid) when in fact MaxAllocSize/sizeof(Oid)
> is the more appropriate limit to avoid that unspecific alloc size error.
>
> Fixed in v4 with a separate patch because it's unrelated to the param
> number parsing.  But it fits nicely into the broader issue on the upper
> limit for param numbers.  Note that $268435455 is still the largest
> possible param number ((2^30-1)/4) and that we just return a more
> user-friendly error message for params beyond that limit.
>

hi, one minor issue:

/* Check parameter number is in range */
if (paramno <= 0 || paramno > MaxAllocSize / sizeof(Oid))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_PARAMETER),
errmsg("there is no parameter $%d", paramno),
parser_errposition(pstate, pref->location)));

if paramno <= 0 then "there is no parameter $%d" makes sense to me.

but, if paramno > 0 why not just say, we can only allow MaxAllocSize /
sizeof(Oid) number of parameters.



Re: Underscore in positional parameters?

От
Erik Wienhold
Дата:
On 2024-05-20 03:26 +0200, jian he wrote:
> On Sun, May 19, 2024 at 10:43 PM Erik Wienhold <ewie@ewie.name> wrote:
> >
> > On 2024-05-19 07:00 +0200, Alexander Lakhin wrote:
> > > I encountered anomalies that you address with this patch too.
> > > And I can confirm that it fixes most cases, but there is another one:
> > > SELECT $300000000 \bind 'foo' \g
> > > ERROR:  invalid memory alloc request size 1200000000
> > >
> > > Maybe you would find this worth fixing as well.
> >
> > Yes, that error message is not great.  In variable_paramref_hook we
> > check paramno > INT_MAX/sizeof(Oid) when in fact MaxAllocSize/sizeof(Oid)
> > is the more appropriate limit to avoid that unspecific alloc size error.
> >
> > Fixed in v4 with a separate patch because it's unrelated to the param
> > number parsing.  But it fits nicely into the broader issue on the upper
> > limit for param numbers.  Note that $268435455 is still the largest
> > possible param number ((2^30-1)/4) and that we just return a more
> > user-friendly error message for params beyond that limit.
> >
> 
> hi, one minor issue:
> 
> /* Check parameter number is in range */
> if (paramno <= 0 || paramno > MaxAllocSize / sizeof(Oid))
> ereport(ERROR,
> (errcode(ERRCODE_UNDEFINED_PARAMETER),
> errmsg("there is no parameter $%d", paramno),
> parser_errposition(pstate, pref->location)));
> 
> if paramno <= 0 then "there is no parameter $%d" makes sense to me.
> 
> but, if paramno > 0 why not just say, we can only allow MaxAllocSize /
> sizeof(Oid) number of parameters.

Yes, it makes sense to show the upper bound.  How about a hint such as
"Valid parameters range from $%d to $%d."?

-- 
Erik



Re: Underscore in positional parameters?

От
Tom Lane
Дата:
Erik Wienhold <ewie@ewie.name> writes:
> On 2024-05-20 03:26 +0200, jian he wrote:
>> /* Check parameter number is in range */
>> if (paramno <= 0 || paramno > MaxAllocSize / sizeof(Oid))
>>     ereport(ERROR, ...

> Yes, it makes sense to show the upper bound.  How about a hint such as
> "Valid parameters range from $%d to $%d."?

I kind of feel like this upper bound is ridiculous.  In what scenario
is parameter 250000000 not a mistake, if not indeed somebody trying
to break the system?

The "Bind" protocol message only allows an int16 parameter count,
so rejecting parameter numbers above 32K would make sense to me.

            regards, tom lane



Re: Underscore in positional parameters?

От
Erik Wienhold
Дата:
On 2024-05-20 05:02 +0200, Tom Lane wrote:
> Erik Wienhold <ewie@ewie.name> writes:
> > On 2024-05-20 03:26 +0200, jian he wrote:
> >> /* Check parameter number is in range */
> >> if (paramno <= 0 || paramno > MaxAllocSize / sizeof(Oid))
> >>     ereport(ERROR, ...
> 
> > Yes, it makes sense to show the upper bound.  How about a hint such as
> > "Valid parameters range from $%d to $%d."?
> 
> I kind of feel like this upper bound is ridiculous.  In what scenario
> is parameter 250000000 not a mistake, if not indeed somebody trying
> to break the system?
> 
> The "Bind" protocol message only allows an int16 parameter count,
> so rejecting parameter numbers above 32K would make sense to me.

Agree.  I was already wondering upthread why someone would use that many
parameters.

Out of curiosity, I checked if there might be an even lower limit.  And
indeed, max_stack_depth puts a limit due to some recursive evaluation:

    ERROR:  stack depth limit exceeded
    HINT:  Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's
stackdepth limit is adequate.
 

Attached is the stacktrace for EXECUTE on HEAD (I snipped most of the
recursive frames).

Running \bind, PREPARE, and EXECUTE with following number of parameters
works as expected, although the number varies between releases which is
not ideal IMO.  The commands hit the stack depth limit for #Params+1.

Version            Command  #Params
-----------------  -------  -------
HEAD (18cbed13d5)  \bind    4365
HEAD (18cbed13d5)  PREPARE  8182
HEAD (18cbed13d5)  EXECUTE  4363
16.2               \bind    3968
16.2               PREPARE  6889
16.2               EXECUTE  3966

Those are already pretty large numbers in my view (compared to the 100
parameters that we accept at most for functions).  And I guess nobody
complained about those limits yet, or they just increased
max_stack_depth.

The Python script to generate the test scripts:

    import sys
    n_params = 1 << 16
    if len(sys.argv) > 1:
        n_params = min(n_params, int(sys.argv[1]))
    params = '+'.join(f'${i+1}::int' for i in range(n_params))
    bind_vals = ' '.join('1' for _ in range(n_params))
    exec_vals = ','.join('1' for _ in range(n_params))
    print(fr"SELECT {params} \bind {bind_vals} \g")
    print(f"PREPARE p AS SELECT {params};")
    print(f"EXECUTE p ({exec_vals});")

-- 
Erik

Вложения