Обсуждение: BUG #9204: truncate_identifier may be called unnecessarily on escaped quoted identifiers

Поиск
Список
Период
Сортировка

BUG #9204: truncate_identifier may be called unnecessarily on escaped quoted identifiers

От
pythonesque@gmail.com
Дата:
The following bug has been logged on the website:

Bug reference:      9204
Logged by:          Joshua Yanovski
Email address:      pythonesque@gmail.com
PostgreSQL version: 9.3.2
Operating system:   Ubuntu 12.0.4
Description:

As in description.  This follows from how these are scanned in scan.l:

    ident = litbuf_udeescape('\\', yyscanner);
if (yyextra->literallen >= NAMEDATALEN)
truncate_identifier(ident, yyextra->literallen, true);

Because literallen is the length of the original string, this does
unnecessary work (and reports a misleading notice) if the resulting string
is shorter.

psql -v 'VERBOSITY=verbose' -c "select
U&\"abcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcd\\3737\"
FROM dummy"
NOTICE:  42622: identifier
"abcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcd㜷" will be
truncated to
"abcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcd㜷"
LOCATION:  truncate_identifier, scansup.c:195

It is a pretty borderline edge case and doesn't have any serious
consequences, but it does seem like it should be easy to fix without a huge
hit to efficiency, considering that the length can be calculated in constant
time from known information in litbuf_udeescape.

Re: BUG #9204: truncate_identifier may be called unnecessarily on escaped quoted identifiers

От
Tom Lane
Дата:
pythonesque@gmail.com writes:
> As in description.  This follows from how these are scanned in scan.l:

>     ident = litbuf_udeescape('\\', yyscanner);
> if (yyextra->literallen >= NAMEDATALEN)
> truncate_identifier(ident, yyextra->literallen, true);

Yeah, that's a bug --- yyextra->literallen is not the thing to use here.
It's just luck that truncate_identifier doesn't fail entirely, since
we're violating its API contract.  Will fix, thanks for reporting it.

            regards, tom lane

Re: BUG #9204: truncate_identifier may be called unnecessarily on escaped quoted identifiers

От
Joshua Yanovski
Дата:
There is one other thing I noticed in that area of the code--namely, if
NAMEDATALEN is low enough, an identifier can be truncated down to an empty
identifier, since the check for empty identifier length is done before the
call to truncate_identifier.  But I doubt this will ever be a problem in
practice and there may be other compensatory checks elsewhere.


On Thu, Feb 13, 2014 at 9:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> pythonesque@gmail.com writes:
> > As in description.  This follows from how these are scanned in scan.l:
>
> >       ident = litbuf_udeescape('\\', yyscanner);
> > if (yyextra->literallen >= NAMEDATALEN)
> > truncate_identifier(ident, yyextra->literallen, true);
>
> Yeah, that's a bug --- yyextra->literallen is not the thing to use here.
> It's just luck that truncate_identifier doesn't fail entirely, since
> we're violating its API contract.  Will fix, thanks for reporting it.
>
>                         regards, tom lane
>



--
Josh

Re: BUG #9204: truncate_identifier may be called unnecessarily on escaped quoted identifiers

От
Tom Lane
Дата:
Joshua Yanovski <pythonesque@gmail.com> writes:
> There is one other thing I noticed in that area of the code--namely, if
> NAMEDATALEN is low enough, an identifier can be truncated down to an empty
> identifier, since the check for empty identifier length is done before the
> call to truncate_identifier.  But I doubt this will ever be a problem in
> practice and there may be other compensatory checks elsewhere.

That'd only be possible if NAMEDATALEN were smaller than the longest
possible multibyte character, which I think is not a case we need to
concern ourselves with.  We currently don't support multibytes longer
than 4 bytes, and even if we do full Unicode somewhere down the line,
it'd still only be 6 bytes.  I can't imagine anyone wanting to run
with NAMEDATALEN less than 16 or so --- even if they tried, it'd likely
not work because of conflicts in the names of built-in functions.

            regards, tom lane

Re: BUG #9204: truncate_identifier may be called unnecessarily on escaped quoted identifiers

От
Joshua Yanovski
Дата:
Yeah, I agree that it will never be a problem in a real database--just
thought I'd bring it up since it was something I noticed and I couldn't
find any explicit minimum value for it :)  Thanks for fixing this!

--
Josh