Обсуждение: Non-compliant SASLprep implementation for ASCII characters

Поиск
Список
Период
Сортировка

Non-compliant SASLprep implementation for ASCII characters

От
Michael Paquier
Дата:
Hi all,

While reviewing some of the SCRAM code, I have been reminded about the
following bit of code in saslprep.c:
    /*
     * Quick check if the input is pure ASCII.  An ASCII string requires no
     * further processing.
     */
    if (pg_is_ascii(input))
    {
        *output = STRDUP(input);
        if (!(*output))
            goto oom;
        return SASLPREP_SUCCESS;
    }

And after cross-checking that with RFCs 3454 (Stringprep) and 4013
(SASLprep), I got reminded of the fact that this implementation
artifact is wrong because not all ASCII characters are allowed:
- 0x00~0x1F (0~31), control characters, are prohibited.
- 0x7F (127, DEL) is prohibited.

The rest of the ASCII character range is OK.

Another question one may ask is: does making our SCRAM implementation
compliant impact our SCRAM implementation at all?  The answer to this
question is no.  If we are dealing with an ASCII-only password with
prohibited characters, our calls of pg_saslprep() deal with the SCRAM
verifiers generated by CREATE/ALTER role and the SASLprep() calls done
during an exchange the same way: even if we have prohibited ASCII
characters, the bytes are fed as-is to the scram build code.  All our
callers of pg_saslprep() make sure that the same thing happens.

In short, I see no downside in just making our implementation
compliant, which should be actually beneficial for future callers of
this routine, should we have any.  One point can be made for the
efficiency of checking ASCII-only passwords, but the default count of
4096 used for the computation of the SCRAM verifiers outweights that
point by far IMO: the SCRAM computation is more expensive than this
ASCII-only shortcut anyway.

Attached are two patches, that I'd like to propose for this commit
fest:
- 0001 is a test suite that I have been relying on for some time,
introduced as the test module test_saslprep.  One artifact that Heikki
has mentioned to me offline while discussing this tool is that we
could also have a check for the entire range of valid UTF8 codepoints
to make sure that we never return an empty password for all these
codepoints.  This check is slightly expensive (3s on my laptop, which
is not bad still a bit expensive), so I have implemented that as a TAP
test controlled by a PG_TEST_EXTRA.  The only exception for the empty
password case is the nul character, that we disallow in CREATE/ALTER
ROLE.  This test suite also adds a test to cover 390b3cbbb2af with an
incomplete UTF8 sequence, as a nice bonus.
- 0002 is the change to make the implementation compliant, impacting
the tests.  This removes nul from the list of valid cases, and the SQL
tests show the compliant behavior.

Even if we don't do 0002, 0001 shows benefits of its own.

I am adding that to the upcoming CF.
Thanks,
--
Michael

Вложения

Re: Non-compliant SASLprep implementation for ASCII characters

От
Michael Paquier
Дата:
On Fri, Feb 27, 2026 at 12:05:28PM +0900, Michael Paquier wrote:
> - 0001 is a test suite that I have been relying on for some time,
> introduced as the test module test_saslprep.  One artifact that Heikki
> has mentioned to me offline while discussing this tool is that we
> could also have a check for the entire range of valid UTF8 codepoints
> to make sure that we never return an empty password for all these
> codepoints.  This check is slightly expensive (3s on my laptop, which
> is not bad still a bit expensive), so I have implemented that as a TAP
> test controlled by a PG_TEST_EXTRA.  The only exception for the empty
> password case is the nul character, that we disallow in CREATE/ALTER
> ROLE.  This test suite also adds a test to cover 390b3cbbb2af with an
> incomplete UTF8 sequence, as a nice bonus.

While thinking more about this one, I have come up with a smarter
query based on set_byte() to build a full range of byteas for the
ASCII characters to check, leading to this simpler pattern:
SELECT set_byte('\x00'::bytea, 0, a) FROM generate_series(0, 127);

A second thing that I have adjusted is the output for non-printable
characters, using a CASE/WHEN shortcut.  Attached is an updated
version of the patch set with these adjustments.
--
Michael

Вложения