Обсуждение: Error in calculating length of encoded base64 string

Поиск
Список
Период
Сортировка

Error in calculating length of encoded base64 string

От
o.tselebrovskiy@postgrespro.ru
Дата:
Greetings, everyone!

While working on an extension I've found an error in how length of 
encoded base64 string is calulated;

This error is present in 3 files across all supported versions:

/src/common/base64.c, function pg_b64_enc_len;
/src/backend/utils/adt/encode.c, function pg_base64_enc_len;
/contrib/pgcrypto/pgp-armor.c, function pg_base64_enc_len (copied from 
encode.c).

In all three cases the length is calculated as follows:

(srclen + 2) * 4 / 3; (plus linefeed in latter two cases)

There's also a comment /* 3 bytes will be converted to 4 */

This formula is wrong. Let's calculate encoded length for different 
starting lengths:

starting length 2: (2 + 2) * 4 / 3 = 5,
starting length 3: (3 + 2) * 4 / 3 = 6,
starting length 4: (4 + 2) * 4 / 3 = 8,
starting length 6: (6 + 2) * 4 / 3 = 10,
starting length 10: (10 + 2) * 4 / 3 = 16,

when it should be 4, 4, 8, 8, 16.

So the suggestion is to change the formula to a right one: (srclen + 2) 
/ 3 * 4;

The patch is attached.

Oleg Tselebrovskiy, Postgres Pro
Вложения

Re: Error in calculating length of encoded base64 string

От
Tom Lane
Дата:
o.tselebrovskiy@postgrespro.ru writes:
> While working on an extension I've found an error in how length of 
> encoded base64 string is calulated;

Yeah, I think you're right.  It's not of huge significance, because
it just overestimates by 1 or 2 bytes, but we might as well get
it right.  Thanks for the report and patch!

            regards, tom lane



Re: Error in calculating length of encoded base64 string

От
Gurjeet Singh
Дата:
On Thu, Jun 8, 2023 at 7:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> o.tselebrovskiy@postgrespro.ru writes:
> > While working on an extension I've found an error in how length of
> > encoded base64 string is calulated;
>
> Yeah, I think you're right.  It's not of huge significance, because
> it just overestimates by 1 or 2 bytes, but we might as well get
> it right.  Thanks for the report and patch!

From your commit d98ed080bb

>    This bug is very ancient, dating to commit 79d78bb26 which
>    added encode.c.  (The other instances were presumably copied
>    from there.)  Still, it doesn't quite seem worth back-patching.

Is it worth investing time in trying to unify these 3 occurrences of
base64 length (and possibly other relevant) code to one place? If yes,
I can volunteer for it.

The common code facility under src/common/ did not exist back when
pgcrypto was added, but since it does now, it may be worth it make
others depend on implementation in src/common/ code.

Best regards,
Gurjeet
http://Gurje.et



Re: Error in calculating length of encoded base64 string

От
Tom Lane
Дата:
Gurjeet Singh <gurjeet@singh.im> writes:
> On Thu, Jun 8, 2023 at 7:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This bug is very ancient, dating to commit 79d78bb26 which
>> added encode.c.  (The other instances were presumably copied
>> from there.)  Still, it doesn't quite seem worth back-patching.

> Is it worth investing time in trying to unify these 3 occurrences of
> base64 length (and possibly other relevant) code to one place? If yes,
> I can volunteer for it.

I wondered about that too.  It seems really silly that we made
a copy in src/common and did not replace the others with calls
to that.

            regards, tom lane



Re: Error in calculating length of encoded base64 string

От
Dagfinn Ilmari Mannsåker
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Gurjeet Singh <gurjeet@singh.im> writes:
>> On Thu, Jun 8, 2023 at 7:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> This bug is very ancient, dating to commit 79d78bb26 which
>>> added encode.c.  (The other instances were presumably copied
>>> from there.)  Still, it doesn't quite seem worth back-patching.
>
>> Is it worth investing time in trying to unify these 3 occurrences of
>> base64 length (and possibly other relevant) code to one place? If yes,
>> I can volunteer for it.
>
> I wondered about that too.  It seems really silly that we made
> a copy in src/common and did not replace the others with calls
> to that.

Also, while we're at it, how about some unit tests that both encode and
calculate the encoded length of strings of various lengths and check
that they match?

- ilmari



Re: Error in calculating length of encoded base64 string

От
Alvaro Herrera
Дата:
On 2023-Jun-09, Tom Lane wrote:

> Gurjeet Singh <gurjeet@singh.im> writes:
> > On Thu, Jun 8, 2023 at 7:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> This bug is very ancient, dating to commit 79d78bb26 which
> >> added encode.c.  (The other instances were presumably copied
> >> from there.)  Still, it doesn't quite seem worth back-patching.
> 
> > Is it worth investing time in trying to unify these 3 occurrences of
> > base64 length (and possibly other relevant) code to one place? If yes,
> > I can volunteer for it.
> 
> I wondered about that too.  It seems really silly that we made
> a copy in src/common and did not replace the others with calls
> to that.

I looked into this.  It turns out that there is a difference in newline
handling in the other routines compared to what was added for SCRAM,
which doesn't have any (and complains if you supply them).  Peter E
did suggest to unify them at the time:
https://www.postgresql.org/message-id/947b9aff-8fdb-dbf5-a99c-0ffd4523a73f%402ndquadrant.com

We could add a boolean "whitespace" flag to both of
src/common/base64.c's pg_b64_encode() and pg_b64_decode(); with that I
think it could serve the three places that need it.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/