Re: Small patch to improve safety of utf8_to_unicode().
| От | Chao Li |
|---|---|
| Тема | Re: Small patch to improve safety of utf8_to_unicode(). |
| Дата | |
| Msg-id | 541F240E-94AD-4D65-9794-7D6C316BC3FF@gmail.com обсуждение исходный текст |
| Ответ на | Small patch to improve safety of utf8_to_unicode(). (Jeff Davis <pgsql@j-davis.com>) |
| Список | pgsql-hackers |
> On Dec 13, 2025, at 07:24, Jeff Davis <pgsql@j-davis.com> wrote:
>
> Attached.
>
>
> <v1-0001-Make-utf8_to_unicode-safer.patch>
This patch adds a length check to utf8_to_unicode(), I think which is where “safety” comes from. Can you please add an
alittle bit more to the commit message instead of only saying “improve safety”. It also deleted two redundant function
declarationsfrom pg_wchar.h, which may also worth a quick note in the commit message.
The code changes all look good to me. Only nitpicks are:
1
```
diff --git a/contrib/fuzzystrmatch/daitch_mokotoff.c b/contrib/fuzzystrmatch/daitch_mokotoff.c
index 07f895ae2bf..47bd2814460 100644
--- a/contrib/fuzzystrmatch/daitch_mokotoff.c
+++ b/contrib/fuzzystrmatch/daitch_mokotoff.c
@@ -401,7 +401,8 @@ read_char(const unsigned char *str, int *ix)
/* Decode UTF-8 character to ISO 10646 code point. */
str += *ix;
- c = utf8_to_unicode(str);
+ /* Assume byte sequence has not been broken. */
+ c = utf8_to_unicode(str, MAX_MULTIBYTE_CHAR_LEN);
```
Here we need an empty line above the new comment.
2
```
diff --git a/src/common/wchar.c b/src/common/wchar.c
index a4bc29921de..c113cadf815 100644
--- a/src/common/wchar.c
+++ b/src/common/wchar.c
@@ -661,7 +661,8 @@ ucs_wcwidth(pg_wchar ucs)
static int
pg_utf_dsplen(const unsigned char *s)
{
- return ucs_wcwidth(utf8_to_unicode(s));
+ /* trust that input is not a truncated byte sequence */
+ return ucs_wcwidth(utf8_to_unicode(s, MAX_MULTIBYTE_CHAR_LEN));
}
```
For the new comment, as a code reader, I wonder why we “trust” that? To me, it more feels like because of lacking
lengthinformation, we have to trust. I would like this comment to be enhanced a little bit with more information.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
В списке pgsql-hackers по дате отправления: