Обсуждение: [PATCH] Use strchr() to search for a single character

Поиск
Список
Период
Сортировка

[PATCH] Use strchr() to search for a single character

От
Dmitry Mityugov
Дата:
Code in src/port/pgmkdirp.c uses strstr() to find a single character in 
a string, but strstr() seems to be too generic for this job. Another 
function, strchr(), might be better suited for this purpose, because it 
is optimized to search for exactly one character in a string. In 
addition, if strchr() is used, the compiler doesn't have to generate a 
terminating \0 byte for the substring, producing slightly smaller code. 
I'm attaching the patch.

Regards,
Dmitry
Вложения

Re: [PATCH] Use strchr() to search for a single character

От
Corey Huinker
Дата:


On Sun, Jul 20, 2025 at 6:21 PM Dmitry Mityugov <d.mityugov@postgrespro.ru> wrote:
Code in src/port/pgmkdirp.c uses strstr() to find a single character in
a string, but strstr() seems to be too generic for this job. Another
function, strchr(), might be better suited for this purpose, because it
is optimized to search for exactly one character in a string. In
addition, if strchr() is used, the compiler doesn't have to generate a
terminating \0 byte for the substring, producing slightly smaller code.
I'm attaching the patch.

Regards,
Dmitry

Seems like a simple-enough change, not a huge win but probably worth doing.

Using ripgrep to search for 'strstr(.*".")' turns up two similar situations in contrib/fuzzystrmatch/dmetaphone.c, so perhaps we include those.

There's also a match in src/bin/pg_rewind/filemap.c, but that one is a false positive.

Re: [PATCH] Use strchr() to search for a single character

От
Dmitry Mityugov
Дата:
Corey Huinker писал(а) 2025-07-22 22:42:
> On Sun, Jul 20, 2025 at 6:21 PM Dmitry Mityugov
> <d.mityugov@postgrespro.ru> wrote:
> 
>> Code in src/port/pgmkdirp.c uses strstr() to find a single character
>> in
>> a string, but strstr() seems to be too generic for this job. Another
>> 
>> function, strchr(), might be better suited for this purpose, because
>> it
>> is optimized to search for exactly one character in a string. In
>> addition, if strchr() is used, the compiler doesn't have to generate
>> a
>> terminating \0 byte for the substring, producing slightly smaller
>> code.
>> I'm attaching the patch.
>> 
>> Regards,
>> Dmitry
> 
> Seems like a simple-enough change, not a huge win but probably worth
> doing.
> 
> Using ripgrep to search for 'strstr(.*".")' turns up two similar
> situations in contrib/fuzzystrmatch/dmetaphone.c, so perhaps we
> include those.
> 
> There's also a match in src/bin/pg_rewind/filemap.c, but that one is a
> false positive.

Thank you for your attention to this problem. The code in 
contrib/fuzzystrmatch/dmetaphone.c indeed uses several calls to strstr() 
to search for a single character, but it also uses strstr() to search 
for strings that consist of more than a single character on adjacent 
lines, and replacing half of those strstr()s with strchr()s would make 
the code less consistent in my opinion.

What's more important, it seems that this code in 
contrib/fuzzystrmatch/dmetaphone.c contains a bug. Statement `else if 
(strstr(s->str, "WITZ"))` at line 317 will never be executed, because if 
the string contains substring “W”, it will be handled at line 311, `if 
(strstr(s->str, "W"))`. Probably this bug should be fixed in a separate 
commit.



Re: [PATCH] Use strchr() to search for a single character

От
David Rowley
Дата:
On Wed, 23 Jul 2025 at 09:34, Dmitry Mityugov <d.mityugov@postgrespro.ru> wrote:
> Thank you for your attention to this problem. The code in
> contrib/fuzzystrmatch/dmetaphone.c indeed uses several calls to strstr()
> to search for a single character, but it also uses strstr() to search
> for strings that consist of more than a single character on adjacent
> lines, and replacing half of those strstr()s with strchr()s would make
> the code less consistent in my opinion.

That depends on what you're making consistent. If the consistency is
that we always use strchr() when the search is for a single char, then
it's not consistent to ignore that one.

Looking at [1], it seems even ancient versions of gcc and clang
rewrite the strstr() into a strchr() call when the search term is a
single char string. So it might not be worth doing to any trouble
here.

[1] https://godbolt.org/z/q1xcKdzd7

David



Re: [PATCH] Use strchr() to search for a single character

От
Tom Lane
Дата:
David Rowley <dgrowleyml@gmail.com> writes:
> Looking at [1], it seems even ancient versions of gcc and clang
> rewrite the strstr() into a strchr() call when the search term is a
> single char string. So it might not be worth doing to any trouble
> here.

I was wondering if that might be true.  However, your godbolt results
show that MSVC doesn't do this optimization, and the usage in
pgmkdirp.c is inside "#ifdef WIN32", so maybe it's worth fixing there.

I can't get excited about dmetaphone --- that's about as old and
crufty as anything in our tree.  If we were going to touch it the
first thing I'd want to do is get rid of its non-multibyte-safe
upcasing logic (MakeUpper).  The "WITZ" business is kind of
silly-looking, but the question it brings to my mind is whether
the algorithm was mistranscribed; so rather than just delete that
line we should do some research.  If we care, that is.

            regards, tom lane



Re: [PATCH] Use strchr() to search for a single character

От
David Rowley
Дата:
On Wed, 23 Jul 2025 at 10:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > Looking at [1], it seems even ancient versions of gcc and clang
> > rewrite the strstr() into a strchr() call when the search term is a
> > single char string. So it might not be worth doing to any trouble
> > here.
>
> I was wondering if that might be true.  However, your godbolt results
> show that MSVC doesn't do this optimization, and the usage in
> pgmkdirp.c is inside "#ifdef WIN32", so maybe it's worth fixing there.

Yeah, I noticed MSVC not doing the rewrite. I didn't notice the
mentioned use case was within an #ifdef WIN32.

I'm currently thinking we should just fix the pgmkdirp.c instance and
call it good.

David



Re: [PATCH] Use strchr() to search for a single character

От
Tom Lane
Дата:
David Rowley <dgrowleyml@gmail.com> writes:
> I'm currently thinking we should just fix the pgmkdirp.c instance and
> call it good.

+1

            regards, tom lane



Re: [PATCH] Use strchr() to search for a single character

От
David Rowley
Дата:
On Wed, 23 Jul 2025 at 11:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > I'm currently thinking we should just fix the pgmkdirp.c instance and
> > call it good.
>
> +1

Done.

David