Re: Faster StrNCpy

Поиск
Список
Период
Сортировка
От mark@mark.mielke.cc
Тема Re: Faster StrNCpy
Дата
Msg-id 20060929212331.GB30048@mark.mielke.cc
обсуждение исходный текст
Ответ на Re: Faster StrNCpy  (mark@mark.mielke.cc)
Ответы Re: Faster StrNCpy  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
If anybody is curious, here are my numbers for an AMD X2 3800+:

$ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected to be slow."' -o x x.c y.c strlcpy.c ; ./x
NONE:        620268 us
MEMCPY:      683135 us
STRNCPY:    7952930 us
STRLCPY:   10042364 us

$ gcc -O3 -std=c99 -DSTRING='"Short sentence."' -o x x.c y.c strlcpy.c ; ./x
NONE:        554694 us
MEMCPY:      691390 us
STRNCPY:    7759933 us
STRLCPY:    3710627 us

$ gcc -O3 -std=c99 -DSTRING='""' -o x x.c y.c strlcpy.c ; ./x
NONE:        631266 us
MEMCPY:      775340 us
STRNCPY:    7789267 us
STRLCPY:     550430 us

Each invocation represents 100 million calls to each of the functions.
Each function accepts a 'dst' and 'src' argument, and assumes that it
is copying 64 bytes from 'src' to 'dst'. The none function does
nothing. The memcpy calls memcpy(), the strncpy calls strncpy(), and
the strlcpy calls the strlcpy() that was posted from the BSD sources.
(GLIBC doesn't have strlcpy() on my machine).

This makes it clear what the overhead of the additional logic involves.
memcpy() is approximately equal to nothing at all. strncpy() is always
expensive. strlcpy() is often more expensive than memcpy(), except in
the empty string case.

These tests do not properly model the effects of real memory, however,
they do model the effects of cache memory. I would suggest that the
results are exaggerated, but not invalid.

For anybody doubting the none vs memcpy, I've included the generated
assembly code. I chalk it entirely up to fully utilizing the
parallelization capability of the CPU. Although 16 movq instructions
are executed, they can be executed fully in parallel.

It almost makes it clear to me that all of these instructions are
pretty fast. Are we sure this is a real bottleneck? Even the slowest
operation above, strlcpy() on a very long string, appears to execute
10 per microsecond? Perhaps my tests are too easy for my CPU and I
need to make it access many different 64-byte blocks? :-)

Cheers,
mark

--
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   |
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: mark@mark.mielke.cc
Дата:
Сообщение: Re: Faster StrNCpy
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Faster StrNCpy