Re: use SSE2 for is_valid_ascii
От | Nathan Bossart |
---|---|
Тема | Re: use SSE2 for is_valid_ascii |
Дата | |
Msg-id | 20220810223120.GA1553157@nathanxps13 обсуждение исходный текст |
Ответ на | use SSE2 for is_valid_ascii (John Naylor <john.naylor@enterprisedb.com>) |
Ответы |
Re: use SSE2 for is_valid_ascii
|
Список | pgsql-hackers |
On Wed, Aug 10, 2022 at 01:50:14PM +0700, John Naylor wrote: > Here is an updated patch using the new USE_SSE2 symbol. The style is > different from the last one in that each stanza has platform-specific > code. I wanted to try it this way because is_valid_ascii() is already > written in SIMD-ish style using general purpose registers and bit > twiddling, so it seemed natural to see the two side-by-side. Sometimes > they can share the same comment. If we think this is bad for > readability, I can go back to one block each, but that way leads to > duplication of code and it's difficult to see what's different for > each platform, IMO. This is a neat patch. I don't know that we need an entirely separate code block for the USE_SSE2 path, but I do think that a little bit of extra commentary would improve the readability. IMO the existing comment for the zero accumulator has the right amount of detail. + /* + * Set all bits in each lane of the error accumulator where input + * bytes are zero. + */ + error_cum = _mm_or_si128(error_cum, + _mm_cmpeq_epi8(chunk, _mm_setzero_si128())); I wonder if reusing a zero vector (instead of creating a new one every time) has any noticeable effect on performance. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: