speed up verifying UTF-8

Поиск
Список
Период
Сортировка
От John Naylor
Тема speed up verifying UTF-8
Дата
Msg-id CAFBsxsHii1-wbwN7vEbpzK03VJJL=EXegJSz6RSXbXZeaUB2jA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [POC] verifying UTF-8 using SIMD instructions  (John Naylor <john.naylor@enterprisedb.com>)
Ответы Re: speed up verifying UTF-8  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-hackers
For v10, I've split the patch up into two parts. 0001 uses pure C everywhere. This is much smaller and easier to review, and gets us the most bang for the buck. 

One concern Heikki raised upthread is that platforms with poor unaligned-memory access will see a regression. We could easily add an #ifdef to take care of that, but I haven't done so here.

To recap: On ascii-only input with storage taken out of the picture, profiles of COPY FROM show a reduction from nealy 10% down to just over 1%. In microbenchmarks found earlier in this thread, this works out to about 7 times faster. On multibyte/mixed input, 0001 is a bit faster, but not really enough to make a difference in copy performance.

0002 adds the SSE4 implementation on x86-64, and is equally fast on all input, at the cost of greater complexity.

To reflect the split, I've changed the thread subject and the commitfest title.
--
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Matthias van de Meent
Дата:
Сообщение: Re: pg_stat_progress_create_index vs. parallel index builds
Следующее
От: Marko Tiikkaja
Дата:
Сообщение: Re: security_definer_search_path GUC