Re: Optimizing COPY with SIMD

Поиск

Список

Период

Сортировка

От	Neil Conway
Тема	Re: Optimizing COPY with SIMD
Дата	7 июня 21:07:36
Msg-id	CAOW5sYaNuci8gNgEPuk0mx2QXi1rJBikmS=dNmR2jpf0K+4svg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Optimizing COPY with SIMD (Nathan Bossart <nathandbossart@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Jun 5, 2024 at 3:05 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

For pg_lfind32(), we ended up using an overlapping approach for the
vectorized case (see commit 7644a73). That appeared to help more than it
harmed in the many (admittedly branch predictor friendly) tests I ran. I
wonder if you could do something similar here.

I didn't entirely follow what you are suggesting here -- seems like we would need to do strlen() for the non-SIMD case if we tried to use a similar approach.

It'd be interesting to see the threshold where your patch starts winning.
IIUC the vector stuff won't take effect until there are 16 bytes to
process. If we don't expect attributes to ordinarily be >= 16 bytes, it
might be worth trying to mitigate this ~3% regression. Maybe we can find
some other small gains elsewhere to offset it.

For the particular short-strings benchmark I have been using (3 columns with 8-character ASCII strings in each), I suspect the regression is caused by the need to do a strlen(), rather than the vectorized loop itself (we skip the vectorized loop anyway because sizeof(Vector8) == 16 on this machine). (This explains why we see a regression on short strings for text but not CSV: CSV needed to do a strlen() for the non-quoted-string case regardless). Unfortunately this makes it tricky to make the optimization conditional on the length of the string. I suppose we could play some games where we start with a byte-by-byte loop and then switch over to the vectorized path (and take a strlen()) if we have seen more than, say, sizeof(Vector8) bytes so far. Seems a bit kludgy though.

I will do some more benchmarking and report back. For the time being, I'm not inclined to push to get the CopyAttributeOutTextVector() into the tree in its current state, as I agree that the short-attribute case is quite important.

In the meantime, attached is a revised patch series. This uses SIMD to optimize CopyReadLineText in COPY FROM. Performance results:

====

master @ 8fea1bd5411b:

Benchmark 1: ./psql -f /Users/neilconway/copy-from-large-long-strings.sql
Time (mean ± σ): 1.944 s ± 0.013 s [User: 0.001 s, System: 0.000 s]
Range (min … max): 1.927 s … 1.975 s 10 runs

Benchmark 1: ./psql -f /Users/neilconway/copy-from-large-short-strings.sql
Time (mean ± σ): 1.021 s ± 0.017 s [User: 0.002 s, System: 0.001 s]
Range (min … max): 1.005 s … 1.053 s 10 runs

master + SIMD patches:

Benchmark 1: ./psql -f /Users/neilconway/copy-from-large-long-strings.sql
Time (mean ± σ): 1.513 s ± 0.022 s [User: 0.001 s, System: 0.000 s]
Range (min … max): 1.493 s … 1.552 s 10 runs

Benchmark 1: ./psql -f /Users/neilconway/copy-from-large-short-strings.sql
Time (mean ± σ): 1.032 s ± 0.032 s [User: 0.002 s, System: 0.001 s]
Range (min … max): 1.009 s … 1.113 s 10 runs

====

Neil

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tomas Vondra
Дата: 07 июня, 20:41:10
Сообщение: WIP: parallel GiST index builds

Следующее

От: Radu Radutiu
Дата: 07 июня, 21:42:58
Сообщение: Re: Postgresql OOM

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Optimizing COPY with SIMD

Вложения

Предыдущее

Следующее