Re: Speed up COPY FROM text/CSV parsing using SIMD

Поиск
Список
Период
Сортировка
От Manni Wood
Тема Re: Speed up COPY FROM text/CSV parsing using SIMD
Дата
Msg-id CAKWEB6qa4V+aU5-S_Eq=J2o09xp=3e-iLFVqimB0Zu6iq3GKdw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Speed up COPY FROM text/CSV parsing using SIMD  (KAZAR Ayoub <ma_kazar@esi.dz>)
Список pgsql-hackers


On Wed, Dec 24, 2025 at 9:08 AM KAZAR Ayoub <ma_kazar@esi.dz> wrote:
Hello,
Following the same path of optimizing COPY FROM using SIMD, i found that COPY TO can also benefit from this.

I attached a small patch that uses SIMD to skip data and advance as far as the first special character is found, then fallback to scalar processing for that character and re-enter the SIMD path again...
There's two ways to do this:
1) Essentially we do SIMD until we find a special character, then continue scalar path without re-entering SIMD again.
- This gives from 10% to 30% speedups depending on the weight of special characters in the attribute, we don't lose anything here since it advances with SIMD until it can't (using the previous scripts: 1/3, 2/3 specials chars).

2) Do SIMD path, then use scalar path when we hit a special character, keep re-entering the SIMD path each time.
- This is equivalent to the COPY FROM story, we'll need to find the same heuristic to use for both COPY FROM/TO to reduce the regressions (same regressions: around from 20% to 30% with 1/3, 2/3 specials chars).

Something else to note is that the scalar path for COPY TO isn't as heavy as the state machine in COPY FROM.

So if we find the sweet spot for the heuristic, doing the same for COPY TO will be trivial and always beneficial.
Attached is 0004 which is option 1 (SIMD without re-entering), 0005 is the second one.


Regards,
Ayoub

Hello, Nazir and Ayoub!

Nazir, sorry for the late reply, I am on holiday. :-) I wanted to thank you for the tips on using cpupower to get less variance in my test results.

Ayoub, I suppose it was inevitable the SIMD patch would work for copying out as well as copying in!

I am back at work on 5 Jan 2026, so I till try to carve out time to test this then, using Nazir's tips.

Happy Holidays!

-Manni
--
-- Manni Wood EDB: https://www.enterprisedb.com

В списке pgsql-hackers по дате отправления: