Re: Speed up COPY FROM text/CSV parsing using SIMD
| От | Manni Wood |
|---|---|
| Тема | Re: Speed up COPY FROM text/CSV parsing using SIMD |
| Дата | |
| Msg-id | CAKWEB6rLxPVtN4ffZ3CMTL518zhk_BWzzBt6ZE2oUSaErdphxA@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: Speed up COPY FROM text/CSV parsing using SIMD (KAZAR Ayoub <ma_kazar@esi.dz>) |
| Список | pgsql-hackers |
On Wed, Nov 26, 2025 at 5:51 AM KAZAR Ayoub <ma_kazar@esi.dz> wrote:
On Tue, Nov 18, 2025 at 05:20:05PM +0300, Nazir Bilal Yavuz wrote:I've compiled both versions with -O2 and confirmed they generate different code. When simd_continue is passed as a constant to CopyReadLineText, the compiler optimizes out the condition checks from the SIMD path.
> Thanks, done.
I took a look at the v3 patches. Here are my high-level thoughts:
+ /*
+ * Parse data and transfer into line_buf. To get benefit from inlining,
+ * call CopyReadLineText() with the constant boolean variables.
+ */
+ if (cstate->simd_continue)
+ result = CopyReadLineText(cstate, is_csv, true);
+ else
+ result = CopyReadLineText(cstate, is_csv, false);
I'm curious whether this actually generates different code, and if it does,
if it's actually faster. We're already branching on cstate->simd_continue
here.
A small benchmark on a 1GB+ file shows the expected benefit which is around 6% performance improvement.
I've attached the assembly outputs in case someone wants to check something else.Regards,Ayoub Kazar
Correction to my last post:
I also tried files that alternated lines with no special characters and lines with 1/3rd special characters, thinking I could force the algorithm to continually check whether or not it should use simd and therefore force more overhead in the try-simd/don't-try-simd housekeeping code. The text file was still 20% faster (not 50% faster as I originally stated --- that was a typo). The CSV file was still 13% faster.
Also, apologies for posting at the top in my last e-mail.
-- -- Manni Wood EDB: https://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: