Re: Review/Pull Request: Adding new CRC32C implementation for IBM S390X

Поиск
Список
Период
Сортировка
От John Naylor
Тема Re: Review/Pull Request: Adding new CRC32C implementation for IBM S390X
Дата
Msg-id CANWCAZbU0wz-H5ER-=yOfrxymYsk0sFU1oE+dD=N=La4s-BGwg@mail.gmail.com
обсуждение исходный текст
Ответ на RE: Review/Pull Request: Adding new CRC32C implementation for IBM S390X  (Eduard Stefes <Eduard.Stefes@ibm.com>)
Список pgsql-hackers
On Tue, May 27, 2025 at 3:24 AM Eduard Stefes <Eduard.Stefes@ibm.com> wrote:
> So I worked on the algorithm to also work on buffers between 16-64
> bytes. Then I ran the performance measurement on two
> dataset[^raw_data_1] [^raw_data_2]. And created two diagrams
> [^attachment].
>
> my findings so far:
>
> - the optimized crc32cvx is faster
> - the sb8 performance is heavily depending on alignment (see the
> ripples every 8 bytes)

To be precise, these all seem 8-byte aligned at a glance, and the
ripple is due to input length.

> - the 8 byte ripple is also visible in the vx implementation. As it can
> only perform on 16 or 64 byte chunks, it will still use sb8 for the
> remaining bytes.
> - there is no obvious speed regression in the vx algorithm. Except
> raw_data_2-28 which I assume is a fluke. I am sharing the system with a
> bunch of other devs.
>
>
> I hope this this is acceptable as performance measurement. However we
> will setup a dedicated performance test and try to get precise numbers
> without side-effects. But it may take some time until we get to that.

This already looks like a solid improvement at 32 bytes and above -- I
don't think we need less noisy numbers. Also for future reference,
please reply in-line. Thanks!

--
John Naylor
Amazon Web Services



В списке pgsql-hackers по дате отправления: