RE: Popcount optimization using AVX512

Поиск
Список
Период
Сортировка
От Amonson, Paul D
Тема RE: Popcount optimization using AVX512
Дата
Msg-id BL1PR11MB530473FB4E9CBD68C28896F7DC282@BL1PR11MB5304.namprd11.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Popcount optimization using AVX512  (Nathan Bossart <nathandbossart@gmail.com>)
Ответы RE: Popcount optimization using AVX512  ("Amonson, Paul D" <paul.d.amonson@intel.com>)
Список pgsql-hackers
> -----Original Message-----
> From: Nathan Bossart <nathandbossart@gmail.com>
> Sent: Friday, March 15, 2024 8:06 AM
> To: Amonson, Paul D <paul.d.amonson@intel.com>
> Cc: Andres Freund <andres@anarazel.de>; Alvaro Herrera <alvherre@alvh.no-
> ip.org>; Shankaran, Akash <akash.shankaran@intel.com>; Noah Misch
> <noah@leadboat.com>; Tom Lane <tgl@sss.pgh.pa.us>; Matthias van de
> Meent <boekewurm+postgres@gmail.com>; pgsql-
> hackers@lists.postgresql.org
> Subject: Re: Popcount optimization using AVX512
>
> Which test suite did you run?  Those numbers seem potentially
> indistinguishable from noise, which probably isn't great for such a large patch
> set.

I ran...
    psql -c "select bitcount(column) from table;"
...in a loop with "column" widths of 84, 4096, 8192, and 16384 containing random data. There DB has 1 million rows.  In
theloop before calling the select I have code to clear all system caches. If I omit the code to clear system caches the
marginof error remains the same but the improvement percent changes from 1.2% to 14.6% (much less I/O when cached data
isavailable). 

> I ran John Naylor's test_popcount module [0] with the following command on
> an i7-1195G7:
>
>     time psql postgres -c 'select drive_popcount(10000000, 1024)'
>
> Without your patches, this seems to take somewhere around 8.8 seconds.
> With your patches, it takes 0.6 seconds.  (I re-compiled and re-ran the tests a
> couple of times because I had a difficult time believing the amount of
> improvement.)

When I tested the code outside postgres in a micro benchmark I got 200-300% improvements. Your results are interesting,
asit implies more than 300% improvement. Let me do some research on the benchmark you referenced. However, in all cases
itseems that there is no regression so should we move forward on merging while I run some more local tests? 

Thanks,
Paul




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nathan Bossart
Дата:
Сообщение: Re: Popcount optimization using AVX512
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Weird test mixup