On Tue, Jan 30, 2024 at 11:23:57AM +1300, David Rowley wrote:
> On Tue, 30 Jan 2024 at 08:32, Nathan Bossart <nathandbossart@gmail.com> wrote:
>> I'm currently +0.1 for this change. I don't see any huge problem with
>> trimming a few instructions, but I'm dubious there's any measurable impact.
>> However, a cycle saved is a cycle earned...
>
> FWIW, In [1] and subsequent replies, there are several examples of
> benchmarks where various bitmapset functions are sitting high in the
> profiles. So I wouldn't be too surprised if such a small change to the
> WORDNUM and BITNUM macros made a noticeable difference.
Good to know, thanks. If there is indeed demonstrable improvement, I'd
readily adjust my +0.1 to +1.
Following the suggestions, I did a quick test with one of the scripts.
Ubuntu 64 bits
gcc 12.3 64 bits
create table t1 (a int) partition by list(a);
select 'create table t1_'||x||' partition of t1 for values
in('||x||');' from generate_series(0,9)x;
test1.sql
select * from t1 where a > 1 and a < 3;
pgbench -U postgres -n -f test1.sql -T 15 postgres
head:
tps = 27983.182940
tps = 28916.903038
tps = 29051.878855
patched:
tps = 27517.301246
tps = 27848.684133
tps = 28669.367300
create table t2 (a int) partition by list(a);
select 'create table t2_'||x||' partition of t2 for values
in('||x||');' from generate_series(0,9999)x;
test2.sql
select * from t2 where a > 1 and a < 3;
pgbench -U postgres -n -f test2.sql -T 15 postgres
head:
tps = 27144.044463
tps = 28932.948620
tps = 29299.016248
patched:
tps = 27363.364039
tps = 28588.141586
tps = 28669.367300
To my complete surprise, the change is slower.
I can't say how, with fewer instructions, gcc makes the binary worse.
best regards,
Ranier Vilela