On 02/01/2020 01:15, John Naylor wrote:
> I wrote:
>
>> Currently, we include the function name string in each FmgrBuiltin
>> struct, whose size is 24 bytes on 64 bit platforms. As far as I can
>> tell, the name is usually unused, so the attached (WIP, untested)
>> patch stores it separately, reducing this struct to 16 bytes.
>>
>> We can go one step further and allocate the names as a single
>> character string, reducing the binary size. It doesn't help much to
>> store offsets, since there are ~40k characters, requiring 32-bit
>> offsets. If we instead compute the offset on the fly from stored name
>> lengths, we can use 8-bit values, saving 19kB of space in the binary
>> over using string pointers.
>
> I tested with the attached C function to make sure
> fmgr_internal_function() still returned the correct answer. I assume
> this is not a performance critical function, but I still wanted to see
> if there was a visible performance regression. I get this when calling
> fmgr_internal_function() 100k times:
>
> master: 833ms
> patch: 886ms
Hmm. I was actually expecting this to slightly speed up
fmgr_internal_function(), now that all the names fit in a smaller amount
of cache. I guess there are more branches or a data dependency or
something now. I'm not too worried about that though. If it mattered, we
should switch to binary searching the array.
> The point of the patch is to increase the likelihood of
> fmgr_isbuiltin() finding the fmgr_builtins[] element in L1 cache. It
> seems harder to put a number on that for a realistic workload, but
> reducing the array size by 1/3 couldn't hurt.
Yeah. Nevertheless, it would be nice to be able to demonstrate the
benefit in some test, at least. It feels hard to justify committing a
performance patch if we can't show the benefit. Otherwise, we should
just try to keep it as simple as possible, to optimize for readability.
A similar approach was actually discussed a couple of years back:
https://www.postgresql.org/message-id/bd13812c-c4ae-3788-5b28-5633beed2929%40iki.fi.
The conclusion then was that it's not worth the trouble or the code
complication. So I think this patch is Rejected, unless you can come up
with a test case that concretely shows the benefit.
- Heikki