Neil Conway <neilc@samurai.com> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> How much did you bloat the code? There are an awful lot of calls to
>> newNode(), so even though it's not all that large, I'd think the
>> multiplier would be nasty.
> The patch increases the executable from 12844452 to 13005244 bytes,
> when compiled with '-pg -g -O2' and without being stripped.
Okay, not as bad as I feared, but still kinda high.
I believe that most of the bloat comes from the MemSet macro; there's
just not much else in newNode(). Now, the reason MemSet expands to
a fair amount of code is its if-then-else case to decide whether to
call memset() or do an inline loop. I've looked at the assembler code
for it on a couple of machines, and the loop proper is only about a
third of the code that gets generated.
Ideally, we'd like to eliminate the if-test for inlined newNode calls.
That would buy back a lot of the bloat and speed things up still
further.
Now the tests on _val == 0 and _len <= MEMSET_LOOP_LIMIT and _len being
a multiple of 4 are no problem, since _val and _len are compile-time
constants; these will be optimized away. What is not optimized away
(on the compilers I've looked at) is the check for _start being
int-aligned.
A brute-force approach is to say "we know _start is word-aligned because
we just got it from palloc, which guarantees MAXALIGNment". We could
make a variant version of MemSet that omits the alignment check, and use
it here and anywhere else we're sure it's safe.
A nicer approach would be to somehow make use of the datatype of the
first argument to MemSet. If we could determine at compile time that
it's supposed to point at a type with at least int alignment, then
it'd be possible for the compiler to optimize away this check in a
reasonably safe fashion. I'm not sure if there's a portable way to
do this, though. There's no "alignof()" construct in C :-(.
Any ideas?
regards, tom lane