> int8 would still pose some overflow risk (at least for int8 input),
> and would likely be no faster than a float8 implementation, since
> both would require palloc().
Right. On 32-bit machines, int8 is likely to be substantially slower,
since the int8 math is done in a library rather than in a single machine
instruction.
> Your test suggests that the performance differential is *at most*
> 2X --- probably much less in real-world situations where the disk
> pages aren't already cached.
Hmm. sum(int4) on the same table is 1.8 seconds for 7.0.2 (vs 12.5 for
snapshot). But I *am* compiling with asserts turned on for the other
tests (with maybe some other differences too), so maybe it is not (yet)
a fair comparison. Still a pretty big performance difference for
something folks expect to be a fast operation.
- Thomas