On Tue, Mar 31, 2020 at 2:31 AM Andres Freund <andres@anarazel.de> wrote:
> I think the form of lea generated here is among the ones that can only
> be executed on port 1. Whereas e.g. an register+register/immediate add
> can be executed on four different ports.
I looked into slow vs. fast leas, and I think the above are actually
fast because they have 2 operands.
leal (%rdi,%rdi,2), %eax
A 3-op lea would look like this:
leal 42(%rdi,%rdi,8), %ecx
In other words, the scale doesn't count as an operand. Although I've
seen in a couple places say that a non-1 scale adds a cycle of latency
for some AMD chips.
Some interesting discussion in these LLVM commits and discussion from
2017 about avoiding slow leas:
https://reviews.llvm.org/D32277
https://reviews.llvm.org/D32352
--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services