But he isn't doing single-row lookups, he is doing large aggregations. If you have to aggregate N rows, doing a O(1) operation on different N occasions is still O(N).
Not that big-O is useful here anyway. It assumes that either everything fits in RAM (and is already there), or that nothing fits in RAM and it all has to be fetched from disk, even the index root pages, every time it is needed. Tommi is not operating under an environment where the first assumption holds, and no one operates in an environment where the second assumption holds.
As N increases beyond available RAM, your actual time for a single look-up is going to be a weighted average of two different constant-time operations, one with a small constant and one with a large constant. Big-O notation ignores this nicety and assumes all operations are at the slower speed, because that is what the limit of the weighted average will be as N gets very large. But real world systems do not operate at the infinite limit.
So his run time could easily be proportional to N^2, if he aggregates more rows and each one of them is less likely to be a cache hit.