My reasoning was: To determine which index block to update (typically one in both the partitioned and non-partitioned cases), one needs to walk the index first, which supposedly causes one additional (read) I/O in the non-partitioned case on average, because there is one extra level and the lower part of the index is not cached (because of the size of the index).
But the "extra level" is up at the top where it is well cached, not at the bottom where it is not.