On 2018-05-16 22:11:22 -0400, Tom Lane wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
> > On 17 May 2018 at 11:00, Andres Freund <andres@anarazel.de> wrote:
> >> Wonder if we shouldn't just cache an estimated relation size in the
> >> relcache entry till then. For planning purposes we don't need to be
> >> accurate, and usually activity that drastically expands relation size
> >> will trigger relcache activity before long. Currently there's plenty
> >> workloads where the lseeks(SEEK_END) show up pretty prominently.
>
> > While I'm in favour of speeding that up, I think we'd get complaints
> > if we used a stale value.
>
> Yeah, that scares me too. We'd then be in a situation where (arguably)
> any relation extension should force a relcache inval. Not good.
> I do not buy Andres' argument that the value is noncritical, either ---
> particularly during initial population of a table, where the size could
> go from zero to something-significant before autoanalyze gets around
> to noticing.
I don't think every extension needs to force a relcache inval. It'd
instead be perfectly reasonable to define a rule that an inval is
triggered whenever crossing a 10% relation size boundary. Which'll lead
to invalidations for the first few pages, but much less frequently
later.
> I'm a bit skeptical of the idea of maintaining an accurate relation
> size in shared memory, too. AIUI, a lot of the problem we see with
> lseek(SEEK_END) has to do with contention inside the kernel for access
> to the single-point-of-truth where the file's size is kept. Keeping
> our own copy would eliminate kernel-call overhead, which can't hurt,
> but it won't improve the contention angle.
A syscall is several hundred instructions. An unlocked read - which'll
be be sufficient in many cases, given that the value can quickly be out
of date anyway - is a few cycles. Even with a barrier you're talking a
few dozen cycles. So I can't see how it'd not improve the contention.
But the main reason for keeping it in shmem is less the lseek avoidance
- although that's nice, context switches aren't great - but to make
relation extension need far less locking.
Greetings,
Andres Freund