On Thu, Dec 10, 2015 at 1:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Thu, Dec 10, 2015 at 1:22 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> We can seq scan the array at relcache build time and invalidate relcache
>>> when we extend. WAL log any extension to a new segment and write the table
>>> to disk at checkpoint.
>
>> Invaliding the relcache when we extend would be extremely expensive,
>
> ... and I think it would be too late anyway, if backends are relying on
> the relcache to tell the truth. You can't require an exclusive lock on
> a rel just to extend it, which means there cannot be a guarantee that
> what a backend has in its relcache will be up to date with current reality.
True.
> I really don't like Robert's proposal of a metapage though. We've got too
> darn many forks per relation already.
Oh, I wasn't thinking of adding a fork, just repurposing block 0 of
the main fork, as we do for some index types.
> It strikes me that this discussion is perhaps conflating two different
> issues. Robert seems to be concerned about how we'd detect (not recover
> from, just detect) filesystem misfeasance in the form of complete loss
> of a non-last segment file. The other issue is a desire to reduce the
> cost of mdnblocks() calls. It may be worth thinking about those two
> things together, but we shouldn't lose sight of these being separate
> goals, assuming that anybody besides Robert thinks that the segment
> file loss issue is worth worrying about.
Don't get me wrong, I'm not willing to expend *any* extra cycles to
notice a problem here. But all things being equal, code that notices
broken stuff is better than code that doesn't.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company