Hi,
On 2024-05-16 15:01:31 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2024-05-16 14:50:50 -0400, Tom Lane wrote:
> >> The intention was certainly always that it be atomic. If it isn't
> >> we have got *big* trouble.
>
> > We unfortunately do *know* that on several systems e.g. basebackup can read a
> > partially written control file, while the control file is being
> > updated.
>
> Yeah, but can't we just retry that if we get a bad checksum?
Retry what/where precisely? We can avoid the issue in basebackup.c by taking
ControlFileLock in the right moment - but that doesn't address
pg_start/stop_backup based backups. Hence the patch in the referenced thread
moving to replacing the control file by atomic-rename if there are base
backups ongoing.
> What had better be atomic is the write to disk.
That is still true to my knowledge.
> Systems that can't manage POSIX semantics for concurrent reads and writes
> are annoying, but not fatal ...
I think part of the issue that people don't agree what posix says about
a read that's concurrent to a write... See e.g.
https://utcc.utoronto.ca/~cks/space/blog/unix/WriteNotVeryAtomic
Greetings,
Andres Freund