On Tue, Jan 27, 2015 at 10:16:48PM -0500, David Steele wrote:
> This is definitely an edge case. Not only does the file have to be
> modified in the same second *after* rsync has done the copy, but the
> file also has to not be modified in *any other subsequent second* before
> the next incremental backup. If the file is busy enough to have a
> collision with rsync in that second, then it is very likely to be
> modified before the next incremental backup which is generally a day or
> so later. And, of course, the backup where the issue occurs is fine -
> it's the next backup that is invalid.
>
> However, the hot/cold backup scheme as documented does make the race
> condition more likely since the two backups are done in close proximity
> temporally. Ultimately, the most reliable method is to use checksums.
>
> For me the biggest issue is that there is no way to discover if a db in
> consistent no matter how much time/resources you are willing to spend.
> I could live with the idea of the occasional bad backup (since I keep as
> many as possible), but having no way to know whether it is good or not
> is very frustrating. I know data checksums are a step in that
> direction, but they are a long way from providing the optimal solution.
> I've implemented rigorous checksums in PgBackRest but something closer
> to the source would be even better.
Agreed. I have update the two mentions of rsync in our docs to clarify
this. Thank you.
The patch also has pg_upgrade doc improvements suggested by comments
from Josh Berkus.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +