On Fri, Dec 21, 2012 at 2:28 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> When pg_basebackup copies data files, it does basically this:
>
>> if (lstat(pathbuf, &statbuf) != 0)
>> {
>> if (errno != ENOENT)
>> ereport(ERROR,
>> (errcode_for_file_access(),
>> errmsg("could not stat file or directory
>> \"%s\": %m",
>> pathbuf)));
>>
>> /* If the file went away while scanning, it's no error. */
>> continue;
>> }
>
>> ...
>> sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf);
>
> There's a race condition there. If the file is removed after the lstat call,
> and before sendFile opens the file, the backup fails with an error. It's a
> fairly tight window, so it's difficult to run into by accident, but by
> putting a breakpoint with a debugger there it's quite easy to reproduce, by
> e.g doing a VACUUM FULL on the table about to be copied.
>
> A straightforward fix is to allow sendFile() to ignore ENOENT. Patch
> attached.
Looks good to me. Nice spot - don't tell me you actually ran into it
during testing? :)
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/