Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file
Дата
Msg-id CAB7nPqSYZrd168BTg7uPDWKy3Y2+UGz=pC8jFzawSt54dvAzJw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file
Список pgsql-bugs
On Fri, Aug 19, 2016 at 4:15 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Aug 18, 2016 at 7:07 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> On Thu, Aug 18, 2016 at 2:25 AM, Michael Paquier <michael.paquier@gmail.com>
>> wrote:
>>> On Tue, Aug 16, 2016 at 4:56 PM, Magnus Hagander <magnus@hagander.net>
>>> wrote:
>>> > On Wed, Jul 13, 2016 at 3:10 AM, Michael Paquier
>>> > <michael.paquier@gmail.com>
>>> > wrote:
>> That's pretty much exactly what I said, isn't it? :)
>
> :p
>
>>> lstat() needs a similar treatment, see sendTablespace that calls it.
>>> port.h needs also to have its comments updated.
>>
>> port/win32.h has:
>> /*
>>  * Supplement to <sys/stat.h>.
>>  */
>> #define lstat(path, sb) stat((path), (sb))
>>
>> So I don't think we should need that, no?
>
> Right, this mapping may be caused by the fact that WIN32 uses junction
> points. Is that right?
>
>>> I think that your patch is definitely an improvement, and that we may
>>> have better backpatch it: any extension relying on those calls is
>>> going to fail similarly, so this gives an escape path. To be honest, I
>>> have not thought of GetLastError() and check that with
>>> STATUS_DELETE_PENDING, but that's definitely the right call to ignore
>>> a file that has this status instead of failing with EACCESS.
>>>
>>
>> So, thinking more about that.
>>
>> AllocateFile() calls fopen(). That one is only documented to set errno, and
>> might not always set the value used by GetLastError().
>>
>> Perhaps we should make AllocateFile() on win32 specifically check for
>> EACCESS, and in case we get that error then we do a GetFileAttributesEx() on
>> the same file and see if we get STATUS_DELETE_PENDING or not. If we get
>> STATUS_DELETE_PENDING we rewrite EACCESS to ENOENT and return, if we get
>> anything else we just let it through.
>
> That's a good idea.

So, working more on it I have not been able to reproduce the failure
reported even after running pg_basebackup in loop with a pgbench
running on a single client with this script to create a bunch of
relfilenodes:
insert into hoge values (1);
truncate hoge;
max_file_size and checkpoint_timeout were also set to minimum to
increase checkpoint frequency.

Anyway, I have spent some time eye-balling the patch proposed by
Magnus and I think that it is over-complicated. First I have noticed
here something:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms681382(v=vs.85).aspx
There is an error code ERROR_DELETE_PENDING that we can use to perform
the checks we want, and I am noticing that this is not getting listed
in win32error.c, so we could just map that with ENOENT and we are
basically done: AllocateFile() calls pgwin32_fopen() and stat() calls
win32_safestats, both of them finishing with _dosmaperr() to be sure
that errno is correctly set. This results in the simple patch
attached.

Takatsuka-san, could you try the patch attached and see if the failure
goes away?
--
Michael

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #14291: Sequence ID gets modified even for "on conflict" update
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file