Обсуждение: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

Поиск
Список
Период
Сортировка

BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
harukat@sraoss.co.jp
Дата:
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz
aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDI0MwpMb2dnZWQgYnk6ICAg
ICAgICAgIFRBS0FUU1VLQSBIYXJ1a2EKRW1haWwgYWRkcmVzczogICAgICBo
YXJ1a2F0QHNyYW9zcy5jby5qcApQb3N0Z3JlU1FMIHZlcnNpb246IDkuNmJl
dGEyCk9wZXJhdGluZyBzeXN0ZW06ICAgV2luZG93cwpEZXNjcmlwdGlvbjog
ICAgICAgIAoKDQpwZ19iYXNlYmFja3VwIHNvbWV0aW1lcyBmYWlsZWQgb24g
V2luZG93cyBieSAiUGVybWlzc2lvbiBkZW5pZWQiLg0KVGhpcyBvY2N1cmVk
IGluIFBvc3RncmVTUUwgOS4xLnggYW5kIEkgcmVwcm9kdWNlZCBpdCBpbiA5
LjZiZXRhMi4NCg0KQzpcPnBnX2Jhc2ViYWNrdXAuZXhlIC1oIGxvY2FsaG9z
dCAtVSBwb3N0Z3JlcyANCiAtRCBDOlxkYXRcOTZkYXRwZ19iYXNlYmFja3Vw
OiBjb3VsZCBub3QgZ2V0IGJhY2t1cCBoZWFkZXI6DQogRVJST1I6ICBjb3Vs
ZCBub3Qgc3RhdCBmaWxlIG9yIGRpcmVjdG9yeSAiLi9iYXNlLzE2MzkzLzE2
NDQ0IjogDQogUGVybWlzc2lvbiBkZW5pZWQNCg0KQzpcUHJvZ3JhbSBGaWxl
c1xQb3N0Z3JlU1FMXDkuNlxkYXRhXGJhc2VcMTYzOTM+ZGlyIC9xIDE2NDQ0
DQoNCiBDOlxQcm9ncmFtIEZpbGVzXFBvc3RncmVTUUxcOS42XGRhdGFcYmFz
ZVwxNjM5MyBkaXJlY3RvcnkNCg0KIDIwMTYvMDcvMTIgIDE1OjU2ICAgICAg
ICAgICAgIDAgLi4uICAgICAgICAxNjQ0NA0KDQpEaXIgY29tbWFuZCBzYXlz
IHRoZSBmaWxlIDE2NDQ0J3Mgb3duZXIgaXMgJy4uLicuDQpJdCBtZWFucyBm
aWxlIDE2NDQ0IGlzIGluIFNUQVRVU19ERUxFVEVfUEVORElORy4NClRoZSBv
dGhlciBwcm9jZXNzZXMgY2Fubm90IG9wZW4gaXQsIHNvIGl0IGNhbg0KY2F1
c2UgcGdfYmFzZWJhY2t1cCB0byBmYWlsLg0KDQpJIHRoaW5rIHBnX2Jhc2Vi
YWNrdXAgc2hvdWxkIGlnbm9yZSBmaWxlcyB3aGljaCBpcyBpbiANClNUQVRV
U19ERUxFVEVfUEVORElORy4gQnV0IEkgY2Fubm90IHlldCBmaW5kIGFuIGVh
c3kNCm1ldGhvZCB0byByZWFsaXplIGl0Lg0KCgo=

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Amit Kapila
Дата:
On Tue, Jul 12, 2016 at 2:02 PM,  <harukat@sraoss.co.jp> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      14243
> Logged by:          TAKATSUKA Haruka
> Email address:      harukat@sraoss.co.jp
> PostgreSQL version: 9.6beta2
> Operating system:   Windows
> Description:
>
>
> pg_basebackup sometimes failed on Windows by "Permission denied".
> This occured in PostgreSQL 9.1.x and I reproduced it in 9.6beta2.
>
> C:\>pg_basebackup.exe -h localhost -U postgres
>  -D C:\dat\96datpg_basebackup: could not get backup header:
>  ERROR:  could not stat file or directory "./base/16393/16444":
>  Permission denied
>
> C:\Program Files\PostgreSQL\9.6\data\base\16393>dir /q 16444
>
>  C:\Program Files\PostgreSQL\9.6\data\base\16393 directory
>
>  2016/07/12  15:56             0 ...        16444
>
> Dir command says the file 16444's owner is '...'.
> It means file 16444 is in STATUS_DELETE_PENDING.
> The other processes cannot open it, so it can
> cause pg_basebackup to fail.
>
> I think pg_basebackup should ignore files which is in
> STATUS_DELETE_PENDING. But I cannot yet find an easy
> method to realize it.
>

pg_basebackup tries to backup all the files in data directory and
tablespaces.  Refer Notes section of pg_basebackup documentation.  I
think if it decides to skip a file with some bad status, it might also
lead to a corrupt backup.  I wonder how the file has such a status?
From the name, it looks to be a valid database file.

[1] - https://www.postgresql.org/docs/devel/static/app-pgbasebackup.html

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Alvaro Herrera
Дата:
Amit Kapila wrote:
> On Tue, Jul 12, 2016 at 2:02 PM,  <harukat@sraoss.co.jp> wrote:

> > C:\Program Files\PostgreSQL\9.6\data\base\16393>dir /q 16444
> >
> >  C:\Program Files\PostgreSQL\9.6\data\base\16393 directory
> >
> >  2016/07/12  15:56             0 ...        16444
> >
> > Dir command says the file 16444's owner is '...'.
> > It means file 16444 is in STATUS_DELETE_PENDING.

> pg_basebackup tries to backup all the files in data directory and
> tablespaces.  Refer Notes section of pg_basebackup documentation.  I
> think if it decides to skip a file with some bad status, it might also
> lead to a corrupt backup.  I wonder how the file has such a status?
> From the name, it looks to be a valid database file.

This file is probably being deleted by a checkpoint after some user
transaction marked it for removal (either because the relation was
dropped, or because it got a new relfilenode).  I would say that the
file is not needed for the backup after all and pg_basebackup should
just ignore it.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> On Tue, Jul 12, 2016 at 2:02 PM,  <harukat@sraoss.co.jp> wrote:
>>> It means file 16444 is in STATUS_DELETE_PENDING.

> This file is probably being deleted by a checkpoint after some user
> transaction marked it for removal (either because the relation was
> dropped, or because it got a new relfilenode).  I would say that the
> file is not needed for the backup after all and pg_basebackup should
> just ignore it.

Yeah.  The question is how do we distinguish that from cases that
are less benign.

            regards, tom lane

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Wed, Jul 13, 2016 at 8:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>>> On Tue, Jul 12, 2016 at 2:02 PM,  <harukat@sraoss.co.jp> wrote:
>>>> It means file 16444 is in STATUS_DELETE_PENDING.
>
>> This file is probably being deleted by a checkpoint after some user
>> transaction marked it for removal (either because the relation was
>> dropped, or because it got a new relfilenode).  I would say that the
>> file is not needed for the backup after all and pg_basebackup should
>> just ignore it.
>
> Yeah.  The question is how do we distinguish that from cases that
> are less benign.

Indeed, recovery will be able to handle that case correctly, and the
file should be ignored in the base backup. It seems that
STATUS_DELETE_PENDING has always mapped to ERROR_ACCESS_DENIED in
NTSTATUS [1], which is definitely not the brightest idea ever because
this does not allow us to check if the access to a path is really
denied because of permissions.

Checking directly for STATUS_DELETE_PENDING would be the way to go,
likely with tweaks in basebackup.c. but like Takatsuka-san, I cannot
find anything around that would allow us to check for that.

[1]:
http://stackoverflow.com/questions/6680491/why-does-windows-return-error-access-denied-when-i-try-to-open-a-delete-pended-f
--
Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Wed, Jul 13, 2016 at 10:06 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> Checking directly for STATUS_DELETE_PENDING would be the way to go,
> likely with tweaks in basebackup.c. but like Takatsuka-san, I cannot
> find anything around that would allow us to check for that.
>
> [1]:
http://stackoverflow.com/questions/6680491/why-does-windows-return-error-access-denied-when-i-try-to-open-a-delete-pended-f

Actually we may want to tweak stat() to not complain for a file with
such a state.
--
Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
TAKATSUKA Haruka
Дата:
I study this a little more, but I cannot find the way to distinguish
STATUS_DELETE_PENDING state from the other state that causes permission
denied error.

Cmd.exe's "dir /q" may distinguish these, but I cannot know which API
is used.


-----------------
Haruka Takatsuka
harukat@sraoss.co.jp


On Wed, 13 Jul 2016 10:10:37 +0900
Michael Paquier <michael.paquier@gmail.com> wrote:

> On Wed, Jul 13, 2016 at 10:06 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > Checking directly for STATUS_DELETE_PENDING would be the way to go,
> > likely with tweaks in basebackup.c. but like Takatsuka-san, I cannot
> > find anything around that would allow us to check for that.
> >
> > [1]:
http://stackoverflow.com/questions/6680491/why-does-windows-return-error-access-denied-when-i-try-to-open-a-delete-pended-f
>
> Actually we may want to tweak stat() to not complain for a file with
> such a state.
> --
> Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Tatsuo Ishii
Дата:
[Sorry for hanging to wrong message but I cannot get any email from
the list since this morning and I have to create a email from the
archive (posting is ok using the same email account. See my From:).]

Since no one came up with any useful solution for this soon, I think
I need to add this to the TODO list unless there's an objection.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> I study this a little more, but I cannot find the way to distinguish
> STATUS_DELETE_PENDING state from the other state that causes permission
> denied error.
>
> Cmd.exe's "dir /q" may distinguish these, but I cannot know which API
> is used.
>
>
> -----------------
> Haruka Takatsuka
> harukat(at)sraoss(dot)co(dot)jp
> > On Wed, Jul 13, 2016 at 10:06 AM, Michael Paquier
> > <michael.paquier@gmail.com> wrote:
> >> Checking directly for STATUS_DELETE_PENDING would be the way to go,
> >> likely with tweaks in basebackup.c. but like Takatsuka-san, I cannot
> >> find anything around that would allow us to check for that.
> >>
> >> [1]:
http://stackoverflow.com/questions/6680491/why-does-windows-return-error-access-denied-when-i-try-to-open-a-delete-pended-f
> >
> > Actually we may want to tweak stat() to not complain for a file with
> > such a state.
> > --
> > Michael
> >
> >
> > --
> > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-bugs
>

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Wed, Aug 3, 2016 at 2:33 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
> [Sorry for hanging to wrong message but I cannot get any email from
> the list since this morning and I have to create a email from the
> archive (posting is ok using the same email account. See my From:).]
>
> Since no one came up with any useful solution for this soon, I think
> I need to add this to the TODO list unless there's an objection.

The last time I looked for for a way to make the difference between a
permission error and a pending deletion status, I bumped into the
following tool PendMove:
https://technet.microsoft.com/en-us/sysinternals/bb897556.aspx

Well, it is not useful in itself, but, as my colleague Nikhil
Deshpande has pointed out, it becomes more interesting if combined
with ProcessExplorer to see what system calls it does:
https://technet.microsoft.com/en-us/sysinternals/processexplorer.aspx
This may allow us to find a way what are the system APIs called to
make the difference and then use them.

I don't make a priority of this item, but, Takatsuka-san, that's an
idea to dig for I think.
--
Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Magnus Hagander
Дата:
On Wed, Jul 13, 2016 at 3:10 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Wed, Jul 13, 2016 at 10:06 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> Checking directly for STATUS_DELETE_PENDING would be the way to go,
> likely with tweaks in basebackup.c. but like Takatsuka-san, I cannot
> find anything around that would allow us to check for that.
>
> [1]: http://stackoverflow.com/questions/6680491/why-does-windows-return-error-access-denied-when-i-try-to-open-a-delete-pended-f

Actually we may want to tweak stat() to not complain for a file with
such a state.


That will still leave is with a race condition won't it? It'll just be much smaller? The file could go into STATUS_DELETE_PENDING between the call to stat() and the call to sendFile()/allocateFile(). 

Maybe that's an acceptable difference? If not, we also need to teach AllocateFile() about the error.

If we're happy with just stat, it looks like we will have to add it to pgwin32_safestat().

I don' t have a working window sbox for testing ATM (yeah yeah, it's on my magic todo), but can someone confirm if calling stat() on such a deleted flie sets the errorcode so it can be checked with GetLastError() *as well* as setting errno? IIRC those functions do, but it needs to be checked. If so, the attached might work? If not, then we need an extra call to GetFileAttributesEx(), or rearrange those calls in a way to trap errors there.


--
Вложения

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Andres Freund
Дата:
On 2016-07-12 19:05:16 -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> >> On Tue, Jul 12, 2016 at 2:02 PM,  <harukat@sraoss.co.jp> wrote:
> >>> It means file 16444 is in STATUS_DELETE_PENDING.
>
> > This file is probably being deleted by a checkpoint after some user
> > transaction marked it for removal (either because the relation was
> > dropped, or because it got a new relfilenode).  I would say that the
> > file is not needed for the backup after all and pg_basebackup should
> > just ignore it.
>
> Yeah.  The question is how do we distinguish that from cases that
> are less benign.

One approach would be to rename the file into something indicating it's
being deleted, before actually deleting it.

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Magnus Hagander
Дата:
On Tue, Aug 16, 2016 at 9:54 PM, Andres Freund <andres@anarazel.de> wrote:

> On 2016-07-12 19:05:16 -0400, Tom Lane wrote:
> > Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > >> On Tue, Jul 12, 2016 at 2:02 PM,  <harukat@sraoss.co.jp> wrote:
> > >>> It means file 16444 is in STATUS_DELETE_PENDING.
> >
> > > This file is probably being deleted by a checkpoint after some user
> > > transaction marked it for removal (either because the relation was
> > > dropped, or because it got a new relfilenode).  I would say that the
> > > file is not needed for the backup after all and pg_basebackup should
> > > just ignore it.
> >
> > Yeah.  The question is how do we distinguish that from cases that
> > are less benign.
>
> One approach would be to rename the file into something indicating it's
> being deleted, before actually deleting it.
>
>
That would be an option (IIRC if you open with FILE_SHARE_DELETE, it can
also be renamed). But if we can track the delete when we try to open it and
just treat it as "file does not exist", that seems cleaner.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Andres Freund
Дата:
On 2016-08-17 12:19:31 +0200, Magnus Hagander wrote:
> On Tue, Aug 16, 2016 at 9:54 PM, Andres Freund <andres@anarazel.de> wrote:
>
> > On 2016-07-12 19:05:16 -0400, Tom Lane wrote:
> > > Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > > >> On Tue, Jul 12, 2016 at 2:02 PM,  <harukat@sraoss.co.jp> wrote:
> > > >>> It means file 16444 is in STATUS_DELETE_PENDING.
> > >
> > > > This file is probably being deleted by a checkpoint after some user
> > > > transaction marked it for removal (either because the relation was
> > > > dropped, or because it got a new relfilenode).  I would say that the
> > > > file is not needed for the backup after all and pg_basebackup should
> > > > just ignore it.
> > >
> > > Yeah.  The question is how do we distinguish that from cases that
> > > are less benign.
> >
> > One approach would be to rename the file into something indicating it's
> > being deleted, before actually deleting it.
> >
> >
> That would be an option (IIRC if you open with FILE_SHARE_DELETE, it can
> also be renamed). But if we can track the delete when we try to open it and
> just treat it as "file does not exist", that seems cleaner.

I'm not sure what exactly you're suggesting. But isn't there the issue
that such an approach will not interoperate with external tools?

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Magnus Hagander
Дата:
On Wed, Aug 17, 2016 at 5:41 PM, Andres Freund <andres@anarazel.de> wrote:

> On 2016-08-17 12:19:31 +0200, Magnus Hagander wrote:
> > On Tue, Aug 16, 2016 at 9:54 PM, Andres Freund <andres@anarazel.de>
> wrote:
> >
> > > On 2016-07-12 19:05:16 -0400, Tom Lane wrote:
> > > > Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > > > >> On Tue, Jul 12, 2016 at 2:02 PM,  <harukat@sraoss.co.jp> wrote:
> > > > >>> It means file 16444 is in STATUS_DELETE_PENDING.
> > > >
> > > > > This file is probably being deleted by a checkpoint after some user
> > > > > transaction marked it for removal (either because the relation was
> > > > > dropped, or because it got a new relfilenode).  I would say that
> the
> > > > > file is not needed for the backup after all and pg_basebackup
> should
> > > > > just ignore it.
> > > >
> > > > Yeah.  The question is how do we distinguish that from cases that
> > > > are less benign.
> > >
> > > One approach would be to rename the file into something indicating it's
> > > being deleted, before actually deleting it.
> > >
> > >
> > That would be an option (IIRC if you open with FILE_SHARE_DELETE, it can
> > also be renamed). But if we can track the delete when we try to open it
> and
> > just treat it as "file does not exist", that seems cleaner.
>
> I'm not sure what exactly you're suggesting. But isn't there the issue
> that such an approach will not interoperate with external tools?
>

I'll have to admit I wasn't thinking of external tools. The external tools
would have to learn about pending-delete files themselves. So yeah, if we
care about those then renaming them to something predictable and then
instruct the third party tools to exclude such files would be a more
complete fix.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Tue, Aug 16, 2016 at 4:56 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Wed, Jul 13, 2016 at 3:10 AM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>>
>> On Wed, Jul 13, 2016 at 10:06 AM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>> > Checking directly for STATUS_DELETE_PENDING would be the way to go,
>> > likely with tweaks in basebackup.c. but like Takatsuka-san, I cannot
>> > find anything around that would allow us to check for that.
>> >
>> > [1]:
>> >
http://stackoverflow.com/questions/6680491/why-does-windows-return-error-access-denied-when-i-try-to-open-a-delete-pended-f
>>
>> Actually we may want to tweak stat() to not complain for a file with
>> such a state.
>>
>
> That will still leave is with a race condition won't it? It'll just be much
> smaller? The file could go into STATUS_DELETE_PENDING between the call to
> stat() and the call to sendFile()/allocateFile().

That would only reduce the window. If the file is marked as ready for
deletion after checking with it for stat() and scanned afterwards
we're out, we'd still get a similar error afterwards.

> Maybe that's an acceptable difference? If not, we also need to teach
> AllocateFile() about the error.
>
> If we're happy with just stat, it looks like we will have to add it to
> pgwin32_safestat().

lstat() needs a similar treatment, see sendTablespace that calls it.
port.h needs also to have its comments updated.

> I don' t have a working window sbox for testing ATM (yeah yeah, it's on my
> magic todo), but can someone confirm if calling stat() on such a deleted
> flie sets the errorcode so it can be checked with GetLastError() *as well*
> as setting errno? IIRC those functions do, but it needs to be checked. If
> so, the attached might work? If not, then we need an extra call to
> GetFileAttributesEx(), or rearrange those calls in a way to trap errors
> there.

I think that your patch is definitely an improvement, and that we may
have better backpatch it: any extension relying on those calls is
going to fail similarly, so this gives an escape path. To be honest, I
have not thought of GetLastError() and check that with
STATUS_DELETE_PENDING, but that's definitely the right call to ignore
a file that has this status instead of failing with EACCESS.
--
Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Magnus Hagander
Дата:
On Thu, Aug 18, 2016 at 2:25 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:

> On Tue, Aug 16, 2016 at 4:56 PM, Magnus Hagander <magnus@hagander.net>
> wrote:
> > On Wed, Jul 13, 2016 at 3:10 AM, Michael Paquier <
> michael.paquier@gmail.com>
> > wrote:
> >>
> >> On Wed, Jul 13, 2016 at 10:06 AM, Michael Paquier
> >> <michael.paquier@gmail.com> wrote:
> >> > Checking directly for STATUS_DELETE_PENDING would be the way to go,
> >> > likely with tweaks in basebackup.c. but like Takatsuka-san, I cannot
> >> > find anything around that would allow us to check for that.
> >> >
> >> > [1]:
> >> > http://stackoverflow.com/questions/6680491/why-does-
> windows-return-error-access-denied-when-i-try-to-open-a-delete-pended-f
> >>
> >> Actually we may want to tweak stat() to not complain for a file with
> >> such a state.
> >>
> >
> > That will still leave is with a race condition won't it? It'll just be
> much
> > smaller? The file could go into STATUS_DELETE_PENDING between the call to
> > stat() and the call to sendFile()/allocateFile().
>
> That would only reduce the window. If the file is marked as ready for
> deletion after checking with it for stat() and scanned afterwards
> we're out, we'd still get a similar error afterwards.
>

That's pretty much exactly what I said, isn't it? :)



> > Maybe that's an acceptable difference? If not, we also need to teach
> > AllocateFile() about the error.
> >
> > If we're happy with just stat, it looks like we will have to add it to
> > pgwin32_safestat().
>
> lstat() needs a similar treatment, see sendTablespace that calls it.
> port.h needs also to have its comments updated.
>

port/win32.h has:
/*
 * Supplement to <sys/stat.h>.
 */
#define lstat(path, sb) stat((path), (sb))


So I don't think we should need that, no?



>
> > I don' t have a working window sbox for testing ATM (yeah yeah, it's on
> my
> > magic todo), but can someone confirm if calling stat() on such a deleted
> > flie sets the errorcode so it can be checked with GetLastError() *as
> well*
> > as setting errno? IIRC those functions do, but it needs to be checked. If
> > so, the attached might work? If not, then we need an extra call to
> > GetFileAttributesEx(), or rearrange those calls in a way to trap errors
> > there.
>
> I think that your patch is definitely an improvement, and that we may
> have better backpatch it: any extension relying on those calls is
> going to fail similarly, so this gives an escape path. To be honest, I
> have not thought of GetLastError() and check that with
> STATUS_DELETE_PENDING, but that's definitely the right call to ignore
> a file that has this status instead of failing with EACCESS.
>
>
So, thinking more about that.

AllocateFile() calls fopen(). That one is only documented to set errno, and
might not always set the value used by GetLastError().

Perhaps we should make AllocateFile() on win32 specifically check for
EACCESS, and in case we get that error then we do a GetFileAttributesEx()
on the same file and see if we get STATUS_DELETE_PENDING or not. If we get
STATUS_DELETE_PENDING we rewrite EACCESS to ENOENT and return, if we get
anything else we just let it through.

The race then is if the file "expires", is deleted *and* we create a new
file in between. In that case, we would still return EACCESS and not open
the new file. But that seems like a very narrow case.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Thu, Aug 18, 2016 at 7:07 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Thu, Aug 18, 2016 at 2:25 AM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>> On Tue, Aug 16, 2016 at 4:56 PM, Magnus Hagander <magnus@hagander.net>
>> wrote:
>> > On Wed, Jul 13, 2016 at 3:10 AM, Michael Paquier
>> > <michael.paquier@gmail.com>
>> > wrote:
> That's pretty much exactly what I said, isn't it? :)

:p

>> lstat() needs a similar treatment, see sendTablespace that calls it.
>> port.h needs also to have its comments updated.
>
> port/win32.h has:
> /*
>  * Supplement to <sys/stat.h>.
>  */
> #define lstat(path, sb) stat((path), (sb))
>
> So I don't think we should need that, no?

Right, this mapping may be caused by the fact that WIN32 uses junction
points. Is that right?

>> I think that your patch is definitely an improvement, and that we may
>> have better backpatch it: any extension relying on those calls is
>> going to fail similarly, so this gives an escape path. To be honest, I
>> have not thought of GetLastError() and check that with
>> STATUS_DELETE_PENDING, but that's definitely the right call to ignore
>> a file that has this status instead of failing with EACCESS.
>>
>
> So, thinking more about that.
>
> AllocateFile() calls fopen(). That one is only documented to set errno, and
> might not always set the value used by GetLastError().
>
> Perhaps we should make AllocateFile() on win32 specifically check for
> EACCESS, and in case we get that error then we do a GetFileAttributesEx() on
> the same file and see if we get STATUS_DELETE_PENDING or not. If we get
> STATUS_DELETE_PENDING we rewrite EACCESS to ENOENT and return, if we get
> anything else we just let it through.

That's a good idea.

> The race then is if the file "expires", is deleted *and* we create a new
> file in between. In that case, we would still return EACCESS and not open
> the new file. But that seems like a very narrow case.

You mean a file with the same name that gets re-created? Nah, it's not
worth worrying about that. The original complaint here was about a
relfilenode that a checkpoint has removed after it got likely
truncated by a backend and those are unique.
--
Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Fri, Aug 19, 2016 at 4:15 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Aug 18, 2016 at 7:07 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> On Thu, Aug 18, 2016 at 2:25 AM, Michael Paquier <michael.paquier@gmail.com>
>> wrote:
>>> On Tue, Aug 16, 2016 at 4:56 PM, Magnus Hagander <magnus@hagander.net>
>>> wrote:
>>> > On Wed, Jul 13, 2016 at 3:10 AM, Michael Paquier
>>> > <michael.paquier@gmail.com>
>>> > wrote:
>> That's pretty much exactly what I said, isn't it? :)
>
> :p
>
>>> lstat() needs a similar treatment, see sendTablespace that calls it.
>>> port.h needs also to have its comments updated.
>>
>> port/win32.h has:
>> /*
>>  * Supplement to <sys/stat.h>.
>>  */
>> #define lstat(path, sb) stat((path), (sb))
>>
>> So I don't think we should need that, no?
>
> Right, this mapping may be caused by the fact that WIN32 uses junction
> points. Is that right?
>
>>> I think that your patch is definitely an improvement, and that we may
>>> have better backpatch it: any extension relying on those calls is
>>> going to fail similarly, so this gives an escape path. To be honest, I
>>> have not thought of GetLastError() and check that with
>>> STATUS_DELETE_PENDING, but that's definitely the right call to ignore
>>> a file that has this status instead of failing with EACCESS.
>>>
>>
>> So, thinking more about that.
>>
>> AllocateFile() calls fopen(). That one is only documented to set errno, and
>> might not always set the value used by GetLastError().
>>
>> Perhaps we should make AllocateFile() on win32 specifically check for
>> EACCESS, and in case we get that error then we do a GetFileAttributesEx() on
>> the same file and see if we get STATUS_DELETE_PENDING or not. If we get
>> STATUS_DELETE_PENDING we rewrite EACCESS to ENOENT and return, if we get
>> anything else we just let it through.
>
> That's a good idea.

So, working more on it I have not been able to reproduce the failure
reported even after running pg_basebackup in loop with a pgbench
running on a single client with this script to create a bunch of
relfilenodes:
insert into hoge values (1);
truncate hoge;
max_file_size and checkpoint_timeout were also set to minimum to
increase checkpoint frequency.

Anyway, I have spent some time eye-balling the patch proposed by
Magnus and I think that it is over-complicated. First I have noticed
here something:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms681382(v=vs.85).aspx
There is an error code ERROR_DELETE_PENDING that we can use to perform
the checks we want, and I am noticing that this is not getting listed
in win32error.c, so we could just map that with ENOENT and we are
basically done: AllocateFile() calls pgwin32_fopen() and stat() calls
win32_safestats, both of them finishing with _dosmaperr() to be sure
that errno is correctly set. This results in the simple patch
attached.

Takatsuka-san, could you try the patch attached and see if the failure
goes away?
--
Michael

Вложения

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Magnus Hagander
Дата:
On Mon, Aug 22, 2016 at 3:13 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:

> On Fri, Aug 19, 2016 at 4:15 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > On Thu, Aug 18, 2016 at 7:07 PM, Magnus Hagander <magnus@hagander.net>
> wrote:
> >> On Thu, Aug 18, 2016 at 2:25 AM, Michael Paquier <
> michael.paquier@gmail.com>
> >> wrote:
> >>> On Tue, Aug 16, 2016 at 4:56 PM, Magnus Hagander <magnus@hagander.net>
> >>> wrote:
> >>> > On Wed, Jul 13, 2016 at 3:10 AM, Michael Paquier
> >>> > <michael.paquier@gmail.com>
> >>> > wrote:
> >> That's pretty much exactly what I said, isn't it? :)
> >
> > :p
> >
> >>> lstat() needs a similar treatment, see sendTablespace that calls it.
> >>> port.h needs also to have its comments updated.
> >>
> >> port/win32.h has:
> >> /*
> >>  * Supplement to <sys/stat.h>.
> >>  */
> >> #define lstat(path, sb) stat((path), (sb))
> >>
> >> So I don't think we should need that, no?
> >
> > Right, this mapping may be caused by the fact that WIN32 uses junction
> > points. Is that right?
> >
> >>> I think that your patch is definitely an improvement, and that we may
> >>> have better backpatch it: any extension relying on those calls is
> >>> going to fail similarly, so this gives an escape path. To be honest, I
> >>> have not thought of GetLastError() and check that with
> >>> STATUS_DELETE_PENDING, but that's definitely the right call to ignore
> >>> a file that has this status instead of failing with EACCESS.
> >>>
> >>
> >> So, thinking more about that.
> >>
> >> AllocateFile() calls fopen(). That one is only documented to set errno,
> and
> >> might not always set the value used by GetLastError().
> >>
> >> Perhaps we should make AllocateFile() on win32 specifically check for
> >> EACCESS, and in case we get that error then we do a
> GetFileAttributesEx() on
> >> the same file and see if we get STATUS_DELETE_PENDING or not. If we get
> >> STATUS_DELETE_PENDING we rewrite EACCESS to ENOENT and return, if we get
> >> anything else we just let it through.
> >
> > That's a good idea.
>
> So, working more on it I have not been able to reproduce the failure
> reported even after running pg_basebackup in loop with a pgbench
> running on a single client with this script to create a bunch of
> relfilenodes:
> insert into hoge values (1);
> truncate hoge;
> max_file_size and checkpoint_timeout were also set to minimum to
> increase checkpoint frequency.
>
> Anyway, I have spent some time eye-balling the patch proposed by
> Magnus and I think that it is over-complicated. First I have noticed
> here something:
> https://msdn.microsoft.com/en-us/library/windows/desktop/
> ms681382(v=vs.85).aspx
> There is an error code ERROR_DELETE_PENDING that we can use to perform
> the checks we want, and I am noticing that this is not getting listed
> in win32error.c, so we could just map that with ENOENT and we are
> basically done: AllocateFile() calls pgwin32_fopen() and stat() calls
> win32_safestats, both of them finishing with _dosmaperr() to be sure
> that errno is correctly set. This results in the simple patch
> attached.
>

Not having looked in detail, but in pgwin32_safestat(), if the stat() call
fails, we return immediately without calling _dosmaperr(), don't we? So
we're still going to error out there with whatever the default mapping is,
and that's access denied.

It's only if the the stat() call succeeds but the getting of extended
attributes fail that we actually call _dosmaperr().

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Mon, Aug 22, 2016 at 10:17 PM, Magnus Hagander <magnus@hagander.net> wrote:
> Not having looked in detail, but in pgwin32_safestat(), if the stat() call
> fails, we return immediately without calling _dosmaperr(), don't we? So
> we're still going to error out there with whatever the default mapping is,
> and that's access denied.
>
> It's only if the the stat() call succeeds but the getting of extended
> attributes fail that we actually call _dosmaperr().

Meh, you're right. How stupid I am here. So we could just reuse the
first block of your patch when checking for (r < 0), but drop the
second part that complicates GetFileAttributesEx if win32error.c gets
completed for _dosmaperr as my last patch does, right?

By the way, in your patch you really need to
s/STATUS_DELETE_PENDING/ERROR_DELETE_PENDING or compilation just
fails.
--
Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Magnus Hagander
Дата:
On Mon, Aug 22, 2016 at 3:42 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:

> On Mon, Aug 22, 2016 at 10:17 PM, Magnus Hagander <magnus@hagander.net>
> wrote:
> > Not having looked in detail, but in pgwin32_safestat(), if the stat()
> call
> > fails, we return immediately without calling _dosmaperr(), don't we? So
> > we're still going to error out there with whatever the default mapping
> is,
> > and that's access denied.
> >
> > It's only if the the stat() call succeeds but the getting of extended
> > attributes fail that we actually call _dosmaperr().
>
> Meh, you're right. How stupid I am here. So we could just reuse the
> first block of your patch when checking for (r < 0), but drop the
> second part that complicates GetFileAttributesEx if win32error.c gets
> completed for _dosmaperr as my last patch does, right?
>

Yeah, that seems correct. And I agree that it's cleaner to do it in
_dosmaperr() than to do it in the individual locations.



> By the way, in your patch you really need to
> s/STATUS_DELETE_PENDING/ERROR_DELETE_PENDING or compilation just
> fails.
>

Hah. Yeah, that patch was coded-without-compiling as a poc of how to do it.
Would've definitely needed a round of testing before applying :)


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Mon, Aug 22, 2016 at 10:48 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Mon, Aug 22, 2016 at 3:42 PM, Michael Paquier <michael.paquier@gmail.com>
> wrote:
>> By the way, in your patch you really need to
>> s/STATUS_DELETE_PENDING/ERROR_DELETE_PENDING or compilation just
>> fails.
>
> Hah. Yeah, that patch was coded-without-compiling as a poc of how to do it.
> Would've definitely needed a round of testing before applying :)

As long as I look at that, it gives the attached.

Takatsuka-san, others, it would be great if you could test this patch
and see if the problem goes away as I have not been able to reproduce
the bug myself..
--
Michael

Вложения

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Mon, Aug 22, 2016 at 11:00 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Aug 22, 2016 at 10:48 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> On Mon, Aug 22, 2016 at 3:42 PM, Michael Paquier <michael.paquier@gmail.com>
>> wrote:
>>> By the way, in your patch you really need to
>>> s/STATUS_DELETE_PENDING/ERROR_DELETE_PENDING or compilation just
>>> fails.
>>
>> Hah. Yeah, that patch was coded-without-compiling as a poc of how to do it.
>> Would've definitely needed a round of testing before applying :)
>
> As long as I look at that, it gives the attached.
>
> Takatsuka-san, others, it would be great if you could test this patch
> and see if the problem goes away as I have not been able to reproduce
> the bug myself..

I have added an entry in the CF for that:
https://commitfest.postgresql.org/10/737/
--
Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Wed, Aug 24, 2016 at 9:09 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Aug 22, 2016 at 11:00 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Mon, Aug 22, 2016 at 10:48 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>> On Mon, Aug 22, 2016 at 3:42 PM, Michael Paquier <michael.paquier@gmail.com>
>>> wrote:
>>>> By the way, in your patch you really need to
>>>> s/STATUS_DELETE_PENDING/ERROR_DELETE_PENDING or compilation just
>>>> fails.
>>>
>>> Hah. Yeah, that patch was coded-without-compiling as a poc of how to do it.
>>> Would've definitely needed a round of testing before applying :)
>>
>> As long as I look at that, it gives the attached.
>>
>> Takatsuka-san, others, it would be great if you could test this patch
>> and see if the problem goes away as I have not been able to reproduce
>> the bug myself..
>
> I have added an entry in the CF for that:
> https://commitfest.postgresql.org/10/737/

I gave this bug another try and let Windows run for some time but I
cannot reproduce the original failure even with manual checkpointing
and aggressive checkpoint_timeout, while truncating relations heavily
with pgbench to enforce the deletion of relfilenodes. To all, do you
think that having a large relfilenode file matters to trigger this
issue?

Could it be possible to get more testing as well? We won't go far as
long as we have not checked that the issue gets fixed by the proposed
patch in a place where the problem is able to show up.
--
Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Wed, Sep 21, 2016 at 1:17 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> I gave this bug another try and let Windows run for some time but I
> cannot reproduce the original failure even with manual checkpointing
> and aggressive checkpoint_timeout, while truncating relations heavily
> with pgbench to enforce the deletion of relfilenodes. To all, do you
> think that having a large relfilenode file matters to trigger this
> issue?
>
> Could it be possible to get more testing as well? We won't go far as
> long as we have not checked that the issue gets fixed by the proposed
> patch in a place where the problem is able to show up.

I still cannot reproduce the original problem, so there is not much I
can do to validate the patch.. If someone has ways to check that the
fix works as expected that would be nice. For now I am marking this
patch as returned with feedback in the CF app until this happens.
--
Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Magnus Hagander
Дата:
On Sep 29, 2016 9:31 AM, "Michael Paquier" <michael.paquier@gmail.com>
wrote:
>
> On Wed, Sep 21, 2016 at 1:17 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > I gave this bug another try and let Windows run for some time but I
> > cannot reproduce the original failure even with manual checkpointing
> > and aggressive checkpoint_timeout, while truncating relations heavily
> > with pgbench to enforce the deletion of relfilenodes. To all, do you
> > think that having a large relfilenode file matters to trigger this
> > issue?
> >
> > Could it be possible to get more testing as well? We won't go far as
> > long as we have not checked that the issue gets fixed by the proposed
> > patch in a place where the problem is able to show up.
>
> I still cannot reproduce the original problem, so there is not much I
> can do to validate the patch.. If someone has ways to check that the
> fix works as expected that would be nice. For now I am marking this
> patch as returned with feedback in the CF app until this happens.

I agree with not committing it without testing, but I think returned with
feedback will almost guarantee there won't be any in the future. And it's a
bug we should definitely try to get fixed..

Maybe it's a better choice to bounce it to the next commitfest to keep up
the visibility?

/Magnus

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Thu, Sep 29, 2016 at 4:38 PM, Magnus Hagander <magnus@hagander.net> wrote:
> Maybe it's a better choice to bounce it to the next commitfest to keep up
> the visibility?

OK, done this way then.
--
Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Alvaro Herrera
Дата:
Michael Paquier wrote:

> I gave this bug another try and let Windows run for some time but I
> cannot reproduce the original failure even with manual checkpointing
> and aggressive checkpoint_timeout, while truncating relations heavily
> with pgbench to enforce the deletion of relfilenodes. To all, do you
> think that having a large relfilenode file matters to trigger this
> issue?

Hmm, perhaps with a larger file the OS spends more time shredding the
file or something like that?  Perhaps there are filesystem options
involved, for instance.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Thu, Sep 29, 2016 at 11:34 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Michael Paquier wrote:
>
>> I gave this bug another try and let Windows run for some time but I
>> cannot reproduce the original failure even with manual checkpointing
>> and aggressive checkpoint_timeout, while truncating relations heavily
>> with pgbench to enforce the deletion of relfilenodes. To all, do you
>> think that having a large relfilenode file matters to trigger this
>> issue?
>
> Hmm, perhaps with a larger file the OS spends more time shredding the
> file or something like that?  Perhaps there are filesystem options
> involved, for instance.

I have been digging around that, but could not find out any options
that would delay the file deletion after it has been requested. As
pg_basebackup grabs automatically all the files in PGDATA, I have as
well tried to use thousands of dummy files up to 1MB, as well as huge
files ("fsutil file createnew" is your friend), deleting them manually
to force the stat() calls to be unhappy be still I could not reproduce
it... If somebody has better ideas I am open.
--
Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Michael Paquier
Дата:
On Thu, Oct 6, 2016 at 2:20 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> I have been digging around that, but could not find out any options
> that would delay the file deletion after it has been requested. As
> pg_basebackup grabs automatically all the files in PGDATA, I have as
> well tried to use thousands of dummy files up to 1MB, as well as huge
> files ("fsutil file createnew" is your friend), deleting them manually
> to force the stat() calls to be unhappy be still I could not reproduce
> it... If somebody has better ideas I am open.

Extra idea: patch win32_safestat() so as it continues to call stat()
in an infinite loop for a specific file that I manually created in
base/$DBOID until it fails with res < 0. Even that did not bump into
ERROR_DELETE_PENDING :)
--
Michael

Re: BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDING file

От
Kevin Grittner
Дата:
On Thu, Oct 6, 2016 at 12:20 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Sep 29, 2016 at 11:34 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>> Michael Paquier wrote:
>>
>>> I gave this bug another try and let Windows run for some time but I
>>> cannot reproduce the original failure even with manual checkpointing
>>> and aggressive checkpoint_timeout, while truncating relations heavily
>>> with pgbench to enforce the deletion of relfilenodes. To all, do you
>>> think that having a large relfilenode file matters to trigger this
>>> issue?
>>
>> Hmm, perhaps with a larger file the OS spends more time shredding the
>> file or something like that?  Perhaps there are filesystem options
>> involved, for instance.
>
> I have been digging around that, but could not find out any options
> that would delay the file deletion after it has been requested. As
> pg_basebackup grabs automatically all the files in PGDATA, I have as
> well tried to use thousands of dummy files up to 1MB, as well as huge
> files ("fsutil file createnew" is your friend), deleting them manually
> to force the stat() calls to be unhappy be still I could not reproduce
> it... If somebody has better ideas I am open.

Any chance that an OS snapshot of the volume was being taken at the
time of the error?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [BUGS] BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDINGfile

От
Magnus Hagander
Дата:


On Thu, Sep 29, 2016 at 9:55 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Thu, Sep 29, 2016 at 4:38 PM, Magnus Hagander <magnus@hagander.net> wrote:
> Maybe it's a better choice to bounce it to the next commitfest to keep up
> the visibility?

OK, done this way then.


So, we're still with zero ability to reproduce this problem enough to know if it's fixed, or if other issues are caused.
 
I'm thinking maybe we should push this patch to master, to at least get a largeish number of runs with it on a bunch of different platforms. No backpatching until we've seen it run there for some time to feel more confident about it.

Thoughts?

--

Re: [BUGS] BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDINGfile

От
Michael Paquier
Дата:
On Sat, Dec 31, 2016 at 9:27 PM, Magnus Hagander <magnus@hagander.net> wrote:
> So, we're still with zero ability to reproduce this problem enough to know
> if it's fixed, or if other issues are caused.
>
> I'm thinking maybe we should push this patch to master, to at least get a
> largeish number of runs with it on a bunch of different platforms. No
> backpatching until we've seen it run there for some time to feel more
> confident about it.
>
> Thoughts?

Yeah, I kind of agree with that. Not reacting with an ENOENT on
win32's stat() if ERROR_DELETE_PENDING is found is definitely wrong,
as is not doing the mapping in win32error.c.
-- 
Michael


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDINGfile

От
Magnus Hagander
Дата:


On Sun, Jan 1, 2017 at 10:53 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Sat, Dec 31, 2016 at 9:27 PM, Magnus Hagander <magnus@hagander.net> wrote:
> So, we're still with zero ability to reproduce this problem enough to know
> if it's fixed, or if other issues are caused.
>
> I'm thinking maybe we should push this patch to master, to at least get a
> largeish number of runs with it on a bunch of different platforms. No
> backpatching until we've seen it run there for some time to feel more
> confident about it.
>
> Thoughts?

Yeah, I kind of agree with that. Not reacting with an ENOENT on
win32's stat() if ERROR_DELETE_PENDING is found is definitely wrong,
as is not doing the mapping in win32error.c.

OK, I've pushed it to master. Let's leave it there for a while and then decide what to do about backpatching. Re-visit in time for the next minor releases perhaps?
 
--

Re: [BUGS] BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDINGfile

От
Michael Paquier
Дата:
On Wed, Jan 4, 2017 at 6:52 PM, Magnus Hagander <magnus@hagander.net> wrote:
> OK, I've pushed it to master. Let's leave it there for a while and then
> decide what to do about backpatching. Re-visit in time for the next minor
> releases perhaps?

Thanks for the commit. Next round of minor releases is planned for
February if my memory does not fail me. We may want to wait an extra
couple of months to the next one. Or not.
-- 
Michael


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14243: pg_basebackup failes by a STATUS_DELETE_PENDINGfile

От
Magnus Hagander
Дата:


On Wed, Jan 4, 2017 at 11:10 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Wed, Jan 4, 2017 at 6:52 PM, Magnus Hagander <magnus@hagander.net> wrote:
> OK, I've pushed it to master. Let's leave it there for a while and then
> decide what to do about backpatching. Re-visit in time for the next minor
> releases perhaps?

Thanks for the commit. Next round of minor releases is planned for
February if my memory does not fail me. We may want to wait an extra
couple of months to the next one. Or not.

Correct, Feb 9. So it might be cutting it a bit too close and we should perhaps aim for the May release. 


--