Обсуждение: Pinned files at Windows

Поиск
Список
Период
Сортировка

Pinned files at Windows

От
Konstantin Knizhnik
Дата:
Hi, hackers.

There is the following problem with Postgres at Windows: files of 
dropped relation can be blocked for arbitrary long amount of time.
Such behavior is caused by two factors:
1. Windows doesn't allow deletion of opened file.
2. Postgres backend caches opened descriptors and this cache is not 
updated if backend is idle.

So the problem can be reproduced quite easily: create some table in once 
client, then drop it in another client and try to do something with 
relation files.
Segments of dropped relation are visible but any attempt to copy this 
file is rejected.
And this state persists until you perform some command in first client.

I wonder if we are going to address this windows specific issue?
It will cause problems with file backup utilities which are not able to 
copy this file.
And situation when backend can be idle for long amount of time are not 
so rare.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Pinned files at Windows

От
Konstantin Knizhnik
Дата:

On 27.05.2019 12:26, Konstantin Knizhnik wrote:
> Hi, hackers.
>
> There is the following problem with Postgres at Windows: files of 
> dropped relation can be blocked for arbitrary long amount of time.
> Such behavior is caused by two factors:
> 1. Windows doesn't allow deletion of opened file.
> 2. Postgres backend caches opened descriptors and this cache is not 
> updated if backend is idle.
>
> So the problem can be reproduced quite easily: create some table in 
> once client, then drop it in another client and try to do something 
> with relation files.
> Segments of dropped relation are visible but any attempt to copy this 
> file is rejected.
> And this state persists until you perform some command in first client.
>
> I wonder if we are going to address this windows specific issue?
> It will cause problems with file backup utilities which are not able 
> to copy this file.
> And situation when backend can be idle for long amount of time are not 
> so rare.
>

I have investigated the problem more and looks like the source of the 
problem is in pgwin32_safestat function:

int
pgwin32_safestat(const char *path, struct stat *buf)
{
     int            r;
     WIN32_FILE_ATTRIBUTE_DATA attr;

     r = stat(path, buf);
     if (r < 0)
     {
         if (GetLastError() == ERROR_DELETE_PENDING)
         {
             /*
              * File has been deleted, but is not gone from the 
filesystem yet.
              * This can happen when some process with FILE_SHARE_DELETE 
has it
              * open and it will be fully removed once that handle is 
closed.
              * Meanwhile, we can't open it, so indicate that the file just
              * doesn't exist.
              */
             errno = ENOENT;
             return -1;
         }

         return r;
     }

     if (!GetFileAttributesEx(path, GetFileExInfoStandard, &attr))
     {
         _dosmaperr(GetLastError());
         return -1;
     }

     /*
      * XXX no support for large files here, but we don't do that in 
general on
      * Win32 yet.
      */
     buf->st_size = attr.nFileSizeLow;

     return 0;
}

Postgres is opening file with FILE_SHARE_DELETE  flag which makes it 
possible to unlink opened file.
But unlike Unixes, the file is not actually deleted. You can see it 
using "dir" command.
And stat() function also doesn't return error in this case:

https://stackoverflow.com/questions/27270374/deletefile-or-unlink-calls-succeed-but-doesnt-remove-file

So first check in  pgwin32_safestat (r < 0) is not working at all: 
stat() returns 0, but subsequent call of GetFileAttributesEx
returns 5 (ERROR_ACCESS_DENIED).
It seems to me that pgwin32_safestat function should be rewritten in 
this way:

int
pgwin32_safestat(const char *path, struct stat *buf)
{
     int            r;
     WIN32_FILE_ATTRIBUTE_DATA attr;

     r = stat(path, buf);
     if (r < 0)
         return r;

     if (!GetFileAttributesEx(path, GetFileExInfoStandard, &attr))
     {
         errno = ENOENT;
         return -1;
     }

     /*
      * XXX no support for large files here, but we don't do that in 
general on
      * Win32 yet.
      */
     buf->st_size = attr.nFileSizeLow;

     return 0;
}


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Pinned files at Windows

От
Michael Paquier
Дата:
On Mon, May 27, 2019 at 05:52:13PM +0300, Konstantin Knizhnik wrote:
> Postgres is opening file with FILE_SHARE_DELETE  flag which makes it
> possible to unlink opened file.
> But unlike Unixes, the file is not actually deleted. You can see it using
> "dir" command.
> And stat() function also doesn't return error in this case:
>
> https://stackoverflow.com/questions/27270374/deletefile-or-unlink-calls-succeed-but-doesnt-remove-file
>
> So first check in  pgwin32_safestat (r < 0) is not working at all: stat()
> returns 0, but subsequent call of GetFileAttributesEx
> returns 5 (ERROR_ACCESS_DENIED).

So you would basically hijack the result of GetFileAttributesEx() so
as any errors returned by this function complain with ENOENT for
everything seen.  Why would that be a sane idea?  What if say a
permission or another error is legit, but instead ENOENT is returned
as you propose, then the caller would be confused by an incorrect
status.

As you mention, what we did as of 9951741 may not be completely right,
and the reason why it was done this way comes from here:
https://www.postgresql.org/message-id/20160712083220.1426.58667@wrigleys.postgresql.org

Could we instead come up with a reliable way to detect if a file is in
a deletion pending state?  Mapping blindly EACCES to ENOENT is not a
solution I think we can rely on (perhaps we could check only after
ERROR_ACCESS_DENIED using GetLastError() and map back to ENOENT in
this case still this can be triggered if a virus scanner holds the
file for read, no?).  stat() returning 0 for a file pending for
deletion which will go away physically once the handles still keeping
the file around are closed is not something I would have imagined is
sane, but that's what we need to deal with...  Windows has a long
history of keeping things compatible, sometimes in their own weird
way, and it seems that we have one here so I cannot imagine that this
behavior is going to change.

Looking around, I have found out about NtCreateFile() which could be
able to report a proper pending deletion status, still that's only
available in kernel mode.  Perhaps others have ideas?
--
Michael

Вложения

Re: Pinned files at Windows

От
Konstantin Knizhnik
Дата:

On 29.05.2019 22:20, Michael Paquier wrote:
> On Mon, May 27, 2019 at 05:52:13PM +0300, Konstantin Knizhnik wrote:
>> Postgres is opening file with FILE_SHARE_DELETE  flag which makes it
>> possible to unlink opened file.
>> But unlike Unixes, the file is not actually deleted. You can see it using
>> "dir" command.
>> And stat() function also doesn't return error in this case:
>>
>> https://stackoverflow.com/questions/27270374/deletefile-or-unlink-calls-succeed-but-doesnt-remove-file
>>
>> So first check in  pgwin32_safestat (r < 0) is not working at all: stat()
>> returns 0, but subsequent call of GetFileAttributesEx
>> returns 5 (ERROR_ACCESS_DENIED).
> So you would basically hijack the result of GetFileAttributesEx() so
> as any errors returned by this function complain with ENOENT for
> everything seen.  Why would that be a sane idea?  What if say a
> permission or another error is legit, but instead ENOENT is returned
> as you propose, then the caller would be confused by an incorrect
> status.

If access to the file is prohibited by lack of permissions, then stat() 
should fail with error
and this error is returned by  pgwin32_safestat function.

If call of stat() is succeed, then my assumption is that the only reason 
of GetFileAttributesEx
failure is that file is deleted and returning ENOENT error code in this 
case is correct behavior.

>
> As you mention, what we did as of 9951741 may not be completely right,
> and the reason why it was done this way comes from here:
> https://www.postgresql.org/message-id/20160712083220.1426.58667@wrigleys.postgresql.org

Yes, this is the same reason, but handling STATUS_DELETE_PENDING is not 
correct.
>
> Could we instead come up with a reliable way to detect if a file is in
> a deletion pending state?  Mapping blindly EACCES to ENOENT is not a
> solution I think we can rely on (perhaps we could check only after
> ERROR_ACCESS_DENIED using GetLastError() and map back to ENOENT in
> this case still this can be triggered if a virus scanner holds the
> file for read, no?).  stat() returning 0 for a file pending for
> deletion which will go away physically once the handles still keeping
> the file around are closed is not something I would have imagined is
> sane, but that's what we need to deal with...  Windows has a long
> history of keeping things compatible, sometimes in their own weird
> way, and it seems that we have one here so I cannot imagine that this
> behavior is going to change.
>
> Looking around, I have found out about NtCreateFile() which could be
> able to report a proper pending deletion status, still that's only
> available in kernel mode.  Perhaps others have ideas?

Sorry, I do not know better solution.
I have written small test reproducing the problem which proves that
if file is opened with FILE_SHARE_DELETE flag, then
it is possible to delete it using unlink() - no error is returned and 
call stat() for it - also succeed.
By any attempt to open this file for reading/writing or performing 
GetFileAttributesEx
are failed with  ERROR_ACCESS_DENIED (not with ERROR_DELETE_PENDING 
which is hidden by Win32 API).

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Pinned files at Windows

От
Robert Haas
Дата:
On Thu, May 30, 2019 at 3:25 AM Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
> If call of stat() is succeed, then my assumption is that the only reason
> of GetFileAttributesEx
> failure is that file is deleted and returning ENOENT error code in this
> case is correct behavior.

In my experience, the assumption "the only possible cause of an error
during X is Y" turns out to be wrong nearly 100% of the time.  Our job
is to report the errors the OS gives us, not guess what they mean.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Pinned files at Windows

От
Konstantin Knizhnik
Дата:

On 03.06.2019 22:15, Robert Haas wrote:
> On Thu, May 30, 2019 at 3:25 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> If call of stat() is succeed, then my assumption is that the only reason
>> of GetFileAttributesEx
>> failure is that file is deleted and returning ENOENT error code in this
>> case is correct behavior.
> In my experience, the assumption "the only possible cause of an error
> during X is Y" turns out to be wrong nearly 100% of the time.  Our job
> is to report the errors the OS gives us, not guess what they mean.
>
This is what we are try to do now:

     r = stat(path, buf);
     if (r < 0)
     {
         if (GetLastError() == ERROR_DELETE_PENDING)
         {
             /*
              * File has been deleted, but is not gone from the 
filesystem yet.
              * This can happen when some process with FILE_SHARE_DELETE 
has it
              * open and it will be fully removed once that handle is 
closed.
              * Meanwhile, we can't open it, so indicate that the file just
              * doesn't exist.
              */
             errno = ENOENT;
             return -1;
         }

         return r;
     }


but without success because ERROR_DELETE_PENDING is never returned by Win32.
And moreover, stat() doesn't ever return error in this case.



Re: Pinned files at Windows

От
Michael Paquier
Дата:
On Mon, Jun 03, 2019 at 11:37:30PM +0300, Konstantin Knizhnik wrote:
> but without success because ERROR_DELETE_PENDING is never returned by Win32.
> And moreover, stat() doesn't ever return error in this case.

Could it be possible to find a reliable way to detect that?
Cloberring errno with an incorrect value is not something we can rely
on, and I am ready to buy that GetFileAttributesEx() can also return
EACCES for some legit cases, like a file it has no access to.  What
if for example something is done on a file between the stat() call and
the GetFileAttributesEx() call in pgwin32_safestat() so as EACCES is
a legit error?
--
Michael

Вложения

Re: Pinned files at Windows

От
Konstantin Knizhnik
Дата:

On 04.06.2019 3:18, Michael Paquier wrote:
> On Mon, Jun 03, 2019 at 11:37:30PM +0300, Konstantin Knizhnik wrote:
>> but without success because ERROR_DELETE_PENDING is never returned by Win32.
>> And moreover, stat() doesn't ever return error in this case.
> Could it be possible to find a reliable way to detect that?
> Cloberring errno with an incorrect value is not something we can rely
> on, and I am ready to buy that GetFileAttributesEx() can also return
> EACCES for some legit cases, like a file it has no access to.  What
> if for example something is done on a file between the stat() call and
> the GetFileAttributesEx() call in pgwin32_safestat() so as EACCES is
> a legit error?

Sorry, I am not a Windows expert so I do not know how if it is possible 
to detect that ERROR_ACCESS_DENIED  returned by GetFileAttributesEx is 
actually caused by pending delete.
The situation when file permissions were changed between call of stat() 
and GetFileAttributesEx() is certainly possible but... do your really 
seriously consider probability of this event
and is there something critical if we return ENOENT instead of EACCES in 
this case?

Actually original problem seems to be caused by the assumption that 
stat() is not correctly setting st_size at Windows:
/*
  * The stat() function in win32 is not guaranteed to update the st_size
  * field when run. So we define our own version that uses the Win32 API
  * to update this field.
  */

I tried to google information about such behavior but didn't find any 
other references except Postgres sources.
I wonder if such problem really takes place (at least with more or less 
recent versions of Windows)?
And how critical it can be that we get cached value of file size?
If we access file without locking, then it is not correct to say about 
the "actual" file size, isn't it? File can be truncated or appended few 
milliseconds later after this call.
If there are some places in Postgres code which rely on the fact that 
stat() returns the "latest" file size value (actual for the moment of 
stat() call), then it can be a sign of possible race condition.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company