Обсуждение: Pinned files at Windows
Hi, hackers. There is the following problem with Postgres at Windows: files of dropped relation can be blocked for arbitrary long amount of time. Such behavior is caused by two factors: 1. Windows doesn't allow deletion of opened file. 2. Postgres backend caches opened descriptors and this cache is not updated if backend is idle. So the problem can be reproduced quite easily: create some table in once client, then drop it in another client and try to do something with relation files. Segments of dropped relation are visible but any attempt to copy this file is rejected. And this state persists until you perform some command in first client. I wonder if we are going to address this windows specific issue? It will cause problems with file backup utilities which are not able to copy this file. And situation when backend can be idle for long amount of time are not so rare. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 27.05.2019 12:26, Konstantin Knizhnik wrote: > Hi, hackers. > > There is the following problem with Postgres at Windows: files of > dropped relation can be blocked for arbitrary long amount of time. > Such behavior is caused by two factors: > 1. Windows doesn't allow deletion of opened file. > 2. Postgres backend caches opened descriptors and this cache is not > updated if backend is idle. > > So the problem can be reproduced quite easily: create some table in > once client, then drop it in another client and try to do something > with relation files. > Segments of dropped relation are visible but any attempt to copy this > file is rejected. > And this state persists until you perform some command in first client. > > I wonder if we are going to address this windows specific issue? > It will cause problems with file backup utilities which are not able > to copy this file. > And situation when backend can be idle for long amount of time are not > so rare. > I have investigated the problem more and looks like the source of the problem is in pgwin32_safestat function: int pgwin32_safestat(const char *path, struct stat *buf) { int r; WIN32_FILE_ATTRIBUTE_DATA attr; r = stat(path, buf); if (r < 0) { if (GetLastError() == ERROR_DELETE_PENDING) { /* * File has been deleted, but is not gone from the filesystem yet. * This can happen when some process with FILE_SHARE_DELETE has it * open and it will be fully removed once that handle is closed. * Meanwhile, we can't open it, so indicate that the file just * doesn't exist. */ errno = ENOENT; return -1; } return r; } if (!GetFileAttributesEx(path, GetFileExInfoStandard, &attr)) { _dosmaperr(GetLastError()); return -1; } /* * XXX no support for large files here, but we don't do that in general on * Win32 yet. */ buf->st_size = attr.nFileSizeLow; return 0; } Postgres is opening file with FILE_SHARE_DELETE flag which makes it possible to unlink opened file. But unlike Unixes, the file is not actually deleted. You can see it using "dir" command. And stat() function also doesn't return error in this case: https://stackoverflow.com/questions/27270374/deletefile-or-unlink-calls-succeed-but-doesnt-remove-file So first check in pgwin32_safestat (r < 0) is not working at all: stat() returns 0, but subsequent call of GetFileAttributesEx returns 5 (ERROR_ACCESS_DENIED). It seems to me that pgwin32_safestat function should be rewritten in this way: int pgwin32_safestat(const char *path, struct stat *buf) { int r; WIN32_FILE_ATTRIBUTE_DATA attr; r = stat(path, buf); if (r < 0) return r; if (!GetFileAttributesEx(path, GetFileExInfoStandard, &attr)) { errno = ENOENT; return -1; } /* * XXX no support for large files here, but we don't do that in general on * Win32 yet. */ buf->st_size = attr.nFileSizeLow; return 0; } -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Mon, May 27, 2019 at 05:52:13PM +0300, Konstantin Knizhnik wrote: > Postgres is opening file with FILE_SHARE_DELETE flag which makes it > possible to unlink opened file. > But unlike Unixes, the file is not actually deleted. You can see it using > "dir" command. > And stat() function also doesn't return error in this case: > > https://stackoverflow.com/questions/27270374/deletefile-or-unlink-calls-succeed-but-doesnt-remove-file > > So first check in pgwin32_safestat (r < 0) is not working at all: stat() > returns 0, but subsequent call of GetFileAttributesEx > returns 5 (ERROR_ACCESS_DENIED). So you would basically hijack the result of GetFileAttributesEx() so as any errors returned by this function complain with ENOENT for everything seen. Why would that be a sane idea? What if say a permission or another error is legit, but instead ENOENT is returned as you propose, then the caller would be confused by an incorrect status. As you mention, what we did as of 9951741 may not be completely right, and the reason why it was done this way comes from here: https://www.postgresql.org/message-id/20160712083220.1426.58667@wrigleys.postgresql.org Could we instead come up with a reliable way to detect if a file is in a deletion pending state? Mapping blindly EACCES to ENOENT is not a solution I think we can rely on (perhaps we could check only after ERROR_ACCESS_DENIED using GetLastError() and map back to ENOENT in this case still this can be triggered if a virus scanner holds the file for read, no?). stat() returning 0 for a file pending for deletion which will go away physically once the handles still keeping the file around are closed is not something I would have imagined is sane, but that's what we need to deal with... Windows has a long history of keeping things compatible, sometimes in their own weird way, and it seems that we have one here so I cannot imagine that this behavior is going to change. Looking around, I have found out about NtCreateFile() which could be able to report a proper pending deletion status, still that's only available in kernel mode. Perhaps others have ideas? -- Michael
Вложения
On 29.05.2019 22:20, Michael Paquier wrote: > On Mon, May 27, 2019 at 05:52:13PM +0300, Konstantin Knizhnik wrote: >> Postgres is opening file with FILE_SHARE_DELETE flag which makes it >> possible to unlink opened file. >> But unlike Unixes, the file is not actually deleted. You can see it using >> "dir" command. >> And stat() function also doesn't return error in this case: >> >> https://stackoverflow.com/questions/27270374/deletefile-or-unlink-calls-succeed-but-doesnt-remove-file >> >> So first check in pgwin32_safestat (r < 0) is not working at all: stat() >> returns 0, but subsequent call of GetFileAttributesEx >> returns 5 (ERROR_ACCESS_DENIED). > So you would basically hijack the result of GetFileAttributesEx() so > as any errors returned by this function complain with ENOENT for > everything seen. Why would that be a sane idea? What if say a > permission or another error is legit, but instead ENOENT is returned > as you propose, then the caller would be confused by an incorrect > status. If access to the file is prohibited by lack of permissions, then stat() should fail with error and this error is returned by pgwin32_safestat function. If call of stat() is succeed, then my assumption is that the only reason of GetFileAttributesEx failure is that file is deleted and returning ENOENT error code in this case is correct behavior. > > As you mention, what we did as of 9951741 may not be completely right, > and the reason why it was done this way comes from here: > https://www.postgresql.org/message-id/20160712083220.1426.58667@wrigleys.postgresql.org Yes, this is the same reason, but handling STATUS_DELETE_PENDING is not correct. > > Could we instead come up with a reliable way to detect if a file is in > a deletion pending state? Mapping blindly EACCES to ENOENT is not a > solution I think we can rely on (perhaps we could check only after > ERROR_ACCESS_DENIED using GetLastError() and map back to ENOENT in > this case still this can be triggered if a virus scanner holds the > file for read, no?). stat() returning 0 for a file pending for > deletion which will go away physically once the handles still keeping > the file around are closed is not something I would have imagined is > sane, but that's what we need to deal with... Windows has a long > history of keeping things compatible, sometimes in their own weird > way, and it seems that we have one here so I cannot imagine that this > behavior is going to change. > > Looking around, I have found out about NtCreateFile() which could be > able to report a proper pending deletion status, still that's only > available in kernel mode. Perhaps others have ideas? Sorry, I do not know better solution. I have written small test reproducing the problem which proves that if file is opened with FILE_SHARE_DELETE flag, then it is possible to delete it using unlink() - no error is returned and call stat() for it - also succeed. By any attempt to open this file for reading/writing or performing GetFileAttributesEx are failed with ERROR_ACCESS_DENIED (not with ERROR_DELETE_PENDING which is hidden by Win32 API). -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Thu, May 30, 2019 at 3:25 AM Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > If call of stat() is succeed, then my assumption is that the only reason > of GetFileAttributesEx > failure is that file is deleted and returning ENOENT error code in this > case is correct behavior. In my experience, the assumption "the only possible cause of an error during X is Y" turns out to be wrong nearly 100% of the time. Our job is to report the errors the OS gives us, not guess what they mean. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 03.06.2019 22:15, Robert Haas wrote: > On Thu, May 30, 2019 at 3:25 AM Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> If call of stat() is succeed, then my assumption is that the only reason >> of GetFileAttributesEx >> failure is that file is deleted and returning ENOENT error code in this >> case is correct behavior. > In my experience, the assumption "the only possible cause of an error > during X is Y" turns out to be wrong nearly 100% of the time. Our job > is to report the errors the OS gives us, not guess what they mean. > This is what we are try to do now: r = stat(path, buf); if (r < 0) { if (GetLastError() == ERROR_DELETE_PENDING) { /* * File has been deleted, but is not gone from the filesystem yet. * This can happen when some process with FILE_SHARE_DELETE has it * open and it will be fully removed once that handle is closed. * Meanwhile, we can't open it, so indicate that the file just * doesn't exist. */ errno = ENOENT; return -1; } return r; } but without success because ERROR_DELETE_PENDING is never returned by Win32. And moreover, stat() doesn't ever return error in this case.
On Mon, Jun 03, 2019 at 11:37:30PM +0300, Konstantin Knizhnik wrote: > but without success because ERROR_DELETE_PENDING is never returned by Win32. > And moreover, stat() doesn't ever return error in this case. Could it be possible to find a reliable way to detect that? Cloberring errno with an incorrect value is not something we can rely on, and I am ready to buy that GetFileAttributesEx() can also return EACCES for some legit cases, like a file it has no access to. What if for example something is done on a file between the stat() call and the GetFileAttributesEx() call in pgwin32_safestat() so as EACCES is a legit error? -- Michael
Вложения
On 04.06.2019 3:18, Michael Paquier wrote: > On Mon, Jun 03, 2019 at 11:37:30PM +0300, Konstantin Knizhnik wrote: >> but without success because ERROR_DELETE_PENDING is never returned by Win32. >> And moreover, stat() doesn't ever return error in this case. > Could it be possible to find a reliable way to detect that? > Cloberring errno with an incorrect value is not something we can rely > on, and I am ready to buy that GetFileAttributesEx() can also return > EACCES for some legit cases, like a file it has no access to. What > if for example something is done on a file between the stat() call and > the GetFileAttributesEx() call in pgwin32_safestat() so as EACCES is > a legit error? Sorry, I am not a Windows expert so I do not know how if it is possible to detect that ERROR_ACCESS_DENIED returned by GetFileAttributesEx is actually caused by pending delete. The situation when file permissions were changed between call of stat() and GetFileAttributesEx() is certainly possible but... do your really seriously consider probability of this event and is there something critical if we return ENOENT instead of EACCES in this case? Actually original problem seems to be caused by the assumption that stat() is not correctly setting st_size at Windows: /* * The stat() function in win32 is not guaranteed to update the st_size * field when run. So we define our own version that uses the Win32 API * to update this field. */ I tried to google information about such behavior but didn't find any other references except Postgres sources. I wonder if such problem really takes place (at least with more or less recent versions of Windows)? And how critical it can be that we get cached value of file size? If we access file without locking, then it is not correct to say about the "actual" file size, isn't it? File can be truncated or appended few milliseconds later after this call. If there are some places in Postgres code which rely on the fact that stat() returns the "latest" file size value (actual for the moment of stat() call), then it can be a sign of possible race condition. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company