Re: BUG #16161: pg_ctl stop fails sometimes (on Windows)

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #16161: pg_ctl stop fails sometimes (on Windows)
Дата
Msg-id 32374.1576770248@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #16161: pg_ctl stop fails sometimes (on Windows)  (Alexander Lakhin <exclusion@gmail.com>)
Ответы Re: BUG #16161: pg_ctl stop fails sometimes (on Windows)  (Alexander Lakhin <exclusion@gmail.com>)
Список pgsql-bugs
Alexander Lakhin <exclusion@gmail.com> writes:
> Please look at the patch that implements (2). It makes `vcregress
> recoverycheck` pass (and my demo restart test still passes too).

I think we could be a little smarter here.  If the problem is
STATUS_DELETE_PENDING, doesn't that affect stat() as well?  That is,
if stat() succeeds, we needn't wait, independently of whether it
says S_ISDIR or not?  This seems like a noticeable improvement if
true, because it would mean that ordinary file-permission failures
also need not wait.  We'd still get confused if we get a permission
failure on a containing directory rather than the file itself, but
that I'm prepared to dismiss as an uncommon case.

Entirely-untested patch along this line attached.

BTW ... it's likely that stat() here is actually going to invoke
pgwin32_safestat(), which has its own opinions about this, and
indeed seems to think we'll get ERROR_DELETE_PENDING.  I think
that's harmless here, but it makes me wonder if we should use
a native Windows API instead of going through stat().

> As open(dir) is getting a little more expensive than before, maybe it's
> still worthwhile to patch fsync*(..., isdir,...). I can prepare such a
> patch if so.

I think we should leave that for later, so that the buildfarm can
actually test whatever logic we put in here.

            regards, tom lane

diff --git a/src/port/open.c b/src/port/open.c
index 5165276..24af6f4 100644
--- a/src/port/open.c
+++ b/src/port/open.c
@@ -21,6 +21,7 @@
 
 #include <fcntl.h>
 #include <assert.h>
+#include <sys/stat.h>
 
 
 static int
@@ -137,18 +138,33 @@ pgwin32_open(const char *fileName, int fileFlags,...)
         }
 
         /*
-         * ERROR_ACCESS_DENIED can be returned if the file is deleted but not
-         * yet gone (Windows NT status code is STATUS_DELETE_PENDING).  Wait a
-         * bit and try again, giving up after 1 second (since this condition
-         * should never persist very long).
+         * ERROR_ACCESS_DENIED is returned if the file is deleted but not yet
+         * gone (Windows NT status code is STATUS_DELETE_PENDING).  In that
+         * case we want to wait a bit and try again, giving up after 1 second
+         * (since this condition should never persist very long).  However,
+         * there are other commonly-hit cases that return ERROR_ACCESS_DENIED,
+         * so care is needed.  In particular that happens if we try to open a
+         * directory, or of course if there's an actual file-permissions
+         * problem.  To distinguish these cases, try a stat().  In the
+         * delete-pending case, it will either also get STATUS_DELETE_PENDING,
+         * or it will see the file as gone and fail with ENOENT.  In other
+         * cases it will usually succeed.  The only somewhat-likely case where
+         * this coding will uselessly wait is if there's a permissions problem
+         * with a containing directory, which we hope will never happen in any
+         * performance-critical code paths.
          */
         if (err == ERROR_ACCESS_DENIED)
         {
             if (loops < 10)
             {
-                pg_usleep(100000);
-                loops++;
-                continue;
+                struct stat st;
+
+                if (stat(fileName, &st) != 0)
+                {
+                    pg_usleep(100000);
+                    loops++;
+                    continue;
+                }
             }
         }


В списке pgsql-bugs по дате отправления:

Предыдущее
От: PG Bug reporting form
Дата:
Сообщение: BUG #16172: failure of vacuum file truncation can cause permanent data corruption
Следующее
От: Alexander Lakhin
Дата:
Сообщение: Re: BUG #16161: pg_ctl stop fails sometimes (on Windows)