Обсуждение: BUG #16526: pg_test_fsync in v12 doesn't run in Windows

Поиск
Список
Период
Сортировка

BUG #16526: pg_test_fsync in v12 doesn't run in Windows

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      16526
Logged by:          Jeff Janes
Email address:      jeff.janes@gmail.com
PostgreSQL version: 12.3
Operating system:   windows 2019
Description:

When I install PG12 on Windows 10 using the EDB installer, pg_test_fsync
doesn't work.  But I notice Windows 10 is not officially supported, so I
repeated the exercise with EDB installer on a rented box: 
Microsoft Windows Server 2019 with Desktop Experience Locale English AMI
provided by Amazon

And it still doesn't work, see capture from CMD.exe done immediately after
running the EDB installer:

Microsoft Windows [Version 10.0.17763.1282]
(c) 2018 Microsoft Corporation. All rights reserved.

C:\Users\Administrator>"c:\Program
Files\PostgreSQL\12\bin\pg_test_fsync.exe"
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                 pg_test_fsync: error: write failed:
Invalid argument

Note that it abandons the run here it does not just report an error and then
continue.
I'm not in a position to compile on Windows, so I can't run git bisect nor
tell if this is something specific to EDB.  But I suspect this rather has
something to do with commit 0ba06e0bfb8cfd24ff17aca92aa72245ddd6c4d7

On v11, this does work.  Or at least it runs to completion, I guess the
numbers reported might be bogus, which is why the indicated commit was done.


Re: BUG #16526: pg_test_fsync in v12 doesn't run in Windows

От
Michael Paquier
Дата:
On Fri, Jul 03, 2020 at 04:01:46PM +0000, PG Bug reporting form wrote:
> Note that it abandons the run here it does not just report an error and then
> continue.
> I'm not in a position to compile on Windows, so I can't run git bisect nor
> tell if this is something specific to EDB.  But I suspect this rather has
> something to do with commit 0ba06e0bfb8cfd24ff17aca92aa72245ddd6c4d7

I think that the error actually comes from 40cfe86, where we enforce
O_TEXT in our port of open() for Windows so as the switch from WIN32's
open() to our concurrent-safe version remains compatible.  O_TEXT and
FILE_FLAG_NO_BUFFERING are visibly incompatible together, and
enforcing the call of open() in pg_test_fsync to use O_BINARY
(actually PG_BINARY) when testing open_datasync() allows things to go
through.  So we are one call short of _setmode(), but it is more
simple to pass down the flag when opening the fd.  It may actually be
safer in the long run to add the binary flag to all the calls of
open() in pg_test_fsync.  Thoughts?
--
Michael

Вложения

Re: BUG #16526: pg_test_fsync in v12 doesn't run in Windows

От
Michael Paquier
Дата:
On Sat, Jul 04, 2020 at 10:03:56AM +0900, Michael Paquier wrote:
> I think that the error actually comes from 40cfe86, where we enforce
> O_TEXT in our port of open() for Windows so as the switch from WIN32's
> open() to our concurrent-safe version remains compatible.  O_TEXT and
> FILE_FLAG_NO_BUFFERING are visibly incompatible together, and
> enforcing the call of open() in pg_test_fsync to use O_BINARY
> (actually PG_BINARY) when testing open_datasync() allows things to go
> through.  So we are one call short of _setmode(), but it is more
> simple to pass down the flag when opening the fd.  It may actually be
> safer in the long run to add the binary flag to all the calls of
> open() in pg_test_fsync.  Thoughts?

So, I have been looking at that again.  And, I am not completely sure
why the combination of FILE_FLAG_NO_BUFFERING and _setmode(O_TEXT)
would cause this failure, but I suspect that this is caused by the
fact that CRLF gets changed into single LF characters on input (input
buffer uses random data), causing the size of what's expected to be
written to not match with what is actually written.  An additional
factor is visibly our wrapper safe-concurrent wrapper pgwin32_open().

It seems to me that it would be a better idea to just use the binary
mode for all the open() calls in pg_test_fsync in the long run, which
is what the attached patch does as there is no control on the data
written.  This way, we avoid any translation that may happen during
the write.  Any thoughts?
--
Michael

Вложения

Re: BUG #16526: pg_test_fsync in v12 doesn't run in Windows

От
Michael Paquier
Дата:
On Tue, Jul 07, 2020 at 05:25:49PM +0900, Michael Paquier wrote:
> It seems to me that it would be a better idea to just use the binary
> mode for all the open() calls in pg_test_fsync in the long run, which
> is what the attached patch does as there is no control on the data
> written.  This way, we avoid any translation that may happen during
> the write.  Any thoughts?

I would like to get a second opinion on this one, so for now I have
registered this patch in the CF app:
https://commitfest.postgresql.org/29/2640/
--
Michael

Вложения

Re: BUG #16526: pg_test_fsync in v12 doesn't run in Windows

От
Bruce Momjian
Дата:
On Tue, Jul  7, 2020 at 05:25:49PM +0900, Michael Paquier wrote:
> So, I have been looking at that again.  And, I am not completely sure
> why the combination of FILE_FLAG_NO_BUFFERING and _setmode(O_TEXT)
> would cause this failure, but I suspect that this is caused by the
> fact that CRLF gets changed into single LF characters on input (input
> buffer uses random data), causing the size of what's expected to be
> written to not match with what is actually written.  An additional
> factor is visibly our wrapper safe-concurrent wrapper pgwin32_open().
> 
> It seems to me that it would be a better idea to just use the binary
> mode for all the open() calls in pg_test_fsync in the long run, which
> is what the attached patch does as there is no control on the data
> written.  This way, we avoid any translation that may happen during
> the write.  Any thoughts?

Well, pg_test_fsync is testing binary writes, so using TEXT for the
files might add unwanted overhead, so I think binary is the best
approach.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: BUG #16526: pg_test_fsync in v12 doesn't run in Windows

От
Michael Paquier
Дата:
On Mon, Jul 13, 2020 at 08:06:25PM -0400, Bruce Momjian wrote:
> Well, pg_test_fsync is testing binary writes, so using TEXT for the
> files might add unwanted overhead, so I think binary is the best
> approach.

Thanks.  Please note that we have switched frontend tools to use our
concurrent-safe flavor of open() and fopen() as of 0ba06e0 (12 and
newer versions), but 40cfe86 has shown that we have been using the
text mode in pg_test_fsync since forever as Windows' open() uses the
text mode by default if we don't specify _fmode with _setmode():
https://docs.microsoft.com/en-us/cpp/c-runtime-library/fmode?view=vs-2019

For this reason, it seems more sensible to me to not backpatch this
change only down to 12, but actually all the way down to 9.5.
--
Michael

Вложения

Re: BUG #16526: pg_test_fsync in v12 doesn't run in Windows

От
Bruce Momjian
Дата:
On Tue, Jul 14, 2020 at 12:54:58PM +0900, Michael Paquier wrote:
> On Mon, Jul 13, 2020 at 08:06:25PM -0400, Bruce Momjian wrote:
> > Well, pg_test_fsync is testing binary writes, so using TEXT for the
> > files might add unwanted overhead, so I think binary is the best
> > approach.
> 
> Thanks.  Please note that we have switched frontend tools to use our
> concurrent-safe flavor of open() and fopen() as of 0ba06e0 (12 and
> newer versions), but 40cfe86 has shown that we have been using the
> text mode in pg_test_fsync since forever as Windows' open() uses the
> text mode by default if we don't specify _fmode with _setmode():
> https://docs.microsoft.com/en-us/cpp/c-runtime-library/fmode?view=vs-2019
> 
> For this reason, it seems more sensible to me to not backpatch this
> change only down to 12, but actually all the way down to 9.5.

OK.  I am kind of surprised we haven't received bug reports earlier, but
I guess few Windows users run pg_test_fsync, and fewer report problems.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: BUG #16526: pg_test_fsync in v12 doesn't run in Windows

От
Michael Paquier
Дата:
On Tue, Jul 14, 2020 at 10:27:00AM -0400, Bruce Momjian wrote:
> OK.  I am kind of surprised we haven't received bug reports earlier, but
> I guess few Windows users run pg_test_fsync, and fewer report problems.

On 11 and older versions, we use O_TEXT with pg_test_fsync.  This does
not cause directly a failure, but those versions have as other problem
to not be able to handle properly O_DSYNC, leading to tests of
open_fdatasync to show extremely high and incorrect outputs:
https://www.postgresql.org/message-id/1527864509.2475.49.camel@cybertec.at
And it is the concurrent use of O_DSYNC with O_TEXT that's visibly
causing a problem for 12~.

O_DSYNC handling got addressed in v12 with 0ba06e0, and while changing
back branches for ~11 to allow frontends to use our concurrent-safe
version of fopen()/open() would be tempting, I don't think that fixing
one test of pg_test_fsync is a reason enough to potentially risk
breakages for other client tools inside, or even outside, the core
tree.

If there are no objections, I would like to just change the tool to
use the binary mode down to 9.5.  I can see that the text mode is
getting used for all branches, and that's wrong as the aim is to test
binary writes, basically as you said as well upthread.
--
Michael

Вложения

Re: BUG #16526: pg_test_fsync in v12 doesn't run in Windows

От
Bruce Momjian
Дата:
On Wed, Jul 15, 2020 at 03:53:56PM +0900, Michael Paquier wrote:
> If there are no objections, I would like to just change the tool to
> use the binary mode down to 9.5.  I can see that the text mode is
> getting used for all branches, and that's wrong as the aim is to test
> binary writes, basically as you said as well upthread.

+1

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: BUG #16526: pg_test_fsync in v12 doesn't run in Windows

От
Michael Paquier
Дата:
On Wed, Jul 15, 2020 at 12:26:53PM -0400, Bruce Momjian wrote:
> On Wed, Jul 15, 2020 at 03:53:56PM +0900, Michael Paquier wrote:
>> If there are no objections, I would like to just change the tool to
>> use the binary mode down to 9.5.  I can see that the text mode is
>> getting used for all branches, and that's wrong as the aim is to test
>> binary writes, basically as you said as well upthread.
>
> +1

Thanks.  Fixed then as 932f9fb & co.
--
Michael

Вложения