Обсуждение: BUG #17288: PSQL bug with COPY command (Windows)

Поиск
Список
Период
Сортировка

BUG #17288: PSQL bug with COPY command (Windows)

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17288
Logged by:          Dmitry Koval
Email address:      d.koval@postgrespro.ru
PostgreSQL version: 14.1
Operating system:   Windows 10 (21H1)
Description:

Hi,
there is a problem on the REL_14_STABLE branch (on the REL_14_1
branch too). When executing a query under Windows 10 (21H1)

\copy (SELECT 1) TO stdout

PSQL utility prints error 
"could not stat file" (null) ": Invalid argument" and crashes.
There is no error under Ubuntu 20.04 LTS.

Issue source: commit bed90759fcbcd72d4d06969eebab81e47326f9a 
("Fix our Windows stat() emulation to handle file sizes > 4GB.", 
discussion: https://postgr.es/m/15858-9572469fd3b73263@postgresql.org).

In this commit function stat () was replaced to function 
GetFileInformationByHandle() 
(see src/port/win32stat.c, function fileinfo_to_stat()):

    if (!GetFileInformationByHandle(hFile, &fiData))
    {
        _dosmaperr(GetLastError());
        return -1;
    }

Function GetFileInformationByHandle() works for files but can not work
with streams like "stdout".
For "stdout" the GetFileInformationByHandle () function returns an error
(GetLastError () == ERROR_INVALID_FUNCTION), which causes PSQL crash.

Examples of errors processing for function GetFileInformationByHandle()
in other applications:

1) https://github.com/python/cpython/blob/main/Modules/posixmodule.c
2)
https://doxygen.reactos.org/da/d6a/subsystems_2mvdm_2ntvdm_2hardware_2disk_8c_source.html

Quick fix in src/port/win32stat.c: 
in case function GetFileInformationByHandle() returns a specific error
code, call the _stat64() function for this descriptor.
It's not elegant, but it works.

(I'll attach file with patch in next email).

With best regards,
Dmitry Koval.


Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Dmitry Koval
Дата:
Attachments: patch + screenshot with error.
Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Michael Paquier
Дата:
On Wed, Nov 17, 2021 at 01:29:02PM +0300, Dmitry Koval wrote:
> Attachments: patch + screenshot with error.

Gulp.  I can reproduce this problem.

>      if (!GetFileInformationByHandle(hFile, &fiData))
>      {
> -        _dosmaperr(GetLastError());
> +        DWORD        error = GetLastError();
> +
> +        switch (error)
> +        {
> +            case ERROR_INVALID_PARAMETER:
> +            case ERROR_INVALID_FUNCTION:
> +            case ERROR_NOT_SUPPORTED:
> +
> +                /*
> +                 * Object is other than real file (stdout, for example). So
> +                 * need to call _fstat64 in this case. "struct stat" should
> +                 * match "struct __stat64", see "struct stat" definition.
> +                 */
> +                if (fileno >= 0)
> +                    return _fstat64(fileno, (struct __stat64 *) buf);
> +                else if (name)
> +                    return _stat64(name, (struct __stat64 *) buf);
> +        }
> +        _dosmaperr(error);
>          return -1;

Hmm.  _fstat64() and _stat64() have proved to be tricky to work with
and rather unworkable across all the build systems we support
(remember the 2GB file size problem, for example), which is why we
have the existing business of win32stat.c to begin with.  It seems to
me, also, that we could run into issues if we blindly map those error
codes to call a stat() function as fallback.  I am not sure that doing
a blind cast to __stat64 is going to work all the time, either.

I think that we had better never call GetFileInformationByHandle() if
we use a fileno that maps to stdin, stdout or stderr.  The 10000$
question is what to use though.  One option is to let our emulation
code fill in the gap for those three cases with at least st_mode
filled with S_IFIFO, then return 0 as error code :/

Just to be sure, this is the code path in psql's copy.c where we check
that a specified copystream is not a directory, right?
--
Michael

Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Juan José Santamaría Flecha
Дата:
Thanks for reporting.

On Thu, Nov 18, 2021 at 8:40 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Nov 17, 2021 at 01:29:02PM +0300, Dmitry Koval wrote:
> Attachments: patch + screenshot with error.

I think that we had better never call GetFileInformationByHandle() if
we use a fileno that maps to stdin, stdout or stderr.  The 10000$
question is what to use though.  One option is to let our emulation
code fill in the gap for those three cases with at least st_mode
filled with S_IFIFO, then return 0 as error code :/

When the file descriptor is coming from a stream I don't think we have to use GetFileInformationByHandle() or stat() to get any information, but we can directly return the stat information. Please find attached a patch for so.

Regards,

Juan José Santamaría Flecha
Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Dmitry Koval
Дата:
 >_fstat64() and _stat64() have proved to be tricky to work with and 
 >rather unworkable across all the build systems we support

I agree, it is better not use them.

 >I think that we had better never call GetFileInformationByHandle() if
 >we use a fileno that maps to stdin, stdout or stderr.

Probably we should call GetFileInformationByHandle() for case standard
streams stdin/stdout/stderr are redirected to files.
See file src\backend\utils\error\elog.c, for example. It contains
lines:

if (!freopen(OutputFileName, "a", stderr))
if (!freopen(OutputFileName, "a", stdout))

And this command with "stderr" works in PSQL without crash (in
contrast to "stdout"):

\copy (SELECT 1) TO stderr

(it put resullt into file with name "stderr").
We can emulate stats for stdin/stdout/stderr after call
GetFileInformationByHandle().

 >Just to be sure, this is the code path in psql's copy.c where we check 
 >that a specified copystream is not a directory, right?

Yes, fstat() called from file src/bin/psql/copy.c:

/* make sure the specified file is not a directory */
if ((result = fstat(fileno(copystream), &st)) < 0)

I attached new patch version.

With best regards,
Dmitry Koval.
Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Dmitry Koval
Дата:
Three corrections for patch
v2-0001-Fixed-Windows-stat-emulation-working-with-streams.patch:

1) I see in debugger that field "st_mode" for stdout has value
_S_IFCHR (0x2000, 8192);

2) "st_rdev" feild should have the same value as "st_dev";

3) we can not use emulated stat information for stderr after call
"freopen(OutputFileName, "a", stderr);".

With best regards,
Dmitry Koval.



Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Juan José Santamaría Flecha
Дата:

On Thu, Nov 18, 2021 at 2:40 PM Dmitry Koval <d.koval@postgrespro.ru> wrote:
 >I think that we had better never call GetFileInformationByHandle() if
 >we use a fileno that maps to stdin, stdout or stderr.

Probably we should call GetFileInformationByHandle() for case standard
streams stdin/stdout/stderr are redirected to files.
See file src\backend\utils\error\elog.c, for example. It contains
lines:

if (!freopen(OutputFileName, "a", stderr))
if (!freopen(OutputFileName, "a", stdout))

And this command with "stderr" works in PSQL without crash (in
contrast to "stdout"):

\copy (SELECT 1) TO stderr

(it put resullt into file with name "stderr").
We can emulate stats for stdin/stdout/stderr after call
GetFileInformationByHandle().

You are right about freopen(), but stderr works just because it's being parsed as a file, not a stream, by parse_slash_copy().
 
I attached new patch version.

I would keep the memset(buf, 0, sizeof(*buf)) for the members we are not setting.

Regards,

Juan José Santamaría Flecha

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Dmitry Koval
Дата:
 >You are right about freopen(), but stderr works just because it's being
 >parsed as a file, not a stream, by parse_slash_copy().

You are right. I wrote about stderr without inspecting code (

 >I would keep the memset(buf, 0, sizeof(*buf)) for the members we
 >are not setting.

Of course. Original code (src/port/win32stat.c) contains line

memset(buf, 0, sizeof(*buf))

before call GetFileInformationByHandle().

With best regards,
Dmitry Koval.



Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Michael Paquier
Дата:
On Thu, Nov 18, 2021 at 09:48:55PM +0300, Dmitry Koval wrote:
> >You are right about freopen(), but stderr works just because it's being
> >parsed as a file, not a stream, by parse_slash_copy().
>
> You are right. I wrote about stderr without inspecting code (
>
> >I would keep the memset(buf, 0, sizeof(*buf)) for the members we
> >are not setting.
>
> Of course. Original code (src/port/win32stat.c) contains line
>
> memset(buf, 0, sizeof(*buf))
>
> before call GetFileInformationByHandle().

I still think that we should not call GetFileInformationByHandle()
when it comes to an argument that we know will fail, so I would agree
with Juan's approach in v2 to just patch _pgfstat64() rather than
messing with the code paths in charge of checking handles pending for
deletion.  Each fileno is going to be 0, 1 or 2 in those cases, still
it would be better to rely on the result of fileno() as v3 is doing?

It looks like you are right about _S_IFCHR for st_mode, as of this
part of the docs:

https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fstat-fstat32-fstat64-fstati64-fstat32i64-fstat64i32?view=msvc-170
--
Michael

Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Dmitry Koval
Дата:
 >I still think that we should not call GetFileInformationByHandle()
 >when it comes to an argument that we know will fail, ...

Our program calls function freopen() (this function is used in file
src\backend\utils\error\elog.c). So we don't know exactly that
GetFileInformationByHandle() will fails for stdin/stdout/stderr.

Ior illustration, I attached small example (test.c) for this case:
1) GetFileInformationByHandle() for "stdout" works without error;
2) file "1.txt" after running test contains:

Sample string
attributes = 0x2020, file size = 15

 >Each fileno is going to be 0, 1 or 2 in those cases, still
 >it would be better to rely on the result of fileno() as v3 is
 >doing?

I think fileno() is safer. We can call fclose(stdout) and
_fileno(stdout) returns -1 after that. But I don't know:
can OS reuse fileno=1 or not in this case?

With best regards,
Dmitry.
Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Juan José Santamaría Flecha
Дата:

On Fri, Nov 19, 2021 at 9:47 AM Dmitry Koval <d.koval@postgrespro.ru> wrote:
 >I still think that we should not call GetFileInformationByHandle()
 >when it comes to an argument that we know will fail, ...

Our program calls function freopen() (this function is used in file
src\backend\utils\error\elog.c). So we don't know exactly that
GetFileInformationByHandle() will fails for stdin/stdout/stderr.

Ior illustration, I attached small example (test.c) for this case:
1) GetFileInformationByHandle() for "stdout" works without error;
2) file "1.txt" after running test contains:

Sample string
attributes = 0x2020, file size = 15

We can check for redirection without calling GetFileInformationByHandle(), please consider the attached patch.

Regards,

Juan José Santamaría Flecha
Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Dmitry Koval
Дата:
 >We can check for redirection without calling
 >GetFileInformationByHandle(), please consider the attached patch.

There are doubts:

1) we always use system function (GetFinalPathNameByHandleA or
GetFileInformationByHandle); sometimes we use these two calls
GetFinalPathNameByHandleA + GetFileInformationByHandle
together (in freopen() case);

2) function GetFinalPathNameByHandleA is slower than
GetFileInformationByHandle (2-4 times).
Might be exist a cheaper way than GetFinalPathNameByHandleA?

I don't understand why this way better than using one call
GetFileInformationByHandle...

With best regards,
Dmitry Koval.



Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Juan José Santamaría Flecha
Дата:

On Fri, Nov 19, 2021 at 11:45 AM Dmitry Koval <d.koval@postgrespro.ru> wrote:
 >We can check for redirection without calling
 >GetFileInformationByHandle(), please consider the attached patch.

There are doubts:

1) we always use system function (GetFinalPathNameByHandleA or
GetFileInformationByHandle); sometimes we use these two calls
GetFinalPathNameByHandleA + GetFileInformationByHandle
together (in freopen() case);

2) function GetFinalPathNameByHandleA is slower than
GetFileInformationByHandle (2-4 times).
Might be exist a cheaper way than GetFinalPathNameByHandleA?

I don't understand why this way better than using one call
GetFileInformationByHandle...

Personally, I think it is more readable, although a comment should explain what we are checking.

GetFinalPathNameByHandleA() is slower when there is redirection, when there is no redirection it's similar or faster than GetFileInformationByHandle(), and that should be the most common case.

Regards,

Juan José Santamaría Flecha

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Dmitry Koval
Дата:
 >GetFinalPathNameByHandleA() is slower when there is redirection, when
 >there is no redirection it's similar or faster than
 >GetFileInformationByHandle(), and that should be the most common case.

I tested 2 cases:

1) "stdout" is not redirected.
In this case GetFinalPathNameByHandleA() returns error and 
GetFileInformationByHandle() returns error.
Simple test in attachment.
---
Result:
GetFileInformationByHandle: 7, GetFinalPathNameByHandleA: 16

2) "stdout" redirected (need to uncomment 2 lines in test).
Both function works without error.
---
Result (see file "1.txt"):
GetFileInformationByHandle: 34, GetFinalPathNameByHandleA: 120

The difference is not very big, but it is.

With best regards,
Dmitry.
Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Michael Paquier
Дата:
On Fri, Nov 19, 2021 at 12:44:14PM +0100, Juan José Santamaría Flecha wrote:
> GetFinalPathNameByHandleA() is slower when there is redirection, when there
> is no redirection it's similar or faster than GetFileInformationByHandle(),
> and that should be the most common case.

Yeah, the slight performance impact caused by the redirection does not
worry me much, based on the code path of Postgres where this would be
called.  And that's unlikely going to become a bottleneck anyway as
GetFinalPathNameByHandleA() would just be called if we know that we
are with stdin, stdout or stderr thanks to the first part of the "if"
clause.

I like the simplicity behind v4 as it does not interfere with
_pgstat64() while dealing with the redirection case, so it should
address the concerns from Dmitry anyway (right?).  We should really
add a comment explaining why this is handled this way though for the
case of streams, why the case of the redirection matters, and why we
count on a failure of GetFinalPathNameByHandleA().
--
Michael

Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Dmitry Koval
Дата:
 >... We should really
 >add a comment explaining why this is handled this way though for the
 >case of streams, why the case of the redirection matters, and why we
 >count on a failure of GetFinalPathNameByHandleA().

Added a comment for v4 (as draft).
But I don't know English well enough and probably comment should be
corrected.


With best regards,
Dmitry.
Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Juan José Santamaría Flecha
Дата:

On Sat, Nov 20, 2021 at 12:51 PM Dmitry Koval <d.koval@postgrespro.ru> wrote:

Added a comment for v4 (as draft).
But I don't know English well enough and probably comment should be
corrected.

It reads a little too much like pseudocode for my liking. Please consider the attached version.

Regards,

Juan José Santamaría Flecha
Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Dmitry Koval
Дата:
> It reads a little too much like pseudocode for my liking. Please 
> consider the attached version.

This is a good comment, I like it.

With best regards,
Dmitry Koval.



Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Michael Paquier
Дата:
On Mon, Nov 22, 2021 at 02:07:09PM +0300, Dmitry Koval wrote:
> > It reads a little too much like pseudocode for my liking. Please
> > consider the attached version.
>
> This is a good comment, I like it.

        /*
+        * Check if the fileno is a data stream. If so, unless it has been
+        * redirected to a file, getting information by handle will fail. Return
+        * the appropriate stat information instead.
+        */
I would perhaps add an extra note that we emulate the result in the
best way we can.  Anyway, what you have here looks basically enough.

Another thing that has been itching me on this thread is that we have
not been able to detect this problem even if we have tests in
copyselect.sql that cover those paths.  Is that because running
vcregress.pl implies a redirection of stdout so we would never see,
except from a terminal?
--
Michael

Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Dmitry Koval
Дата:
 >Is that because running vcregress.pl implies a redirection of stdout so
 >we would never see, except from a terminal?

Unfortunately, you are.
All my COPY-tests work well with files.
I see a problem in the terminal only ...

With best regards,
Dmitry Koval.



Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Michael Paquier
Дата:
On Wed, Nov 24, 2021 at 10:44:09AM +0300, Dmitry Koval wrote:
> Unfortunately, you are.
> All my COPY-tests work well with files.
> I see a problem in the terminal only ...

While checking all the callers of fstat() in the core code, I have
discovered that this broke a second case when stdout is not
redirected: pg_recvlogical -f -.  In this case, the code would just
bump on EINVAL repeatedly instead of crashing, but equally broken it
was.  There is a TAP test for this case in src/bin/pg_basebackup/, but
it would not trigger the problem per the redirection already done
there.

Anyway, I have run more tests, tweaked slightly the comment, and
applied it down to 14.  Thanks to both of you!
--
Michael

Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Juan José Santamaría Flecha
Дата:

On Thu, Nov 25, 2021 at 4:23 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Nov 24, 2021 at 10:44:09AM +0300, Dmitry Koval wrote:
> Unfortunately, you are.
> All my COPY-tests work well with files.
> I see a problem in the terminal only ...

Anyway, I have run more tests, tweaked slightly the comment, and
applied it down to 14.  Thanks to both of you!

After some experimentation I have been able to test this type of error through a tap test using console input. Looking at the patch I'm not sure it is worth all the extra code, but I'm submitting it for future reference at least.

Regards,

Juan José Santamaría Flecha 
Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Michael Paquier
Дата:
On Fri, Nov 26, 2021 at 01:37:49PM +0100, Juan José Santamaría Flecha wrote:
> through a tap test using console input. Looking at the patch I'm not sure
> it is worth all the extra code, but I'm submitting it for future reference
> at least.

+       eval {
+               $result = IPC::Run::run [ 'psql', '-p', $port, '-c',
"\\copy public.test from stdin", 'postgres' ],
+                       IPC::Run::timeout( 1, name => "stall timeout" );
Using directly psql commands in the tests is not a good idea as it
makes the client-side execution more sensitive to the environment.
src/test/perl/PostgreSQL/Test/Cluster.pm does the same work, in the
wanted way, so it would be better to rely on it (see 384f1ab for one
recent issue).  A test with a minimal timeout also takes time, making
for slower tests on faster machines.

Hmm.  Do you think that a test based on Cluster::psql and stdout would
actually be able to do the work or would the redirection done in the
INIT block of Utils.pm prevent that?  Contrary to the code path of
pg_recvlogical, Cluster::psql would use directly IPC::Run, and not
redirect stdout to the test log file.

I would stick a test in one of the existing test suites, to not create
an extra cluster and reduce its run time.
--
Michael

Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Michael Paquier
Дата:
On Sun, Nov 28, 2021 at 01:15:36PM +0900, Michael Paquier wrote:
> Using directly psql commands in the tests is not a good idea as it
> makes the client-side execution more sensitive to the environment.
> src/test/perl/PostgreSQL/Test/Cluster.pm does the same work, in the
> wanted way, so it would be better to rely on it (see 384f1ab for one
> recent issue).  A test with a minimal timeout also takes time, making
> for slower tests on faster machines.

Or, rather than relying on a small timeout, could we just send some
data to stdin and check the table's contents in return?  That would
make the test faster, while not penalizing the coverage.
--
Michael

Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Juan José Santamaría Flecha
Дата:

On Mon, Nov 29, 2021 at 7:51 AM Michael Paquier <michael@paquier.xyz> wrote:

We know that GetFileInformationByHandle() would be able to work in
the scope of win32stat.c as it is supported down to Windows XP so we
would be covered, so switching to that instead of
GetFinalPathNameByHandleA() would take care of the problem.  Better to
call that only for a standard stream, only for the emulation of
fstat(), in the spirit of the original fix.

Fair enough. Windows XP support is a discussion for another thread.

Regards,

Juan José Santamaría Flecha

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Michael Paquier
Дата:
On Mon, Nov 29, 2021 at 08:53:28AM +0100, Juan José Santamaría Flecha wrote:
> Fair enough. Windows XP support is a discussion for another thread.

Yeah, I would not mind doing something about that on HEAD.  I am not
sure what this implies for the buildfarm, though.
--
Michael

Вложения

Re: BUG #17288: PSQL bug with COPY command (Windows)

От
Michael Paquier
Дата:
On Tue, Nov 30, 2021 at 09:57:23AM +0900, Michael Paquier wrote:
> Yeah, I would not mind doing something about that on HEAD.  I am not
> sure what this implies for the buildfarm, though.

So, this compatibility fix has been applied as of 58651d8.  And based
on the buildfarm members jacana and fairywren, the warning is now
gone.
--
Michael

Вложения