Обсуждение: Re: BUG #15858: could not stat file - over 4GB

Поиск
Список
Период
Сортировка

Re: BUG #15858: could not stat file - over 4GB

От
Juan José Santamaría Flecha
Дата:

On Thu, Sep 17, 2020 at 6:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Juan José Santamaría Flecha <juanjo.santamaria@gmail.com> writes:
> Thanks for the reminder. Please find attached a rebased version.

(This hasn't shown up on -hackers yet, maybe caught in moderation?)

Thanks for looking into it. Finally, it went through. I will be removing bug-list from now on. 

I took a quick look through this.  I'm not qualified to review the
actual Windows code in win32stat.c, but as far as the way you're
plugging it into the system goes, it looks good and seems to comport
with the discussion so far.

One thing I noticed, which is a pre-existing problem but maybe now
is a good time to consider it, is that we're mapping lstat() to be
exactly stat() on Windows.  That made sense years ago when (we
believed that) Windows didn't have symlinks, but surely it no longer
makes sense.

I will have to take a better look at it, but from a quick look it, all lstat() calls seem to test just if the file exists, and that can be done with a cheap call to GetFileAttributes(). Would a limited (but fast) lstat(), where only st_mode could be informed, be acceptable?

Another more trivial point is that it'd be good to run the new code
through pgindent before committing.

I do not have pgindent in the WIN32 machine, but I will try to for the next version.

Regards,

Juan José Santamaría Flecha

Re: BUG #15858: could not stat file - over 4GB

От
Juan José Santamaría Flecha
Дата:

On Thu, Sep 17, 2020 at 8:47 PM Juan José Santamaría Flecha <juanjo.santamaria@gmail.com> wrote:
On Thu, Sep 17, 2020 at 6:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

One thing I noticed, which is a pre-existing problem but maybe now
is a good time to consider it, is that we're mapping lstat() to be
exactly stat() on Windows.  That made sense years ago when (we
believed that) Windows didn't have symlinks, but surely it no longer
makes sense.

I will have to take a better look at it, but from a quick look it, all lstat() calls seem to test just if the file exists, and that can be done with a cheap call to GetFileAttributes(). Would a limited (but fast) lstat(), where only st_mode could be informed, be acceptable?

After thinking more about this, that approach would be problematic for DELETE_PENDING files. The proposed patch logic is meant to maintain current behaviour, which is not broken for WIN32 symlinks AFAICT.

Regards,

Juan José Santamaría Flecha

Re: BUG #15858: could not stat file - over 4GB

От
Emil Iggland
Дата:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           not tested
Documentation:            not tested

I tested the patch at hand, and it performs as expected. Files larger than 4GB can be imported.

Steps: 
0) create a csv-file that is sufficiently big (>4GB), and one that is small. Use these files to test.
1a) Attempt to import the small file using devel-version.
1b) EXPECTED: success, ACTUAL: success
2a) Attempt to import the big file using devel-version.
2b) EXPECTED: failure, ACTUAL: failure
3) Apply patch and build new version
4a) Attempt to import the small file using patched-version.
4b) EXPECTED: success, ACTUAL: success
4a) Attempt to import the big file using patched-version.
4b) EXPECTED: success, ACTUAL: success

The code looks sensible, it is easy to read and follow. The code uses appropriate win32 functions to perform the task.


Code calculates file size using the following method: buf->st_size = ((__int64) fiData.nFileSizeHigh) << 32 |
(__int64)(fiData.nFileSizeLow);
The hard coded constant 32 is fine, nFileSizeHigh is defined as a DWORD in the Win32 API, which is a 32 bit unsigned
integer.There is no need to a dynamic calculation.
 

There are minor "nit-picks" that I would change if it were my code, but do not change the functionality of the code. 

1) 
if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
{
  errno = ENOENT;
  return -1;
}

Here I would call _dosmaperr(GetLastError()) instead, just to take account of the possibility that some other error
occurred. Following this change there are slight inconsistency in the order of "CloseHandle(hFile), errno = ENOENT;
return-1" and "_dosmaperr(GetLastError()); CloseHandle(hFile); return -1". I would prefer consistent ordering, but that
isnot important. 

The new status of this patch is: Ready for Committer

Re: BUG #15858: could not stat file - over 4GB

От
Tom Lane
Дата:
Emil Iggland <emil@iggland.com> writes:
> I tested the patch at hand, and it performs as expected. Files larger than 4GB can be imported.

Thanks for testing!

I'd been expecting one of our Windows-savvy committers to pick this up,
but since nothing has been happening, I took it on myself to push it.
I'll probably regret that :-(

I made a few cosmetic changes, mostly reorganizing comments in a way
that made more sense to me.

            regards, tom lane



Re: BUG #15858: could not stat file - over 4GB

От
Juan José Santamaría Flecha
Дата:

On Fri, Oct 9, 2020 at 10:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Emil Iggland <emil@iggland.com> writes:
> I tested the patch at hand, and it performs as expected. Files larger than 4GB can be imported.

Thanks for testing!
 
  Thanks for testing! +1

I'd been expecting one of our Windows-savvy committers to pick this up,
but since nothing has been happening, I took it on myself to push it.
I'll probably regret that :-(

Thanks for taking care of this. I see no problems in the build farm, but please reach me if I missed something. 

I made a few cosmetic changes, mostly reorganizing comments in a way
that made more sense to me.

I was working on a new version, which was pgindent-friendlier and clearer about reporting an error when 'errno' is not informed. Please find attached a patch with those changes.

Regards,

Juan José Santamaría Flecha 
Вложения

Re: BUG #15858: could not stat file - over 4GB

От
Michael Paquier
Дата:
On Sat, Oct 10, 2020 at 01:31:21PM +0200, Juan José Santamaría Flecha wrote:
> Thanks for taking care of this. I see no problems in the build farm, but
> please reach me if I missed something.

Thanks for continuing your work on this patch.  I see no related
failures in the buildfarm.

-               _dosmaperr(GetLastError());
+               DWORD           err = GetLastError();
+
+               /* report when not ERROR_SUCCESS */
+               if (err == ERROR_FILE_NOT_FOUND || err == ERROR_PATH_NOT_FOUND)
+                       errno = ENOENT;
+               else
+                       _dosmaperr(err);
Why are you changing that?  The original coding is fine, as
_dosmaperr() already maps ERROR_FILE_NOT_FOUND and
ERROR_PATH_NOT_FOUND to ENOENT.

-      _dosmaperr(GetLastError());
+      DWORD           err = GetLastError();
+
       CloseHandle(hFile);
+      _dosmaperr(err);
These parts are indeed incorrect.  CloseHandle() could overwrite
errno.
--
Michael

Вложения

Re: BUG #15858: could not stat file - over 4GB

От
Juan José Santamaría Flecha
Дата:
On Sat, Oct 10, 2020 at 2:24 PM Michael Paquier <michael@paquier.xyz> wrote:

-               _dosmaperr(GetLastError());
+               DWORD           err = GetLastError();
+
+               /* report when not ERROR_SUCCESS */
+               if (err == ERROR_FILE_NOT_FOUND || err == ERROR_PATH_NOT_FOUND)
+                       errno = ENOENT;
+               else
+                       _dosmaperr(err);
Why are you changing that?  The original coding is fine, as
_dosmaperr() already maps ERROR_FILE_NOT_FOUND and
ERROR_PATH_NOT_FOUND to ENOENT.

If the file does not exist there is no need to call _dosmaperr() and log the error. 

-      _dosmaperr(GetLastError());
+      DWORD           err = GetLastError();
+
       CloseHandle(hFile);
+      _dosmaperr(err);
These parts are indeed incorrect.  CloseHandle() could overwrite
errno.

The meaningful error should come from the previous call, and an error from CloseHandle() could mask it. Not sure it makes a difference anyhow.

Regards,

Juan José Santamaría Flecha

Re: BUG #15858: could not stat file - over 4GB

От
Tom Lane
Дата:
=?UTF-8?Q?Juan_Jos=C3=A9_Santamar=C3=ADa_Flecha?= <juanjo.santamaria@gmail.com> writes:
> If the file does not exist there is no need to call _dosmaperr() and log
> the error.

I concur with Michael that it's inappropriate to make an end run around
_dosmaperr() here.  If you think that the DEBUG5 logging inside that
is inappropriate, you should propose removing it outright.

Pushed the rest of this.

(pgindent behaved differently around PFN_NTQUERYINFORMATIONFILE today
than it did yesterday.  No idea why.)

> The meaningful error should come from the previous call, and an error from
> CloseHandle() could mask it. Not sure it makes a difference anyhow.

Would CloseHandle() really touch errno at all?  But this way is
certainly safer, so done.

            regards, tom lane



Re: BUG #15858: could not stat file - over 4GB

От
Juan José Santamaría Flecha
Дата:

On Sat, Oct 10, 2020 at 7:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I concur with Michael that it's inappropriate to make an end run around
_dosmaperr() here.  If you think that the DEBUG5 logging inside that
is inappropriate, you should propose removing it outright.

Pushed the rest of this.

Great, thanks again to everyone who has taken some time to look into this.

Regards,

Juan José Santamaría Flecha

Re: BUG #15858: could not stat file - over 4GB

От
Michael Paquier
Дата:
On Sat, Oct 10, 2020 at 09:00:27PM +0200, Juan José Santamaría Flecha wrote:
> Great, thanks again to everyone who has taken some time to look into this.

We are visibly not completely out of the woods yet, jacana is
reporting a compilation error:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2020-10-10%2018%3A00%3A28
Oct 10 14:04:40
c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/port/win32stat.c:
In function 'fileinfo_to_stat':
Oct 10 14:04:40
c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/port/win32stat.c:151:13:
error: 'BY_HANDLE_FILE_INFORMATION {aka struct
_BY_HANDLE_FILE_INFORMATION}' has no member named 'nFileSizeLowi'; did
you mean 'nFileSizeLow'?
Oct 10 14:04:40       fiData.nFileSizeLowi);
Oct 10 14:04:40              ^~~~~~~~~~~~~
Oct 10 14:04:40              nFileSizeLow

I don't have the time to check MinGW and HEAD now, so that's just a
heads-up.
--
Michael

Вложения

Re: BUG #15858: could not stat file - over 4GB

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> We are visibly not completely out of the woods yet, jacana is
> reporting a compilation error:

Nah, I fixed that hours ago (961e07b8c).  jacana must not have run again
yet.

            regards, tom lane



Re: BUG #15858: could not stat file - over 4GB

От
Michael Paquier
Дата:
On Sat, Oct 10, 2020 at 08:34:48PM -0400, Tom Lane wrote:
> Nah, I fixed that hours ago (961e07b8c).  jacana must not have run again
> yet.

Indeed, thanks.  I have missed one sync here.

+   hFile = CreateFile(name,
+                      GENERIC_READ,
+                      (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE),
+                      &sa,
+                      OPEN_EXISTING,
+                      (FILE_FLAG_NO_BUFFERING | FILE_FLAG_BACKUP_SEMANTICS |
+                       FILE_FLAG_OVERLAPPED),
+                      NULL);
+   if (hFile == INVALID_HANDLE_VALUE)
+   {
+       CloseHandle(hFile);
+       errno = ENOENT;
+       return -1;
+   }
Why are we forcing errno=ENOENT here?  Wouldn't it be correct to use
_dosmaperr(GetLastError()) to get the correct errno?  This code would
for example consider as non-existing a file even if we fail getting it
because of ERROR_SHARING_VIOLATION, which should map to EACCES.  This
case can happen with virus scanners taking a non-share handle on files
being looked at in parallel of this code path.
--
Michael

Вложения

Re: BUG #15858: could not stat file - over 4GB

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> Why are we forcing errno=ENOENT here?  Wouldn't it be correct to use
> _dosmaperr(GetLastError()) to get the correct errno?

Fair question.  Juan, was there some good reason not to look at
GetLastError() in this step?

            regards, tom lane



Re: BUG #15858: could not stat file - over 4GB

От
Juan José Santamaría Flecha
Дата:

On Mon, Oct 12, 2020 at 5:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael@paquier.xyz> writes:
> Why are we forcing errno=ENOENT here?  Wouldn't it be correct to use
> _dosmaperr(GetLastError()) to get the correct errno?

Fair question.  Juan, was there some good reason not to look at
GetLastError() in this step?

Uhm, a good question indeed, forcing errno serves no purpose there.

Regards,

Juan José Santamaría Flecha

Re: BUG #15858: could not stat file - over 4GB

От
Tom Lane
Дата:
=?UTF-8?Q?Juan_Jos=C3=A9_Santamar=C3=ADa_Flecha?= <juanjo.santamaria@gmail.com> writes:
> On Mon, Oct 12, 2020 at 5:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Michael Paquier <michael@paquier.xyz> writes:
>>> Why are we forcing errno=ENOENT here?  Wouldn't it be correct to use
>>> _dosmaperr(GetLastError()) to get the correct errno?

>> Fair question.  Juan, was there some good reason not to look at
>> GetLastError() in this step?

> Uhm, a good question indeed, forcing errno serves no purpose there.

OK, changed.

            regards, tom lane



Re: BUG #15858: could not stat file - over 4GB

От
Michael Paquier
Дата:
On Mon, Oct 12, 2020 at 11:13:38AM -0400, Tom Lane wrote:
> Juan José Santamaría Flecha wrote:
>> Uhm, a good question indeed, forcing errno serves no purpose there.
>
> OK, changed.

Thanks!
--
Michael

Вложения