Обсуждение: Switch buffile.c/h to use pgoff_t

Поиск
Список
Период
Сортировка

Switch buffile.c/h to use pgoff_t

От
Michael Paquier
Дата:
Hi all,
(Added Bryan in CC as he has been looking at this stuff previously.)

An mentioned on this thread and as a part of the quest to remove more
of long in the tree, buffile.c and buffile.h still rely on an
unportable off_t, which is signed 4 bytes on Windows:
https://www.postgresql.org/message-id/0f238ff4-c442-42f5-adb8-01b762c94ca1@gmail.com

Please find attached a patch to do the switch.  I was surprised to see
that the amount of code to adapt was limited, the routines of
buffile.h changed in this commit being used in other places that keep
track of offsets.  Hence these other files just need to do a off_t =>
pgoff_t flip in a couple of structures to be updated, as far as I can
see.

This removes a couple of extra long casts, as well as one comment in
BufFileSeek() that relates to overflows for large offsets, that would
not exist with this switch, which is nice.

Thanks,
--
Michael

Вложения

Re: Switch buffile.c/h to use pgoff_t

От
Chao Li
Дата:

> On Dec 19, 2025, at 09:43, Michael Paquier <michael@paquier.xyz> wrote:
>
> Hi all,
> (Added Bryan in CC as he has been looking at this stuff previously.)
>
> An mentioned on this thread and as a part of the quest to remove more
> of long in the tree, buffile.c and buffile.h still rely on an
> unportable off_t, which is signed 4 bytes on Windows:
> https://www.postgresql.org/message-id/0f238ff4-c442-42f5-adb8-01b762c94ca1@gmail.com
>
> Please find attached a patch to do the switch.  I was surprised to see
> that the amount of code to adapt was limited, the routines of
> buffile.h changed in this commit being used in other places that keep
> track of offsets.  Hence these other files just need to do a off_t =>
> pgoff_t flip in a couple of structures to be updated, as far as I can
> see.
>
> This removes a couple of extra long casts, as well as one comment in
> BufFileSeek() that relates to overflows for large offsets, that would
> not exist with this switch, which is nice.
>
> Thanks,
> --
> Michael
> <0001-Switch-buffile.c-h-to-use-portable-pgoff_t.patch>


```
     while (wpos < file->nbytes)
     {
-        off_t        availbytes;
+        pgoff_t        availbytes;
         instr_time    io_start;
         instr_time    io_time;

@@ -524,7 +524,7 @@ BufFileDumpBuffer(BufFile *file)
         bytestowrite = file->nbytes - wpos;
         availbytes = MAX_PHYSICAL_FILESIZE - file->curOffset;

-        if ((off_t) bytestowrite > availbytes)
+        if ((pgoff_t) bytestowrite > availbytes)
             bytestowrite = (int) availbytes;
```

bytestowrite is of type int, loosing it to pgoff_t then compare with availbytes, if bytestowrite > availbytes, then
availbytesmust be within the range of int, so the next assignment “bytestowrite = (int) availbytes” is safe, but makes
readingdifficult. 

Given MAX_PHYSICAL_FILESIZE is just 1G (2^30), why availbytes has to be pgoff_t instead of just int?

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: Switch buffile.c/h to use pgoff_t

От
Bertrand Drouvot
Дата:
Hi,

On Fri, Dec 19, 2025 at 11:00:54AM +0800, Chao Li wrote:
> Given MAX_PHYSICAL_FILESIZE is just 1G (2^30), why availbytes has to be pgoff_t instead of just int?

I agree that int would work, but maybe it's using pgoff_t just to be on the safe
side of things should MAX_PHYSICAL_FILESIZE become 2^31 or higher one day?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Switch buffile.c/h to use pgoff_t

От
Michael Paquier
Дата:
On Fri, Dec 19, 2025 at 11:00:54AM +0800, Chao Li wrote:
> Given MAX_PHYSICAL_FILESIZE is just 1G (2^30), why availbytes has to
> be pgoff_t instead of just int?

The point of such changes would be to lift this barrier at some point,
which is what the other thread I am mentioning upthread is also
pointing at.  It does not change the fact that this code is currently
not portable as written: off_t can be 4 or 8 bytes depending on the
environment, and pgoff_t exists to be a stable alternative.  This
relates as well to the use of long in the tree, all coming down to
WIN32.
--
Michael

Вложения

Re: Switch buffile.c/h to use pgoff_t

От
Michael Paquier
Дата:
On Fri, Dec 19, 2025 at 02:22:02PM +0900, Michael Paquier wrote:
> The point of such changes would be to lift this barrier at some point,
> which is what the other thread I am mentioning upthread is also
> pointing at.  It does not change the fact that this code is currently
> not portable as written: off_t can be 4 or 8 bytes depending on the
> environment, and pgoff_t exists to be a stable alternative.  This
> relates as well to the use of long in the tree, all coming down to
> WIN32.

Getting rid of a couple more long assumptions while removing one
portability comment from buffile.c is appealing while the change is
not invasive, so applied.
--
Michael

Вложения

Re: Switch buffile.c/h to use pgoff_t

От
Chao Li
Дата:

On Fri, Dec 19, 2025 at 1:22 PM Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Dec 19, 2025 at 11:00:54AM +0800, Chao Li wrote:
> Given MAX_PHYSICAL_FILESIZE is just 1G (2^30), why availbytes has to
> be pgoff_t instead of just int?

The point of such changes would be to lift this barrier at some point,
which is what the other thread I am mentioning upthread is also
pointing at.  It does not change the fact that this code is currently
not portable as written: off_t can be 4 or 8 bytes depending on the
environment, and pgoff_t exists to be a stable alternative.  This
relates as well to the use of long in the tree, all coming down to
WIN32.
--
Michael

Sorry, I didn’t explain myself clearly earlier. My main point was to avoid the awkward mixed-type casts here:
```
if ((pgoff_t) bytestowrite > availbytes)
    bytestowrite = (int) availbytes;
```

I agree that changing availbytes to int would not be a good choice. Instead, I tried making bytestowrite a pgoff_t, so that the comparison and assignment can be done without casts, while still keeping the code correct if MAX_PHYSICAL_FILESIZE is lifted in the future.

I’ve attached a small patch along these lines. It compiles without warnings, and "make check" passes on my side. What do you think?

Best regards,
==
Chao Li (Evan)
---------------------
HighGo Software Co., Ltd.
Вложения

Re: Switch buffile.c/h to use pgoff_t

От
Michael Paquier
Дата:
On Tue, Dec 23, 2025 at 10:59:45AM +0800, Chao Li wrote:
> I’ve attached a small patch along these lines. It compiles without
> warnings, and "make check" passes on my side. What do you think?

I don't think it is right.  bytestowrite is not a file offset, and the
code has been using an int due to BufFile->nbytes.  This leads to a
more confusing result.
--
Michael

Вложения

Re: Switch buffile.c/h to use pgoff_t

От
Chao Li
Дата:

On Wed, Dec 24, 2025 at 2:15 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Dec 23, 2025 at 10:59:45AM +0800, Chao Li wrote:
> I’ve attached a small patch along these lines. It compiles without
> warnings, and "make check" passes on my side. What do you think?

I don't think it is right.  bytestowrite is not a file offset, and the
code has been using an int due to BufFile->nbytes.  This leads to a
more confusing result.
--
Michael

Make sense, bytestowrite is not a file offset. So, in the current code, availbytes is not a file offset either, but it is defined as pgoff_t, which has the same confusion, right? Also bytestowrite is casted to pgoff_t, it's the same confusion again.

How about using "ssize_t" for both bytestowrite and availbytes? It's still signed, broader than int, and the odd type casts are eliminated. 

In win32_port.h:
```
#ifndef _WIN64
typedef long ssize_t;
#else
typedef __int64 ssize_t;
#endif
```

Best regards,
==
Chao Li (Evan)
---------------------
HighGo Software Co., Ltd.
Вложения

Re: Switch buffile.c/h to use pgoff_t

От
Michael Paquier
Дата:
On Wed, Dec 24, 2025 at 05:01:57PM +0800, Chao Li wrote:
> Make sense, bytestowrite is not a file offset. So, in the current code,
> availbytes is not a file offset either, but it is defined as pgoff_t, which
> has the same confusion, right? Also bytestowrite is casted to pgoff_t, it's
> the same confusion again.

Yeah, actually this suggestion makes more sense.  availbytes is a
computation made of a maximal size and an offset, so defining it as an
offset from the start is kind of weird.

Now I don't think that your suggested set of changes could become more
consistent with a few more changes.  For example, what about pos and
nbytes in BufFile?  While ssize_t is more consistent with FileRead()
and FileWrite(), this code is written to care about signedness while
ssize_t has a stricter range per posix, hence could int64 be a better
choice for the whole interface?  int64 is already what we use for
BufFileSize(), which is due to the limit of MAX_PHYSICAL_FILESIZE of
course.
--
Michael

Вложения

Re: Switch buffile.c/h to use pgoff_t

От
Michael Paquier
Дата:
On Thu, Dec 25, 2025 at 08:42:04AM +0900, Michael Paquier wrote:
> Now I don't think that your suggested set of changes could become more
> consistent with a few more changes.

Cough.  "I think that your suggested set of changes could become more
consistent with a few more changes."  Cough.
--
Michael

Вложения

Re: Switch buffile.c/h to use pgoff_t

От
Chao Li
Дата:




On Thu, Dec 25, 2025 at 7:42 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Dec 24, 2025 at 05:01:57PM +0800, Chao Li wrote:
> Make sense, bytestowrite is not a file offset. So, in the current code,
> availbytes is not a file offset either, but it is defined as pgoff_t, which
> has the same confusion, right? Also bytestowrite is casted to pgoff_t, it's
> the same confusion again.

Yeah, actually this suggestion makes more sense.  availbytes is a
computation made of a maximal size and an offset, so defining it as an
offset from the start is kind of weird.

Now I think that your suggested set of changes could become more
consistent with a few more changes.  For example, what about pos and
nbytes in BufFile?  While ssize_t is more consistent with FileRead()
and FileWrite(), this code is written to care about signedness while
ssize_t has a stricter range per posix, hence could int64 be a better
choice for the whole interface?  int64 is already what we use for
BufFileSize(), which is due to the limit of MAX_PHYSICAL_FILESIZE of
course.
--
Michael

Yeah, int64 feels a better fit.

WRT. changing pos and nbytes in BufFile to int64, I agree that also makes sense. BufFile is a private struct in buffile.c, though it's indirectly exposed by:
```
typedef struct BufFile BufFile;
```
None of the callers accesses its fields directly. Changing the two fields from int to int64 will bump the structure size 16 bytes, but likely there is not a case where numerous BufFile objects need to be created, thus runtime impact should be minimal.

In attached v3, I have applied int64 to the 2 struct fields and corresponding local variables. I ran a clean build, no warning was introduced, and "make check" passed.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
Вложения

Re: Switch buffile.c/h to use pgoff_t

От
Michael Paquier
Дата:
On Thu, Dec 25, 2025 at 10:40:02AM +0800, Chao Li wrote:
> In attached v3, I have applied int64 to the 2 struct fields and
> corresponding local variables. I ran a clean build, no warning was
> introduced, and "make check" passed.

The advantage of being able to make the code transparently more
pluggable for max large sizes while we already use 8-byte offsets is
kind of nice, I guess.  "availbytes" is a clarification bonus as it is
not an offset per-se.  I can see that you have missed on cast spot, so
adjusted things a bit, then applied the result.
--
Michael

Вложения

Re: Switch buffile.c/h to use pgoff_t

От
Chao Li
Дата:

> On Dec 26, 2025, at 07:45, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Dec 25, 2025 at 10:40:02AM +0800, Chao Li wrote:
>> In attached v3, I have applied int64 to the 2 struct fields and
>> corresponding local variables. I ran a clean build, no warning was
>> introduced, and "make check" passed.
>
> The advantage of being able to make the code transparently more
> pluggable for max large sizes while we already use 8-byte offsets is
> kind of nice, I guess.  "availbytes" is a clarification bonus as it is
> not an offset per-se.  I can see that you have missed on cast spot, so
> adjusted things a bit, then applied the result.
> --
> Michael

Thanks a lot for pushing.

WRT to the original (int) casts, I intentionally removed them, because file->nbytes, newOffset and file->curOffset are
allsigned 64 bit integers now. Compiling with -Wextra won’t get a warning, so technically, the type cast is no longer
needed.

I saw you changed to:
```
- file->pos = (int) (newOffset - file->curOffset);
+ file->pos = (int64) (newOffset - file->curOffset);

- file->pos = (int) (newOffset - file->curOffset);
+ file->pos = (int64) newOffset - file->curOffset;

- file->nbytes = (int) (newOffset - file->curOffset);
+ file->nbytes = (int64) newOffset - file->curOffset;
```

The latter two places missed (), but that should also work, just a little inconsistent.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/