Обсуждение: Fix incorrect comments in tuplesort.c

Поиск
Список
Период
Сортировка

Fix incorrect comments in tuplesort.c

От
"cca5507"
Дата:
Hi,

The incorrect comment:

```
/*
 * Initial size of memtuples array.  We're trying to select this size so that
 * array doesn't exceed ALLOCSET_SEPARATE_THRESHOLD and so that the overhead of
 * allocation might possibly be lowered.  However, we don't consider array sizes
 * less than 1024.
 *
 */
#define INITIAL_MEMTUPSIZE Max(1024, \
    ALLOCSET_SEPARATE_THRESHOLD / sizeof(SortTuple) + 1)
```

The "doesn't exceed" are contrary to the macro. Maybe "slightly exceed"?

Attach a small patch. (also remove the empty line in this comment)

--
Regards,
ChangAo Chen

Вложения

Re: Fix incorrect comments in tuplesort.c

От
Chao Li
Дата:

> On Dec 6, 2025, at 22:56, cca5507 <cca5507@qq.com> wrote:
>
> Hi,
>
> The incorrect comment:
>
> ```
> /*
> * Initial size of memtuples array.  We're trying to select this size so that
> * array doesn't exceed ALLOCSET_SEPARATE_THRESHOLD and so that the overhead of
> * allocation might possibly be lowered.  However, we don't consider array sizes
> * less than 1024.
> *
> */
> #define INITIAL_MEMTUPSIZE Max(1024, \
> ALLOCSET_SEPARATE_THRESHOLD / sizeof(SortTuple) + 1)
> ```
>
> The "doesn't exceed" are contrary to the macro. Maybe "slightly exceed"?
>
> Attach a small patch. (also remove the empty line in this comment)
>
> --
> Regards,
> ChangAo Chen
> <v1-0001-Fix-incorrect-comments-in-tuplesort.c.patch>

I guess a native English speaker may read the sentence in a different way than our non-English speakers. So, it’s
betterto wait for an English-speaking committer to decide if this comment needs to be updated. 

But I agree from a non-English speaker’s view, this sentence is a little bit confusing, at least makes us to spend more
timeto understand its real meaning. 

However, if we really want to update the comment, I just feel “slightly” is not be best choice here, because “slightly”
justmeans “not too much”, it doesn’t precisely describe the real meaning of the macro. I would suggest a rephrasing
like:"We choose the smallest number of SortTuple entries whose total size exceeds ALLOCSET_SEPARATE_THRESHOLD, and so
that…" 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: Fix incorrect comments in tuplesort.c

От
"cca5507"
Дата:
Hi Chao,

Thank you for your reply.

I feed the comment to Github Copilot and he says it's incorrect. The "slightly" is just what he suggests.

Your suggestion also LGTM.

--
Regards,
ChangAo Chen

Re: Fix incorrect comments in tuplesort.c

От
"David G. Johnston"
Дата:
On Saturday, December 6, 2025, cca5507 <cca5507@qq.com> wrote:
Hi Chao,

Thank you for your reply.

I feed the comment to Github Copilot and he says it's incorrect. The "slightly" is just what he suggests.

Your suggestion also LGTM.

I don’t think just adding the word “slightly” is a good fix here.  Moving the comment from line 695 here would be better as that explains and gives reference to why the formula is used, close to the formula itself.  It feels like the identical comment at line 757 also is just a bad copy-paste outcome and should be removed and probably replaced with one describing the complicated conditional code it precedes.

David J.

Re: Fix incorrect comments in tuplesort.c

От
"cca5507"
Дата:
Hi,

I find that the initial size of memtuples array must be more than ALLOCSET_SEPARATE_THRESHOLD
is not for lower overhead of allocation, but for a bug fixed in 8ea3e7a75c0d22c41c57f59c8b367059b97d0b66.

Attach a new patch.

--
Regards,
ChangAo Chen

Вложения

Re: Fix incorrect comments in tuplesort.c

От
David Rowley
Дата:
On Sun, 7 Dec 2025 at 21:34, cca5507 <cca5507@qq.com> wrote:
> I find that the initial size of memtuples array must be more than ALLOCSET_SEPARATE_THRESHOLD
> is not for lower overhead of allocation, but for a bug fixed in 8ea3e7a75c0d22c41c57f59c8b367059b97d0b66.
>
> Attach a new patch.

I find the current comment perfectly understandable and the text
you're proposing to be incorrect. What you might be missing is that
with aset.c, a palloc() request above ALLOCSET_SEPARATE_THRESHOLD is
certain to require a malloc() and a dedicated block. Sizes under that
may be able to use an existing AllocBlock. The comment is effectively
explaining that we don't want to make the array big enough so that a
malloc will always be required. You may need to read
AllocSetContextCreateInternal() around where allocChunkLimit is being
set to understand what this is all about.

David



Re: Fix incorrect comments in tuplesort.c

От
"David G. Johnston"
Дата:
On Sun, Dec 7, 2025 at 3:09 PM David Rowley <dgrowleyml@gmail.com> wrote:
The comment is effectively
explaining that we don't want to make the array big enough so that a
malloc will always be required.
 
Doesn't what you are saying contradict both the formula (the +1 post-division) and the comments:

/*
* Initial size of array must be more than ALLOCSET_SEPARATE_THRESHOLD;
* see comments in grow_memtuples().
*/
state->memtupsize = INITIAL_MEMTUPSIZE;
state->memtuples = NULL;

and in grow_memtuples:

* That shouldn't happen because we chose the
* initial array size large enough to ensure that palloc will be treating
* both old and new arrays as separate chunks.

?

David J.

Re: Fix incorrect comments in tuplesort.c

От
David Rowley
Дата:
On Mon, 8 Dec 2025 at 12:10, David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Sun, Dec 7, 2025 at 3:09 PM David Rowley <dgrowleyml@gmail.com> wrote:
>>
>> The comment is effectively
>> explaining that we don't want to make the array big enough so that a
>> malloc will always be required.
>
>
> Doesn't what you are saying contradict both the formula (the +1 post-division) and the comments:
>
> /*
> * Initial size of array must be more than ALLOCSET_SEPARATE_THRESHOLD;
> * see comments in grow_memtuples().
> */

Looks like I put too much faith into what the comment was telling me.
I now agree that the comment is describing the opposite of what the
code is doing. The comment seems to have first appeared in the patch
submitted in [1], and per the "Not sure myself, let's ask the other
Alexander." in [2], indicates that Alexander Kuzmenkov
reverse-engineered the comment from his incorrect understanding of the
code.

With that, I think the v2 patch is almost ok. One small quibble I have is:

+ * in grow_memtuples().  However, we don't consider array sizes
+ * less than 1024.

Using "However" here indicates some exception to what's just been
said, but there is no longer an exception. To write about what the
1024 is for, we might need to reverse engineer what that's for. I
assume it's something like "Clamp at 1024 elements to avoid excessive
reallocs of the array". Or perhaps that without the " of the array"
part.

David

[1] https://www.postgresql.org/message-id/CAPpHfdtKHETXhf062CPvkjpG1wnjQ7rv4uLhZgYQ6VZjwqDYpg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/b45ff523-780b-502b-8d03-2763182aa3c6%40postgrespro.ru



Re: Fix incorrect comments in tuplesort.c

От
Chao Li
Дата:

> On Dec 8, 2025, at 08:28, David Rowley <dgrowleyml@gmail.com> wrote:
>
> + * in grow_memtuples().  However, we don't consider array sizes
> + * less than 1024.
>
> Using "However" here indicates some exception to what's just been
> said, but there is no longer an exception. To write about what the
> 1024 is for, we might need to reverse engineer what that's for. I
> assume it's something like "Clamp at 1024 elements to avoid excessive
> reallocs of the array". Or perhaps that without the " of the array"
> part.

+1 for the “however” concern.

Also, I found 8ea3e7a75c might explain why we want initial memtuples array size to exceed ALLOCSET_SEPARATE_THRESHOLD.
That'sbecause we want to avoid switching between block chunk and separate chunk when growing. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: Fix incorrect comments in tuplesort.c

От
"cca5507"
Дата:
Hi,

> Using "However" here indicates some exception to what's just been
> said, but there is no longer an exception. To write about what the
> 1024 is for, we might need to reverse engineer what that's for. I
> assume it's something like "Clamp at 1024 elements to avoid excessive
> reallocs of the array". Or perhaps that without the " of the array"
> part.

Agreed. I prefer without the " of the array" part.

--
Regards,
ChangAo Chen

Re: Fix incorrect comments in tuplesort.c

От
David Rowley
Дата:
On Mon, 8 Dec 2025 at 15:08, cca5507 <cca5507@qq.com> wrote:
> > Using "However" here indicates some exception to what's just been
> > said, but there is no longer an exception. To write about what the
> > 1024 is for, we might need to reverse engineer what that's for. I
> > assume it's something like "Clamp at 1024 elements to avoid excessive
> > reallocs of the array". Or perhaps that without the " of the array"
> > part.
>
> Agreed. I prefer without the " of the array" part.

OK. Done like that.  Thanks.

David