Обсуждение: Remove useless casting to the same type
Hi hackers, Attached is a patch to $SUBJECT. This is the same kind of idea as 7f798aca1d5 and ef8fe693606, as their presence could cause risks of hiding actual type mismatches in the future or silently discarding qualifiers. I think that it also improves readability. Those have been found with a coccinelle script as: " @@ type T; T E; @@ - (T) E " which removes casts when it's casting to the same type. Note that it generated more that what is in the attached. I chose not to remove some of them (mainly arithmetic ones) to keep the patch focused on what matters here. Thoughts? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
On Mon, Nov 24, 2025 at 10:26:32AM +0000, Bertrand Drouvot wrote: > This is the same kind of idea as 7f798aca1d5 and ef8fe693606, as their presence > could cause risks of hiding actual type mismatches in the future or silently > discarding qualifiers. I think that it also improves readability. Seems reasonable to me. > Note that it generated more that what is in the attached. I chose not to remove > some of them (mainly arithmetic ones) to keep the patch focused on what matters > here. Can you give an example of what you are talking about here? -- nathan
On Mon, Nov 24, 2025 at 2:26 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > Attached is a patch to $SUBJECT. > --- a/src/backend/access/gin/gindatapage.c > +++ b/src/backend/access/gin/gindatapage.c > @@ -607,11 +607,11 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack, > > if (append) > elog(DEBUG2, "appended %d new items to block %u; %d bytes (%d to go)", > - maxitems, BufferGetBlockNumber(buf), (int) leaf->lsize, > + maxitems, BufferGetBlockNumber(buf), leaf->lsize, > items->nitem - items->curitem - maxitems); Hm. How do we feel, as a group, about superstitious casts in variadic calls? I don't feel like they're in the same class as the other fixes. Argument for: it's nice to know at a glance that a printf() invocation won't corrupt something horribly, especially if I'm quickly scanning code during a CVE analysis, and especially if the variable is named as if it could maybe be a size_t. Do our compilers warn us well enough now, in practice? Argument against: it takes up precious columns and focuses attention away from other things. Like the fact that (items->nitem - items->curitem - maxitems) is unsigned and printed as signed here. :D Maybe we should make the code compile cleanly under -Wformat-signedness at some point... (not in this patch, not signing you up for that) > @@ -389,7 +389,7 @@ hash_xlog_split_allocate_page(XLogReaderState *record) > > /* extract low and high masks. */ > memcpy(&lowmask, data, sizeof(uint32)); > - highmask = (uint32 *) ((char *) data + sizeof(uint32)); > + highmask = (uint32 *) (data + sizeof(uint32)); I wonder about these, too. I like knowing what the code does without having to check the type of `data`. But then later on we do a `data += sizeof(uint32) * 2`, so you have to check the type anyway, so... I don't know. > +++ b/contrib/fuzzystrmatch/dmetaphone.c > @@ -327,7 +327,7 @@ GetAt(metastring *s, int pos) > if ((pos < 0) || (pos >= s->length)) > return '\0'; > > - return ((char) *(s->str + pos)); > + return *(s->str + pos); Isn't this just s->str[pos]? Ditto for SetAt(), right afterwards. --Jacob
Hi,
On Mon, Nov 24, 2025 at 12:43:28PM -0600, Nathan Bossart wrote:
> On Mon, Nov 24, 2025 at 10:26:32AM +0000, Bertrand Drouvot wrote:
> > This is the same kind of idea as 7f798aca1d5 and ef8fe693606, as their presence
> > could cause risks of hiding actual type mismatches in the future or silently
> > discarding qualifiers. I think that it also improves readability.
>
> Seems reasonable to me.
Thanks for looking at it!
> > Note that it generated more that what is in the attached. I chose not to remove
> > some of them (mainly arithmetic ones) to keep the patch focused on what matters
> > here.
>
> Can you give an example of what you are talking about here?
Things like:
A)
- int k = (int) (targrows * sampler_random_fract(&rstate.randstate));
+ int k = (targrows * sampler_random_fract(&rstate.randstate));
That's a valid cast removal but I'm not sure about the removal added value here.
B)
- sign = (BITVECP) (((char *) sign) + 1);
+ sign = ((sign) + 1);
BITVECP is "typedef char *BITVECP; So that's a valid cast removal but I decided
to keep it for semantic reason.
Same as:
@@ -2277,7 +2277,7 @@ printTrgmColor(StringInfo buf, TrgmColor co)
else if (co == COLOR_BLANK)
appendStringInfoChar(buf, 'b');
else
- appendStringInfo(buf, "%d", (int) co);
+ appendStringInfo(buf, "%d", co);
where TrgmColor is "typedef int TrgmColor;"
C)
- if (((unsigned char *) base)[i] != 0xff)
+ if ((base)[i] != 0xff)
because not safe.
See attached the full list that I decided not to include.
Do you think we should add some of them in the patch? (maybe the ones in
nbtcompare.c for example)
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения
Hi, On Mon, Nov 24, 2025 at 03:18:00PM -0800, Jacob Champion wrote: > On Mon, Nov 24, 2025 at 2:26 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > Attached is a patch to $SUBJECT. > > > --- a/src/backend/access/gin/gindatapage.c > > +++ b/src/backend/access/gin/gindatapage.c > > @@ -607,11 +607,11 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack, > > > > if (append) > > elog(DEBUG2, "appended %d new items to block %u; %d bytes (%d to go)", > > - maxitems, BufferGetBlockNumber(buf), (int) leaf->lsize, > > + maxitems, BufferGetBlockNumber(buf), leaf->lsize, > > items->nitem - items->curitem - maxitems); > > Hm. How do we feel, as a group, about superstitious casts in variadic > calls? I don't feel like they're in the same class as the other fixes. > > Argument for: it's nice to know at a glance that a printf() invocation > won't corrupt something horribly, especially if I'm quickly scanning > code during a CVE analysis, and especially if the variable is named as > if it could maybe be a size_t. Do our compilers warn us well enough > now, in practice? > > Argument against: it takes up precious columns and focuses attention > away from other things. Thanks for looking at it! I think that the variadic calls in the patch are related to functions that can benefits from -Wformat. Let's focus on those: with the cast one would need to verify 3 things: variable type, cast and format specifier. Without the cast then only 2 things and the compiler can verify these match via -Wformat warnings. With the cast, the compiler only checks that the cast result matches the format, not whether the cast itself is correct, so I'm in favor of removing the cast, thoughts? > Like the fact that (items->nitem - > items->curitem - maxitems) is unsigned and printed as signed here. :D Nice catch! ;-) > Maybe we should make the code compile cleanly under > -Wformat-signedness at some point... good idea, will give it a try later on outside the context of this patch. > > @@ -389,7 +389,7 @@ hash_xlog_split_allocate_page(XLogReaderState *record) > > > > /* extract low and high masks. */ > > memcpy(&lowmask, data, sizeof(uint32)); > > - highmask = (uint32 *) ((char *) data + sizeof(uint32)); > > + highmask = (uint32 *) (data + sizeof(uint32)); > > I wonder about these, too. I like knowing what the code does without > having to check the type of `data`. But then later on we do a `data += > sizeof(uint32) * 2`, so you have to check the type anyway, so... I > don't know. I think that even with the cast in place, it's good to check the type of data. Not for the line that follows (i.e: "data += sizeof(uint32) * 2") but to check that the cast makes sense and does not hide "wrong" pointer manipulation. So I think that with or without the cast one would need to check. But that feels more natural to check when there is no cast (as we don't assume that someone said "I know what I'm doing"). So I'm in favor of removing the cast, thoughts? > > +++ b/contrib/fuzzystrmatch/dmetaphone.c > > @@ -327,7 +327,7 @@ GetAt(metastring *s, int pos) > > if ((pos < 0) || (pos >= s->length)) > > return '\0'; > > > > - return ((char) *(s->str + pos)); > > + return *(s->str + pos); > > Isn't this just s->str[pos]? Ditto for SetAt(), right afterwards. Yeah, "*(s->str + pos)" is already used in SetAt() and also in IsVowel(). Instead of changing those 3, I'd prefer to keep the current change and keep the patch focus on its intend. We could change those in a dedicated patch afterward if we feel the need. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 25.11.2025 06:46, Bertrand Drouvot wrote: > I think that the variadic calls in the patch are related to functions that > can benefits from -Wformat. Let's focus on those: with the cast one would need > to verify 3 things: variable type, cast and format specifier. > Without the cast then only 2 things and the compiler can verify these match via > -Wformat warnings. > > With the cast, the compiler only checks that the cast result matches the format, > not whether the cast itself is correct, so I'm in favor of removing the cast, > thoughts? > I agree. It's better if we only have the casts in places where we actually want to change the type before printing. Another benefit is that one can directly deduct the type from the log / print statement by looking at the format specifier. -- David Geier