Обсуждение: warning: dereferencing type-punned pointer
Today I compiled PostgreSQL master branch with -fno-strict-aliasing
compile option removed (previous discussions on the $subject [1]). gcc
version is 9.4.0.
There are a few places where $subject warning printed.
In file included from ../../../src/include/nodes/pg_list.h:42,
from ../../../src/include/access/tupdesc.h:19,
from ../../../src/include/access/htup_details.h:19,
from ../../../src/include/access/heaptoast.h:16,
from execExprInterp.c:59:
execExprInterp.c: In function ‘ExecEvalJsonExprPath’:
../../../src/include/nodes/nodes.h:133:29: warning: dereferencing type-punned pointer will break strict-aliasing rules
[-Wstrict-aliasing]
133 | #define nodeTag(nodeptr) (((const Node*)(nodeptr))->type)
| ~^~~~~~~~~~~~~~~~~~~~~~~
../../../src/include/nodes/nodes.h:158:31: note: in expansion of macro ‘nodeTag’
158 | #define IsA(nodeptr,_type_) (nodeTag(nodeptr) == T_##_type_)
| ^~~~~~~
../../../src/include/nodes/miscnodes.h:53:26: note: in expansion of macro ‘IsA’
53 | ((escontext) != NULL && IsA(escontext, ErrorSaveContext) && \
| ^~~
execExprInterp.c:4399:7: note: in expansion of macro ‘SOFT_ERROR_OCCURRED’
4399 | if (SOFT_ERROR_OCCURRED(&jsestate->escontext))
| ^~~~~~~~~~~~~~~~~~~
execExprInterp.c: In function ‘ExecEvalJsonCoercionFinish’:
../../../src/include/nodes/nodes.h:133:29: warning: dereferencing type-punned pointer will break strict-aliasing rules
[-Wstrict-aliasing]
133 | #define nodeTag(nodeptr) (((const Node*)(nodeptr))->type)
| ~^~~~~~~~~~~~~~~~~~~~~~~
../../../src/include/nodes/nodes.h:158:31: note: in expansion of macro ‘nodeTag’
158 | #define IsA(nodeptr,_type_) (nodeTag(nodeptr) == T_##_type_)
| ^~~~~~~
../../../src/include/nodes/miscnodes.h:53:26: note: in expansion of macro ‘IsA’
53 | ((escontext) != NULL && IsA(escontext, ErrorSaveContext) && \
| ^~~
execExprInterp.c:4556:6: note: in expansion of macro ‘SOFT_ERROR_OCCURRED’
4556 | if (SOFT_ERROR_OCCURRED(&jsestate->escontext))
| ^~~~~~~~~~~~~~~~~~~
origin.c: In function ‘StartupReplicationOrigin’:
origin.c:773:16: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
773 | file_crc = *(pg_crc32c *) &disk_state;
| ^~~~~~~~~~~~~~~~~~~~~~~~~
In my understanding from the discussion [1], it would be better to fix
our code to avoid the warning because it *might* point out that there
is something wrong with our code. However the consensus at the time
was, we will not remove -fno-strict-aliasing option for now. It will
take long time before it would happen...
So I think the warnings in ExecEvalJsonExprPath are better fixed
because these are the only places where IsA (nodeTag) macro are used
and the warning is printed. Patch attached.
I am not so sure about StartupReplicationOrigin. Should we fix it now?
For me the code looks sane as long as we keep -fno-strict-aliasing
option. Or maybe better to fix so that someday we could remove the
compiler option?
[1] https://www.postgresql.org/message-id/flat/366.1535731324%40sss.pgh.pa.us#bd93089182d13c79b74593ec70bac435
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index d8735286c4..387311fdfb 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -4381,6 +4381,7 @@ ExecEvalJsonExprPath(ExprState *state, ExprEvalStep *op,
if (!*op->resnull && jsexpr->use_io_coercion)
{
FunctionCallInfo fcinfo;
+ Node *node;
Assert(jump_eval_coercion == -1);
fcinfo = jsestate->input_fcinfo;
@@ -4396,7 +4397,8 @@ ExecEvalJsonExprPath(ExprState *state, ExprEvalStep *op,
fcinfo->isnull = false;
*op->resvalue = FunctionCallInvoke(fcinfo);
- if (SOFT_ERROR_OCCURRED(&jsestate->escontext))
+ node = (Node *) &jsestate->escontext;
+ if (SOFT_ERROR_OCCURRED(node))
error = true;
}
@@ -4552,8 +4554,9 @@ void
ExecEvalJsonCoercionFinish(ExprState *state, ExprEvalStep *op)
{
JsonExprState *jsestate = op->d.jsonexpr.jsestate;
+ Node *node = (Node *) &jsestate->escontext;
- if (SOFT_ERROR_OCCURRED(&jsestate->escontext))
+ if (SOFT_ERROR_OCCURRED(node))
{
*op->resvalue = (Datum) 0;
*op->resnull = true;
Tatsuo Ishii <ishii@postgresql.org> writes:
> So I think the warnings in ExecEvalJsonExprPath are better fixed
> because these are the only places where IsA (nodeTag) macro are used
> and the warning is printed. Patch attached.
I'm not very thrilled with these changes. It's not apparent why
your compiler is warning about these usages of IsA and not any other
ones, nor is it apparent why these changes suppress the warnings.
(The code's not fundamentally different, so I'd think the underlying
problem is still there, if there really is one at all.)
I'm afraid we'd be playing whack-a-mole to suppress similar warnings
on various compiler versions, with no end result other than making
the code uglier and less consistent.
If we can figure out why the warning is appearing, maybe it'd be
possible to adjust the definition of IsA() to prevent it.
regards, tom lane
On 24.07.24 08:55, Tatsuo Ishii wrote: > origin.c: In function ‘StartupReplicationOrigin’: > origin.c:773:16: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] > 773 | file_crc = *(pg_crc32c *) &disk_state; > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > I am not so sure about StartupReplicationOrigin. Should we fix it now? > For me the code looks sane as long as we keep -fno-strict-aliasing > option. Or maybe better to fix so that someday we could remove the > compiler option? This is basically the textbook example of aliasing violation, isn't it? Wouldn't it be just as simple to do memcpy(&file_crc, &disk_state, sizeof(file_crc));
Peter Eisentraut <peter@eisentraut.org> writes:
> This is basically the textbook example of aliasing violation, isn't it?
> Wouldn't it be just as simple to do
> memcpy(&file_crc, &disk_state, sizeof(file_crc));
+1. Also, it seems thoroughly bizarre to me that this case is handled
before checking for read failure. I'd move the stanza to after the
"if (readBytes < 0)" one.
regards, tom lane
On 24.07.24 16:05, Tom Lane wrote: > I'm not very thrilled with these changes. It's not apparent why > your compiler is warning about these usages of IsA and not any other > ones, I think one difference is that normally IsA is called on a Node * (since you call IsA to decide what to cast it to), but in this case it's called on a pointer that is already of type ErrorSaveContext *. You wouldn't normally need to call IsA on that, but it comes with the SOFT_ERROR_OCCURRED macro. Another difference is that most nodes are dynamically allocated but in this case the ErrorSaveContext object (not a pointer to it) is part of another struct, and so I think some different aliasing rules might apply, but I'm not sure. I think here you could just bypass the SOFT_ERROR_OCCURRED macro: - if (SOFT_ERROR_OCCURRED(&jsestate->escontext)) + if (jsestate->escontext.error_occurred)
Peter Eisentraut <peter@eisentraut.org> writes:
> On 24.07.24 16:05, Tom Lane wrote:
>> I'm not very thrilled with these changes. It's not apparent why
>> your compiler is warning about these usages of IsA and not any other
>> ones,
> I think one difference is that normally IsA is called on a Node * (since
> you call IsA to decide what to cast it to), but in this case it's called
> on a pointer that is already of type ErrorSaveContext *.
Hmm. But there are boatloads of places where we call IsA on a
pointer of type Expr *, or sometimes other things. Why aren't
those triggering the same warning?
> I think here you could just bypass the SOFT_ERROR_OCCURRED macro:
> - if (SOFT_ERROR_OCCURRED(&jsestate->escontext))
> + if (jsestate->escontext.error_occurred)
Perhaps. That's a bit sad because it's piercing a layer of
abstraction. I do not like compiler warnings that can't be
gotten rid of without making the code objectively worse.
regards, tom lane
On 24.07.24 20:09, Tom Lane wrote: > Peter Eisentraut<peter@eisentraut.org> writes: >> On 24.07.24 16:05, Tom Lane wrote: >>> I'm not very thrilled with these changes. It's not apparent why >>> your compiler is warning about these usages of IsA and not any other >>> ones, >> I think one difference is that normally IsA is called on a Node * (since >> you call IsA to decide what to cast it to), but in this case it's called >> on a pointer that is already of type ErrorSaveContext *. > Hmm. But there are boatloads of places where we call IsA on a > pointer of type Expr *, or sometimes other things. Why aren't > those triggering the same warning? It must have to do with the fact that the escontext field in JsonExprState has the object inline, not as a pointer. AIUI, with dynamically allocated objects you have more liberties about what type to interpret them as than with actually declared objects. If you change the member to a pointer - ErrorSaveContext escontext; + ErrorSaveContext *escontext; } JsonExprState; and make the required adjustments elsewhere in the code, the warning goes away. This arrangement would also appear to be more consistent with other executor nodes (e.g., ExprState, ExprEvalStep), so it might be worth it for consistency in any case.
BTW, I tried the same experiment of building without
-fno-strict-aliasing using gcc 11.4.1 (from RHEL9).
I see one more warning than Tatsuo-san did:
In file included from verify_heapam.c:18:
verify_heapam.c: In function ‘check_tuple_attribute’:
../../src/include/access/toast_internals.h:37:11: warning: dereferencing type-punned pointer will break strict-aliasing
rules[-Wstrict-aliasing]
37 | (((toast_compress_header *) (ptr))->tcinfo >> VARLENA_EXTSIZE_BITS)
| ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
verify_heapam.c:1693:24: note: in expansion of macro ‘TOAST_COMPRESS_METHOD’
1693 | cmid = TOAST_COMPRESS_METHOD(&toast_pointer);
| ^~~~~~~~~~~~~~~~~~~~~
This looks a bit messy to fix: we surely don't want to pierce
the abstraction TOAST_COMPRESS_METHOD provides. Perhaps
the toast_pointer local variable could be turned into a union
of struct varatt_external and toast_compress_header, but that
would impose a good deal of notational overhead on the rest
of this function.
The good news is that we get through check-world (although
I didn't try very many build options).
regards, tom lane
Peter Eisentraut <peter@eisentraut.org> writes:
> If you change the member to a pointer
> - ErrorSaveContext escontext;
> + ErrorSaveContext *escontext;
> } JsonExprState;
> and make the required adjustments elsewhere in the code, the warning
> goes away.
> This arrangement would also appear to be more consistent with other
> executor nodes (e.g., ExprState, ExprEvalStep), so it might be worth it
> for consistency in any case.
+1, makes sense to me.
regards, tom lane
> On 24.07.24 20:09, Tom Lane wrote:
>> Peter Eisentraut<peter@eisentraut.org> writes:
>>> On 24.07.24 16:05, Tom Lane wrote:
>>>> I'm not very thrilled with these changes. It's not apparent why
>>>> your compiler is warning about these usages of IsA and not any other
>>>> ones,
>>> I think one difference is that normally IsA is called on a Node *
>>> (since
>>> you call IsA to decide what to cast it to), but in this case it's
>>> called
>>> on a pointer that is already of type ErrorSaveContext *.
>> Hmm. But there are boatloads of places where we call IsA on a
>> pointer of type Expr *, or sometimes other things. Why aren't
>> those triggering the same warning?
>
> It must have to do with the fact that the escontext field in
> JsonExprState has the object inline, not as a pointer. AIUI, with
> dynamically allocated objects you have more liberties about what type
> to interpret them as than with actually declared objects.
I don't agree. I think the compiler just dislike that nodeTag macro's
argument is a pointer created by '&' operator in this case:
#define nodeTag(nodeptr) (((const Node*)(nodeptr))->type)
If we just give a pointer variable either it's type is Node * or
ErrorSaveContext * to nodeTag macro, the compiler becomes happy.
Moreover I think whether the object is inline or not is
irrelevant. Attached is a self contained test case. In the program:
if (IsA(&f, List))
produces the strict aliasing rule violation but
if (IsA(fp, List))
does not. Here "f" is an object defined as:
typedef struct Foo
{
NodeTag type;
int d;
} Foo;
Foo f;
and fp is defined as:
Foo *fp = &f;
$ gcc -Wall -O2 -c strict2.c
strict2.c: In function ‘sub’:
strict2.c:1:29: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
1 | #define nodeTag(nodeptr) (((const Node*)(nodeptr))->type)
| ~^~~~~~~~~~~~~~~~~~~~~~~
strict2.c:2:31: note: in expansion of macro ‘nodeTag’
2 | #define IsA(nodeptr,_type_) (nodeTag(nodeptr) == T_##_type_)
| ^~~~~~~
strict2.c:26:6: note: in expansion of macro ‘IsA’
26 | if (IsA(&f, List))
| ^~~
At top level:
strict2.c:21:12: warning: ‘sub’ defined but not used [-Wunused-function]
21 | static int sub(void)
| ^~~
> If you change the member to a pointer
>
> - ErrorSaveContext escontext;
> + ErrorSaveContext *escontext;
> } JsonExprState;
>
> and make the required adjustments elsewhere in the code, the warning
> goes away.
I think this is not necessary. Just my patch in the upthread is enough.
Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
Sorry, I forgot to attach the file...
> I don't agree. I think the compiler just dislike that nodeTag macro's
> argument is a pointer created by '&' operator in this case:
>
> #define nodeTag(nodeptr) (((const Node*)(nodeptr))->type)
>
> If we just give a pointer variable either it's type is Node * or
> ErrorSaveContext * to nodeTag macro, the compiler becomes happy.
>
> Moreover I think whether the object is inline or not is
> irrelevant. Attached is a self contained test case. In the program:
>
> if (IsA(&f, List))
>
> produces the strict aliasing rule violation but
>
> if (IsA(fp, List))
>
> does not. Here "f" is an object defined as:
>
> typedef struct Foo
> {
> NodeTag type;
> int d;
> } Foo;
>
> Foo f;
>
> and fp is defined as:
>
> Foo *fp = &f;
>
> $ gcc -Wall -O2 -c strict2.c
> strict2.c: In function ‘sub’:
> strict2.c:1:29: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
> 1 | #define nodeTag(nodeptr) (((const Node*)(nodeptr))->type)
> | ~^~~~~~~~~~~~~~~~~~~~~~~
> strict2.c:2:31: note: in expansion of macro ‘nodeTag’
> 2 | #define IsA(nodeptr,_type_) (nodeTag(nodeptr) == T_##_type_)
> | ^~~~~~~
> strict2.c:26:6: note: in expansion of macro ‘IsA’
> 26 | if (IsA(&f, List))
> | ^~~
> At top level:
> strict2.c:21:12: warning: ‘sub’ defined but not used [-Wunused-function]
> 21 | static int sub(void)
> | ^~~
>
>> If you change the member to a pointer
>>
>> - ErrorSaveContext escontext;
>> + ErrorSaveContext *escontext;
>> } JsonExprState;
>>
>> and make the required adjustments elsewhere in the code, the warning
>> goes away.
>
> I think this is not necessary. Just my patch in the upthread is enough.
>
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS LLC
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
/*
* Minimum definitions copied from PostgreSQL to make the
* test self-contained.
*/
#define nodeTag(nodeptr) (((const Node*)(nodeptr))->type)
#define IsA(nodeptr,_type_) (nodeTag(nodeptr) == T_##_type_)
typedef enum NodeTag
{
T_Invalid = 0,
T_List = 1
} NodeTag;
typedef struct Node
{
NodeTag type;
} Node;
/* Home brew node */
typedef struct Foo
{
NodeTag type;
int d;
} Foo;
static int sub(void)
{
Foo f;
Foo *fp = &f;
f.type = T_List;
/* strict aliasing rule error */
if (IsA(&f, List))
return 1;
/* This is ok */
if (IsA(fp, List))
return 1;
return 0;
}
On 24.07.24 19:48, Tom Lane wrote:
> Peter Eisentraut <peter@eisentraut.org> writes:
>> This is basically the textbook example of aliasing violation, isn't it?
>> Wouldn't it be just as simple to do
>
>> memcpy(&file_crc, &disk_state, sizeof(file_crc));
>
> +1. Also, it seems thoroughly bizarre to me that this case is handled
> before checking for read failure. I'd move the stanza to after the
> "if (readBytes < 0)" one.
(older thread)
I committed this fix, since it was simple and obvious.
I also researched again the other warnings mentioned in this thread.
There are also a few new similar ones from commit 44eba8f06e55.
My conclusion is that casting Node * to a different pointer-to-struct is
always an aliasing violation. The reason why the compiler only warns in
a few cases is probably that those are fully within the same translation
unit, while most other ones are across file boundaries that the compiler
cannot analyze reliably. If you dial up the warning verbosity with
-Wstrict-aliasing=1/2 (lower = more warnings but less reliable), you get
more warnings about this.
Unlike what PostgreSQL code appears to assume, there is no rule in C (or
C++) that overlaying similar structs is a valid aliasing. So this was
never correct, but compilers have only gotten more aggressive about this
over time.
One correct way would be to embed the Node struct, like
typedef struct ErrorSaveContext
{
Node node;
bool error_occurred;
bool details_wanted;
ErrorData *error_data;
} ErrorSaveContext;
because you can cast between a struct and one of its field types.
We already do this node embedding for inheritance for other nodes, so
this approach would probably be somewhat natural, but it would obviously
require a bit of adjustment here and there (but maybe not that much).
Another approach is to annotate some fundamental types like Node and
Expr with __attribute__((may_alias)).
Similarly with struct varlena and all its variants and related types.
Anyway, this is a battle for another day. We're definitely going to
have to stick with -fno-strict-aliasing for the foreseeable time.