Обсуждение: [HACKERS] Minor codegen silliness in ExecInterpExpr()
I noticed the following while poking around with perf:
| fcinfo->isnull = false; |b5b: movb $0x0,0x1c(%rdx) |
*op->resvalue = op->d.func.fn_addr(fcinfo); 0.02 | mov 0x8(%rbx),%rcx 1.19 | mov %rdx,%rdi
0.93| mov %rdx,(%rsp) | mov %rcx,0x8(%rsp) 0.01 | callq *0x28(%rbx) 2.17 | mov
0x8(%rsp),%rcx | mov %rax,(%rcx) | *op->resnull = fcinfo->isnull; 1.18 |
mov (%rsp),%rdx 4.32 | mov 0x10(%rbx),%rax 0.06 | movzbl 0x1c(%rdx),%edx 9.14 | mov
%dl,(%rax)
It looks to me like gcc believes it is required to evaluate "op->resvalue"
before invoking the called function, just in case the function somehow has
access to *op and modifies that. We could save a pointless register spill
and reload if there were a temporary variable in there, ie
EEO_CASE(EEOP_FUNCEXPR) { FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ Datum fvalue; fcinfo->isnull = false;
- *op->resvalue = op->d.func.fn_addr(fcinfo);
+ fvalue = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = fvalue; *op->resnull = fcinfo->isnull;
EEO_NEXT(); }
and likewise in the other FUNCEXPR cases.
This is on a rather old gcc, haven't checked on bleeding-edge versions.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 2017-09-28 16:21:34 -0400, Tom Lane wrote:
> I noticed the following while poking around with perf:
>
> | fcinfo->isnull = false;
> |b5b: movb $0x0,0x1c(%rdx)
> | *op->resvalue = op->d.func.fn_addr(fcinfo);
> 0.02 | mov 0x8(%rbx),%rcx
> 1.19 | mov %rdx,%rdi
> 0.93 | mov %rdx,(%rsp)
> | mov %rcx,0x8(%rsp)
> 0.01 | callq *0x28(%rbx)
> 2.17 | mov 0x8(%rsp),%rcx
> | mov %rax,(%rcx)
> | *op->resnull = fcinfo->isnull;
> 1.18 | mov (%rsp),%rdx
> 4.32 | mov 0x10(%rbx),%rax
> 0.06 | movzbl 0x1c(%rdx),%edx
> 9.14 | mov %dl,(%rax)
>
> It looks to me like gcc believes it is required to evaluate "op->resvalue"
> before invoking the called function, just in case the function somehow has
> access to *op and modifies that.
Yea, the compiler probably doesn't have much choice besides assuming
that. Might be different if we'd annote function pointers as pure, and
used strict aliasing + perhaps some restrict markers, but ...
> We could save a pointless register spill
> and reload if there were a temporary variable in there, ie
>
> EEO_CASE(EEOP_FUNCEXPR)
> {
> FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
> + Datum fvalue;
>
> fcinfo->isnull = false;
> - *op->resvalue = op->d.func.fn_addr(fcinfo);
> + fvalue = op->d.func.fn_addr(fcinfo);
> + *op->resvalue = fvalue;
> *op->resnull = fcinfo->isnull;
>
> EEO_NEXT();
> }
>
> and likewise in the other FUNCEXPR cases.
>
> This is on a rather old gcc, haven't checked on bleeding-edge versions.
Makes sense. Do you want to make it so, or shall I? I'd personally be
somewhat tempted to keep the branches in sync here...
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2017-09-28 16:21:34 -0400, Tom Lane wrote:
>> We could save a pointless register spill
>> and reload if there were a temporary variable in there,
> Makes sense. Do you want to make it so, or shall I?
I just finished testing a patch, as attached. On my machine (again,
not latest gcc: 4.4.7 on RHEL6 x86_64), it reduces the code size of
execExprInterp.o by a fraction of a percent, and it seems to offer
a slight benefit in "pgbench -S" performance although I'd not put
much stock in that being reproducible.
> I'd personally be
> somewhat tempted to keep the branches in sync here...
I was tempted that way too, but it doesn't apply cleanly to v10,
because of Peter's recent cleanup of function pointer invocation
style. I don't think it's really worth worrying about.
regards, tom lane
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 09abd46..68a1f96 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -501,15 +501,17 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_CASE(EEOP_INNER_SYSVAR)
{
int attnum = op->d.var.attnum;
+ Datum d;
/* these asserts must match defenses in slot_getattr */
Assert(innerslot->tts_tuple != NULL);
Assert(innerslot->tts_tuple != &(innerslot->tts_minhdr));
- /* heap_getsysattr has sufficient defenses against bad attnums */
- *op->resvalue = heap_getsysattr(innerslot->tts_tuple, attnum,
- innerslot->tts_tupleDescriptor,
- op->resnull);
+ /* heap_getsysattr has sufficient defenses against bad attnums */
+ d = heap_getsysattr(innerslot->tts_tuple, attnum,
+ innerslot->tts_tupleDescriptor,
+ op->resnull);
+ *op->resvalue = d;
EEO_NEXT();
}
@@ -517,15 +519,17 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_CASE(EEOP_OUTER_SYSVAR)
{
int attnum = op->d.var.attnum;
+ Datum d;
/* these asserts must match defenses in slot_getattr */
Assert(outerslot->tts_tuple != NULL);
Assert(outerslot->tts_tuple != &(outerslot->tts_minhdr));
/* heap_getsysattr has sufficient defenses against bad attnums */
- *op->resvalue = heap_getsysattr(outerslot->tts_tuple, attnum,
- outerslot->tts_tupleDescriptor,
- op->resnull);
+ d = heap_getsysattr(outerslot->tts_tuple, attnum,
+ outerslot->tts_tupleDescriptor,
+ op->resnull);
+ *op->resvalue = d;
EEO_NEXT();
}
@@ -533,15 +537,17 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_CASE(EEOP_SCAN_SYSVAR)
{
int attnum = op->d.var.attnum;
+ Datum d;
/* these asserts must match defenses in slot_getattr */
Assert(scanslot->tts_tuple != NULL);
Assert(scanslot->tts_tuple != &(scanslot->tts_minhdr));
- /* heap_getsysattr has sufficient defenses against bad attnums */
- *op->resvalue = heap_getsysattr(scanslot->tts_tuple, attnum,
- scanslot->tts_tupleDescriptor,
- op->resnull);
+ /* heap_getsysattr has sufficient defenses against bad attnums */
+ d = heap_getsysattr(scanslot->tts_tuple, attnum,
+ scanslot->tts_tupleDescriptor,
+ op->resnull);
+ *op->resvalue = d;
EEO_NEXT();
}
@@ -641,13 +647,22 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
* As both STRICT checks and function-usage are noticeable performance
* wise, and function calls are a very hot-path (they also back
* operators!), it's worth having so many separate opcodes.
+ *
+ * Note: the reason for using a temporary variable "d", here and in
+ * other places, is that some compilers think "*op->resvalue = f();"
+ * requires them to evaluate op->resvalue into a register before
+ * calling f(), just in case f() is able to modify op->resvalue
+ * somehow. The extra line of code can save a useless register spill
+ * and reload, on architectures without many registers.
*/
EEO_CASE(EEOP_FUNCEXPR)
{
FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ Datum d;
fcinfo->isnull = false;
- *op->resvalue = op->d.func.fn_addr(fcinfo);
+ d = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = d;
*op->resnull = fcinfo->isnull;
EEO_NEXT();
@@ -658,6 +673,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
bool *argnull = fcinfo->argnull;
int argno;
+ Datum d;
/* strict function, so check for NULL args */
for (argno = 0; argno < op->d.func.nargs; argno++)
@@ -669,7 +685,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
}
}
fcinfo->isnull = false;
- *op->resvalue = op->d.func.fn_addr(fcinfo);
+ d = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = d;
*op->resnull = fcinfo->isnull;
strictfail:
@@ -680,11 +697,13 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
{
FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
PgStat_FunctionCallUsage fcusage;
+ Datum d;
pgstat_init_function_usage(fcinfo, &fcusage);
fcinfo->isnull = false;
- *op->resvalue = op->d.func.fn_addr(fcinfo);
+ d = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = d;
*op->resnull = fcinfo->isnull;
pgstat_end_function_usage(&fcusage, true);
@@ -698,6 +717,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
PgStat_FunctionCallUsage fcusage;
bool *argnull = fcinfo->argnull;
int argno;
+ Datum d;
/* strict function, so check for NULL args */
for (argno = 0; argno < op->d.func.nargs; argno++)
@@ -712,7 +732,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
pgstat_init_function_usage(fcinfo, &fcusage);
fcinfo->isnull = false;
- *op->resvalue = op->d.func.fn_addr(fcinfo);
+ d = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = d;
*op->resnull = fcinfo->isnull;
pgstat_end_function_usage(&fcusage, true);
@@ -1113,6 +1134,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
if (!op->d.iocoerce.finfo_in->fn_strict || str != NULL)
{
FunctionCallInfo fcinfo_in;
+ Datum d;
fcinfo_in = op->d.iocoerce.fcinfo_data_in;
fcinfo_in->arg[0] = PointerGetDatum(str);
@@ -1120,7 +1142,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
/* second and third arguments are already set up */
fcinfo_in->isnull = false;
- *op->resvalue = FunctionCallInvoke(fcinfo_in);
+ d = FunctionCallInvoke(fcinfo_in);
+ *op->resvalue = d;
/* Should get null result if and only if str is NULL */
if (str == NULL)
@@ -1268,6 +1291,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_CASE(EEOP_ROWCOMPARE_STEP)
{
FunctionCallInfo fcinfo = op->d.rowcompare_step.fcinfo_data;
+ Datum d;
/* force NULL result if strict fn and NULL input */
if (op->d.rowcompare_step.finfo->fn_strict &&
@@ -1279,7 +1303,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
/* Apply comparison function */
fcinfo->isnull = false;
- *op->resvalue = op->d.rowcompare_step.fn_addr(fcinfo);
+ d = op->d.rowcompare_step.fn_addr(fcinfo);
+ *op->resvalue = d;
/* force NULL result if NULL function result */
if (fcinfo->isnull)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-28 18:39:03 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2017-09-28 16:21:34 -0400, Tom Lane wrote: > >> We could save a pointless register spill > >> and reload if there were a temporary variable in there, > > > Makes sense. Do you want to make it so, or shall I? > > I just finished testing a patch, as attached. On my machine (again, > not latest gcc: 4.4.7 on RHEL6 x86_64), it reduces the code size of > execExprInterp.o by a fraction of a percent, and it seems to offer > a slight benefit in "pgbench -S" performance although I'd not put > much stock in that being reproducible. Cool. > + * Note: the reason for using a temporary variable "d", here and in > + * other places, is that some compilers think "*op->resvalue = f();" > + * requires them to evaluate op->resvalue into a register before > + * calling f(), just in case f() is able to modify op->resvalue > + * somehow. The extra line of code can save a useless register spill > + * and reload, on architectures without many registers. I'd remove the "without many registers" bit - that's really more an functioncall ABI question (#caller vs #callee saved registers) than about the actual architecture. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2017-09-28 18:39:03 -0400, Tom Lane wrote:
>> + * Note: the reason for using a temporary variable "d", here and in
>> + * other places, is that some compilers think "*op->resvalue = f();"
>> + * requires them to evaluate op->resvalue into a register before
>> + * calling f(), just in case f() is able to modify op->resvalue
>> + * somehow. The extra line of code can save a useless register spill
>> + * and reload, on architectures without many registers.
> I'd remove the "without many registers" bit - that's really more an
> functioncall ABI question (#caller vs #callee saved registers) than
> about the actual architecture.
Fair enough.
I wondered how pervasive this behavior is. AFAICS it is *not* required
by the C standard; C99 does not say that the left operand of assignment
must be evaluated first, in fact it says that the order of evaluation is
unspecified. But the latest gcc I have at hand (6.4.1 on Fedora 25) still
does it this way. OTOH, Apple's latest edition of clang (LLVM version
9.0.0 (clang-900.0.37)) appears to be just fine with waiting till after
the function call to load op->resvalue. So that's not many data points,
but it does suggest that this is worth fixing, and is not just an artifact
of an old compiler version.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-28 20:01:37 -0400, Tom Lane wrote: > I wondered how pervasive this behavior is. AFAICS it is *not* required > by the C standard; C99 does not say that the left operand of assignment > must be evaluated first, in fact it says that the order of evaluation is > unspecified. FWIW, it's being specificied in C++17 ([1]), which seems to make it not unlikely to end up in C as well. > but it does suggest that this is worth fixing, and is not just an artifact > of an old compiler version. +1, I saw the same in a bleeding edge c++ version. Greetings, Andres Freund [1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0145r3.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-09-28 20:56:57 -0700, Andres Freund wrote: > +1, I saw the same in a bleeding edge c++ version. err, gcc version. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers