Обсуждение: Potential security risk associated with function call
Hi Hackers,
Recently, I notice a security risk when calling a function, it's strange but also interesting. E.g.
`array_to_text_null` is a bultin function with 3 args. Normally, the function is working well. **BUT**
if we create another version `array_to_text_null` function, say `harmful_array_to_string`, but with 2 args:
```
CREATE OR REPLACE FUNCTION harmful_array_to_string(anyarray, text)
RETURNS text
LANGUAGE internal
STABLE PARALLEL SAFE STRICT
AS $function$array_to_text_null$function$;
```
And the we call the new function:
```
postgres=# SELECT harmful_array_to_string(ARRAY[1,2], 'HARMFUL');
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
```
It will cause the server crash~
The reason is there is a if statement in `array_to_text_null`
```
Datum
array_to_text_null(PG_FUNCTION_ARGS)
{
...
/* NULL null string is passed through as a null pointer */
if (!PG_ARGISNULL(2))
null_string = text_to_cstring(PG_GETARG_TEXT_PP(2));
...
}
```
to determine wheather the 3rd arg is NULL or not. And we only pass 2 args to the function, but the
if statement here return TRUE, so it tries to get the 3rd arg, and cause the segmentfault.
The strange but interesting thing's here, if we change the code to:
```
Datum
array_to_text_null(PG_FUNCTION_ARGS)
{
...
/* NULL null string is passed through as a null pointer */
if (PG_ARGISNULL(2))
null_string = text_to_cstring(PG_GETARG_TEXT_PP(2));
...
}
```
Will this code work well?
NO! The if statement still return TRUE! So still cause the segmentfault.
Not only `array_to_text_null`, other functions also having such problem, like `array_prepend`, we can
create a function:
```
CREATE OR REPLACE FUNCTION harmful_array_prepend(anycompatible)
RETURNS anycompatiblearray
LANGUAGE internal
IMMUTABLE PARALLEL SAFE
AS $function$array_prepend$function$;
```
to cause the server crash easily.
This issue can be reproduction when compiled with "-O0". And when compiled with "-O2", although will not cause the server crash, but potential security risk arised as it will access an unknow memory.
A simple patch provided to prevent to access unknow args memory.
Jet
Halo Tech
Вложения
On 3/10/26 11:24, Jet wrote: > Hi Hackers, > > Recently, I notice a security risk when calling a function, it's > strange but also interesting. E.g. > > `array_to_text_null` is a bultin function with 3 args. Normally, the > function is working well. **BUT** > if we create another version `array_to_text_null` function, say > `harmful_array_to_string`, but with 2 args: > > Yikes. This seems really dangerous. > A simple patch provided to prevent to access unknow args memory. > I don't think this patch will cover all cases as the function might do something else with the data instead of checking for NULL, especially if it expects to be called from a function that is defined with RETURNS NULL ON NULL INPUT on the sql side. My gut reaction would be to limit the creation of functions with language=internal to superusers, but that wouldn't work as it would break CREATE EXTENSION when there are server modules involved. Maybe all C functions that are able to be used as language=internal needs to explicitly check nargs at the top of the function? -- Anders Åstrand Percona
> My gut reaction would be to limit the creation of functions with > language=internal to superusers, but that wouldn't work as it would > break CREATE EXTENSION when there are server modules involved. > > Maybe all C functions that are able to be used as language=internal > needs to explicitly check nargs at the top of the function? Yes, all C functions suffer such potential risk, not only language=internal. So limit the creation of functions with language=internal is not enough. Jet Halo Tech
On Tue, 10 Mar 2026 at 11:25, Jet <zhangchenxi@halodbtech.com> wrote: > > Hi Hackers, > > Recently, I notice a security risk when calling a function, it's strange but also interesting. E.g. > > `array_to_text_null` is a bultin function with 3 args. Normally, the function is working well. **BUT** > if we create another version `array_to_text_null` function, say `harmful_array_to_string`, but with 2 args: [...] > And the we call the new function: [...] > It will cause the server crash~ Correct. This is expected behaviour: the "internal" and "c" languages are not 'trusted' languages, and therefore only superusers can create functions using these languages. It is the explicit responsibility of the superuser to make sure the functions they create using untrusted languages are correct and execute safely when called by PostgreSQL. Kind regards, Matthias van de Meent
On Tue, Mar 10, 2026 at 8:03 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > Correct. This is expected behaviour: the "internal" and "c" languages > are not 'trusted' languages, and therefore only superusers can create > functions using these languages. It is the explicit responsibility of > the superuser to make sure the functions they create using untrusted > languages are correct and execute safely when called by PostgreSQL. Agreed! In fact, it's pretty much theoretically impossible for this to work any other way. If we wanted to add checks that the expectations of the C code match the actual function definitions, how would we do that? I'm tempted to say we'd have to solve the halting problem (which is impossible, look it up), but the 2026 reality is that someone would just say "deploy an AI agent to check whether the code is safe for the definition," and that might actually work in practical cases, but we're not going to add a call-out to Claude as part of the CREATE FUNCTION statement. And it's equally impossible to insist that every C function anyone writes must be prepared for an arbitrary number of arguments of arbitrary data types. Even doing that for core functions would be a massive waste of resources. Functions like +(int4,int4) can be called in very tight loops, and even the fact that those functions do overflow checking is a significant performance drain. Doing these kinds of checks to counter hypothetical scenarios would be a poor investment of resources that would make many users unhappy. Besides, even if we did that, we couldn't possibly enforce that out-of-core C code has all of the same checks, or that those checks are correctly coded. Basically, yeah, being able to call code written directly in C is dangerous, but it's also necessary, because that's how we get reasonable performance. -- Robert Haas EDB: http://www.enterprisedb.com
> Correct. This is expected behaviour: the "internal" and "c" languages > are not 'trusted' languages, and therefore only superusers can create > functions using these languages. Yes, you're right, only superusers can create "in.ternal" and "c" languages > It is the explicit responsibility of > the superuser to make sure the functions they create using untrusted > languages are correct and execute safely when called by PostgreSQL. But the question is how can a superuser know the "internal" and "c" functions implementation details? He will not know whether the code has !PG_ARGISNULL(...), and create a harmful function accidentally... Jet Halo Tech
On Tuesday, March 10, 2026, Jet <zhangchenxi@halodbtech.com> wrote:
> It is the explicit responsibility of
> the superuser to make sure the functions they create using untrusted
> languages are correct and execute safely when called by PostgreSQL.
But the question is how can a superuser know the "internal" and "c" functions
implementation details? He will not know whether the code has !PG_ARGISNULL(...),
and create a harmful function accidentally...
You describe the fundamental problem/risk of the entire software industry. At least PostgreSQL has chosen a business model where the superuser has the option to read the source code.
David J.
On Tue, 10 Mar 2026 at 17:27, Jet <zhangchenxi@halodbtech.com> wrote: > > It is the explicit responsibility of > > the superuser to make sure the functions they create using untrusted > > languages are correct and execute safely when called by PostgreSQL. > But the question is how can a superuser know the "internal" and "c" functions > implementation details? He will not know whether the code has !PG_ARGISNULL(...), > and create a harmful function accidentally... I think our global assumption is that superuser is super-wise and knows everything -- Best regards, Kirill Reshke
> > > It is the explicit responsibility of > > > the superuser to make sure the functions they create using untrusted > > > languages are correct and execute safely when called by PostgreSQL. > > But the question is how can a superuser know the "internal" and "c" functions > > implementation details? He will not know whether the code has !PG_ARGISNULL(...), > > and create a harmful function accidentally... > I think our global assumption is that superuser is super-wise and > knows everything Totally agreed ... Jet Halo Tech
> but the 2026 reality is that someone would > just say "deploy an AI agent to check whether the code is safe for the > definition," and that might actually work in practical cases, but > we're not going to add a call-out to Claude as part of the CREATE > FUNCTION statement. I notice the potential problem just because using Claude to write a simple extension. And it works well on testing enviroment. But when take over the Claude generated extenion to dev enviroment, the server crashed. More and more people will use AI to generate codes, that's the trend, but AI will make mistakes, and may leave many potention risks. So I suppose as the base platform, we should try our best efforts to make it more robust. Regards, Jet Halo Tech
> On 10 Mar 2026, at 14:09, Jet <zhangchenxi@halodbtech.com> wrote: > >> but the 2026 reality is that someone would >> just say "deploy an AI agent to check whether the code is safe for the >> definition," and that might actually work in practical cases, but >> we're not going to add a call-out to Claude as part of the CREATE >> FUNCTION statement. > I notice the potential problem just because using Claude to write a simple > extension. And it works well on testing enviroment. But when take over the > Claude generated extenion to dev enviroment, the server crashed. > More and more people will use AI to generate codes, that's the trend, but AI > will make mistakes, and may leave many potention risks. So I suppose as the > base platform, we should try our best efforts to make it more robust. There is no protection strong enough against developers who run generated C code in production that they didn't read, review and test properly. -- Daniel Gustafsson
On Tue, Mar 10, 2026 at 8:39 AM Kirill Reshke <reshkekirill@gmail.com> wrote: > I think our global assumption is that superuser is super-wise and > knows everything Right, but in case they don't, instead of writing their own CREATE FUNCTION statements, they might want to use CREATE EXTENSION, thus depending on the wisdom of the extension provider in lieu of their own. In ~30 years as a PostgreSQL user and developer, I've only written a relatively small number of CREATE FUNCTION ... LANGUAGE c/internal statements myself, and they've all been either for an extension or for some kind of development exercise. There's no real reason to go around writing random such statements that are completely broken just for fun. By the way, if you think this is a fun way to break your database, try running "DELETE FROM pg_proc" sometime. Do not, under any circumstances, do this in a PostgreSQL instance that you ever want to use for anything ever again. I actually think we should have more guardrails against this kind of direct system catalog modification than we do -- like you have to set a GUC saying "yes, I know I'm potentially about to break everything really badly" before you can write to the system catalogs. The example that started this thread is essentially unpreventable, because we need CREATE FUNCTION to be possible and we need the superuser to tell us what the C code is expecting, but the number of people who go tinkering with catalog contents manually without fully understanding the consequences seems to be much larger than I would have thought, even if the tinkering is usually less dramatic than this example. -- Robert Haas EDB: http://www.enterprisedb.com
> Right, but in case they don't, instead of writing their own CREATE > FUNCTION statements, they might want to use CREATE EXTENSION, thus > depending on the wisdom of the extension provider in lieu of their > own. > > In ~30 years as a PostgreSQL user and developer, I've only written a > relatively small number of CREATE FUNCTION ... LANGUAGE c/internal > statements myself, and they've all been either for an extension or for > some kind of development exercise. There's no real reason to go around > writing random such statements that are completely broken just for > fun. I don't think it just for fun. People may prefer to use EXTENSION, but the problem is may the EXTENSION was written by a person who don't have full skills with extension developing or even without any code experience but only using AI. Just in the case I notice the problem. AI doing all the things and on most cases it works well but leave potential risks. Will the end user really to study the whole EXTENSION code? I can ensure most of them will not. And AI will take over to do the most of coding works, that iss what happening... Regards, Jet Halo Tech
On Tue, Mar 10, 2026 at 10:05 AM Jet <zhangchenxi@halodbtech.com> wrote: > I don't think it just for fun. People may prefer to use EXTENSION, but the > problem is may the EXTENSION was written by a person who don't have full > skills with extension developing or even without any code experience but only > using AI. Just in the case I notice the problem. AI doing all the things and on > most cases it works well but leave potential risks. Will the end user really to > study the whole EXTENSION code? I can ensure most of them will not. And AI > will take over to do the most of coding works, that iss what happening... Sure, but what do you propose to do about it? As I have already said, there's no realistic way for PostgreSQL itself to know what the correct function definition is. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, 10 Mar 2026 at 13:26, Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, Mar 10, 2026 at 8:03 AM Matthias van de Meent > <boekewurm+postgres@gmail.com> wrote: > > Correct. This is expected behaviour: the "internal" and "c" languages > > are not 'trusted' languages, and therefore only superusers can create > > functions using these languages. It is the explicit responsibility of > > the superuser to make sure the functions they create using untrusted > > languages are correct and execute safely when called by PostgreSQL. > > Agreed! > > In fact, it's pretty much theoretically impossible for this to work > any other way. If we wanted to add checks that the expectations of the > C code match the actual function definitions, how would we do that? > I'm tempted to say we'd have to solve the halting problem (which is > impossible, look it up), but the 2026 reality is that someone would > just say "deploy an AI agent to check whether the code is safe for the > definition," and that might actually work in practical cases, but > we're not going to add a call-out to Claude as part of the CREATE > FUNCTION statement. Tangent: I think it could be possible to make extensions (and PG itself) generate more extensive pg_finfo records that contain sufficient information to describe the functions' expected SQL calling signature(s), which PG could then check and verify when the function is catalogued (e.g. through lanvalidator). E.g. "this function has 2 PG calling signatures: a volatile function with 2 non-null arguments, or an immutable function with 3 non-null arguments". Registrations which conflict with the exposed definition could then raise a warning to expose the difference. This would make the gap between C code and SQL code that needs to be bridged by manual superuser validation a bit smaller. I won't claim it's trivial, but I do think it might be a worthwile time investment, and extensions could benefit here, too, as such metadata could be used to validate and/or generate parts of extension's install/upgrade scripts. (And, whilst this is not on my personal todo list, it's definitely on my wishlist; so do with the idea what you would like). Kind regards, Matthias van de Meent Databricks (https://www.databricks.com)
On Tue, Mar 10, 2026 at 09:23:50AM -0400, Robert Haas wrote: > [...]. The example that started this thread is > essentially unpreventable, because we need CREATE FUNCTION to be > possible and we need the superuser to tell us what the C code is > expecting, but the number of people who go tinkering with catalog > contents manually without fully understanding the consequences seems > to be much larger than I would have thought, even if the tinkering is > usually less dramatic than this example. If DWARF is available you could always get the C function's prototype from that, and sanity-check it. But DWARF really bloats shared objects, and it's not universal, so it's not a good solution. C is just a crappy language. You play with fire, you best know what you're doing -- that's a reasonable policy. And since PG is written in C, and users do have C-coded extensions here and there, playing with fire has to be supported. It'd be clever if there was at least a standard for a subset of DWARF that provides just the types information (but not, e.g., stack unwinding) so that we could have some sort of standard reflection support in C. That would be for the C standards committee. Nico --
On Tue, 10 Mar 2026 at 17:19, Nico Williams <nico@cryptonector.com> wrote: > > On Tue, Mar 10, 2026 at 09:23:50AM -0400, Robert Haas wrote: > > [...]. The example that started this thread is > > essentially unpreventable, because we need CREATE FUNCTION to be > > possible and we need the superuser to tell us what the C code is > > expecting, but the number of people who go tinkering with catalog > > contents manually without fully understanding the consequences seems > > to be much larger than I would have thought, even if the tinkering is > > usually less dramatic than this example. > > If DWARF is available you could always get the C function's > prototype from that, and sanity-check it. But DWARF really bloats > shared objects, and it's not universal, so it's not a good solution. Even with DWARF analysis it wouldn't help for C-language SQL functions, as their signature is fixed: their one and only argument is always just an FunctionCallInfo aka FunctionCallInfoBaseData*. That struct then contains the actual arguments/argument count/nullability info. Also note that the "c" language here effectively only means "dynamically loaded symbol using standard C linking with the platform's C calling convention": PostgreSQL doesn't compile the functions from sources. Any language that compiles to a binary that links with such symbols should work; e.g. C++ and Rust are both using this mechanism despite the "c" name used for the language. Kind regards, Matthias van de Meent Databricks (https://www.databricks.com)
On Tue, Mar 10, 2026 at 12:19 PM Nico Williams <nico@cryptonector.com> wrote:
On Tue, Mar 10, 2026 at 09:23:50AM -0400, Robert Haas wrote:
> [...]. The example that started this thread is
> essentially unpreventable, because we need CREATE FUNCTION to be
> possible and we need the superuser to tell us what the C code is
> expecting, but the number of people who go tinkering with catalog
> contents manually without fully understanding the consequences seems
> to be much larger than I would have thought, even if the tinkering is
> usually less dramatic than this example.
If DWARF is available you could always get the C function's
prototype from that, and sanity-check it. But DWARF really bloats
shared objects, and it's not universal, so it's not a good solution.
C is just a crappy language. You play with fire, you best know what
you're doing -- that's a reasonable policy. And since PG is written in
C, and users do have C-coded extensions here and there, playing with
fire has to be supported.
I'm really tired of "C" bashing. The C programming language is a tool. Effective and powerful tools can be dangerous, but the productivity is worth it. The problem isn't the "C" language, it is with people who try to program in C but do not know C.
It'd be clever if there was at least a standard for a subset of DWARF
that provides just the types information (but not, e.g., stack
unwinding) so that we could have some sort of standard reflection
support in C. That would be for the C standards committee.
Why do we need that?
Nico
--
I have written a lot of postgresql extensions and I think the interface as it is is pretty good. You do need to be careful and check your inputs. One of the things we could do is create set of macros and/or post processing that could do something like C++ style name mangling for C functions. I have had to do this manually in the past, but maybe we can create a process that will scan a source file for meta in comments and create the SQL function declarations based on the number of parameters and hook them up to the correct C functions. Something like that. It's an incremental mitigation. Malicious remapping of SQL functions to bad C functions is not preventable if you have the permissions to do so.
On Tue, Mar 10, 2026 at 6:25 AM Jet <zhangchenxi@halodbtech.com> wrote:
Hi Hackers,Recently, I notice a security risk when calling a function, it's strange but also interesting. E.g.`array_to_text_null` is a bultin function with 3 args. Normally, the function is working well. **BUT**if we create another version `array_to_text_null` function, say `harmful_array_to_string`, but with 2 args:```CREATE OR REPLACE FUNCTION harmful_array_to_string(anyarray, text)RETURNS textLANGUAGE internalSTABLE PARALLEL SAFE STRICTAS $function$array_to_text_null$function$;```And the we call the new function:```postgres=# SELECT harmful_array_to_string(ARRAY[1,2], 'HARMFUL');server closed the connection unexpectedlyThis probably means the server terminated abnormallybefore or while processing the request.```It will cause the server crash~The reason is there is a if statement in `array_to_text_null````Datumarray_to_text_null(PG_FUNCTION_ARGS){.../* NULL null string is passed through as a null pointer */if (!PG_ARGISNULL(2))null_string = text_to_cstring(PG_GETARG_TEXT_PP(2));...}```to determine wheather the 3rd arg is NULL or not. And we only pass 2 args to the function, but theif statement here return TRUE, so it tries to get the 3rd arg, and cause the segmentfault.The strange but interesting thing's here, if we change the code to:```Datumarray_to_text_null(PG_FUNCTION_ARGS){.../* NULL null string is passed through as a null pointer */if (PG_ARGISNULL(2))null_string = text_to_cstring(PG_GETARG_TEXT_PP(2));...}```Will this code work well?NO! The if statement still return TRUE! So still cause the segmentfault.Not only `array_to_text_null`, other functions also having such problem, like `array_prepend`, we cancreate a function:```CREATE OR REPLACE FUNCTION harmful_array_prepend(anycompatible)RETURNS anycompatiblearrayLANGUAGE internalIMMUTABLE PARALLEL SAFEAS $function$array_prepend$function$;```to cause the server crash easily.This issue can be reproduction when compiled with "-O0". And when compiled with "-O2", although will not cause the server crash, but potential security risk arised as it will access an unknow memory.A simple patch provided to prevent to access unknow args memory.JetHalo Tech
Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> Tangent: I think it could be possible to make extensions (and PG
> itself) generate more extensive pg_finfo records that contain
> sufficient information to describe the functions' expected SQL calling
> signature(s), which PG could then check and verify when the function
> is catalogued (e.g. through lanvalidator).
I think that'd be a lot of work with little result other than to
change what sort of manual validation you have to do. Today, you
have to check "does the function's actual C code match the SQL
definition?". But with this, you'd have to check "does the function's
actual C code match the pg_finfo record?". I'm not seeing a huge win
there.
Many many years ago when we first designed V1 function call protocol,
I had the idea that we could write a tool that inspects C code like
Datum
int42pl(PG_FUNCTION_ARGS)
{
int32 arg1 = PG_GETARG_INT32(0);
int16 arg2 = PG_GETARG_INT16(1);
int32 result;
and automatically derives (or at least cross-checks against) the SQL
definition. And we probably still could write such a tool. But
there's a large fraction of the code base where no attention was paid
to following that layout, and/or one C function was made to handle
several signatures by writing conditional logic to fetch the
arguments. Maybe you could get an AI tool to disentangle such logic,
but how much you wanna trust the results?
regards, tom lane
út 10. 3. 2026 v 19:09 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> Tangent: I think it could be possible to make extensions (and PG
> itself) generate more extensive pg_finfo records that contain
> sufficient information to describe the functions' expected SQL calling
> signature(s), which PG could then check and verify when the function
> is catalogued (e.g. through lanvalidator).
I think that'd be a lot of work with little result other than to
change what sort of manual validation you have to do. Today, you
have to check "does the function's actual C code match the SQL
definition?". But with this, you'd have to check "does the function's
actual C code match the pg_finfo record?". I'm not seeing a huge win
there.
Many many years ago when we first designed V1 function call protocol,
I had the idea that we could write a tool that inspects C code like
Datum
int42pl(PG_FUNCTION_ARGS)
{
int32 arg1 = PG_GETARG_INT32(0);
int16 arg2 = PG_GETARG_INT16(1);
int32 result;
and automatically derives (or at least cross-checks against) the SQL
definition. And we probably still could write such a tool. But
there's a large fraction of the code base where no attention was paid
to following that layout, and/or one C function was made to handle
several signatures by writing conditional logic to fetch the
arguments. Maybe you could get an AI tool to disentangle such logic,
but how much you wanna trust the results?
FmgrInfo holds fn_oid - so maybe it can holds proallargtypes too, and then some assertation can be injected to PG_GETARG_X macros without massive slowdown.
Regards
Pavel
regards, tom lane
Hi,
On 2026-03-10 14:08:45 -0400, Tom Lane wrote:
> Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> > Tangent: I think it could be possible to make extensions (and PG
> > itself) generate more extensive pg_finfo records that contain
> > sufficient information to describe the functions' expected SQL calling
> > signature(s), which PG could then check and verify when the function
> > is catalogued (e.g. through lanvalidator).
>
> I think that'd be a lot of work with little result other than to
> change what sort of manual validation you have to do. Today, you
> have to check "does the function's actual C code match the SQL
> definition?". But with this, you'd have to check "does the function's
> actual C code match the pg_finfo record?". I'm not seeing a huge win
> there.
If we were to do this, I'd assume it'd be something vaguely like
PG_DEFINE_C_FUNCTION(funcname, {argtype1, argtype2}, returntype)
{
...
}
Where PG_DEFINE_C_FUNCTION() would evaluate to an extended version of
PG_FUNCTION_INFO_V1() that also declared argument types and also emitted the
function definition. So there hopefully would be less of a chance of a
mismatch... Then the CREATE FUNCTION could verify that, if present, the
additional information present in the finfo matches the SQL signature.
FWIW, I think we're going to eventually need a more optimized function call
protocol for the most common cases (small number of arguments, no SRF, perhaps
requiring them to be strict, ...). If you look at profiles of queries that do
stuff like aggregate transition invocations or WHERE clause evaluation as part
of a large seqscan, moving things into and out FunctionCallInfo really adds
up. We spend way more on that than e.g. evaluating an int4lt or int8inc.
Greetings,
Andres Freund
Hi
út 10. 3. 2026 v 20:56 odesílatel Andres Freund <andres@anarazel.de> napsal:
Hi,
On 2026-03-10 14:08:45 -0400, Tom Lane wrote:
> Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> > Tangent: I think it could be possible to make extensions (and PG
> > itself) generate more extensive pg_finfo records that contain
> > sufficient information to describe the functions' expected SQL calling
> > signature(s), which PG could then check and verify when the function
> > is catalogued (e.g. through lanvalidator).
>
> I think that'd be a lot of work with little result other than to
> change what sort of manual validation you have to do. Today, you
> have to check "does the function's actual C code match the SQL
> definition?". But with this, you'd have to check "does the function's
> actual C code match the pg_finfo record?". I'm not seeing a huge win
> there.
If we were to do this, I'd assume it'd be something vaguely like
PG_DEFINE_C_FUNCTION(funcname, {argtype1, argtype2}, returntype)
{
...
}
Where PG_DEFINE_C_FUNCTION() would evaluate to an extended version of
PG_FUNCTION_INFO_V1() that also declared argument types and also emitted the
function definition. So there hopefully would be less of a chance of a
mismatch... Then the CREATE FUNCTION could verify that, if present, the
additional information present in the finfo matches the SQL signature.
FWIW, I think we're going to eventually need a more optimized function call
protocol for the most common cases (small number of arguments, no SRF, perhaps
requiring them to be strict, ...). If you look at profiles of queries that do
stuff like aggregate transition invocations or WHERE clause evaluation as part
of a large seqscan, moving things into and out FunctionCallInfo really adds
up. We spend way more on that than e.g. evaluating an int4lt or int8inc.
Maybe a vector executor and vector instructions can be a solution - as an alternative (not substitution).
The overhead of fmgr per one row is high, but for a call with a batch 1000 rows can be minimal.
Regards
Pavel
Greetings,
Andres Freund