Обсуждение: Potential security risk associated with function call

Поиск

Список

Период

Сортировка

Potential security risk associated with function call

От

"Jet"

Дата:

10 марта, 13:24:47

Hi Hackers,

Recently, I notice a security risk when calling a function, it's strange but also interesting. E.g.

`array_to_text_null` is a bultin function with 3 args. Normally, the function is working well. **BUT**

if we create another version `array_to_text_null` function, say `harmful_array_to_string`, but with 2 args:

```

CREATE OR REPLACE FUNCTION harmful_array_to_string(anyarray, text)

RETURNS text

LANGUAGE internal

STABLE PARALLEL SAFE STRICT

AS $function$array_to_text_null$function$;

```

And the we call the new function:

```

postgres=# SELECT harmful_array_to_string(ARRAY[1,2], 'HARMFUL');

server closed the connection unexpectedly

This probably means the server terminated abnormally

before or while processing the request.

```

It will cause the server crash~

The reason is there is a if statement in `array_to_text_null`

```

Datum

array_to_text_null(PG_FUNCTION_ARGS)

{

...

/* NULL null string is passed through as a null pointer */

if (!PG_ARGISNULL(2))

null_string = text_to_cstring(PG_GETARG_TEXT_PP(2));

...

}

```

to determine wheather the 3rd arg is NULL or not. And we only pass 2 args to the function, but the

if statement here return TRUE, so it tries to get the 3rd arg, and cause the segmentfault.

The strange but interesting thing's here, if we change the code to:

```

Datum

array_to_text_null(PG_FUNCTION_ARGS)

{

...

/* NULL null string is passed through as a null pointer */

if (PG_ARGISNULL(2))

null_string = text_to_cstring(PG_GETARG_TEXT_PP(2));

...

}

```

Will this code work well?

NO! The if statement still return TRUE! So still cause the segmentfault.

Not only `array_to_text_null`, other functions also having such problem, like `array_prepend`, we can

create a function:

```

CREATE OR REPLACE FUNCTION harmful_array_prepend(anycompatible)

RETURNS anycompatiblearray

LANGUAGE internal

IMMUTABLE PARALLEL SAFE

AS $function$array_prepend$function$;

```

to cause the server crash easily.

This issue can be reproduction when compiled with "-O0". And when compiled with "-O2", although will not cause the server crash, but potential security risk arised as it will access an unknow memory.

A simple patch provided to prevent to access unknow args memory.

Jet

Halo Tech

Вложения

0001-fix-potential-funccall-leakrisk.patch

Re: Potential security risk associated with function call

От

"Anders Åstrand"

Дата:

10 марта, 13:50:42

On 3/10/26 11:24, Jet wrote:
> Hi Hackers,
>
> Recently, I notice a security risk when calling a function, it's
> strange but also interesting. E.g.
>
> `array_to_text_null` is a bultin function with 3 args. Normally, the
> function is working well. **BUT**
> if we create another version `array_to_text_null` function, say
> `harmful_array_to_string`, but with 2 args:
>
>
Yikes. This seems really dangerous.
> A simple patch provided to prevent to access unknow args memory.
>
I don't think this patch will cover all cases as the function might do
something else with the data instead of checking for NULL, especially if
it expects to be called from a function that is defined with RETURNS
NULL ON NULL INPUT on the sql side.

My gut reaction would be to limit the creation of functions with
language=internal to superusers, but that wouldn't work as it would
break CREATE EXTENSION when there are server modules involved.

Maybe all C functions that are able to be used as language=internal
needs to explicitly check nargs at the top of the function? 

-- 
Anders Åstrand
Percona

Re: Potential security risk associated with function call

От

"Jet"

Дата:

10 марта, 14:50:07

> My gut reaction would be to limit the creation of functions with
> language=internal to superusers, but that wouldn't work as it would
> break CREATE EXTENSION when there are server modules involved.
> 
> Maybe all C functions that are able to be used as language=internal
> needs to explicitly check nargs at the top of the function? 
Yes, all C functions suffer such potential risk, not only language=internal.
So limit the creation of functions with language=internal is not enough.

Jet
Halo Tech

Re: Potential security risk associated with function call

От

Matthias van de Meent

Дата:

10 марта, 15:02:52

On Tue, 10 Mar 2026 at 11:25, Jet <zhangchenxi@halodbtech.com> wrote:
>
> Hi Hackers,
>
> Recently, I notice a security risk when calling a function, it's strange but also interesting. E.g.
>
> `array_to_text_null` is a bultin function with 3 args. Normally, the function is working well. **BUT**
> if we create another version `array_to_text_null` function, say `harmful_array_to_string`, but with 2 args:
[...]
> And the we call the new function:
[...]
> It will cause the server crash~

Correct. This is expected behaviour: the "internal" and "c" languages
are not 'trusted' languages, and therefore only superusers can create
functions using these languages. It is the explicit responsibility of
the superuser to make sure the functions they create using untrusted
languages are correct and execute safely when called by PostgreSQL.

Kind regards,

Matthias van de Meent

Re: Potential security risk associated with function call

От

Robert Haas

Дата:

10 марта, 15:26:29

On Tue, Mar 10, 2026 at 8:03 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> Correct. This is expected behaviour: the "internal" and "c" languages
> are not 'trusted' languages, and therefore only superusers can create
> functions using these languages. It is the explicit responsibility of
> the superuser to make sure the functions they create using untrusted
> languages are correct and execute safely when called by PostgreSQL.

Agreed!

In fact, it's pretty much theoretically impossible for this to work
any other way. If we wanted to add checks that the expectations of the
C code match the actual function definitions, how would we do that?
I'm tempted to say we'd have to solve the halting problem (which is
impossible, look it up), but the 2026 reality is that someone would
just say "deploy an AI agent to check whether the code is safe for the
definition," and that might actually work in practical cases, but
we're not going to add a call-out to Claude as part of the CREATE
FUNCTION statement. And it's equally impossible to insist that every C
function anyone writes must be prepared for an arbitrary number of
arguments of arbitrary data types. Even doing that for core functions
would be a massive waste of resources. Functions like +(int4,int4) can
be called in very tight loops, and even the fact that those functions
do overflow checking is a significant performance drain. Doing these
kinds of checks to counter hypothetical scenarios would be a poor
investment of resources that would make many users unhappy. Besides,
even if we did that, we couldn't possibly enforce that out-of-core C
code has all of the same checks, or that those checks are correctly
coded.

Basically, yeah, being able to call code written directly in C is
dangerous, but it's also necessary, because that's how we get
reasonable performance.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Potential security risk associated with function call

От

"Jet"

Дата:

10 марта, 15:27:33

> Correct. This is expected behaviour: the "internal" and "c" languages
> are not 'trusted' languages, and therefore only superusers can create
> functions using these languages. 
Yes, you're right, only superusers can create "in.ternal" and "c" languages

> It is the explicit responsibility of
> the superuser to make sure the functions they create using untrusted
> languages are correct and execute safely when called by PostgreSQL.
But the question is how can a superuser know the "internal" and "c" functions
implementation details? He will not know whether the code has !PG_ARGISNULL(...),
and create a harmful function accidentally...

Jet
Halo Tech

Re: Potential security risk associated with function call

От

"David G. Johnston"

Дата:

10 марта, 15:37:13

On Tuesday, March 10, 2026, Jet <zhangchenxi@halodbtech.com> wrote:

> It is the explicit responsibility of
> the superuser to make sure the functions they create using untrusted
> languages are correct and execute safely when called by PostgreSQL.
But the question is how can a superuser know the "internal" and "c" functions
implementation details? He will not know whether the code has !PG_ARGISNULL(...),
and create a harmful function accidentally...

You describe the fundamental problem/risk of the entire software industry. At least PostgreSQL has chosen a business model where the superuser has the option to read the source code.

David J.

Re: Potential security risk associated with function call

От

Kirill Reshke

Дата:

10 марта, 15:39:28

On Tue, 10 Mar 2026 at 17:27, Jet <zhangchenxi@halodbtech.com> wrote:

> > It is the explicit responsibility of
> > the superuser to make sure the functions they create using untrusted
> > languages are correct and execute safely when called by PostgreSQL.
> But the question is how can a superuser know the "internal" and "c" functions
> implementation details? He will not know whether the code has !PG_ARGISNULL(...),
> and create a harmful function accidentally...

I think our global assumption is that superuser is super-wise and
knows everything

-- 
Best regards,
Kirill Reshke

Re: Potential security risk associated with function call

От

"Jet"

Дата:

10 марта, 15:44:46

> > > It is the explicit responsibility of
> > > the superuser to make sure the functions they create using untrusted
> > > languages are correct and execute safely when called by PostgreSQL.
> > But the question is how can a superuser know the "internal" and "c" functions
> > implementation details? He will not know whether the code has !PG_ARGISNULL(...),
> > and create a harmful function accidentally...

> I think our global assumption is that superuser is super-wise and
> knows everything

Totally agreed ...

Jet
Halo Tech

Re: Potential security risk associated with function call

От

"Jet"

Дата:

10 марта, 16:09:52

> but the 2026 reality is that someone would
> just say "deploy an AI agent to check whether the code is safe for the
> definition," and that might actually work in practical cases, but
> we're not going to add a call-out to Claude as part of the CREATE
> FUNCTION statement.
I notice the potential problem just because using Claude to write a simple
extension. And it works well on testing enviroment. But when take over the
Claude generated extenion to dev enviroment, the server crashed. 
More and more people will use AI to generate codes, that's the trend, but AI 
will make mistakes, and may leave many potention risks. So I suppose as the
base platform, we should try our best efforts to make it more robust.

Regards,
Jet
Halo Tech

Re: Potential security risk associated with function call

От

Daniel Gustafsson

Дата:

10 марта, 16:13:24

> On 10 Mar 2026, at 14:09, Jet <zhangchenxi@halodbtech.com> wrote:
>
>> but the 2026 reality is that someone would
>> just say "deploy an AI agent to check whether the code is safe for the
>> definition," and that might actually work in practical cases, but
>> we're not going to add a call-out to Claude as part of the CREATE
>> FUNCTION statement.
> I notice the potential problem just because using Claude to write a simple
> extension. And it works well on testing enviroment. But when take over the
> Claude generated extenion to dev enviroment, the server crashed.
> More and more people will use AI to generate codes, that's the trend, but AI
> will make mistakes, and may leave many potention risks. So I suppose as the
> base platform, we should try our best efforts to make it more robust.

There is no protection strong enough against developers who run generated C
code in production that they didn't read, review and test properly.

--
Daniel Gustafsson

Re: Potential security risk associated with function call

От

Robert Haas

Дата:

10 марта, 16:23:50

On Tue, Mar 10, 2026 at 8:39 AM Kirill Reshke <reshkekirill@gmail.com> wrote:
> I think our global assumption is that superuser is super-wise and
> knows everything

Right, but in case they don't, instead of writing their own CREATE
FUNCTION statements, they might want to use CREATE EXTENSION, thus
depending on the wisdom of the extension provider in lieu of their
own.

In ~30 years as a PostgreSQL user and developer, I've only written a
relatively small number of CREATE FUNCTION ... LANGUAGE c/internal
statements myself, and they've all been either for an extension or for
some kind of development exercise. There's no real reason to go around
writing random such statements that are completely broken just for
fun.

By the way, if you think this is a fun way to break your database, try
running "DELETE FROM pg_proc" sometime. Do not, under any
circumstances, do this in a PostgreSQL instance that you ever want to
use for anything ever again. I actually think we should have more
guardrails against this kind of direct system catalog modification
than we do -- like you have to set a GUC saying "yes, I know I'm
potentially about to break everything really badly" before you can
write to the system catalogs. The example that started this thread is
essentially unpreventable, because we need CREATE FUNCTION to be
possible and we need the superuser to tell us what the C code is
expecting, but the number of people who go tinkering with catalog
contents manually without fully understanding the consequences seems
to be much larger than I would have thought, even if the tinkering is
usually less dramatic than this example.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Potential security risk associated with function call

От

"Jet"

Дата:

10 марта, 17:05:01

> Right, but in case they don't, instead of writing their own CREATE
> FUNCTION statements, they might want to use CREATE EXTENSION, thus
> depending on the wisdom of the extension provider in lieu of their
> own.
>
> In ~30 years as a PostgreSQL user and developer, I've only written a
> relatively small number of CREATE FUNCTION ... LANGUAGE c/internal
> statements myself, and they've all been either for an extension or for
> some kind of development exercise. There's no real reason to go around
> writing random such statements that are completely broken just for
> fun.
I don't think it just for fun. People may prefer to use EXTENSION, but the 
problem is may the EXTENSION was written by a person who don't have full
skills with extension developing or even without any code experience but only
using AI. Just in the case I notice the problem. AI doing all the things and on
most cases it works well but leave potential risks. Will the end user really to
study the whole EXTENSION code? I can ensure most of them will not. And AI
will take over to do the most of coding works, that iss what happening...

Regards,
Jet
Halo Tech

Re: Potential security risk associated with function call

От

Robert Haas

Дата:

10 марта, 18:22:48

On Tue, Mar 10, 2026 at 10:05 AM Jet <zhangchenxi@halodbtech.com> wrote:
> I don't think it just for fun. People may prefer to use EXTENSION, but the
> problem is may the EXTENSION was written by a person who don't have full
> skills with extension developing or even without any code experience but only
> using AI. Just in the case I notice the problem. AI doing all the things and on
> most cases it works well but leave potential risks. Will the end user really to
> study the whole EXTENSION code? I can ensure most of them will not. And AI
> will take over to do the most of coding works, that iss what happening...

Sure, but what do you propose to do about it? As I have already said,
there's no realistic way for PostgreSQL itself to know what the
correct function definition is.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Potential security risk associated with function call

От

Matthias van de Meent

Дата:

10 марта, 19:06:41

On Tue, 10 Mar 2026 at 13:26, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Mar 10, 2026 at 8:03 AM Matthias van de Meent
> <boekewurm+postgres@gmail.com> wrote:
> > Correct. This is expected behaviour: the "internal" and "c" languages
> > are not 'trusted' languages, and therefore only superusers can create
> > functions using these languages. It is the explicit responsibility of
> > the superuser to make sure the functions they create using untrusted
> > languages are correct and execute safely when called by PostgreSQL.
>
> Agreed!
>
> In fact, it's pretty much theoretically impossible for this to work
> any other way. If we wanted to add checks that the expectations of the
> C code match the actual function definitions, how would we do that?
> I'm tempted to say we'd have to solve the halting problem (which is
> impossible, look it up), but the 2026 reality is that someone would
> just say "deploy an AI agent to check whether the code is safe for the
> definition," and that might actually work in practical cases, but
> we're not going to add a call-out to Claude as part of the CREATE
> FUNCTION statement.

Tangent: I think it could be possible to make extensions (and PG
itself) generate more extensive pg_finfo records that contain
sufficient information to describe the functions' expected SQL calling
signature(s), which PG could then check and verify when the function
is catalogued (e.g. through lanvalidator).
E.g. "this function has 2 PG calling signatures: a volatile function
with 2 non-null arguments, or an immutable function with 3 non-null
arguments". Registrations which conflict with the exposed definition
could then raise a warning to expose the difference. This would make
the gap between C code and SQL code that needs to be bridged by manual
superuser validation a bit smaller.

I won't claim it's trivial, but I do think it might be a worthwile
time investment, and extensions could benefit here, too, as such
metadata could be used to validate and/or generate parts of
extension's install/upgrade scripts.

(And, whilst this is not on my personal todo list, it's definitely on
my wishlist; so do with the idea what you would like).

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

Re: Potential security risk associated with function call

От

Nico Williams

Дата:

10 марта, 19:19:43

On Tue, Mar 10, 2026 at 09:23:50AM -0400, Robert Haas wrote:
>                        [...]. The example that started this thread is
> essentially unpreventable, because we need CREATE FUNCTION to be
> possible and we need the superuser to tell us what the C code is
> expecting, but the number of people who go tinkering with catalog
> contents manually without fully understanding the consequences seems
> to be much larger than I would have thought, even if the tinkering is
> usually less dramatic than this example.

If DWARF is available you could always get the C function's
prototype from that, and sanity-check it.  But DWARF really bloats
shared objects, and it's not universal, so it's not a good solution.

C is just a crappy language.  You play with fire, you best know what
you're doing -- that's a reasonable policy.  And since PG is written in
C, and users do have C-coded extensions here and there, playing with
fire has to be supported.

It'd be clever if there was at least a standard for a subset of DWARF
that provides just the types information (but not, e.g., stack
unwinding) so that we could have some sort of standard reflection
support in C.  That would be for the C standards committee.

Nico
--

Re: Potential security risk associated with function call

От

Matthias van de Meent

Дата:

10 марта, 19:54:12

On Tue, 10 Mar 2026 at 17:19, Nico Williams <nico@cryptonector.com> wrote:
>
> On Tue, Mar 10, 2026 at 09:23:50AM -0400, Robert Haas wrote:
> >                        [...]. The example that started this thread is
> > essentially unpreventable, because we need CREATE FUNCTION to be
> > possible and we need the superuser to tell us what the C code is
> > expecting, but the number of people who go tinkering with catalog
> > contents manually without fully understanding the consequences seems
> > to be much larger than I would have thought, even if the tinkering is
> > usually less dramatic than this example.
>
> If DWARF is available you could always get the C function's
> prototype from that, and sanity-check it.  But DWARF really bloats
> shared objects, and it's not universal, so it's not a good solution.

Even with DWARF analysis it wouldn't help for C-language SQL
functions, as their signature is fixed: their one and only argument is
always just an FunctionCallInfo aka FunctionCallInfoBaseData*. That
struct then contains the actual arguments/argument count/nullability
info.

Also note that the "c" language here effectively only means
"dynamically loaded symbol using standard C linking with the
platform's C calling convention": PostgreSQL doesn't compile the
functions from sources. Any language that compiles to a binary that
links with such symbols should work; e.g. C++ and Rust are both using
this mechanism despite the "c" name used for the language.

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

Re: Potential security risk associated with function call

От

Mark Woodward

Дата:

10 марта, 20:42:58

On Tue, Mar 10, 2026 at 12:19 PM Nico Williams <nico@cryptonector.com> wrote:

On Tue, Mar 10, 2026 at 09:23:50AM -0400, Robert Haas wrote:
> [...]. The example that started this thread is
> essentially unpreventable, because we need CREATE FUNCTION to be
> possible and we need the superuser to tell us what the C code is
> expecting, but the number of people who go tinkering with catalog
> contents manually without fully understanding the consequences seems
> to be much larger than I would have thought, even if the tinkering is
> usually less dramatic than this example.

If DWARF is available you could always get the C function's
prototype from that, and sanity-check it. But DWARF really bloats
shared objects, and it's not universal, so it's not a good solution.

C is just a crappy language. You play with fire, you best know what
you're doing -- that's a reasonable policy. And since PG is written in
C, and users do have C-coded extensions here and there, playing with
fire has to be supported.

I'm really tired of "C" bashing. The C programming language is a tool. Effective and powerful tools can be dangerous, but the productivity is worth it. The problem isn't the "C" language, it is with people who try to program in C but do not know C.

It'd be clever if there was at least a standard for a subset of DWARF
that provides just the types information (but not, e.g., stack
unwinding) so that we could have some sort of standard reflection
support in C. That would be for the C standards committee.

Why do we need that?

Nico
--

Re: Potential security risk associated with function call

От

Mark Woodward

Дата:

10 марта, 20:58:07

I have written a lot of postgresql extensions and I think the interface as it is is pretty good. You do need to be careful and check your inputs. One of the things we could do is create set of macros and/or post processing that could do something like C++ style name mangling for C functions. I have had to do this manually in the past, but maybe we can create a process that will scan a source file for meta in comments and create the SQL function declarations based on the number of parameters and hook them up to the correct C functions. Something like that. It's an incremental mitigation. Malicious remapping of SQL functions to bad C functions is not preventable if you have the permissions to do so.

On Tue, Mar 10, 2026 at 6:25 AM Jet <zhangchenxi@halodbtech.com> wrote:

Hi Hackers,

Recently, I notice a security risk when calling a function, it's strange but also interesting. E.g.

`array_to_text_null` is a bultin function with 3 args. Normally, the function is working well. **BUT**
if we create another version `array_to_text_null` function, say `harmful_array_to_string`, but with 2 args:

```
CREATE OR REPLACE FUNCTION harmful_array_to_string(anyarray, text)
RETURNS text
LANGUAGE internal
STABLE PARALLEL SAFE STRICT
AS $function$array_to_text_null$function$;
```

And the we call the new function:
```
postgres=# SELECT harmful_array_to_string(ARRAY[1,2], 'HARMFUL');
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
```

It will cause the server crash~

The reason is there is a if statement in `array_to_text_null`

```
Datum
array_to_text_null(PG_FUNCTION_ARGS)
{
...
/* NULL null string is passed through as a null pointer */
if (!PG_ARGISNULL(2))
null_string = text_to_cstring(PG_GETARG_TEXT_PP(2));
...
}
```

to determine wheather the 3rd arg is NULL or not. And we only pass 2 args to the function, but the
if statement here return TRUE, so it tries to get the 3rd arg, and cause the segmentfault.

The strange but interesting thing's here, if we change the code to:

```
Datum
array_to_text_null(PG_FUNCTION_ARGS)
{
...
/* NULL null string is passed through as a null pointer */
if (PG_ARGISNULL(2))
null_string = text_to_cstring(PG_GETARG_TEXT_PP(2));
...
}
```

Will this code work well?

NO! The if statement still return TRUE! So still cause the segmentfault.

Not only `array_to_text_null`, other functions also having such problem, like `array_prepend`, we can
create a function:

```
CREATE OR REPLACE FUNCTION harmful_array_prepend(anycompatible)
RETURNS anycompatiblearray
LANGUAGE internal
IMMUTABLE PARALLEL SAFE
AS $function$array_prepend$function$;
```

to cause the server crash easily.

This issue can be reproduction when compiled with "-O0". And when compiled with "-O2", although will not cause the server crash, but potential security risk arised as it will access an unknow memory.

A simple patch provided to prevent to access unknow args memory.

Jet
Halo Tech

Re: Potential security risk associated with function call

От

Tom Lane

Дата:

10 марта, 21:08:45

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> Tangent: I think it could be possible to make extensions (and PG
> itself) generate more extensive pg_finfo records that contain
> sufficient information to describe the functions' expected SQL calling
> signature(s), which PG could then check and verify when the function
> is catalogued (e.g. through lanvalidator).

I think that'd be a lot of work with little result other than to
change what sort of manual validation you have to do.  Today, you
have to check "does the function's actual C code match the SQL
definition?".  But with this, you'd have to check "does the function's
actual C code match the pg_finfo record?".  I'm not seeing a huge win
there.

Many many years ago when we first designed V1 function call protocol,
I had the idea that we could write a tool that inspects C code like

Datum
int42pl(PG_FUNCTION_ARGS)
{
    int32        arg1 = PG_GETARG_INT32(0);
    int16        arg2 = PG_GETARG_INT16(1);
    int32        result;

and automatically derives (or at least cross-checks against) the SQL
definition.  And we probably still could write such a tool.  But
there's a large fraction of the code base where no attention was paid
to following that layout, and/or one C function was made to handle
several signatures by writing conditional logic to fetch the
arguments.  Maybe you could get an AI tool to disentangle such logic,
but how much you wanna trust the results?

            regards, tom lane

Re: Potential security risk associated with function call

От

Pavel Stehule

Дата:

10 марта, 22:21:04

út 10. 3. 2026 v 19:09 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> Tangent: I think it could be possible to make extensions (and PG
> itself) generate more extensive pg_finfo records that contain
> sufficient information to describe the functions' expected SQL calling
> signature(s), which PG could then check and verify when the function
> is catalogued (e.g. through lanvalidator).

I think that'd be a lot of work with little result other than to
change what sort of manual validation you have to do. Today, you
have to check "does the function's actual C code match the SQL
definition?". But with this, you'd have to check "does the function's
actual C code match the pg_finfo record?". I'm not seeing a huge win
there.

Many many years ago when we first designed V1 function call protocol,
I had the idea that we could write a tool that inspects C code like

Datum
int42pl(PG_FUNCTION_ARGS)
{
int32 arg1 = PG_GETARG_INT32(0);
int16 arg2 = PG_GETARG_INT16(1);
int32 result;

and automatically derives (or at least cross-checks against) the SQL
definition. And we probably still could write such a tool. But
there's a large fraction of the code base where no attention was paid
to following that layout, and/or one C function was made to handle
several signatures by writing conditional logic to fetch the
arguments. Maybe you could get an AI tool to disentangle such logic,
but how much you wanna trust the results?

FmgrInfo holds fn_oid - so maybe it can holds proallargtypes too, and then some assertation can be injected to PG_GETARG_X macros without massive slowdown.

Regards

Pavel

regards, tom lane

Re: Potential security risk associated with function call

От

Andres Freund

Дата:

10 марта, 22:56:14

Hi,

On 2026-03-10 14:08:45 -0400, Tom Lane wrote:
> Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> > Tangent: I think it could be possible to make extensions (and PG
> > itself) generate more extensive pg_finfo records that contain
> > sufficient information to describe the functions' expected SQL calling
> > signature(s), which PG could then check and verify when the function
> > is catalogued (e.g. through lanvalidator).
> 
> I think that'd be a lot of work with little result other than to
> change what sort of manual validation you have to do.  Today, you
> have to check "does the function's actual C code match the SQL
> definition?".  But with this, you'd have to check "does the function's
> actual C code match the pg_finfo record?".  I'm not seeing a huge win
> there.

If we were to do this, I'd assume it'd be something vaguely like

PG_DEFINE_C_FUNCTION(funcname, {argtype1, argtype2}, returntype)
{
    ...
}

Where PG_DEFINE_C_FUNCTION() would evaluate to an extended version of
PG_FUNCTION_INFO_V1() that also declared argument types and also emitted the
function definition.  So there hopefully would be less of a chance of a
mismatch...  Then the CREATE FUNCTION could verify that, if present, the
additional information present in the finfo matches the SQL signature.

FWIW, I think we're going to eventually need a more optimized function call
protocol for the most common cases (small number of arguments, no SRF, perhaps
requiring them to be strict, ...). If you look at profiles of queries that do
stuff like aggregate transition invocations or WHERE clause evaluation as part
of a large seqscan, moving things into and out FunctionCallInfo really adds
up. We spend way more on that than e.g. evaluating an int4lt or int8inc.

Greetings,

Andres Freund

Re: Potential security risk associated with function call

От

Pavel Stehule

Дата:

10 марта, 23:18:02

út 10. 3. 2026 v 20:56 odesílatel Andres Freund <andres@anarazel.de> napsal:

Hi,

On 2026-03-10 14:08:45 -0400, Tom Lane wrote:
> Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> > Tangent: I think it could be possible to make extensions (and PG
> > itself) generate more extensive pg_finfo records that contain
> > sufficient information to describe the functions' expected SQL calling
> > signature(s), which PG could then check and verify when the function
> > is catalogued (e.g. through lanvalidator).
>
> I think that'd be a lot of work with little result other than to
> change what sort of manual validation you have to do. Today, you
> have to check "does the function's actual C code match the SQL
> definition?". But with this, you'd have to check "does the function's
> actual C code match the pg_finfo record?". I'm not seeing a huge win
> there.

If we were to do this, I'd assume it'd be something vaguely like

PG_DEFINE_C_FUNCTION(funcname, {argtype1, argtype2}, returntype)
{
...
}

Where PG_DEFINE_C_FUNCTION() would evaluate to an extended version of
PG_FUNCTION_INFO_V1() that also declared argument types and also emitted the
function definition. So there hopefully would be less of a chance of a
mismatch... Then the CREATE FUNCTION could verify that, if present, the
additional information present in the finfo matches the SQL signature.

FWIW, I think we're going to eventually need a more optimized function call
protocol for the most common cases (small number of arguments, no SRF, perhaps
requiring them to be strict, ...). If you look at profiles of queries that do
stuff like aggregate transition invocations or WHERE clause evaluation as part
of a large seqscan, moving things into and out FunctionCallInfo really adds
up. We spend way more on that than e.g. evaluating an int4lt or int8inc.

Maybe a vector executor and vector instructions can be a solution - as an alternative (not substitution).

The overhead of fmgr per one row is high, but for a call with a batch 1000 rows can be minimal.

Regards

Pavel

Greetings,

Andres Freund

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Potential security risk associated with function call

Вложения