Обсуждение: Cleaning up PREPARE query strings?

Поиск

Список

Период

Сортировка

Cleaning up PREPARE query strings?

От

Julien Rouhaud

Дата:

24 декабря 2025 г., 18:43:13

Hi,

Currently prepared statements store the whole query string that was submitted
by the client at the time of the PREPARE as-is.  This is usually fine, but if
that query was a multi-statement query string it can lead to a waste of memory.

There are some pattern that are more likely to have such overhead, mine being
an application with a fixed set of prepared statements that are sent at the
connection start using a single query to avoid extra round trips.

One naive example of the outcome is as follow:

#= PREPARE s1 AS SELECT 1\; PREPARE s2(text) AS SELECT oid FROM pg_class WHERE
relname = $1\; PREPARE s3(int, int) AS SELECT $1 + $2;
PREPARE
PREPARE
PREPARE

=# SELECT name, statement FROM pg_prepared_statements ;
 name |                                 statement
------+----------------------------------------------------------------------------
 s1   | PREPARE s1 AS SELECT 1; PREPARE s2(text) AS SELECT oid FROM pg_class WHERE+
      | relname = $1; PREPARE s3(int, int) AS SELECT $1 + $2;
 s2   | PREPARE s1 AS SELECT 1; PREPARE s2(text) AS SELECT oid FROM pg_class WHERE+
      | relname = $1; PREPARE s3(int, int) AS SELECT $1 + $2;
 s3   | PREPARE s1 AS SELECT 1; PREPARE s2(text) AS SELECT oid FROM pg_class WHERE+
      | relname = $1; PREPARE s3(int, int) AS SELECT $1 + $2;
(3 rows)

The more prepared statements you have the bigger the waste.  This is also not
particularly readable for people who want to rely on the pg_prepared_statements
views, as you need to parse the query again yourself to figure out what exactly
is the associated query.

I assume that some other patterns could lead to other kind of problems.  For
instance if the query string includes a prepared statement and some DML, it
could lead some automated program to replay both the PREPARE and DML when only
the PREPARE was intended.

I'm attaching a POC patch to fix that behavior by teaching PREPARE to clean the
passed query text the same way as pg_stat_statements.  Since it relies on the
location saved during parsing the overhead should be minimal, and only present
when some space can actually be saved.  Note that I first tried to have the
cleanup done in CreateCachedPlan so that it's done everywhere including things
like the extended protocol but this lead to too many issues so I ended up doing
it for an explicit PREPARE statement only.

With this patch applied, the above scenario gives this new output:

=# SELECT name, statement FROM pg_prepared_statements ;
 name |                     statement
------+----------------------------------------------------
 s1   | PREPARE s1 AS SELECT 1
 s2   | PREPARE s2(text) AS SELECT oid FROM pg_class WHERE+
      | relname = $1
 s3   | PREPARE s3(int, int) AS SELECT $1 + $2
(3 rows)

One possible issue is that any comment present at the beginning of the query
text would be discarded.  I'm not sure if that's something used by e.g.
pg_hint_plan, but if yes it's always possible to put the statement in front of
the SELECT (or other actual first keyword) rather than the PREPARE itself to
preserve it.

Вложения

v1-0001-Cleanup-explicit-PREPARE-query-strings.patch

Re: Cleaning up PREPARE query strings?

От

Tom Lane

Дата:

24 декабря 2025 г., 19:21:00

Julien Rouhaud <rjuju123@gmail.com> writes:
> I'm attaching a POC patch to fix that behavior by teaching PREPARE to clean the
> passed query text the same way as pg_stat_statements.

This patch invalidates all the location fields in the parsed query:
they could not be used to generate sane error cursors referencing the
truncated string.  I'm not sure how many places try to generate such
errors post-parsing, but it's more than zero, and I've long had
ambitions of trying to extend that substantially (e.g, allowing
execution-time errors from functions to point at the relevant function
call).

Certainly the patch could be extended to update all those fields,
but that increases its complexity very significantly.  I doubt
that it's worth it.  My reaction to your example is more like
"if that bothers you, don't do it that way".

            regards, tom lane

Re: Cleaning up PREPARE query strings?

От

Sami Imseih

Дата:

24 декабря 2025 г., 23:16:38

Hi,

I think this is a good idea.

> Julien Rouhaud <rjuju123@gmail.com> writes:
> > I'm attaching a POC patch to fix that behavior by teaching PREPARE to clean the
> > passed query text the same way as pg_stat_statements.
>
> This patch invalidates all the location fields in the parsed query:
> they could not be used to generate sane error cursors referencing the
> truncated string.

I am not sure why we need to update the location of the rawstmt at all.
CreateCachedPlan does not seem to care about location and length
of query strings, AFAICT.

```rawstmt->stmt_location = 0;```

This is somewhat tangential, but I am now also wondering if
CleanQuerytext itself
should not mess with supplied statement location and length, but instead
send back the location and length of the isolated query text within the
multi-line query separately.

From
```
extern const char *CleanQuerytext(const char *query, int *location, int *len);
```

To
```
extern const char *CleanQuerytext(const char *query, int location, int
len, int *new_location, int *new_len);
```

It looks wrong that CleanQueryText is modifying any of the original
locations and lengths.

--
Sami Imseih
Amazon Web Services (AWS)

Re: Cleaning up PREPARE query strings?

От

Julien Rouhaud

Дата:

11 января, 10:50:19

Hi,

On Wed, Dec 24, 2025 at 11:21:00AM -0500, Tom Lane wrote:
> Julien Rouhaud <rjuju123@gmail.com> writes:
> > I'm attaching a POC patch to fix that behavior by teaching PREPARE to clean the
> > passed query text the same way as pg_stat_statements.
>
> This patch invalidates all the location fields in the parsed query:
> they could not be used to generate sane error cursors referencing the
> truncated string.  I'm not sure how many places try to generate such
> errors post-parsing, but it's more than zero, and I've long had
> ambitions of trying to extend that substantially (e.g, allowing
> execution-time errors from functions to point at the relevant function
> call).

Ah I wasn't aware of that.  After a quick look I'm assuming it's all the
callers of executor_errposition()?

There aren't a lot of them, and I've unfortunately been unable to hit any of
them.  Every scenario I tried was always caught during planning time.  Do you
have any example on how to exercise them?

> Certainly the patch could be extended to update all those fields,
> but that increases its complexity very significantly.  I doubt
> that it's worth it.  My reaction to your example is more like
> "if that bothers you, don't do it that way".

I agree that updating the location fields in the whole parsetree is a non
starter, butt I don't think that it's the only option.

If that's indeed the cases going through executor_errposition(), they rely on
having the executor state at hand to retrieve the query string.   We could
simply have an extra "query string offset" field stored there that would always
be 0 except when PREPARE changes the input query string and then do the
location to character number calculation after applying that offset. And that
offset should already be returned by CleanQuerytext() so it wouldn't have extra
complication.

Re: Cleaning up PREPARE query strings?

От

Sami Imseih

Дата:

13 января, 23:24:51

I was looking a bit more here, and I found that this patch breaks if
pg_stat_statements is enabled.

```
postgres=# SELECT 'bingo'\;  PREPARE q1 AS SELECT 1 AS a \; SELECT 42;
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
!?>
```

This is because the raw statement location is set to 0 with the patch which
breaks generate_nromalized_query when trying to deal with a multi command
statement. So, I don't think the way v1 is doing things will actually work.

I am thinking that storing the statement length and location in the entry
of prepared_queries may be the better option. Having that info, the
cleaned up query text can be generated on the fly whenever
pg_prepared_statement is called.

We may not need to call CleanQuerytext whenever, however it does
have the benefit of removing leading and trailing whitespace and the
terminating semicolon.

What do you think?

--
Sami Imseih
Amazon Web Services (AWS)

Re: Cleaning up PREPARE query strings?

От

Julien Rouhaud

Дата:

14 января, 01:50:59

Hi,

On Tue, Jan 13, 2026 at 02:24:51PM -0600, Sami Imseih wrote:
> I was looking a bit more here, and I found that this patch breaks if
> pg_stat_statements is enabled.
>
> ```
> postgres=# SELECT 'bingo'\;  PREPARE q1 AS SELECT 1 AS a \; SELECT 42;
> server closed the connection unexpectedly
>     This probably means the server terminated abnormally
>     before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> The connection to the server was lost. Attempting reset: Failed.
> !?>
> ```
>
> This is because the raw statement location is set to 0 with the patch which
> breaks generate_nromalized_query when trying to deal with a multi command
> statement. So, I don't think the way v1 is doing things will actually work.

That's very strange.  I'm pretty sure that I tried since the earlier approach I
mentioned using doing it in CreateCachedPlan() had that exact problem.

Also, src/test/recovery/t/027_stream_regress.pl does run the full regression
tests with pg_stat_statements enabled, and it doesn't fail.

I'm not sure what is different in your case.

Re: Cleaning up PREPARE query strings?

От

Sami Imseih

Дата:

14 января, 18:56:21

> That's very strange.  I'm pretty sure that I tried since the earlier approach I
> mentioned using doing it in CreateCachedPlan() had that exact problem.
>
> Also, src/test/recovery/t/027_stream_regress.pl does run the full regression
> tests with pg_stat_statements enabled, and it doesn't fail.
>
> I'm not sure what is different in your case.

the regression tests succeed because by the time the multi-statement command
is executed, the queries involved have already been tracked by
pg_stat_statements,
so they don't need to get stored again, and no reason to go through
generate_normalize_query. In this case, SELECT $1 is already tracked.

To repro the crash, just reset pg_stat_statements prior.

```
diff --git a/src/test/regress/sql/prepare.sql b/src/test/regress/sql/prepare.sql
index 0e7fe44725e..51421357f26 100644
--- a/src/test/regress/sql/prepare.sql
+++ b/src/test/regress/sql/prepare.sql
@@ -4,6 +4,9 @@

 SELECT name, statement, parameter_types, result_types FROM
pg_prepared_statements;

+CREATE EXTENSION pg_stat_statements;
+SELECT pg_stat_statements_reset();
+
 SELECT 'bingo'\;  PREPARE q1 AS SELECT 1 AS a \; SELECT 42;
 EXECUTE q1;
```

For this test, we should either reset statements before or construct a
statement for the test that has not been tracked. We can maybe
do that with a query against a newly created table in prepare.sql.
I like the latter more.

--
Sami Imseih
Amazon Web Services (AWS)

Re: Cleaning up PREPARE query strings?

От

Julien Rouhaud

Дата:

15 января, 07:41:29

Hi,

On Wed, Jan 14, 2026 at 09:56:21AM -0600, Sami Imseih wrote:
> > That's very strange.  I'm pretty sure that I tried since the earlier approach I
> > mentioned using doing it in CreateCachedPlan() had that exact problem.
> >
> > Also, src/test/recovery/t/027_stream_regress.pl does run the full regression
> > tests with pg_stat_statements enabled, and it doesn't fail.
> >
> > I'm not sure what is different in your case.
>
> the regression tests succeed because by the time the multi-statement command
> is executed, the queries involved have already been tracked by
> pg_stat_statements,
> so they don't need to get stored again, and no reason to go through
> generate_normalize_query. In this case, SELECT $1 is already tracked.

I see.  I'm not sure why my first prototype used to crash with that exact same
problem in 027_stream_regress but not this one, but anyway.

I only had a quick look at the code but unless I'm missing something it's
mostly an oversight as I pass "new_query" to CreateCachedPlan() but still pass
the original query string to pg_analyze_and_rewrite_varparams().  Using
"new_query" there too should fix the problem?

In an earlier message you wrote

> I am thinking that storing the statement length and location in the entry
> of prepared_queries may be the better option. Having that info, the
> cleaned up query text can be generated on the fly whenever
> pg_prepared_statement is called.

I don't like this approach as it has more execution time overhead, but also
doesn't fix at all the memory waste in the prepared statements cache.

> To repro the crash, just reset pg_stat_statements prior.
>
> ```
> diff --git a/src/test/regress/sql/prepare.sql b/src/test/regress/sql/prepare.sql
> index 0e7fe44725e..51421357f26 100644
> --- a/src/test/regress/sql/prepare.sql
> +++ b/src/test/regress/sql/prepare.sql
> @@ -4,6 +4,9 @@
>
>  SELECT name, statement, parameter_types, result_types FROM
> pg_prepared_statements;
>
> +CREATE EXTENSION pg_stat_statements;
> +SELECT pg_stat_statements_reset();
> +
>  SELECT 'bingo'\;  PREPARE q1 AS SELECT 1 AS a \; SELECT 42;
>  EXECUTE q1;
> ```
>
> For this test, we should either reset statements before or construct a
> statement for the test that has not been tracked. We can maybe
> do that with a query against a newly created table in prepare.sql.
> I like the latter more.

FWIW I think that this should belong to pg_stat_statements testing, no the main
regression tests.  This would also ensure that we see consistent results in
some other scenarios.

Re: Cleaning up PREPARE query strings?

От

Sami Imseih

Дата:

15 января, 22:13:36

> I only had a quick look at the code but unless I'm missing something it's
> mostly an oversight as I pass "new_query" to CreateCachedPlan() but still pass
> the original query string to pg_analyze_and_rewrite_varparams().  Using
> "new_query" there too should fix the problem?

That will be a fix, but that seems way too invasive for me. Not sure if it may
break other things.

> > I am thinking that storing the statement length and location in the entry
> > of prepared_queries may be the better option. Having that info, the
> > cleaned up query text can be generated on the fly whenever
> > pg_prepared_statement is called.
>
> I don't like this approach as it has more execution time overhead, but also
> doesn't fix at all the memory waste in the prepared statements cache.

ISTM that your proposal will actually use more memory because
pstate->p_sourcetext does not get free'd, but we must now allocate
space for a new "cleaned" query.

As far as execution overhead, this will only be paid when you query
pg_prepared_statement, and we can even optimize this a bit by only
cleaning once and reusing the results.

> FWIW I think that this should belong to pg_stat_statements testing, no the main
> regression tests.  This would also ensure that we see consistent results in
> some other scenarios.

I agree.

--
Sami Imseih
Amazon Web Services (AWS)

Re: Cleaning up PREPARE query strings?

От

Julien Rouhaud

Дата:

18 января, 14:18:02

Hi,

On Thu, Jan 15, 2026 at 01:13:36PM -0600, Sami Imseih wrote:
> > I only had a quick look at the code but unless I'm missing something it's
> > mostly an oversight as I pass "new_query" to CreateCachedPlan() but still pass
> > the original query string to pg_analyze_and_rewrite_varparams().  Using
> > "new_query" there too should fix the problem?
>
> That will be a fix, but that seems way too invasive for me. Not sure if it may
> break other things.
>
> > > I am thinking that storing the statement length and location in the entry
> > > of prepared_queries may be the better option. Having that info, the
> > > cleaned up query text can be generated on the fly whenever
> > > pg_prepared_statement is called.
> >
> > I don't like this approach as it has more execution time overhead, but also
> > doesn't fix at all the memory waste in the prepared statements cache.
>
> ISTM that your proposal will actually use more memory because
> pstate->p_sourcetext does not get free'd, but we must now allocate
> space for a new "cleaned" query.

I'm not sure that I understand.  Sure we allocate a new temporary buffer for
the cleaned up query string but this will be freed soon.  The query string
stored in the prepared statement will stay in memory as long as the prepare
statement exist so this any cleanup can actually save a lot of memory.

I'm attaching a v2 that fixes the crash you describe earlier, and with some
minimal regression test in pg_stat_statements to cover it.

> As far as execution overhead, this will only be paid when you query
> pg_prepared_statement, and we can even optimize this a bit by only
> cleaning once and reusing the results.

Actually I thought that there was a requirement to have a zero terminated
string in the API to transform C strings to text datums, but there is not.
It's actually slightly faster if you know the size as there is on strlen() less
call in that case.  This could be done in both case though.

> > FWIW I think that this should belong to pg_stat_statements testing, no the main
> > regression tests.  This would also ensure that we see consistent results in
> > some other scenarios.
>
> I agree.

As mentioned basic tests added in v2.

Вложения

v2-0001-Cleanup-explicit-PREPARE-query-strings.patch

Re: Cleaning up PREPARE query strings?

От

Sami Imseih

Дата:

18 января, 19:16:16

> ISTM that your proposal will actually use more memory because
> pstate->p_sourcetext does not get free'd, but we must now allocate
> space for a new "cleaned" query.

I'm not sure that I understand. Sure we allocate a new temporary buffer for
the cleaned up query string but this will be freed soon. The query string
stored in the prepared statement will stay in memory as long as the prepare
statement exist so this any cleanup can actually save a lot of memory.

My point is pstate->p_sourcetext doesn't get freed just because we're not
referencing it from CachedPlanSource in prepared_queries. Instead, with
multi-statement strings, prepared_queries now use a newly allocated string,
new_query, which will be around until DEALLOCATE.

```

+ tmp = palloc(rawstmt->stmt_len + 1);
+ strlcpy(tmp, cleaned, rawstmt->stmt_len + 1);
+
+ new_query = tmp;

```

So this patch always increases memory usage for multi-statement strings;
we have both the original multi-statement string and the cleaned up copy
around.

Sami Imseih

Amazon Web Services (AWS)

Re: Cleaning up PREPARE query strings?

От

Julien Rouhaud

Дата:

19 января, 04:07:36

Hi,

On Sun, Jan 18, 2026 at 10:16:16AM -0600, Sami Imseih wrote:
> >
> > > ISTM that your proposal will actually use more memory because
> > > pstate->p_sourcetext does not get free'd, but we must now allocate
> > > space for a new "cleaned" query.
> >
> > I'm not sure that I understand.  Sure we allocate a new temporary buffer
> > for
> > the cleaned up query string but this will be freed soon.  The query string
> > stored in the prepared statement will stay in memory as long as the prepare
> > statement exist so this any cleanup can actually save a lot of memory.
> >
>
> My point is pstate->p_sourcetext doesn't get freed just because we're not
> referencing it from CachedPlanSource in prepared_queries. Instead, with
> multi-statement strings, prepared_queries now use a newly allocated string,
> new_query, which will be around until DEALLOCATE.
>
> ```
> +               tmp = palloc(rawstmt->stmt_len + 1);
> +               strlcpy(tmp, cleaned, rawstmt->stmt_len + 1);
> +
> +               new_query = tmp;
> ```
>
> So this patch always increases memory usage for multi-statement strings;
> we have both the original multi-statement string and the cleaned up copy
> around.

I think that you have a misunderstanding of how memory context works in
postgres.  Both pstate->p_sourcetext and new_query will be freed when their
underlying memory context will be reset whether or not they're individually
freed, so they won't survive until the next EXECUTE or DEALLOCATE.  It's
actually usually worse to issue individual pfree() as it's just unneeded
overhead.

CreateCachedPlan() creates a dedicated memory context to save all its
information, including a new copy of the passed query string, and that context
will eventually be linked to a permanent one so that it can survive the
original query execution lifetime.

So with this patch we have an opportunity to allocate less data in the
permanent memory context, which as I said requires a transient extra allocation
of the current portion of the current query string.

Re: Cleaning up PREPARE query strings?

От

Sami Imseih

Дата:

19 января, 09:43:53

> I think that you have a misunderstanding of how memory context works in
> postgres.

No, I understand how memory contexts work, but I skipped over the fact that the
cleaned up query text is copied into CachedPlanSource memory context during

CreateCachedPlan:
```
plansource->query_string = pstrdup(query_string);

```
Which is long-lived and will remain until DEALLOCATE.

while pstate->p_sourcetext and new_query are in the current execution's
memory context.

The patch is indeed better from a memory perspective. My apologies.

However, the error reporting does break with the patch. Notice with the patch
the cursor for the error reporting shifts incorrectly. This is due to the fact
rawstmt->stmt_location/length are no longer representative of the original
qurey text.

## unpatched
```
postgres=# SELECT 1    \; PREPARE stmt AS SELECT nonexistent_column
FROM users\; SELECT 2;
 ?column?
----------
        1
(1 row)

ERROR:  relation "users" does not exist
LINE 1: ... ; PREPARE stmt AS SELECT nonexistent_column FROM users; SEL...

                                       ^
```

## patched
```
postgres=# SELECT 1    \; PREPARE stmt AS SELECT nonexistent_column
FROM users\; SELECT 2;
 ?column?
----------
        1
(1 row)

ERROR:  relation "users" does not exist
LINE 1: ...LECT 1    ; PREPARE stmt AS SELECT nonexistent_column FROM u...

                                   ^
```

--
Sami Imseih
Amazon Web Services (AWS)

Re: Cleaning up PREPARE query strings?

От

Julien Rouhaud

Дата:

25 января, 15:57:31

Hi,

On Mon, Jan 19, 2026 at 12:43:53AM -0600, Sami Imseih wrote:
>
> However, the error reporting does break with the patch. Notice with the patch
> the cursor for the error reporting shifts incorrectly. This is due to the fact
> rawstmt->stmt_location/length are no longer representative of the original
> qurey text.
>
> ## unpatched
> ```
> postgres=# SELECT 1    \; PREPARE stmt AS SELECT nonexistent_column
> FROM users\; SELECT 2;
>  ?column?
> ----------
>         1
> (1 row)
>
> ERROR:  relation "users" does not exist
> LINE 1: ... ; PREPARE stmt AS SELECT nonexistent_column FROM users; SEL...
>
>                                        ^
> ```
>
> ## patched
> ```
> postgres=# SELECT 1    \; PREPARE stmt AS SELECT nonexistent_column
> FROM users\; SELECT 2;
>  ?column?
> ----------
>         1
> (1 row)
>
> ERROR:  relation "users" does not exist
> LINE 1: ...LECT 1    ; PREPARE stmt AS SELECT nonexistent_column FROM u...
>
>                                    ^
> ```

This was already reported by Tom Lane on his first message, although his
complaint was about execution time error reporting while this is during
parse-analysis.  However, I think that the exact same approach can be used to
fixed both, either updating the position of every single element (which no one
wants) or teaching the executor (and evidently the planstate) about a new
"query offset" so that parser_errposition and executor_errposition report the
correct location.  I'm still waiting on whether the latter would be acceptable
or not before implementing it.  Note that I wasn't able to hit the execution
time error so at least with parse-analysis time error I could at least have
some regression tests, so thanks a lot!

In the meantime, the cfbot shows that a rebase is needed, so v3 attached.

Вложения

v3-0001-Cleanup-explicit-PREPARE-query-strings.patch

Re: Cleaning up PREPARE query strings?

От

Sami Imseih

Дата:

27 января, 01:48:59

> This was already reported by Tom Lane on his first message, although his
> complaint was about execution time error reporting while this is during
> parse-analysis.  However, I think that the exact same approach can be used to
> fixed both, either updating the position of every single element (which no one
> wants) or teaching the executor (and evidently the planstate) about a new
> "query offset" so that parser_errposition and executor_errposition report the
> correct location.

Maybe this is what you also mean, but I think it should be enough to
just tell Query
about the updated location = 0, and we don't need to touch anything else
( the raw parestree or the copy of it in plansource ). This will
ensure that the error
tracking is relying on the original raw parsetree, and consumers of the updated
statement query location and lengths are getting the updated values based on
the cleaned up query text that is stored in prepared statements.

I tested the below with the tests you have in v3 so far and it works.
I also tested
the error position and it looks correct.

Not sure if I am missing something.


```
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -27,6 +27,7 @@
 #include "commands/prepare.h"
 #include "funcapi.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/queryjumble.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_collate.h"
 #include "parser/parse_expr.h"
@@ -64,6 +65,8 @@ PrepareQuery(ParseState *pstate, PrepareStmt *stmt,
        Oid                *argtypes = NULL;
        int                     nargs;
        List       *query_list;
+       const char *new_query;
+       ListCell   *lc;

        /*
         * Disallow empty-string statement name (conflicts with protocol-level
@@ -82,12 +85,31 @@ PrepareQuery(ParseState *pstate, PrepareStmt *stmt,
        rawstmt->stmt = stmt->query;
        rawstmt->stmt_location = stmt_location;
        rawstmt->stmt_len = stmt_len;
+       new_query = pstate->p_sourcetext;
+
+       /*
+        * If we have a multi-statement string, confine the query to
the query we
+        * care about.
+        */
+       if (stmt_location >= 0)
+       {
+               const char *cleaned;
+               char       *tmp;
+
+               cleaned = CleanQuerytext(pstate->p_sourcetext,
&rawstmt->stmt_location,
+
&rawstmt->stmt_len);
+
+               tmp = palloc(rawstmt->stmt_len + 1);
+               strlcpy(tmp, cleaned, rawstmt->stmt_len + 1);
+
+               new_query = tmp;
+       }

        /*
         * Create the CachedPlanSource before we do parse analysis,
since it needs
         * to see the unmodified raw parse tree.
         */
-       plansource = CreateCachedPlan(rawstmt, pstate->p_sourcetext,
+       plansource = CreateCachedPlan(rawstmt, new_query,

CreateCommandTag(stmt->query));

        /* Transform list of TypeNames to array of type OIDs */
@@ -119,6 +141,17 @@ PrepareQuery(ParseState *pstate, PrepareStmt *stmt,
        query_list = pg_analyze_and_rewrite_varparams(rawstmt,
pstate->p_sourcetext,

                           &argtypes, &nargs, NULL);

+       /*
+        * Set the stmt location to 0 and update stmt_len, since at
this point we
+        * only have the query we care about in case of a
multi-statement string.
+        */
+       foreach(lc, query_list)
+       {
+               Query      *query = lfirst_node(Query, lc);
+
+               if (query->stmt_location > 0)
+                   query->stmt_location = 0;
+       }

```

--
Sami Imseih
Amazon Web Services (AWS)

Re: Cleaning up PREPARE query strings?

От

Julien Rouhaud

Дата:

01 марта, 03:10:27

Hi,

On Mon, Jan 26, 2026 at 04:48:59PM -0600, Sami Imseih wrote:
> > This was already reported by Tom Lane on his first message, although his
> > complaint was about execution time error reporting while this is during
> > parse-analysis.  However, I think that the exact same approach can be used to
> > fixed both, either updating the position of every single element (which no one
> > wants) or teaching the executor (and evidently the planstate) about a new
> > "query offset" so that parser_errposition and executor_errposition report the
> > correct location.
>
> Maybe this is what you also mean, but I think it should be enough to
> just tell Query
> about the updated location = 0, and we don't need to touch anything else
> ( the raw parestree or the copy of it in plansource ). This will
> ensure that the error
> tracking is relying on the original raw parsetree, and consumers of the updated
> statement query location and lengths are getting the updated values based on
> the cleaned up query text that is stored in prepared statements.
>
> I tested the below with the tests you have in v3 so far and it works.
> I also tested
> the error position and it looks correct.
>
> Not sure if I am missing something.

It's hard to know what changed exactly as you show the diff against master
rather than against the patch, but IIUC the problem will be for execution time
error location reporting, as reported before.  I think that it will still be
wrong, but may now also point to an offset that isn't in the query text.  I
still haven't been able to hit that code path, so I can't really investigate
more.  In general I feel like it would be cleaner to have the same handling for
a possible "location vs query offset" approach for both pare-analysis and
execution, but I'm not opposed to this approach if it gets the feature
accepted.

Re: Cleaning up PREPARE query strings?

От

Sami Imseih

Дата:

05 марта, 00:33:13

> It's hard to know what changed exactly as you show the diff against master

Here it is as a v4

> rather than against the patch, but IIUC the problem will be for execution time
> error location reporting, as reported before.  I think that it will still be
> wrong, but may now also point to an offset that isn't in the query text.
> I still haven't been able to hit that code path,

From what I can tell, executor_errposition is difficult to trigger in
practice due to
commit 0436f6bde8 and the related discussion in [1]. Most errors now occur
during parsing analysis rather than execution.

Therefore, in the attached v4 patch, I propose NOT passing the cleaned
up text to
pg_analyze_and_rewrite_varparams() as v3 does:

-    query_list = pg_analyze_and_rewrite_varparams(rawstmt,
pstate->p_sourcetext,
+    query_list = pg_analyze_and_rewrite_varparams(rawstmt, new_query,

Instead, continue passing the original source text (pstate->p_sourcetext).
This ensures that parser_errposition() uses cursor marker position relative
to the original multi-statement string.

Also, from what I can see, in the cases where executor_errposition()
is triggered
(though I don't have a test case), those will use the cleaned-up source text
from the plan cache. However, the prepared statement must be executed
via EXECUTE, so the cursor marker position is not a concern in this case.


[1] [https://www.postgresql.org/message-id/27103.1496979469%40sss.pgh.pa.us]

--
Sami Imseih
Amazon Web Services (AWS)

Вложения

v4-0001-Clean-up-PREPARE-query-strings-in-multi-statement.patch

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Cleaning up PREPARE query strings?

Вложения

Вложения

Вложения

Вложения