Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()

Поиск
Список
Период
Сортировка
От Chao Li
Тема Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Дата
Msg-id BFBCB564-1AA4-4686-B539-92B63B807E19@gmail.com
обсуждение исходный текст
Ответ на Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Ответы Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Re: lsyscache: free IndexAmRoutine objects returned by GetIndexAmRoutineByAmId()
Список pgsql-hackers

> On Dec 30, 2025, at 04:04, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
>
> On Mon, 29 Dec 2025 at 20:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Matthias van de Meent <boekewurm+postgres@gmail.com> writes
>>> Here's my patch that does this, pulled from [0]. It was originally
>>> built to reduce the per-index catcache overhead; as using static const
>>> pointers reduces the data allocated into the "index info" context by
>>> 264 bytes.
>>
>> Cool, I forgot you'd already been poking at this.  The patch
>> is kinda long, but not as bad as I feared, and it all looks
>> quite mechanical.  It lacks documentation updates though.
>
> Attached v2, in which I've updated the one place where I could find
> IndexAmRoutine's allocation policy being described, and in which I've
> also updated InitIndexAmRoutine with the suggested changes of your
> other mail. Thanks!
>
> Kind regards,
>
> Matthias van de Meent
> Databricks (https://www.databricks.com)
> <v2-0001-Stop-allocating-one-IndexAmRoutine-for-every-inde.patch>


I’m glad that my finding helped move this patch forward. After reviewing v2, I think my patch can be completely
supersededby yours, which is fine with me. I have a couple of comments on v2. 

1 - amapi.c
```
/*
* GetIndexAmRoutine - call the specified access method handler routine to get
* its IndexAmRoutine struct, which will be palloc'd in the caller's context.
*
* Note that if the amhandler function is built-in, this will not involve
* any catalog access. It's therefore safe to use this while bootstrapping
* indexes for the system catalogs. relcache.c relies on that.
*/
const IndexAmRoutine *
GetIndexAmRoutine(Oid amhandler)
{
Datum datum;
IndexAmRoutine *routine;

```

* The function header comment needs an update, it still talks about “palloc”, now it should say something like
“returnedstructure should not be free-ed”. 
* The local variable “routine” now can be “const” as well.

2 - relcache.c
```
 InitIndexAmRoutine(Relation relation)
 {
-    IndexAmRoutine *cached,
-               *tmp;
+    MemoryContext    oldctx;

     /*
-     * Call the amhandler in current, short-lived memory context, just in case
-     * it leaks anything (it probably won't, but let's be paranoid).
+     * We formerly specified that the amhandler should return a
+     * palloc'd struct.  That's now deprecated in favor of returning
+     * a pointer to a static struct, but to avoid completely breaking
+     * old external AMs, run the amhandler in the relation's rd_indexcxt.
      */
-    tmp = GetIndexAmRoutine(relation->rd_amhandler);
-
-    /* OK, now transfer the data into relation's rd_indexcxt. */
-    cached = (IndexAmRoutine *) MemoryContextAlloc(relation->rd_indexcxt,
-                                                   sizeof(IndexAmRoutine));
-    memcpy(cached, tmp, sizeof(IndexAmRoutine));
-    relation->rd_indam = cached;
-
-    pfree(tmp);
+    oldctx = MemoryContextSwitchTo(relation->rd_indexcxt);
+    relation->rd_indam = GetIndexAmRoutine(relation->rd_amhandler);
+    MemoryContextSwitchTo(oldctx);
 }
 ```

I understand that switching to rd_indexcxt is intended to mitigate the risk that external index AMs might still palloc
thereturned IndexAmRoutine. 

However, I don’t see a way to actually enforce external AMs to always return a static pointer. In particular, if an AM
switchesto a different memory context internally, the MemoryContextSwitchTo() here would not help. 

I’m not sure whether we want to go further than the current approach, but it seems that fully eliminating the risk
wouldrequire detecting dynamically allocated results and copying them into rd_indexcxt, which doesn’t appear easy or
robustto implement in practice. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







В списке pgsql-hackers по дате отправления: