Обсуждение: VACUUM FULL, CLUSTER, and REPACK block on other sessions' temp tables

Поиск
Список
Период
Сортировка

VACUUM FULL, CLUSTER, and REPACK block on other sessions' temp tables

От
Jim Jones
Дата:
Hi,

While testing another patch [1], I noticed that REPACK is blocked when a
temporary table is locked in another session. It also turns out that the
same behaviour occurs with VACUUM FULL and CLUSTER:

== session 1 ==

$ psql postgres
psql (19devel)
Type "help" for help.

postgres=# CREATE TEMPORARY TABLE tmp (id int);
CREATE TABLE
postgres=# BEGIN;
LOCK TABLE tmp IN SHARE MODE;
BEGIN
LOCK TABLE
postgres=*#

== session 2 ==

$ psql postgres
psql (19devel)
Type "help" for help.

postgres=# REPACK;
^CCancel request sent
ERROR:  canceling statement due to user request
CONTEXT:  waiting for AccessExclusiveLock on relation 38458 of database 5
postgres=# VACUUM FULL;
^CCancel request sent
ERROR:  canceling statement due to user request
CONTEXT:  waiting for AccessExclusiveLock on relation 38458 of database 5

Skipping temporary relations in get_tables_to_repack() and
get_all_vacuum_rels() before they're appended to the list seems to do
the trick -- see attached draft.

I can reproduce the same behaviour with CLUSTER and VACUUM FULL in
PG14-PG18. I took a quick look at the code in PG17 and PG18 and the fix
appears to be straightforward, but before I start working on it, I'd
like to hear your thoughts. Is it worth the effort?

Best, Jim

1 - https://www.postgresql.org/message-id/13637.1774342137%40localhost
Вложения

Re: VACUUM FULL, CLUSTER, and REPACK block on other sessions' temp tables

От
Chao Li
Дата:

> On Mar 24, 2026, at 23:35, Jim Jones <jim.jones@uni-muenster.de> wrote:
>
> Hi,
>
> While testing another patch [1], I noticed that REPACK is blocked when a
> temporary table is locked in another session. It also turns out that the
> same behaviour occurs with VACUUM FULL and CLUSTER:
>
> == session 1 ==
>
> $ psql postgres
> psql (19devel)
> Type "help" for help.
>
> postgres=# CREATE TEMPORARY TABLE tmp (id int);
> CREATE TABLE
> postgres=# BEGIN;
> LOCK TABLE tmp IN SHARE MODE;
> BEGIN
> LOCK TABLE
> postgres=*#
>
> == session 2 ==
>
> $ psql postgres
> psql (19devel)
> Type "help" for help.
>
> postgres=# REPACK;
> ^CCancel request sent
> ERROR:  canceling statement due to user request
> CONTEXT:  waiting for AccessExclusiveLock on relation 38458 of database 5
> postgres=# VACUUM FULL;
> ^CCancel request sent
> ERROR:  canceling statement due to user request
> CONTEXT:  waiting for AccessExclusiveLock on relation 38458 of database 5
>
> Skipping temporary relations in get_tables_to_repack() and
> get_all_vacuum_rels() before they're appended to the list seems to do
> the trick -- see attached draft.
>
> I can reproduce the same behaviour with CLUSTER and VACUUM FULL in
> PG14-PG18. I took a quick look at the code in PG17 and PG18 and the fix
> appears to be straightforward, but before I start working on it, I'd
> like to hear your thoughts. Is it worth the effort?
>
> Best, Jim
>
> 1 -
https://www.postgresql.org/message-id/13637.1774342137%40localhost<v1-0001-Skip-other-sessions-temp-tables-in-REPACK-CLUSTER.patch>

I think skipping temp tables of another session is reasonable, because anyway they are not accessible from the current
session,though visible via pg_class. 

Looking at the patch:
```
+            /* Skip temp relations belonging to other sessions */
+            if (class->relpersistence == RELPERSISTENCE_TEMP &&
+                isOtherTempNamespace(class->relnamespace))
```

It uses isOtherTempNamespace(), but I noticed that the header comment of the function says:
```
 * isOtherTempNamespace - is the given namespace some other backend's
* temporary-table namespace (including temporary-toast-table namespaces)?
*
* Note: for most purposes in the C code, this function is obsolete. Use
* RELATION_IS_OTHER_TEMP() instead to detect non-local temp relations.
```

Then looking at RELATION_IS_OTHER_TEMP():
```
#define RELATION_IS_OTHER_TEMP(relation) \
    ((relation)->rd_rel->relpersistence == RELPERSISTENCE_TEMP && \
     !(relation)->rd_islocaltemp)
```

It takes a relation as parameter and check relation->rd_islocaltemp, however in the context of this patch, we have only
Form_pg_class.

Then checking how rd_islocaltemp is set:
```
        case RELPERSISTENCE_TEMP:
            if (isTempOrTempToastNamespace(relation->rd_rel->relnamespace))
            {
                relation->rd_backend = ProcNumberForTempRelations();
                relation->rd_islocaltemp = true;
            }
            else
            {
                /*
                 * If it's a temp table, but not one of ours, we have to use
                 * the slow, grotty method to figure out the owning backend.
                 *
                 * Note: it's possible that rd_backend gets set to
                 * MyProcNumber here, in case we are looking at a pg_class
                 * entry left over from a crashed backend that coincidentally
                 * had the same ProcNumber we're using.  We should *not*
                 * consider such a table to be "ours"; this is why we need the
                 * separate rd_islocaltemp flag.  The pg_class entry will get
                 * flushed if/when we clean out the corresponding temp table
                 * namespace in preparation for using it.
                 */
                relation->rd_backend =
                    GetTempNamespaceProcNumber(relation->rd_rel->relnamespace);
                Assert(relation->rd_backend != INVALID_PROC_NUMBER);
                relation->rd_islocaltemp = false;
            }
            break;
```

It uses isTempOrTempToastNamespace(relation->rd_rel->relnamespace) to decide relation->rd_islocaltemp.

So, I think this patch should also use "!isTempOrTempToastNamespace(classForm->relnamespace)" instead of
isOtherTempNamespace(class->relnamespace).I tried that locally, and it works for me. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: VACUUM FULL, CLUSTER, and REPACK block on other sessions' temp tables

От
Jim Jones
Дата:
Hi Chao,

Thanks for the thorough review.

On 25/03/2026 02:55, Chao Li wrote:
> It uses isTempOrTempToastNamespace(relation->rd_rel->relnamespace) to decide relation->rd_islocaltemp.
> 
> So, I think this patch should also use "!isTempOrTempToastNamespace(classForm->relnamespace)" instead of
isOtherTempNamespace(class->relnamespace).I tried that locally, and it works for me.
 

I agree. In get_all_vacuum_rels() and the pg_class scan branch of
get_tables_to_repack(), relpersistence == RELPERSISTENCE_TEMP already
establishes the relation is a temp table, so
!isTempOrTempToastNamespace() alone is sufficient to identify
other-session temp tables.

In the usingindex path of get_tables_to_repack(), Form_pg_class is not
available, so there is no relpersistence to use as a pre-filter. The
explicit isAnyTempNamespace() check is required to avoid incorrectly
skipping permanent tables. This is pretty much what
isOtherTempNamespace() does internally -- the only change is inlining it
to avoid the obsolete wrapper.

* Skip temp relations belonging to other sessions */
{
  Oid nsp = get_rel_namespace(index->indrelid);

  if (!isTempOrTempToastNamespace(nsp) && isAnyTempNamespace(nsp))
  {
    UnlockRelationOid(index->indrelid, AccessShareLock);
    continue;
  }
}


v2 attached.

Thanks!

Best, Jim
Вложения

Re: VACUUM FULL, CLUSTER, and REPACK block on other sessions' temp tables

От
Jim Jones
Дата:
I forgot to create a CF entry. Here it is:
https://commitfest.postgresql.org/patch/6616/

Best, Jim



Re: VACUUM FULL, CLUSTER, and REPACK block on other sessions' temp tables

От
Zsolt Parragi
Дата:
Hello!

Shouldn't the patch also include a tap test to verify that the change
works / fails without it?

+ /* Skip temp relations belonging to other sessions */
+ {
+ Oid nsp = get_rel_namespace(index->indrelid);
+
+ if (!isTempOrTempToastNamespace(nsp) && isAnyTempNamespace(nsp))
+ {

Doesn't this result in several repeated syscache lookups?

There's already a SearchSysCacheExsists1 directly above this, then a
get_rel_namespace, then an isAnyTempNamespace. While this probably
isn't performance critical, this should be doable with a single
SearchSysCache1(RELOID...) and then a few conditions, similarly to the
else branch below this?



Re: VACUUM FULL, CLUSTER, and REPACK block on other sessions' temp tables

От
Jim Jones
Дата:
Hi

On 25/03/2026 21:38, Zsolt Parragi wrote:
> Shouldn't the patch also include a tap test to verify that the change
> works / fails without it?

Definitely. I just didn't want to invest much time on tests before
getting feedback on the issue itself.

> + /* Skip temp relations belonging to other sessions */
> + {
> + Oid nsp = get_rel_namespace(index->indrelid);
> +
> + if (!isTempOrTempToastNamespace(nsp) && isAnyTempNamespace(nsp))
> + {
> 
> Doesn't this result in several repeated syscache lookups?
> 
> There's already a SearchSysCacheExsists1 directly above this, then a
> get_rel_namespace, then an isAnyTempNamespace. While this probably
> isn't performance critical, this should be doable with a single
> SearchSysCache1(RELOID...) and then a few conditions, similarly to the
> else branch below this?

You're right. Although it is not performance critical we can solve it
with a single SearchSysCache1.

PFA v3 with the improved fix (0001) and tests (0002).

Thanks for the review!

Best, Jim
Вложения

Re: VACUUM FULL, CLUSTER, and REPACK block on other sessions' temp tables

От
Chao Li
Дата:

> On Mar 26, 2026, at 07:00, Jim Jones <jim.jones@uni-muenster.de> wrote:
>
> Hi
>
> On 25/03/2026 21:38, Zsolt Parragi wrote:
>> Shouldn't the patch also include a tap test to verify that the change
>> works / fails without it?
>
> Definitely. I just didn't want to invest much time on tests before
> getting feedback on the issue itself.
>
>> + /* Skip temp relations belonging to other sessions */
>> + {
>> + Oid nsp = get_rel_namespace(index->indrelid);
>> +
>> + if (!isTempOrTempToastNamespace(nsp) && isAnyTempNamespace(nsp))
>> + {
>>
>> Doesn't this result in several repeated syscache lookups?
>>
>> There's already a SearchSysCacheExsists1 directly above this, then a
>> get_rel_namespace, then an isAnyTempNamespace. While this probably
>> isn't performance critical, this should be doable with a single
>> SearchSysCache1(RELOID...) and then a few conditions, similarly to the
>> else branch below this?
>
> You're right. Although it is not performance critical we can solve it
> with a single SearchSysCache1.
>
> PFA v3 with the improved fix (0001) and tests (0002).
>
> Thanks for the review!
>
> Best,
Jim<v3-0001-Skip-other-sessions-temp-tables-in-REPACK-CLUSTER.patch><v3-0002-Test-VACUUM-FULL-CLUSTER-and-REPACK-with-locked-t.patch>

I don't think such a TAP test is necessary.

One reason is that, if we look at the check right above the new one:
```
        /*
         * We include partitioned tables here; depending on which operation is
         * to be performed, caller will decide whether to process or ignore
         * them.
         */
        if (classForm->relkind != RELKIND_RELATION &&
            classForm->relkind != RELKIND_MATVIEW &&
            classForm->relkind != RELKIND_PARTITIONED_TABLE)
            continue;
```

I don't see a test specifically for that check either. So I don't think we need a test for every individual path.

Second, based on [1] and [2], I got the impression that adding new tests is not always welcome considering overall test
runtime.Anyway, maybe I’m wrong, let the committers judge that. 

[1] https://postgr.es/m/mtkrkkcn2tlhytumitpch5ubxiprv2jzvprf5r5m3mjeczvq4q@p6wkzbfxuyv2 <https://postgr.es/m/>
[2] https://postgr.es/m/1449781.1773948276@sss.pgh.pa.us

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: VACUUM FULL, CLUSTER, and REPACK block on other sessions' temp tables

От
Antonin Houska
Дата:
Chao Li <li.evan.chao@gmail.com> wrote:

> I don't think such a TAP test is necessary.

+1

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com



Re: VACUUM FULL, CLUSTER, and REPACK block on other sessions' temp tables

От
Jim Jones
Дата:
On 26/03/2026 12:25, Antonin Houska wrote:
> Chao Li <li.evan.chao@gmail.com> wrote:
> 
>> I don't think such a TAP test is necessary.
> +1

I've kept the tests in a separate file so the committer can easily skip
them if needed.

Thanks for the feedback, everyone!

Best, Jim