Обсуждение: global temporary table (GTT) - are there some ideas how to implement it?

Поиск
Список
Период
Сортировка

global temporary table (GTT) - are there some ideas how to implement it?

От
Pavel Stehule
Дата:
Hi,

last week there was a discussion on linkedin related to port Oracle's application to Postgres.
I am sure so lot of usage of temporary tables in application is useless, based on long history of ported applications - Sybase (MSSQL) -> Oracle -> Postgres, but still global temporary tables are interesting feature - and impossibility to use GTT is a real problem for lot of users.

One of the issues of this port are probably temporary tables. It is probably a common issue - because PostgreSQL doesn't support global temporary tables and any workarounds have a significant problem with bloating of some system catalog tables - pg_attribute, pg_class, pg_depends, pg_shdepends. 

The implementation has two parts - one can be "simple" - using a local storage for a persistent table.

Second is almost impossible - storing some metadata that cannot be shared - like relpages, reltuples, pg_statistic. We also want to support some views like pg_stats for global temp tables too, and if possibly without bigger changes.

Some years ago there was a some implementations based on using some memory caches. It doesn't work well, because Postgres has not concept of session persistent caches of catalog data, that should live across cache invalidation signal. 

I think so this problem can be reduced just on implementation of pg_statistic table. If we can support GTT for pg_statistic we can support GTT generally. 

pg_statistic can be (in future) partitioned table - one partition can for common tables, one partition can be global temporary tables. The partition for global temporary tables can be GTT by self. There can be a GTT partition for currently used local temporary tables too (this pattern can fix a bloating related to usage of local temporary tables).

I am not sure if proposed design is implementable - it requires partitioning of system tables on some very low level. 

Has somebody some ideas to this topic?

Regards

Pavel


Re: global temporary table (GTT) - are there some ideas how to implement it?

От
Konstantin Knizhnik
Дата:
On 12/01/2026 7:51 AM, Pavel Stehule wrote:
> Hi,
>
> last week there was a discussion on linkedin related to port Oracle's 
> application to Postgres.
> I am sure so lot of usage of temporary tables in application is 
> useless, based on long history of ported applications - Sybase (MSSQL) 
> -> Oracle -> Postgres, but still global temporary tables are 
> interesting feature - and impossibility to use GTT is a real problem 
> for lot of users.
>
> One of the issues of this port are probably temporary tables. It is 
> probably a common issue - because PostgreSQL doesn't support global 
> temporary tables and any workarounds have a significant problem with 
> bloating of some system catalog tables - pg_attribute, pg_class, 
> pg_depends, pg_shdepends.
>
> The implementation has two parts - one can be "simple" - using a local 
> storage for a persistent table.
>
> Second is almost impossible - storing some metadata that cannot be 
> shared - like relpages, reltuples, pg_statistic. We also want to 
> support some views like pg_stats for global temp tables too, and if 
> possibly without bigger changes.
>
> Some years ago there was a some implementations based on using some 
> memory caches. It doesn't work well, because Postgres has not concept 
> of session persistent caches of catalog data, that should live across 
> cache invalidation signal.
>
> I think so this problem can be reduced just on implementation of 
> pg_statistic table. If we can support GTT for pg_statistic we can 
> support GTT generally.
>
> pg_statistic can be (in future) partitioned table - one partition can 
> for common tables, one partition can be global temporary tables. The 
> partition for global temporary tables can be GTT by self. There can be 
> a GTT partition for currently used local temporary tables too (this 
> pattern can fix a bloating related to usage of local temporary tables).
>
> I am not sure if proposed design is implementable - it requires 
> partitioning of system tables on some very low level.
>
> Has somebody some ideas to this topic?
>
> Regards
>
> Pavel
>
>

Hi,

7 years ago I proposed Oracle-like solution for temp tables (shared 
metadata, private data):
https://www.postgresql.org/message-id/flat/73954ab7-44d3-b37b-81a3-69bdcbb446f7%40postgrespro.ru
and you also participated in discussion and one of the concerns were 
this problems with statistics.

I do not completely understand how partitioning of system tables can 
solve this problem.
Do you propose that each backend has its own (private) partition?
It seems to be impossible and can cause even worse catalog bloating 
(instead of  one temp table we will have to create temp partitions for 
multiple system tables).




Re: global temporary table (GTT) - are there some ideas how to implement it?

От
Pavel Stehule
Дата:
Hi
 

I do not completely understand how partitioning of system tables can
solve this problem.
Do you propose that each backend has its own (private) partition?
It seems to be impossible and can cause even worse catalog bloating
(instead of  one temp table we will have to create temp partitions for
multiple system tables).

yes - but what is important - each backend has its own global temp partitions. Then almost all metadata will be static and only statistic related will be temporary.

The backend related partitions have to be of global temp table, not local temp table (only then it makes sense).

The main target is removing bloat from the system catalog, and it is impossible without storing system data to GTT.




 

Re: global temporary table (GTT) - are there some ideas how to implement it?

От
Konstantin Knizhnik
Дата:


On 12/01/2026 8:57 AM, Pavel Stehule wrote:
Hi
 

I do not completely understand how partitioning of system tables can
solve this problem.
Do you propose that each backend has its own (private) partition?
It seems to be impossible and can cause even worse catalog bloating
(instead of  one temp table we will have to create temp partitions for
multiple system tables).

yes - but what is important - each backend has its own global temp partitions. Then almost all metadata will be static and only statistic related will be temporary.

The backend related partitions have to be of global temp table, not local temp table (only then it makes sense).

The main target is removing bloat from the system catalog, and it is impossible without storing system data to GTT.

So do I correctly understand that at least pg_class and pg_statistics will have two partitions: "global" and "local" and last one will act as GTT: it has shared metadata but private data. It can work for pg_statistics, because it is actually only needed for the particular backend. 
But it will not work for pg_class, because all backends should observe information about this table, but each of them should see it's own relpages/reltuples.  So we need some kind of CoW here. Not sure that partitioning can solve this problem...

Also even if we manage to somehow solve this problem and provide private version of pg_statistics/pg_class, it is not clear who and when will fill this data.
Autovacuum is not processing temp tables. There was autoanalyze extension which allows to do analyze on demand inside backend itself. But I am not sure that it can be considered as general approach for GTT. Alternative is to store GTT data in share buffers, rather than in local backend memory.
It will also solve other problems with temporary tables, i.e. connection pooling. 

Re: global temporary table (GTT) - are there some ideas how to implement it?

От
Pavel Stehule
Дата:


po 12. 1. 2026 v 10:16 odesílatel Konstantin Knizhnik <knizhnik@garret.ru> napsal:


On 12/01/2026 8:57 AM, Pavel Stehule wrote:
Hi
 

I do not completely understand how partitioning of system tables can
solve this problem.
Do you propose that each backend has its own (private) partition?
It seems to be impossible and can cause even worse catalog bloating
(instead of  one temp table we will have to create temp partitions for
multiple system tables).

yes - but what is important - each backend has its own global temp partitions. Then almost all metadata will be static and only statistic related will be temporary.

The backend related partitions have to be of global temp table, not local temp table (only then it makes sense).

The main target is removing bloat from the system catalog, and it is impossible without storing system data to GTT.

So do I correctly understand that at least pg_class and pg_statistics will have two partitions: "global" and "local" and last one will act as GTT: it has shared metadata but private data. It can work for pg_statistics, because it is actually only needed for the particular backend. 
But it will not work for pg_class, because all backends should observe information about this table, but each of them should see it's own relpages/reltuples.  So we need some kind of CoW here. Not sure that partitioning can solve this problem...


For this design all fields related to possibly unshared fields should be moved to a new table - and that table should be partitioned with a possible GTT partition. 


Also even if we manage to somehow solve this problem and provide private version of pg_statistics/pg_class, it is not clear who and when will fill this data.

It can be the same like now for temporary tables - they are empty until somebody runs a vacuum. 

The storage of GTT will be local, so autovacuum is not possible.

 Autovacuum is not processing temp tables now - and I didn't propose .

The proposed design doesn't try to solve some "automatic" vacuum or analyze temporary objects. 

The lifecycle of temporary tables is usually less than 1 minute so cron based design cannot work well, and I don't think using GTT can help. 

Missing ANALYZE for temporary tables is probably a common issue - if I can say (I found it as a customer issue more times). But it needs a different design than current autovacuum/autoanalyze.
Maybe the planner can raise a warning when trying to process tables without statistics (or planner can force ANALYZE on temp tables). I don't know. It is a different question - for this moment I want to expect so users are able to run ANALYZE manually.

Another question is if we need to be worried about choosing local or shared storage. Implementation of GTT is not a monthly game, and if Postgres moves from process to threads, then the question about choosing local or shared storage loses sense.

It will also solve other problems with temporary tables, i.e. connection pooling. 

This is another (but interesting) question - how the session and connections can be shared. I think there are the same issues like local temporary tables - so we don't need to open this question now.
At this moment I expect that the content of GTT doesn't survive DISCARD TEMP - and DISCARD TEMP can be implemented just by throwing local storage and local buffers.