Re: Global temporary tables

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: Global temporary tables
Дата
Msg-id CAFj8pRAej_Jpw3S5nnJYQZAwCa8MYMY1-3q0-n64eeqFFJykRw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Global temporary tables  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Список pgsql-hackers


po 23. 9. 2019 v 9:57 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 20.09.2019 19:43, Pavel Stehule wrote:

1. I do not need them at all.
2. Eliminate catalog bloating.
3. Mostly needed for compatibility with Oracle (simplify porting,...).
4. Parallel query execution.
5. Can be used at replica.
6. More efficient use of resources (first of all memory).

There can be other point important for cloud. Inside some cloud usually there are two types of discs - persistent (slow) and ephemeral (fast). We effectively used temp tables there because we moved temp tablespace to ephemeral discs.

Yes, I already heard this argument and agree with it.
I just want to notice two things:
1. My assumption is that in most cases data of temporary table can fit in memory (certainly if we are not limiting them by temp_buffers = 8MB, but store in shared buffers) and so there is on need to write them to the persistent media at all.
2. Global temp tables do not substitute local temp tables, accessed through local buffers. So if you want to use temporary storage, you will always have a way to do it.
The question is whether we need to support two kinds of global temp tables (with shared or private buffers) or just implement one of them.

It's valid only for OLTP.  OLAP world is totally different. More if all users used temporary tables, and you should to calculate with it - it is one reason for global temp tables, then you need multiply size by max_connection.

hard to say what is best from implementation perspective, but it can be unhappy if global temporary tables has different performance characteristics and configuration than local temporary tables.


I missing one point in your list - developer's comfort - using temp tables is just much more comfortable - you don't need create it again, again, .. Due this behave is possible to reduce @2 and @3 can be nice side effect. If you reduce @2 to zero, then @5 should be possible without any other.

Sorry, I do not completely understand your point here
You can use normal (permanent) table and you will not have to create them again and again. It is also possible to use them for storing temporary data - just need to truncate table when data is not needed any more.
Certainly you can not use the same table in more than one backend. Here is the main advantage of temp tables - you can have storage of per-session data and do not worry about possible name conflicts.

You use temporary tables because you know so you share data between session never. I don't remember any situation when I designed temp tables with different schema for different sessions.

Using global temp table is not effective - you are work with large tables, you need to use delete, .. so you cannot to use classic table like temp tables effectively.


From the other side: there are many cases where format of temporary data is not statically known: it is determined dynamically during program execution.
In this case local temp table provides the most convenient mechanism for working with such data.

This is why I think that ewe need to have both local and global temp tables.

Also I do not agree with your statement "If you reduce @2 to zero, then @5 should be possible without any other".
In the solution implemented by Aleksander Alekseev metadata of temporary tables is kept in memory and not affecting catalog at all.
But them still can not be used at replica.
There are still some serious problems which need to be fixed to able it:
allow insert/update/delete statements for read-only transactions, somehow assign XIDs for them, implement savepoints and rollback of such transactions.
All this was done in the last version of my patch.
Yes, it doesn't depend on whether we are using shared or private buffers for temporary tables. The same approach can be implemented for both of them.
The question is whether we are really need temp tables at replica and if so, do we need full transaction support for them, including rollbacks, subtransactions.

temporary tables (of any type) on replica is interesting feature that opens some possibilities. Some queries cannot be optimized and should be divided and some results should be stored to temporary tables, analysed (to get correct statistics), maybe indexed, and after that the calculation can continue. Now you can do this just only on master. More - on HotStandBy the data are read only, and without direct impact on master (production), so you can do some harder calculation there. And temporary tables is used technique how to fix estimation errors.

I don't think so subtransaction, transaction, rollbacks are necessary for these tables. On second hand with out it, it is half cooked features, and can looks pretty strange in pg environment.

I am very happy, how much work you do in this area, I had not a courage to start this job, but I don't think so this work can be reduced just to some supported scenarios - and I hope so correct implementation is possible - although it is not simply work.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: [proposal] de-TOAST'ing using a iterator
Следующее
От: Steven Pousty
Дата:
Сообщение: Re: JSONPATH documentation