Re: Global temporary tables

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: Global temporary tables
Дата
Msg-id 8b4bdfa6-7a90-95e2-1fd0-243767e5b357@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Global temporary tables  (Craig Ringer <craig@2ndquadrant.com>)
Ответы Re: Global temporary tables  (Craig Ringer <craig@2ndquadrant.com>)
Список pgsql-hackers


On 08.08.2019 5:40, Craig Ringer wrote:
On Tue, 6 Aug 2019 at 16:32, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
New version of the patch with several fixes is attached.
Many thanks to Roman Zharkov for testing.

FWIW I still don't understand your argument with regards to using shared_buffers for temp tables having connection pooling benefits. Are you assuming the presence of some other extension in your extended  version of PostgreSQL ? In community PostgreSQL a temp table's contents in one backend will not be visible in another backend. So if your connection pooler in transaction pooling mode runs txn 1 on backend 42 and it populates temp table X, then the pooler runs the same app session's txn 2 on backend 45, the contents of temp table X are not visible anymore.

Certainly here I mean built-in connection pooler which is not currently present in Postgres,
but it is part of PgPRO-EE and there is my patch for vanilla at commitfest:
https://commitfest.postgresql.org/24/2067


Can you explain? Because AFAICS so long as temp table contents are backend-private there's absolutely no point ever using shared buffers for their contents.

Sure, there is no such problem with temporary tables now.
There is another problem: you can not use temporary table with any existed connection poolers (pgbouncer,...) with pooling level other than session unless temporary table is used inside one transaction.
One of the advantages of built-in connection pooler is that it can provide session semantic (GUCs, prepared statement, temporary tables,...) with limited number of backends (smaller than number of sessions).

In PgPRO-EE this problem was solved by binding session to backend. I.e. one backend can manage multiple sessions, 
but session can not migrate to another backend. The drawback of such solution is obvious: one long living transaction can block transactions of all other sessions scheduled to this backend.
Possibility to migrate session to another backend is one of the obvious solutions of the problem. But the main show stopper for it is temporary tables.
This is why  I consider moving temporary tables to shared buffers as very important step.

In vanilla version of built-in connection pooler situation is slightly different.
Right now if client is using temporary tables without "ON COMMIT DROP" clause, backend is marked as "tainted" and is pinned for this session.
So it is actually excluded from connection pool and servers only this session. Once again - if I will be able to access temporary table data from other backend, there will be no need to mark backend as tainted in this case.
Certainly it also requires shared metadata. And here we come to the concept of global temp tables (if we forget for a moment that global temp tables were "invented" long time ago by Oracle and many other DBMSes:)

Perhaps you mean that in a connection pooling case, each backend may land up filling up temp buffers with contents from *multiple different temp tables*? If so, sure, I get that, but the answer there seems to be to improve eviction and memory accounting, not make backends waste precious shared_buffers space on non-shareable data.

Anyhow, I strongly suggest you simplify the feature to add the basic global temp table feature so the need to change pg_class, pg_attribute etc to use temp tables is removed, but separate changes to temp table memory handling etc into a follow-up patch. That'll make it smaller and easier to review and merge too. The two changes are IMO logically quite separate anyway.

I agree that them are separate.
But even if we forget about built-in connection pooler, don't you think that possibility to use parallel query plans for temporary tables is itself strong enough motivation to access global temp table through shared buffers
(while still supporting private page pool for local temp tables). So both approaches (shared vs. private buffers) have their pros and contras. This is why it seems to be reasonable to support both of them and let user to make choice most suitable for concrete application. Certainly it is possible to provide "global shared temp tables" and "global private temp tables". But IMHO it is overkill.

Come to think of it, I think connection poolers might benefit from an extension to the DISCARD command, say "DISCARD TEMP_BUFFERS", which evicts temp table buffers from memory *without* dropping the temp tables. If they're currently in-memory tuplestores they'd be written out and evicted. That way a connection pooler could "clean" the backend, at the cost of some probably pretty cheap buffered writes to the system buffer cache. The kernel might not even bother to write out the buffercache and it won't be forced to do so by fsync, checkpoints, etc, nor will the writes go via WAL so such evictions could be pretty cheap - and if not under lots of memory pressure the backend could often read the temp table back in from system buffer cache without disk I/O.

Yes,  this is one of th possible solutions for session migration.  But frankly speaking flushing local buffers on each session reschedule seems to be not so good solution. Even if OS file cache is large enough and flushed buffers are still  present in memory (but them will be written to the disk in this case even if data of temp table is not intended to be persisted).

That's my suggestion for how to solve your pooler problem, assuming I've understood it correctly.

Along these lines I suggest adding the following to DISCARD at some point, obviously not as part of your patch:

* DISCARD TEMP_BUFFERS
* DISCARD SHARED_BUFFERS
* DISCARD TEMP_FILES
* DISCARD CATALOG_CACHE
* DISCARD HOLD_CURSORS
* DISCARD ADVISORY_LOCKS

where obviously DISCARD SHARED_BUFFERS would be superuser-only and evict only clean buffers.

(Also, if we extend DISCARD lets also it to be written as DISCARD (LIST, OF, THINGS, TO, DISCARD) so that we can make the syntax extensible for plugins in future).

Thoughts?

Would DISCARD TEMP_BUFFERS meet your needs?

Actually I have already implemented DropLocalBuffers function (three line of code:)

void
DropLocalBuffers(void)
{
     RelFileNode rnode;
     rnode.relNode = InvalidOid; /* drop all local buffers */
     DropRelFileNodeAllLocalBuffers(rnode);
}


for yet another Postgres extension which is not yet included even in PgPRO-EE - SnapFS: support of database snapshots.
I do not think that we need such command at user level (i.e. have correspondent SQL command).
But, as I already wrote above, I do not consider flushing all buffers on session reschedule as acceptable solution.
And moreover, just flushing buffers is not enough. There is still some smgr stuff associated with this relation which is local to the backend.
We in any case has to make some changes to be able to access temporary data from other backend even if data is flushed to the file system.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Mark G
Дата:
Сообщение: Small const correctness patch
Следующее
От: amul sul
Дата:
Сообщение: Some memory not freed at the exit of RelationBuildPartitionDesc()