Обсуждение: Prepare Transaction support for ON COMMIT DROP temporary tables
Hi, Please find attached a patch to enable support for temporary tables in prepared transactions when ON COMMIT DROP has been specified. The comment in the existing code around this idea reads: * Don't allow PREPARE TRANSACTION if we've accessed a temporary table in * this transaction. [ ... ] * XXX In principle this could be relaxed to allow some useful special * cases, such as a temp table created and dropped all within the * transaction. That seems to require much more bookkeeping though. In the attached patch I have added this paragraph, and of course the implementation of it: * A special case of this situation is using ON COMMIT DROP, where the * call to PreCommit_on_commit_actions() is then responsible for * performing the DROP table within the transaction and before we get * here. Regards, -- dim
Вложения
On 28/12/2018 12:46, Dimitri Fontaine wrote: > Hi, > > Please find attached a patch to enable support for temporary tables in > prepared transactions when ON COMMIT DROP has been specified. The comments I made on IRC have been addressed in this version of the patch, so it looks good to me. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Hi Dimitri On 2018-Dec-28, Dimitri Fontaine wrote: > Please find attached a patch to enable support for temporary tables in > prepared transactions when ON COMMIT DROP has been specified. Glad to see you submitting patches again. I suggest to add in your regression tests a case where the prepared transaction commits, and ensuring that the temp table is gone from catalogs. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Glad to see you submitting patches again. Thanks! > I suggest to add in your regression tests a case where the prepared > transaction commits, and ensuring that the temp table is gone from > catalogs. Please find such a revision attached. Regards, -- dim
Вложения
On Fri, Dec 28, 2018 at 08:32:15PM +0100, Dimitri Fontaine wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> I suggest to add in your regression tests a case where the prepared >> transaction commits, and ensuring that the temp table is gone from >> catalogs. > > Please find such a revision attached. Being able to relax a bit the case is better than nothing, so that's nice to see incremental improvements. Thanks Dimitri. I just had a very quick glance, so that's far from being a detailed review, but could it be possible to add test cases involving inheritance trees and/or partitions if that makes sense? The ON COMMIT action handling is designed to make such cases work properly. -- Michael
Вложения
Michael Paquier <michael@paquier.xyz> writes: > Being able to relax a bit the case is better than nothing, so that's > nice to see incremental improvements. Thanks Dimitri. I'm afraid that this patch is likely to be, if not completely broken, at least very much less useful than one could wish by the time we get done closing the holes discussed in this other thread: https://www.postgresql.org/message-id/flat/5d910e2e-0db8-ec06-dd5f-baec420513c3%40imap.cc For instance, if we're going to have to reject the case where the session's temporary schema was created during the current transaction, then that puts a very weird constraint on whether this case works. Also, even without worrying about new problems that that discussion may lead to, I don't think that the patch works as-is. The function every_on_commit_is_on_commit_drop() does what it says, but that is NOT sufficient to conclude that every temp table the transaction has touched is on-commit-drop. This logic will successfully reject cases with on-commit-delete-rows temp tables, but not cases where the temp table(s) lack any ON COMMIT spec at all. regards, tom lane
Hi, Tom Lane <tgl@sss.pgh.pa.us> writes: > I'm afraid that this patch is likely to be, if not completely broken, > at least very much less useful than one could wish by the time we get > done closing the holes discussed in this other thread: > > https://www.postgresql.org/message-id/flat/5d910e2e-0db8-ec06-dd5f-baec420513c3%40imap.cc Thanks for the review here Tom, and for linking with the other discussion (Álvaro did that too, thanks!). I've been reviewing it too. I didn't think about the pg_temp_NN namespaces in my approach, and I think it might be possible to make it work, but it's getting quite involved now. One idea would be that if every temp table in the session belongs to the transaction, and their namespace too (we could check through pg_depend that the namespace doesn't contain anything else beside the transaction's tables), then we could dispose of the temp schema and on-commit-drop tables at PREPARE COMMIT time. Otherwise, as before, prevent the transaction to be a 2PC one. > For instance, if we're going to have to reject the case where the > session's temporary schema was created during the current transaction, > then that puts a very weird constraint on whether this case works. Yeah. The goal of my approach is to transparently get back temp table support in 2PC when that makes sense, which is most use cases I've been confronted to. We use 2PC in Citus, and it would be nice to be able to use transaction local temp tables on worker nodes when doing data injestion and roll-ups. > Also, even without worrying about new problems that that discussion > may lead to, I don't think that the patch works as-is. The function > every_on_commit_is_on_commit_drop() does what it says, but that is > NOT sufficient to conclude that every temp table the transaction has > touched is on-commit-drop. This logic will successfully reject cases > with on-commit-delete-rows temp tables, but not cases where the temp > table(s) lack any ON COMMIT spec at all. Thanks! I missed that the lack of ON COMMIT spec would have that impact in the code. We could add tracking of that I suppose, and will have a look at how to implement it provided that the other points find an acceptable solution. Regards, -- dim
On Mon, Jan 14, 2019 at 07:41:18PM +0100, Dimitri Fontaine wrote: > Thanks for the review here Tom, and for linking with the other > discussion (Álvaro did that too, thanks!). I've been reviewing it > too. If you can look at the patch, reviews are welcome. There are quite a couple of patterns I spotted on the way. > I didn't think about the pg_temp_NN namespaces in my approach, and I > think it might be possible to make it work, but it's getting quite > involved now. > > One idea would be that if every temp table in the session belongs to the > transaction, and their namespace too (we could check through pg_depend > that the namespace doesn't contain anything else beside the > transaction's tables), then we could dispose of the temp schema and > on-commit-drop tables at PREPARE COMMIT time. Hm. A strong assumption that we rely on in the code is that the temporary namespace drop only happens when the session ends, so you would need to complicate the logic so as the namespace is created in a given transaction, which is something that can be done (at least that's what my patch on the other thread adds control for), and that no other objects than ON COMMIT tables are created, which is more tricky to track (still things would get weird with a LOCK on ON COMMIT DROP tables?). The root of the problem is that the objects' previous versions would still be around between the PREPARE TRANSACTION and COMMIT PREPARED, and that both queries can be perfectly run transparently across multiple sessions. Back in the time, one thing that we did in Postgres-XC was to enforce 2PC to not be used and use a direct commit instead of failing, which was utterly wrong. Postgres-XL may be reusing some of that :( > Yeah. The goal of my approach is to transparently get back temp table > support in 2PC when that makes sense, which is most use cases I've been > confronted to. We use 2PC in Citus, and it would be nice to be able to > use transaction local temp tables on worker nodes when doing data > injestion and roll-ups. You have not considered the case of inherited tables and partitioned mixing ON COMMIT actions of different types as well. For inherited tables this does not matter much I think, perhaps for partitions it does (see tests in 52ea6a8, which you would need to mix with 2PC). -- Michael
Вложения
On Mon, Jan 14, 2019 at 1:41 PM Dimitri Fontaine <dimitri@citusdata.com> wrote: > One idea would be that if every temp table in the session belongs to the > transaction, and their namespace too (we could check through pg_depend > that the namespace doesn't contain anything else beside the > transaction's tables), then we could dispose of the temp schema and > on-commit-drop tables at PREPARE COMMIT time. Why not just drop any on-commit-drop tables at PREPARE TRANSACTION time and leave the schema alone? If there are any temp tables touched by the transaction which are not on-commit-drop then we'd have to fail, but as long as all the tables we've got are on-commit-drop then it seems fine to just nuke them at PREPARE time. Such tables must've been created in the current transaction, because otherwise the creating transaction aborted and they're gone for that reason, or it committed and they're gone because they're on-commit-drop. And regardless of whether the transaction we are preparing goes on to commit or abort, those tables will be gone afterwards for the same reasons. So there doesn't in this case seem to be any reason to keep them around until the transaction's fate is known. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 16/01/2019 17:44, Robert Haas wrote: > On Mon, Jan 14, 2019 at 1:41 PM Dimitri Fontaine <dimitri@citusdata.com> wrote: >> One idea would be that if every temp table in the session belongs to the >> transaction, and their namespace too (we could check through pg_depend >> that the namespace doesn't contain anything else beside the >> transaction's tables), then we could dispose of the temp schema and >> on-commit-drop tables at PREPARE COMMIT time. > > Why not just drop any on-commit-drop tables at PREPARE TRANSACTION > time and leave the schema alone? If there are any temp tables touched > by the transaction which are not on-commit-drop then we'd have to > fail, but as long as all the tables we've got are on-commit-drop then > it seems fine to just nuke them at PREPARE time. Such tables must've > been created in the current transaction, because otherwise the > creating transaction aborted and they're gone for that reason, or it > committed and they're gone because they're on-commit-drop. And > regardless of whether the transaction we are preparing goes on to > commit or abort, those tables will be gone afterwards for the same > reasons. So there doesn't in this case seem to be any reason to keep > them around until the transaction's fate is known. Isn't that what happens already? PrepareTransaction() calls PreCommit_on_commit_actions() from what I can tell. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Fri, Jan 18, 2019 at 4:50 AM Vik Fearing <vik.fearing@2ndquadrant.com> wrote: > Isn't that what happens already? PrepareTransaction() calls > PreCommit_on_commit_actions() from what I can tell. Huh. Well, in that case, I'm not sure I understand we really need to do beyond removing the error checks for the case where all tables are on-commit-drop. It could be useful to do something about the issue with pg_temp creation that Tom linked to in the other thread. But even if you didn't do that, it'd be pretty easy to work around this in application code -- just issue a dummy CREATE TEMP TABLE .. ON COMMIT DROP statement the first time you use a connection, so that the temp schema definitely exists. So I'm not sure I'd view that as a blocker for this patch, even though it's kind of a sucky limitation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Jan 18, 2019 at 10:39:46AM -0500, Robert Haas wrote: > Huh. Well, in that case, I'm not sure I understand we really need to > do beyond removing the error checks for the case where all tables are > on-commit-drop. I have not looked at the patch in details, but we should really be careful that if we do that the namespace does not remain behind when performing such transactions so as it cannot be dropped. On my very recent lookups of this class of problems you can easily finish by blocking a backend from shutting down when dropping its temporary schema, with the client, say psql, already able to disconnect. So as long as the 2PC transaction is not COMMIT PREPARED the backend-side wait will not be able to complete, blocking a backend slot in shared memory. PREPARE TRANSACTION is very close to a simple commit in terms of its semantics, while COMMIT PREPARED is just here to finish releasing resources. > It could be useful to do something about the issue with pg_temp > creation that Tom linked to in the other thread. But even if you > didn't do that, it'd be pretty easy to work around this in application > code -- just issue a dummy CREATE TEMP TABLE .. ON COMMIT DROP > statement the first time you use a connection, so that the temp schema > definitely exists. So I'm not sure I'd view that as a blocker for > this patch, even though it's kind of a sucky limitation. That's not really user-friendly, still workable. Or you could just call current_schema() ;) -- Michael
Вложения
On Sat, Jan 19, 2019 at 10:39:43AM +0900, Michael Paquier wrote: > I have not looked at the patch in details, but we should really be > careful that if we do that the namespace does not remain behind when > performing such transactions so as it cannot be dropped. On my very > recent lookups of this class of problems you can easily finish by > blocking a backend from shutting down when dropping its temporary > schema, with the client, say psql, already able to disconnect. So as > long as the 2PC transaction is not COMMIT PREPARED the backend-side > wait will not be able to complete, blocking a backend slot in shared > memory. PREPARE TRANSACTION is very close to a simple commit in terms > of its semantics, while COMMIT PREPARED is just here to finish > releasing resources. I have been looking at this patch, which conflicts on HEAD by the way (Sorry!) still it is easy enough to get rid of the conflict, and from what I can see it does not completely do its job. Simply take the following example: =# begin; BEGIN =# create temp table aa (a int ) on commit drop; CREATE TABLE =# prepare transaction 'ad'; PREPARE TRANSACTION =# \q This causes the client to think that the session is finished, but if you look closer at the backend it is still pending to close until the transaction is COMMIT PREPARED: michael 22126 0.0 0.0 218172 15788 ? Ss 15:59 0:00 postgres: michael michael [local] idle waiting Here is a backtrace: #7 0x00005616900d0462 in LockAcquireExtended (locktag=0x7ffdd6bd5390, lockmode=8, sessionLock=false, dontWait=false, reportMemoryError=true, locallockp=0x0) at lock.c:1050 #8 0x00005616900cf9ab in LockAcquire (locktag=0x7ffdd6bd5390, lockmode=8, sessionLock=false, dontWait=false) at lock.c:713 #9 0x00005616900ced07 in LockDatabaseObject (classid=2615, objid=16385, objsubid=0, lockmode=8) at lmgr.c:934 #10 0x000056168fd8cace in AcquireDeletionLock (object=0x7ffdd6bd5414, flags=0) at dependency.c:1389 #11 0x000056168fd8b398 in performDeletion (object=0x7ffdd6bd5414, behavior=DROP_CASCADE, flags=29) at dependency.c:325 #12 0x000056168fda103a in RemoveTempRelations (tempNamespaceId=16385) at namespace.c:4142 #13 0x000056168fda106d in RemoveTempRelationsCallback (code=0, arg=0) at namespace.c:4161 If you really intend to drop any trace of the objects at PREPARE phase, that does not seem completely impossible to me, still you would also need handling for the case where the temp table created also creates the temporary schema for the session. -- Michael
Вложения
On Mon, Jan 28, 2019 at 04:06:11PM +0900, Michael Paquier wrote: > If you really intend to drop any trace of the objects at PREPARE > phase, that does not seem completely impossible to me, still you would > also need handling for the case where the temp table created also > creates the temporary schema for the session. More work needs to be done, and the patch has problems, so I am marking this patch as returned with feedback. -- Michael