Обсуждение: SELECT FOR UPDATE leaks relation refcounts
Poking into Oliver's report of "RelationClearRelation: relation 21645 modified while in use", I find that the culprit is the following code in execMain.c's InitPlan(): foreach(l, parseTree->rowMark) { rm = lfirst(l); relid = rt_fetch(rm->rti, rangeTable)->relid; relation = heap_open(relid, RowShareLock); if (!(rm->info & ROW_MARK_FOR_UPDATE)) continue; erm = (execRowMark *) palloc(sizeof(execRowMark)); erm->relation= relation; erm->rti = rm->rti; sprintf(erm->resname, "ctid%u", rm->rti); estate->es_rowMark= lappend(estate->es_rowMark, erm); } That heap_open() call has no corresponding heap_close() anywhere, so every SELECT FOR UPDATE leaves the relation's refcount one higher than it was. This didn't use to be a huge problem, other than that the rel would be permanently locked into the backend's relcache. (I think an attempt to DROP the table later in the session would have caused trouble, though.) However, I just committed changes in the relcache that assume that zero refcount is trustworthy, and it's those changes that are spitting up. It's easy enough to add code to EndPlan that goes through the estate->es_rowMark list to close the rels that had ROW_MARK_FOR_UPDATE set. But if that bit wasn't set, the above code opens the rel and then forgets about it completely. Is that a bug? If not, I guess we need another data structure to keep track of the non-ROW_MARK_FOR_UPDATE rels through execution. (EndPlan doesn't currently get the parsetree as a parameter, so it can't just duplicate the above loop --- though passing it the parsetree might be one possible solution.) I don't understand SELECT FOR UPDATE enough to know what is going on here. But it seems darn peculiar to open a rel and then not keep any reference to the rel for later use. Anybody know how this works? regards, tom lane
> -----Original Message----- > From: owner-pgsql-hackers@postgreSQL.org > [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Tom Lane > > Poking into Oliver's report of "RelationClearRelation: relation 21645 > modified while in use", I find that the culprit is the following > code in execMain.c's InitPlan(): > > foreach(l, parseTree->rowMark) > { > rm = lfirst(l); > relid = rt_fetch(rm->rti, rangeTable)->relid; > relation = heap_open(relid, RowShareLock); > if (!(rm->info & ROW_MARK_FOR_UPDATE)) > continue; > erm = (execRowMark *) palloc(sizeof(execRowMark)); > erm->relation = relation; > erm->rti = rm->rti; > sprintf(erm->resname, "ctid%u", rm->rti); > estate->es_rowMark = lappend(estate->es_rowMark, erm); > } > > That heap_open() call has no corresponding heap_close() anywhere, > so every SELECT FOR UPDATE leaves the relation's refcount one higher > than it was. This didn't use to be a huge problem, other than that the > rel would be permanently locked into the backend's relcache. (I think > an attempt to DROP the table later in the session would have caused > trouble, though.) However, I just committed changes in the relcache > that assume that zero refcount is trustworthy, and it's those changes > that are spitting up. > > It's easy enough to add code to EndPlan that goes through the > estate->es_rowMark list to close the rels that had ROW_MARK_FOR_UPDATE > set. But if that bit wasn't set, the above code opens the rel and then > forgets about it completely. Is that a bug? If not, I guess we need Seems its a bug though I'm not sure. Is there anything wrong with inserting heap_close(relation, NoLock) immediately before 'continue;' ? Regards. Hiroshi Inoue Inoue@tpf.co.jp
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: >> It's easy enough to add code to EndPlan that goes through the >> estate->es_rowMark list to close the rels that had ROW_MARK_FOR_UPDATE >> set. But if that bit wasn't set, the above code opens the rel and then >> forgets about it completely. Is that a bug? If not, I guess we need > Seems its a bug though I'm not sure. I looked over the code that works with rowmarks and decided it is a bug. There are just two action flag bits for the executor to worry about, ROW_MARK_FOR_UPDATE and ROW_ACL_FOR_UPDATE. The first makes the execution-time stuff actually happen, while the second causes a suitable permissions check to be applied before execution. In a simple SELECT FOR UPDATE situation, both bits will be set. The only way that the ROW_MARK_FOR_UPDATE bit can get unset is if the SELECT FOR UPDATE command references a view --- in that case, the rewriter clears the ROW_MARK_FOR_UPDATE bit on the view's rowmark entry, and adds rowmark entries with only ROW_MARK_FOR_UPDATE set for the tables referenced by the view. As far as I can see, this is correct behavior: the permissions check should be applied to the view, not the referenced tables, but actual execution happens in the referenced tables and doesn't touch the view at all. Therefore, it's unnecessary --- and perhaps actually wrong --- for InitPlan to be grabbing a RowShareLock on the view. So, I've rearranged the InitPlan code to not open the rel at all when ROW_MARK_FOR_UPDATE is clear, and I've added code in EndPlan to traverse the estate->es_rowMark list and heap_close the opened rels (specifying NoLock, so that the RowShareLock is held till commit). This seems to solve Oliver's problem, and the regress tests still pass, so I committed it a little while ago. > Is there anything wrong with inserting heap_close(relation, NoLock) > immediately before 'continue;' ? We can do that if it turns out my analysis is wrong and RowShareLock should indeed be grabbed on views as well as their underlying tables. regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Thursday, February 03, 2000 10:00 AM > > This seems to solve Oliver's problem, and the regress tests still pass, > so I committed it a little while ago. > > > Is there anything wrong with inserting heap_close(relation, NoLock) > > immediately before 'continue;' ? > > We can do that if it turns out my analysis is wrong and RowShareLock > should indeed be grabbed on views as well as their underlying tables. > I couldn't judge whether the following current behavior has some meaning or not. Let v be a view; Session-1 begin; lock table v in exclusive mode; (I don't know what this means) Session-2 begin; select * from v for update; (blocked by Session-1) Regards. Hiroshi Inoue Inoue@tpf.co.jp
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > I couldn't judge whether the following current behavior has some meaning > or not. > Let v be a view; > lock table v in exclusive mode; (I don't know what this means) Good question ... but it seems to me that it has to mean grabbing exclusive lock on the table(s) referred to by v. Otherwise, if client A locks the view and client B locks the underlying table directly, they'll both pass the lock and be able to access/modify the underlying table at the same time. That can't be right. The rewriter correctly passes SELECT FOR UPDATE locking from the view to the referenced tables, but I'm not sure whether it is bright enough to do the same for LOCK statements. (Jan?) regards, tom lane
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > > I couldn't judge whether the following current behavior has some meaning > > or not. > > > Let v be a view; > > > lock table v in exclusive mode; (I don't know what this means) > > Good question ... but it seems to me that it has to mean grabbing > exclusive lock on the table(s) referred to by v. Otherwise, if > client A locks the view and client B locks the underlying table > directly, they'll both pass the lock and be able to access/modify > the underlying table at the same time. That can't be right. > > The rewriter correctly passes SELECT FOR UPDATE locking from the > view to the referenced tables, but I'm not sure whether it is > bright enough to do the same for LOCK statements. (Jan?) Isn't LOCK TABLE a utility statement? So it doesn't go through the rewriter. The LOCK code would have to do the correct locking of the underlying tables. And not to forget cascaded viewsor possible subselects. Actually LockTableCommand() in command.c doesn't do it. It simply locks the view relation, what's definitely wrong. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
> > The rewriter correctly passes SELECT FOR UPDATE locking from the > > view to the referenced tables, but I'm not sure whether it is > > bright enough to do the same for LOCK statements. (Jan?) > > Isn't LOCK TABLE a utility statement? So it doesn't go > through the rewriter. > > The LOCK code would have to do the correct locking of the > underlying tables. And not to forget cascaded views or > possible subselects. > > Actually LockTableCommand() in command.c doesn't do it. It > simply locks the view relation, what's definitely wrong. > Added to TODO: * Disallow LOCK on view -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026