Обсуждение: Writeable CTEs and empty relations
Hi, While playing around with another issue with the patch, I came across the following: => create table foo(a int); CREATE TABLE => with t as (insert into foo values(0)) select * from foo;a --- (0 rows) I traced this down to heapam.c, which has this: /** return null immediately if relation is empty*/if (scan->rs_nblocks == 0){ Assert(!BufferIsValid(scan->rs_cbuf)); tuple->t_data = NULL; return;} and /* * Determine the number of blocks we have to scan. * * It is sufficient to do this once at scan start, since any tuples added * while the scan is in progress will be invisible to my snapshot anyway. * (That is not true when using a non-MVCC snapshot. However, we couldn't * guarantee to return tuples added after scan start anyway, since they * might go into pages we already scanned. To guarantee consistent * results for a non-MVCC snapshot, the caller must hold some higher-level * lock that ensures the interesting tuple(s) won't change.) */ scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd); This doesn't exactly work anymore since we modify the snapshot after calling ExecInitScan(). I'm not really familiar with this part of the code, so I'm asking: is there a simple enough way around this? Would updating scan->rs_nblocks before scanning the first tuple be OK? Regards, Marko Tiikkaja
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: > I traced this down to heapam.c, which has this: > ... > This doesn't exactly work anymore since we modify the snapshot after > calling ExecInitScan(). So don't do that. The executor is generally entitled to assume that parameters given to ExecutorStart are correct. In particular, changing the snapshot after the query has started to run seems to me to ensure all sorts of inconsistent and undesirable behavior. regards, tom lane
On 2010-02-09 01:09 +0200, Tom Lane wrote: > Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: >> I traced this down to heapam.c, which has this: >> ... >> This doesn't exactly work anymore since we modify the snapshot after >> calling ExecInitScan(). > > So don't do that. The executor is generally entitled to assume that > parameters given to ExecutorStart are correct. In particular, changing > the snapshot after the query has started to run seems to me to ensure > all sorts of inconsistent and undesirable behavior. You do remember that the whole patch in its current form depends on modifying the snapshot? Regards, Marko Tiikkaja
On 2010-02-08 21:30 +0200, I wrote: > This doesn't exactly work anymore since we modify the snapshot after > calling ExecInitScan(). I'm not really familiar with this part of the > code, so I'm asking: is there a simple enough way around this? Would > updating scan->rs_nblocks before scanning the first tuple be OK? I've looked at this some more, and the problem is a lot bigger than I originally thought. We'd basically be forced to do another initscan() before starting a new scan after the snapshot changed. One way to accomplish this would be that ExecutePlan() would leave a flag in EState whenever the scan nodes need to reinit. Does this sound completely unacceptable? Regards, Marko Tiikkaja
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: > On 2010-02-08 21:30 +0200, I wrote: >> This doesn't exactly work anymore since we modify the snapshot after >> calling ExecInitScan(). I'm not really familiar with this part of the >> code, so I'm asking: is there a simple enough way around this? Would >> updating scan->rs_nblocks before scanning the first tuple be OK? > I've looked at this some more, and the problem is a lot bigger than I > originally thought. We'd basically be forced to do another initscan() > before starting a new scan after the snapshot changed. One way to > accomplish this would be that ExecutePlan() would leave a flag in EState > whenever the scan nodes need to reinit. > Does this sound completely unacceptable? You still haven't explained why it's a good idea to change the snapshot after the executor has started. Right at the moment I'm prepared to reject the patch on that ground alone. regards, tom lane
On 2010-02-10 02:19 +0200, Tom Lane wrote: > Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: >> Does this sound completely unacceptable? > > You still haven't explained why it's a good idea to change the snapshot > after the executor has started. Right at the moment I'm prepared to > reject the patch on that ground alone. The patch only touches the snapshot's curcid. That's needed to allow the queries see the tuples of the DML WITHs ran before that particular query. Regards, Marko Tiikkaja
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: > On 2010-02-10 02:19 +0200, Tom Lane wrote: >> You still haven't explained why it's a good idea to change the snapshot >> after the executor has started. Right at the moment I'm prepared to >> reject the patch on that ground alone. > The patch only touches the snapshot's curcid. That's needed to allow > the queries see the tuples of the DML WITHs ran before that particular > query. ... and they will also see, for example, tuples inserted by triggers fired by those queries. I thought the plan for this feature was to store and re-read the RETURNING data, not to go back and re-read the underlying tables. regards, tom lane
On 2010-02-10 03:20 +0200, Tom Lane wrote: > Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: >> On 2010-02-10 02:19 +0200, Tom Lane wrote: >>> You still haven't explained why it's a good idea to change the snapshot >>> after the executor has started. Right at the moment I'm prepared to >>> reject the patch on that ground alone. > >> The patch only touches the snapshot's curcid. That's needed to allow >> the queries see the tuples of the DML WITHs ran before that particular >> query. > > ... and they will also see, for example, tuples inserted by triggers > fired by those queries. I thought the plan for this feature was to > store and re-read the RETURNING data, not to go back and re-read the > underlying tables. I originally suggested this approach about four months ago. During this whole time you haven't expressed any concerns about this, but suddenly it's a reason to reject the patch? Anyway, if we want to avoid modifying the snapshot, we can't bump the CID between queries. I haven't thought this through, but it seems like this could work. However, none of the WITH queries can see the previous statement's tuples, which means that someone may be suprised when they try to modify the same tuples twice just to discover that only the first modification took place. I don't see this as a huge problem though. This will also solve the problem I started this thread for and makes the patch smaller, simpler and even a bit prettier. Unless someone sees a problem with this approach, I'm going to make the change. Regards, Marko Tiikkaja
On Wed, Feb 10, 2010 at 5:05 AM, Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> wrote: > On 2010-02-10 03:20 +0200, Tom Lane wrote: >> Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: >>> On 2010-02-10 02:19 +0200, Tom Lane wrote: >>>> You still haven't explained why it's a good idea to change the snapshot >>>> after the executor has started. Right at the moment I'm prepared to >>>> reject the patch on that ground alone. >> >>> The patch only touches the snapshot's curcid. That's needed to allow >>> the queries see the tuples of the DML WITHs ran before that particular >>> query. >> >> ... and they will also see, for example, tuples inserted by triggers >> fired by those queries. I thought the plan for this feature was to >> store and re-read the RETURNING data, not to go back and re-read the >> underlying tables. > > I originally suggested this approach about four months ago. During this > whole time you haven't expressed any concerns about this, but suddenly > it's a reason to reject the patch? > > Anyway, if we want to avoid modifying the snapshot, we can't bump the > CID between queries. I haven't thought this through, but it seems like > this could work. However, none of the WITH queries can see the previous > statement's tuples, which means that someone may be suprised when they > try to modify the same tuples twice just to discover that only the first > modification took place. I don't see this as a huge problem though. I think that we agreed previously that running all the DML queries to completion before the main query, bumping the CID after each one, and saving the outputs, was the only workable approach. http://archives.postgresql.org/pgsql-hackers/2009-10/msg00566.php http://archives.postgresql.org/pgsql-hackers/2009-10/msg00614.php If the executor has buried in it the assumption that the snapshot can't change after startup, then does that mean that we need to start up and shut down the executor for each subquery? http://archives.postgresql.org/pgsql-hackers/2009-11/msg01964.php ...Robert
On 2010-02-10 21:59 +0200, Robert Haas wrote: > On Wed, Feb 10, 2010 at 5:05 AM, Marko Tiikkaja > <marko.tiikkaja@cs.helsinki.fi> wrote: >> On 2010-02-10 03:20 +0200, Tom Lane wrote: >>> Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: >>>> On 2010-02-10 02:19 +0200, Tom Lane wrote: >>>>> You still haven't explained why it's a good idea to change the snapshot >>>>> after the executor has started. Right at the moment I'm prepared to >>>>> reject the patch on that ground alone. >>> >>>> The patch only touches the snapshot's curcid. That's needed to allow >>>> the queries see the tuples of the DML WITHs ran before that particular >>>> query. >>> >>> ... and they will also see, for example, tuples inserted by triggers >>> fired by those queries. I thought the plan for this feature was to >>> store and re-read the RETURNING data, not to go back and re-read the >>> underlying tables. >> >> I originally suggested this approach about four months ago. During this >> whole time you haven't expressed any concerns about this, but suddenly >> it's a reason to reject the patch? >> >> Anyway, if we want to avoid modifying the snapshot, we can't bump the >> CID between queries. I haven't thought this through, but it seems like >> this could work. However, none of the WITH queries can see the previous >> statement's tuples, which means that someone may be suprised when they >> try to modify the same tuples twice just to discover that only the first >> modification took place. I don't see this as a huge problem though. > > I think that we agreed previously that running all the DML queries to > completion before the main query, bumping the CID after each one, and > saving the outputs, was the only workable approach. > > http://archives.postgresql.org/pgsql-hackers/2009-10/msg00566.php Hmm. Yeah. Didn't think about that :-( > If the executor has buried in it the assumption that the snapshot > can't change after startup, then does that mean that we need to start > up and shut down the executor for each subquery? Then I guess that's pretty much the only option. Regards, Marko Tiikkaja
Robert Haas <robertmhaas@gmail.com> writes: > If the executor has buried in it the assumption that the snapshot > can't change after startup, then does that mean that we need to start > up and shut down the executor for each subquery? Yes, I think so. That's the way it's always worked in the past; see for example PortalRunMulti() and ProcessQuery(). I think trying to change that is a high-risk, low-reward activity. This probably means that the planner output for queries involving writeable CTEs has to be a separate PlannedStmt per such CTE. regards, tom lane
On 2010-02-10 23:57 +0200, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> If the executor has buried in it the assumption that the snapshot >> can't change after startup, then does that mean that we need to start >> up and shut down the executor for each subquery? > > Yes, I think so. That's the way it's always worked in the past; > see for example PortalRunMulti() and ProcessQuery(). I think trying > to change that is a high-risk, low-reward activity. > > This probably means that the planner output for queries involving > writeable CTEs has to be a separate PlannedStmt per such CTE. I'm looking at this, but I can't think of any good way of associating the tuplestores from PortalRunMulti() with the correct CTEs. Any ideas? Regards, Marko Tiikkaja
On 2010-02-11 00:50 +0200, Marko Tiikkaja wrote: > On 2010-02-10 23:57 +0200, Tom Lane wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> If the executor has buried in it the assumption that the snapshot >>> can't change after startup, then does that mean that we need to start >>> up and shut down the executor for each subquery? >> >> Yes, I think so. That's the way it's always worked in the past; >> see for example PortalRunMulti() and ProcessQuery(). I think trying >> to change that is a high-risk, low-reward activity. >> >> This probably means that the planner output for queries involving >> writeable CTEs has to be a separate PlannedStmt per such CTE. > > I'm looking at this, but I can't think of any good way of associating > the tuplestores from PortalRunMulti() with the correct CTEs. Any ideas? Ok, what about the following: - after planning the original query, standard_planner() goes through the list of top-levelCTEs and assigns a running number for each of the DML WITHs, in the order they will be executed and returns a list of PlannedStmts. all necessary statements wi have a flag signaling that the result should be stored in a tuplestore.- the portal keeps the list of tuplestores around and passes that list to every query through PlannedStmt. it keeps on executing the statements until it finds a PlannedStmt that doesn't want its results stored anywhere and thenfrees the list of tuplestores Does this sound reasonable? And more importantly, am I going to be wasting my time implementing this? The 15th isn't that far away.. Regards, Marko Tiikkaja
On Wed, Feb 10, 2010 at 6:25 PM, Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> wrote: > On 2010-02-11 00:50 +0200, Marko Tiikkaja wrote: >> On 2010-02-10 23:57 +0200, Tom Lane wrote: >>> Robert Haas <robertmhaas@gmail.com> writes: >>>> If the executor has buried in it the assumption that the snapshot >>>> can't change after startup, then does that mean that we need to start >>>> up and shut down the executor for each subquery? >>> >>> Yes, I think so. That's the way it's always worked in the past; >>> see for example PortalRunMulti() and ProcessQuery(). I think trying >>> to change that is a high-risk, low-reward activity. >>> >>> This probably means that the planner output for queries involving >>> writeable CTEs has to be a separate PlannedStmt per such CTE. >> >> I'm looking at this, but I can't think of any good way of associating >> the tuplestores from PortalRunMulti() with the correct CTEs. Any ideas? > > Ok, what about the following: > - after planning the original query, standard_planner() goes through > the list of top-level CTEs and assigns a running number for each of > the DML WITHs, in the order they will be executed and returns a > list of PlannedStmts. all necessary statements wi have a flag > signaling that the result should be stored in a tuplestore. > - the portal keeps the list of tuplestores around and passes that > list to every query through PlannedStmt. it keeps on executing > the statements until it finds a PlannedStmt that doesn't want its > results stored anywhere and then frees the list of tuplestores Wouldn't you'd want to stop when you ran out of PlannedStmts, not when you hit one that didn't need its results stored? What happens if there's an unreferenced CTE in the middle of the list? > Does this sound reasonable? And more importantly, am I going to be > wasting my time implementing this? The 15th isn't that far away.. I have to admit I've been starting to have a feeling over the last couple of days that this patch isn't going to make it for 9.0... but obviously I'm in no position to guarantee anything one way or the other. Please do keep us up to date on your plans, though. Thanks, ...Robert
On 2010-02-11 01:58 +0200, Robert Haas wrote: > On Wed, Feb 10, 2010 at 6:25 PM, Marko Tiikkaja > <marko.tiikkaja@cs.helsinki.fi> wrote: >> On 2010-02-11 00:50 +0200, Marko Tiikkaja wrote: >> Ok, what about the following: >> - after planning the original query, standard_planner() goes through >> the list of top-level CTEs and assigns a running number for each of >> the DML WITHs, in the order they will be executed and returns a >> list of PlannedStmts. all necessary statements wi have a flag >> signaling that the result should be stored in a tuplestore. >> - the portal keeps the list of tuplestores around and passes that >> list to every query through PlannedStmt. it keeps on executing >> the statements until it finds a PlannedStmt that doesn't want its >> results stored anywhere and then frees the list of tuplestores > > Wouldn't you'd want to stop when you ran out of PlannedStmts, not when > you hit one that didn't need its results stored? What happens if > there's an unreferenced CTE in the middle of the list? Right, of course. >> Does this sound reasonable? And more importantly, am I going to be >> wasting my time implementing this? The 15th isn't that far away.. > > I have to admit I've been starting to have a feeling over the last > couple of days that this patch isn't going to make it for 9.0... but > obviously I'm in no position to guarantee anything one way or the > other. Please do keep us up to date on your plans, though. Unfortunately, so have I. I'll take a shot at implementing this, but if I come across any problems, I have to give up. Regards, Marko Tiikkaja
On 2010-02-11 02:13 +0200, I wrote: > On 2010-02-11 01:58 +0200, Robert Haas wrote: >> I have to admit I've been starting to have a feeling over the last >> couple of days that this patch isn't going to make it for 9.0... but >> obviously I'm in no position to guarantee anything one way or the >> other. Please do keep us up to date on your plans, though. > > Unfortunately, so have I. I'll take a shot at implementing this, but if > I come across any problems, I have to give up. I'm going to have to disappoint a bunch of people and give up. :-( At this point, I've already used a huge amount of my time and energy to work on this patch during this commit fest, and I simply can't continue like this. There's only 4 days left anyway, and I don't feel confident enough to spend a lot of time during the next couple of days just to get one more shot. I don't even have any kind of certainty that there would've been a shot, even if I finished the patch on time. Thanks everyone for helping out and reviewing this, especially Robert and Tom. Regards, Marko Tiikkaja
On 2010-02-11 03:44 +0200, I wrote: > I'm going to have to disappoint a bunch of people and give up. :-( Btw. would it make sense to apply the WITH-on-top-of-DML part of this patch? At least to me, this seems useful because you can write a RECURSIVE SELECT and then use UPDATE .. FROM or DELETE .. USING on that CTE. Regards, Marko Tiikkaja
On Thu, Feb 11, 2010 at 8:46 AM, Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> wrote: > On 2010-02-11 03:44 +0200, I wrote: >> I'm going to have to disappoint a bunch of people and give up. :-( > > Btw. would it make sense to apply the WITH-on-top-of-DML part of this > patch? At least to me, this seems useful because you can write a > RECURSIVE SELECT and then use UPDATE .. FROM or DELETE .. USING on that CTE. Hmm, that's a thought. Can you split out just that part? ...Robert
On Thu, 11 Feb 2010 10:53:22 -0500, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Feb 11, 2010 at 8:46 AM, Marko Tiikkaja > <marko.tiikkaja@cs.helsinki.fi> wrote: >> On 2010-02-11 03:44 +0200, I wrote: >>> I'm going to have to disappoint a bunch of people and give up. :-( >> >> Btw. would it make sense to apply the WITH-on-top-of-DML part of this >> patch? At least to me, this seems useful because you can write a >> RECURSIVE SELECT and then use UPDATE .. FROM or DELETE .. USING on that >> CTE. > > Hmm, that's a thought. Can you split out just that part? Here's the patch. It's the same as the stuff in writeable CTE patches, but I added regression tests. Regards, Marko Tiikkaja
Вложения
On Thu, 11 Feb 2010 19:28:28 +0200, I wrote: > On Thu, 11 Feb 2010 10:53:22 -0500, Robert Haas <robertmhaas@gmail.com> > wrote: >> On Thu, Feb 11, 2010 at 8:46 AM, Marko Tiikkaja >> <marko.tiikkaja@cs.helsinki.fi> wrote: >>> On 2010-02-11 03:44 +0200, I wrote: >>>> I'm going to have to disappoint a bunch of people and give up. :-( >>> >>> Btw. would it make sense to apply the WITH-on-top-of-DML part of this >>> patch? At least to me, this seems useful because you can write a >>> RECURSIVE SELECT and then use UPDATE .. FROM or DELETE .. USING on that >>> CTE. >> >> Hmm, that's a thought. Can you split out just that part? > > Here's the patch. It's the same as the stuff in writeable CTE patches, but > I added regression tests. Whoops. The reference section in docs still had some traces of writeable CTEs. Updated patch attached. Regards, Marko Tiikkaja
Вложения
On Thu, Feb 11, 2010 at 12:35 PM, Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> wrote: > On Thu, 11 Feb 2010 19:28:28 +0200, I wrote: >> On Thu, 11 Feb 2010 10:53:22 -0500, Robert Haas <robertmhaas@gmail.com> >> wrote: >>> On Thu, Feb 11, 2010 at 8:46 AM, Marko Tiikkaja >>> <marko.tiikkaja@cs.helsinki.fi> wrote: >>>> On 2010-02-11 03:44 +0200, I wrote: >>>>> I'm going to have to disappoint a bunch of people and give up. :-( >>>> >>>> Btw. would it make sense to apply the WITH-on-top-of-DML part of this >>>> patch? At least to me, this seems useful because you can write a >>>> RECURSIVE SELECT and then use UPDATE .. FROM or DELETE .. USING on that >>>> CTE. >>> >>> Hmm, that's a thought. Can you split out just that part? >> >> Here's the patch. It's the same as the stuff in writeable CTE patches, > but >> I added regression tests. > > Whoops. The reference section in docs still had some traces of writeable > CTEs. Updated patch attached. This looks simple and useful. I haven't tested it, but if it's really this easy, we should definitely do it. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > This looks simple and useful. I haven't tested it, but if it's really > this easy, we should definitely do it. I should be out from under the window functions patch tomorrow, will look at this one then. regards, tom lane
On Fri, Feb 12, 2010 at 12:11 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> This looks simple and useful. I haven't tested it, but if it's really >> this easy, we should definitely do it. > > I should be out from under the window functions patch tomorrow, > will look at this one then. Cool, thanks. ...Robert
Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: >> Here's the patch. It's the same as the stuff in writeable CTE patches, >> but I added regression tests. > Whoops. The reference section in docs still had some traces of writeable > CTEs. Updated patch attached. I spent some time playing with this but concluded that it's not committable. I ran into two significant problems: 1. In an INSERT statement, it's already possible to attach a WITH to the contained statement, ie INSERT INTO foo WITH ... SELECT ... INSERT INTO foo WITH ... VALUES (...) and the patch wasn't doing anything nice with the case where one tries to put WITH at both places: WITH ... INSERT INTO foo WITH ... VALUES (...) (The SELECT case actually works, mostly, but the VALUES one doesn't.) I thought about just concat'ing the two WITH lists but this introduces various strange corner cases; in particular when one list is marked RECURSIVE and the other isn't there's no way to avoid surprising behavior. However, since the option for an inner WITH already does everything you would want, we could just forget about adding outer WITH for INSERT. The attached modified patch does that. 2. None of the cases play nicely with NEW or OLD references in a rule. For example, regression=# create temp table x(f1 int); CREATE TABLE regression=# create temp table y(f2 int); CREATE TABLE regression=# create rule r2 as on update to x do instead with t as (select old.*) update y set f2 = t.f1 from t; CREATE RULE regression=# update x set f1 = f1+1; ERROR: bogus local parameter passed to WITH query regression=# I don't see any very nice way to fix this. The problem is that the NEW or OLD reference is treated as though it were a relation of the main query (the UPDATE in this case), which is something that's not valid to reference in a WITH query. I'm afraid that it might not be possible to fix it without significant changes in the way rules work (and consequent compatibility issues). We could possibly put in some hack to disallow OLD/NEW references in the WITH queries, but that got past my threshold of ugliness, so I'm not going to commit it without further discussion. Attached is the patch as I had it before giving up (sans documentation since I'd not gotten to that yet). The main other change from what you submitted was adding ruleutils.c support. regards, tom lane Index: src/backend/nodes/copyfuncs.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/nodes/copyfuncs.c,v retrieving revision 1.461 diff -c -r1.461 copyfuncs.c *** src/backend/nodes/copyfuncs.c 12 Feb 2010 17:33:20 -0000 1.461 --- src/backend/nodes/copyfuncs.c 13 Feb 2010 00:49:41 -0000 *************** *** 2279,2284 **** --- 2279,2285 ---- COPY_NODE_FIELD(usingClause); COPY_NODE_FIELD(whereClause); COPY_NODE_FIELD(returningList); + COPY_NODE_FIELD(withClause); return newnode; } *************** *** 2293,2298 **** --- 2294,2300 ---- COPY_NODE_FIELD(whereClause); COPY_NODE_FIELD(fromClause); COPY_NODE_FIELD(returningList); + COPY_NODE_FIELD(withClause); return newnode; } Index: src/backend/nodes/equalfuncs.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/nodes/equalfuncs.c,v retrieving revision 1.382 diff -c -r1.382 equalfuncs.c *** src/backend/nodes/equalfuncs.c 12 Feb 2010 17:33:20 -0000 1.382 --- src/backend/nodes/equalfuncs.c 13 Feb 2010 00:49:41 -0000 *************** *** 899,904 **** --- 899,905 ---- COMPARE_NODE_FIELD(usingClause); COMPARE_NODE_FIELD(whereClause); COMPARE_NODE_FIELD(returningList); + COMPARE_NODE_FIELD(withClause); return true; } *************** *** 911,916 **** --- 912,918 ---- COMPARE_NODE_FIELD(whereClause); COMPARE_NODE_FIELD(fromClause); COMPARE_NODE_FIELD(returningList); + COMPARE_NODE_FIELD(withClause); return true; } Index: src/backend/parser/analyze.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/parser/analyze.c,v retrieving revision 1.401 diff -c -r1.401 analyze.c *** src/backend/parser/analyze.c 12 Feb 2010 22:48:56 -0000 1.401 --- src/backend/parser/analyze.c 13 Feb 2010 00:49:41 -0000 *************** *** 282,287 **** --- 282,294 ---- qry->commandType = CMD_DELETE; + /* process the WITH clause independently of all else */ + if (stmt->withClause) + { + qry->hasRecursive = stmt->withClause->recursive; + qry->cteList = transformWithClause(pstate, stmt->withClause); + } + /* set up range table with just the result rel */ qry->resultRelation = setTargetTable(pstate, stmt->relation, interpretInhOption(stmt->relation->inhOpt), *************** *** 380,386 **** * Must get write lock on INSERT target table before scanning SELECT, else * we will grab the wrong kind of initial lock if the target table is also * mentioned in the SELECT part. Note that the target table is not added ! * to the joinlist or namespace. */ qry->resultRelation = setTargetTable(pstate, stmt->relation, false, false, ACL_INSERT); --- 387,394 ---- * Must get write lock on INSERT target table before scanning SELECT, else * we will grab the wrong kind of initial lock if the target table is also * mentioned in the SELECT part. Note that the target table is not added ! * to the joinlist or namespace (and hence it won't affect processing ! * of the contained statement). */ qry->resultRelation = setTargetTable(pstate, stmt->relation, false, false, ACL_INSERT); *************** *** 1730,1735 **** --- 1738,1750 ---- qry->commandType = CMD_UPDATE; pstate->p_is_update = true; + /* process the WITH clause independently of all else */ + if (stmt->withClause) + { + qry->hasRecursive = stmt->withClause->recursive; + qry->cteList = transformWithClause(pstate, stmt->withClause); + } + qry->resultRelation = setTargetTable(pstate, stmt->relation, interpretInhOption(stmt->relation->inhOpt), true, Index: src/backend/parser/gram.y =================================================================== RCS file: /cvsroot/pgsql/src/backend/parser/gram.y,v retrieving revision 2.708 diff -c -r2.708 gram.y *** src/backend/parser/gram.y 12 Feb 2010 17:33:20 -0000 2.708 --- src/backend/parser/gram.y 13 Feb 2010 00:49:41 -0000 *************** *** 429,435 **** %type <boolean> xml_whitespace_option %type <node> common_table_expr ! %type <with> with_clause %type <list> cte_list %type <list> window_clause window_definition_list opt_partition_clause --- 429,435 ---- %type <boolean> xml_whitespace_option %type <node> common_table_expr ! %type <with> with_clause opt_with_clause %type <list> cte_list %type <list> window_clause window_definition_list opt_partition_clause *************** *** 7141,7154 **** * *****************************************************************************/ ! DeleteStmt: DELETE_P FROM relation_expr_opt_alias using_clause where_or_current_clause returning_clause { DeleteStmt *n = makeNode(DeleteStmt); ! n->relation = $3; ! n->usingClause = $4; ! n->whereClause = $5; ! n->returningList = $6; $$ = (Node *)n; } ; --- 7141,7155 ---- * *****************************************************************************/ ! DeleteStmt: opt_with_clause DELETE_P FROM relation_expr_opt_alias using_clause where_or_current_clause returning_clause { DeleteStmt *n = makeNode(DeleteStmt); ! n->relation = $4; ! n->usingClause = $5; ! n->whereClause = $6; ! n->returningList = $7; ! n->withClause = $1; $$ = (Node *)n; } ; *************** *** 7203,7220 **** * *****************************************************************************/ ! UpdateStmt: UPDATE relation_expr_opt_alias SET set_clause_list from_clause where_or_current_clause returning_clause { UpdateStmt *n = makeNode(UpdateStmt); ! n->relation = $2; ! n->targetList = $4; ! n->fromClause = $5; ! n->whereClause = $6; ! n->returningList = $7; $$ = (Node *)n; } ; --- 7204,7222 ---- * *****************************************************************************/ ! UpdateStmt: opt_with_clause UPDATE relation_expr_opt_alias SET set_clause_list from_clause where_or_current_clause returning_clause { UpdateStmt *n = makeNode(UpdateStmt); ! n->relation = $3; ! n->targetList = $5; ! n->fromClause = $6; ! n->whereClause = $7; ! n->returningList = $8; ! n->withClause = $1; $$ = (Node *)n; } ; *************** *** 7556,7561 **** --- 7558,7569 ---- } ; + opt_with_clause: + with_clause { $$ = $1; } + | /*EMPTY*/ { $$ = NULL; } + ; + + into_clause: INTO OptTempTableName { Index: src/backend/utils/adt/ruleutils.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/adt/ruleutils.c,v retrieving revision 1.321 diff -c -r1.321 ruleutils.c *** src/backend/utils/adt/ruleutils.c 12 Feb 2010 17:33:20 -0000 1.321 --- src/backend/utils/adt/ruleutils.c 13 Feb 2010 00:49:42 -0000 *************** *** 3351,3356 **** --- 3351,3359 ---- RangeTblEntry *rte; ListCell *l; + /* Insert the WITH clause if given */ + get_with_clause(query, context); + /* * Start the query with UPDATE relname SET */ *************** *** 3432,3437 **** --- 3435,3443 ---- StringInfo buf = context->buf; RangeTblEntry *rte; + /* Insert the WITH clause if given */ + get_with_clause(query, context); + /* * Start the query with DELETE FROM relname */ Index: src/include/nodes/parsenodes.h =================================================================== RCS file: /cvsroot/pgsql/src/include/nodes/parsenodes.h,v retrieving revision 1.429 diff -c -r1.429 parsenodes.h *** src/include/nodes/parsenodes.h 12 Feb 2010 17:33:20 -0000 1.429 --- src/include/nodes/parsenodes.h 13 Feb 2010 00:49:42 -0000 *************** *** 906,911 **** --- 906,912 ---- List *usingClause; /* optional using clause for more tables */ Node *whereClause; /* qualifications */ List *returningList; /* list of expressions to return */ + WithClause *withClause; /* WITH clause */ } DeleteStmt; /* ---------------------- *************** *** 920,925 **** --- 921,927 ---- Node *whereClause; /* qualifications */ List *fromClause; /* optional from clause for more tables */ List *returningList; /* list of expressions to return */ + WithClause *withClause; /* WITH clause */ } UpdateStmt; /* ---------------------- Index: src/test/regress/expected/with.out =================================================================== RCS file: /cvsroot/pgsql/src/test/regress/expected/with.out,v retrieving revision 1.13 diff -c -r1.13 with.out *** src/test/regress/expected/with.out 22 Nov 2009 05:20:41 -0000 1.13 --- src/test/regress/expected/with.out 13 Feb 2010 00:49:42 -0000 *************** *** 738,743 **** --- 738,820 ---- (54 rows) -- + -- WITH on top of a DML statement + -- + CREATE TEMPORARY TABLE y (a INTEGER); + INSERT INTO y SELECT generate_series(1, 10); + INSERT INTO y + WITH t AS ( + SELECT a FROM y + ) + SELECT a+20 FROM t RETURNING *; + a + ---- + 21 + 22 + 23 + 24 + 25 + 26 + 27 + 28 + 29 + 30 + (10 rows) + + WITH t AS ( + SELECT a FROM y + ) + UPDATE y SET a = y.a-10 FROM t WHERE y.a > 20 AND t.a = y.a RETURNING y.a; + a + ---- + 11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + (10 rows) + + WITH RECURSIVE t(a) AS ( + SELECT 11 + UNION ALL + SELECT a+1 FROM t WHERE a < 50 + ) + DELETE FROM y USING t WHERE t.a = y.a RETURNING y.a; + a + ---- + 11 + 12 + 13 + 14 + 15 + 16 + 17 + 18 + 19 + 20 + (10 rows) + + SELECT * FROM y; + a + ---- + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + (10 rows) + + -- -- error cases -- -- INTERSECT *************** *** 774,781 **** ERROR: recursive reference to query "x" must not appear within its non-recursive term LINE 1: WITH RECURSIVE x(n) AS (SELECT n FROM x UNION ALL SELECT 1) ^ - CREATE TEMPORARY TABLE y (a INTEGER); - INSERT INTO y SELECT generate_series(1, 10); -- LEFT JOIN WITH RECURSIVE x(n) AS (SELECT a FROM y WHERE a = 1 UNION ALL --- 851,856 ---- Index: src/test/regress/sql/with.sql =================================================================== RCS file: /cvsroot/pgsql/src/test/regress/sql/with.sql,v retrieving revision 1.11 diff -c -r1.11 with.sql *** src/test/regress/sql/with.sql 9 Sep 2009 03:32:52 -0000 1.11 --- src/test/regress/sql/with.sql 13 Feb 2010 00:49:42 -0000 *************** *** 339,344 **** --- 339,371 ---- SELECT * FROM z; -- + -- WITH on top of a DML statement + -- + + CREATE TEMPORARY TABLE y (a INTEGER); + INSERT INTO y SELECT generate_series(1, 10); + + INSERT INTO y + WITH t AS ( + SELECT a FROM y + ) + SELECT a+20 FROM t RETURNING *; + + WITH t AS ( + SELECT a FROM y + ) + UPDATE y SET a = y.a-10 FROM t WHERE y.a > 20 AND t.a = y.a RETURNING y.a; + + WITH RECURSIVE t(a) AS ( + SELECT 11 + UNION ALL + SELECT a+1 FROM t WHERE a < 50 + ) + DELETE FROM y USING t WHERE t.a = y.a RETURNING y.a; + + SELECT * FROM y; + + -- -- error cases -- *************** *** 364,372 **** WITH RECURSIVE x(n) AS (SELECT n FROM x UNION ALL SELECT 1) SELECT * FROM x; - CREATE TEMPORARY TABLE y (a INTEGER); - INSERT INTO y SELECT generate_series(1, 10); - -- LEFT JOIN WITH RECURSIVE x(n) AS (SELECT a FROM y WHERE a = 1 --- 391,396 ----
On Fri, Feb 12, 2010 at 8:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Marko Tiikkaja <marko.tiikkaja@cs.helsinki.fi> writes: >>> Here's the patch. It's the same as the stuff in writeable CTE patches, >>> but I added regression tests. > >> Whoops. The reference section in docs still had some traces of writeable >> CTEs. Updated patch attached. > > I spent some time playing with this but concluded that it's not > committable. I ran into two significant problems: > > 1. In an INSERT statement, it's already possible to attach a WITH to > the contained statement, ie > INSERT INTO foo WITH ... SELECT ... > INSERT INTO foo WITH ... VALUES (...) > and the patch wasn't doing anything nice with the case where one tries > to put WITH at both places: > WITH ... INSERT INTO foo WITH ... VALUES (...) > (The SELECT case actually works, mostly, but the VALUES one doesn't.) > I thought about just concat'ing the two WITH lists but this introduces > various strange corner cases; in particular when one list is marked > RECURSIVE and the other isn't there's no way to avoid surprising > behavior. However, since the option for an inner WITH already does > everything you would want, we could just forget about adding outer WITH > for INSERT. The attached modified patch does that. That seems reasonable, though we might want to document that somehow for posterity. By the way, are we going to support WITH ... TABLE? > 2. None of the cases play nicely with NEW or OLD references in a rule. > For example, > > regression=# create temp table x(f1 int); > CREATE TABLE > regression=# create temp table y(f2 int); > CREATE TABLE > regression=# create rule r2 as on update to x do instead > with t as (select old.*) update y set f2 = t.f1 from t; > CREATE RULE > regression=# update x set f1 = f1+1; > ERROR: bogus local parameter passed to WITH query > regression=# > > I don't see any very nice way to fix this. The problem is that the > NEW or OLD reference is treated as though it were a relation of the > main query (the UPDATE in this case), which is something that's not > valid to reference in a WITH query. I'm afraid that it might not > be possible to fix it without significant changes in the way rules > work (and consequent compatibility issues). > > We could possibly put in some hack to disallow OLD/NEW references in > the WITH queries, but that got past my threshold of ugliness, so > I'm not going to commit it without further discussion. On the face of it it's not obvious to me why you wouldn't just do that. If it's not valid to reference them there, then just don't let it happen (now comes the part where you tell me why it's not that simple). ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Feb 12, 2010 at 8:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> We could possibly put in some hack to disallow OLD/NEW references in >> the WITH queries, but that got past my threshold of ugliness, so >> I'm not going to commit it without further discussion. > On the face of it it's not obvious to me why you wouldn't just do > that. If it's not valid to reference them there, then just don't let > it happen (now comes the part where you tell me why it's not that > simple). Well, there's no obvious-to-the-user reason why it shouldn't work. If we hack it, then an example like I gave will give a "no such table" error, and I'll bet long odds we'll get bug reports about it. (Now maybe we could suppress the bug reports if we could get it to print something more like "NEW can't be referenced in WITH", but doing that seems significantly harder --- the way that I can see to do it would be to not have NEW/OLD in the namespace while parsing WITH, and that would lead to a pretty stupid error message.) regards, tom lane
On Fri, Feb 12, 2010 at 9:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Feb 12, 2010 at 8:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> We could possibly put in some hack to disallow OLD/NEW references in >>> the WITH queries, but that got past my threshold of ugliness, so >>> I'm not going to commit it without further discussion. > >> On the face of it it's not obvious to me why you wouldn't just do >> that. If it's not valid to reference them there, then just don't let >> it happen (now comes the part where you tell me why it's not that >> simple). > > Well, there's no obvious-to-the-user reason why it shouldn't work. > If we hack it, then an example like I gave will give a "no such > table" error, and I'll bet long odds we'll get bug reports about it. > > (Now maybe we could suppress the bug reports if we could get it to > print something more like "NEW can't be referenced in WITH", but > doing that seems significantly harder --- the way that I can see > to do it would be to not have NEW/OLD in the namespace while parsing > WITH, and that would lead to a pretty stupid error message.) Oh, I get it. I thought you were saying that it was semantically nonsensical, but now I think you're saying that there's some implementation artifact that prevents us from supporting it. Personally, I don't use NEW and OLD in UPDATE queries, so it wouldn't bother me to just disallow them, but (1) I might be in the minority and (2) even if I'm not, the ugliness factor is something to consider.So I'm not sure what the right thing to do is, butwe need to make up our mind... If we do decide to hold off on this, then we should probably have a discussion of what the right way to fix the rule issues is for 9.1. If there's no non-ugly way to do this, then we might as well go ahead and do something ugly. If there is a non-ugly way to fix it, we should figure out what it is so that those who may be interested in having this feature can see about coding it up. ...Robert