Обсуждение: BUG #8870: PL/PgSQL, SELECT .. INTO and the number of result columns
The following bug has been logged on the website: Bug reference: 8870 Logged by: Marko Tiikkaja Email address: marko@joh.to PostgreSQL version: 9.3.2 Operating system: OS X Description: I looked more closely into a gripe from Stephen Frost, which lead me into discovering this behaviour: =# create function f() returns void as -# $$declare $# f1 int; $# begin $# select 1, 2 into f1; $# end $# $$ language plpgsql; CREATE FUNCTION =# select f(); f --- (1 row) Which directly contradicts this statement in the docs (http://www.postgresql.org/docs/9.3/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-SQL-ONEROW): "If a row or a variable list is used as target, the query's result columns must exactly match the structure of the target as to number and data types, or else a run-time error occurs." This seems to have been broken by commit 8bb3c8fe5413103c30ba8d1bcb3a1f8e46bddf3d. I strongly believe the documentation is right in this case, and the behaviour ought to change.
* marko@joh.to (marko@joh.to) wrote: > I looked more closely into a gripe from Stephen Frost, which lead me into > discovering this behaviour: Well, as I got mentioned here.. The specific gripe I was bitching about on IRC goes thusly: =3D*# do $$ declare f1 int; declare f2 int; declare f3 int; begin select 1, 2, 3 into f1, f2 f3; end $$ language plpgsql; DO The issue above being that when you have a long list of columns being selected into variables, it can be easy to forget a comma and then debugging can be very painful. > Which directly contradicts this statement in the docs > (http://www.postgresql.org/docs/9.3/static/plpgsql-statements.html#PLPGSQ= L-STATEMENTS-SQL-ONEROW): >=20 > "If a row or a variable list is used as target, the query's result columns > must exactly match the structure of the target as to number and data type= s, > or else a run-time error occurs." Agreed. > This seems to have been broken by commit > 8bb3c8fe5413103c30ba8d1bcb3a1f8e46bddf3d. I strongly believe the > documentation is right in this case, and the behaviour ought to change. Yeah, I don't entirely follow the logic in that commit. It's strongly discouraged to use 'SELECT * INTO', for pretty obvious reasons. We certainly have no sympathy for users using 'SELECT *' from other languages when the columns get rearranged underneath them (due to a DROP COLUMN or similar). Thanks, Stephen
marko@joh.to writes: > =# create function f() returns void as > -# $$declare > $# f1 int; > $# begin > $# select 1, 2 into f1; > $# end > $# $$ language plpgsql; > CREATE FUNCTION > =# select f(); > f > --- > (1 row) > This seems to have been broken by commit > 8bb3c8fe5413103c30ba8d1bcb3a1f8e46bddf3d. Did you really test against a pre-2001 plpgsql? The comments in that commit are correct as far as they go: we *must* be willing to accept tuples with more or fewer attributes than the row has fields, or unpleasant things will happen with code that is correct on its face, if the table it's fetching from has inheritance children and/or has been subject to ALTER TABLE. ISTM what you're really asking for is a compile time (not run-time) check that the number of INTO targets matches the nominal number of output columns of the query. Not sure how difficult that is, but it'd be more likely to be successful than messing with the run-time behavior in exec_move_row. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > ISTM what you're really asking for is a compile time (not run-time) check > that the number of INTO targets matches the nominal number of output > columns of the query. Not sure how difficult that is, but it'd be more > likely to be successful than messing with the run-time behavior in > exec_move_row. For my 2c- a compile-time check would be even better than a run-time one in this case as it would provide pl/pgsql authors notice that they've goof'd something up. Regardless, I don't see how we can claim to be respecting what the documentation calls for here and it's causing pain for our users. I'm on the fence about if we can feel comfortable back-patching such a change, but I do feel we should figure out a way to fix it for 9.4. Thanks, Stephen
On 1/19/14, 12:59 AM, Tom Lane wrote: >> This seems to have been broken by commit >> 8bb3c8fe5413103c30ba8d1bcb3a1f8e46bddf3d. > > Did you really test against a pre-2001 plpgsql? No, I can't compile it. It just *looks* like the behaviour was broken by that commit. Sorry for the confusion. > ISTM what you're really asking for is a compile time (not run-time) check > that the number of INTO targets matches the nominal number of output > columns of the query. Not sure how difficult that is, but it'd be more > likely to be successful than messing with the run-time behavior in > exec_move_row. Please correct me if I'm wrong, but as far as I can tell we can't do that check at compile time because we don't do parse analysis. And making that happen every time CREATE FUNCTION is called.. I don't see that happening. But making this check at run time doesn't seem too hard. How broken does the attached look? It passes all the existing regression tests, at least.. Regards, Marko Tiikkaja
Вложения
On 1/19/14, 1:50 AM, I wrote: > But making this check at run time doesn't seem too hard. How broken > does the attached look? A bit broken, now that I look at it in detail. In particular, I could see someone doing something like: CREATE TABLE lotsofcolumns(f1 int, f2 int, f3 int, ..); DECLARE f1 int; f2 int; BEGIN SELECT * INTO f1, f2 FROM lotsofcolumns; I can't say I think this is a good idea, but not sure breaking this case is worth it either. In bug #8893 there was some discussion which I interpreted to mean that we could improve this a bit by checking the number of returned columns during compile time if there's no * in the target list. Attached is a crude patch attempting to do that, which appears to be working. Any thoughts? Regards, Marko Tiikkaja
Вложения
Marko Tiikkaja <marko@joh.to> writes: > On 1/19/14, 1:50 AM, I wrote: >> But making this check at run time doesn't seem too hard. How broken >> does the attached look? > A bit broken, now that I look at it in detail. In particular, I could > see someone doing something like: > CREATE TABLE lotsofcolumns(f1 int, f2 int, f3 int, ..); > DECLARE > f1 int; > f2 int; > BEGIN > SELECT * INTO f1, f2 FROM lotsofcolumns; > I can't say I think this is a good idea, but not sure breaking this case > is worth it either. Um, I thought the whole point was to complain about that. If this isn't a mistake, how can you consistently maintain the other one is? > In bug #8893 there was some discussion which I interpreted to mean that > we could improve this a bit by checking the number of returned columns > during compile time if there's no * in the target list. Attached is a > crude patch attempting to do that, which appears to be working. Any > thoughts? Ick. I thought you wanted to do this at first execution, anyway, not in pl_gram.y. regards, tom lane
On 1/22/14, 3:48 AM, Tom Lane wrote: > Marko Tiikkaja <marko@joh.to> writes: >> SELECT * INTO f1, f2 FROM lotsofcolumns; > >> I can't say I think this is a good idea, but not sure breaking this case >> is worth it either. > > Um, I thought the whole point was to complain about that. If this isn't a > mistake, how can you consistently maintain the other one is? I'm sure you can see a difference between explicitly listing the wrong number of columns and using * to ignore trailing columns. That said, this *should* be an error according to the documentation, so it's probably not the end of the world if we break it. >> In bug #8893 there was some discussion which I interpreted to mean that >> we could improve this a bit by checking the number of returned columns >> during compile time if there's no * in the target list. Attached is a >> crude patch attempting to do that, which appears to be working. Any >> thoughts? > > Ick. I thought you wanted to do this at first execution, anyway, not in > pl_gram.y. Catching these errors at compile time would be ideal. If we also want to catch the * cases we could also check the result structure before the first execution. Regards, Marko Tiikkaja
2014/1/22 Marko Tiikkaja <marko@joh.to> > On 1/22/14, 3:48 AM, Tom Lane wrote: > > Marko Tiikkaja <marko@joh.to> writes: >> >>> SELECT * INTO f1, f2 FROM lotsofcolumns; >>> >> >> I can't say I think this is a good idea, but not sure breaking this case >>> is worth it either. >>> >> >> Um, I thought the whole point was to complain about that. If this isn't a >> mistake, how can you consistently maintain the other one is? >> > > I'm sure you can see a difference between explicitly listing the wrong > number of columns and using * to ignore trailing columns. That said, this > *should* be an error according to the documentation, so it's probably not > the end of the world if we break it. > > > In bug #8893 there was some discussion which I interpreted to mean that >>> we could improve this a bit by checking the number of returned columns >>> during compile time if there's no * in the target list. Attached is a >>> crude patch attempting to do that, which appears to be working. Any >>> thoughts? >>> >> >> Ick. I thought you wanted to do this at first execution, anyway, not in >> pl_gram.y. >> > > Catching these errors at compile time would be ideal. If we also want to > catch the * cases we could also check the result structure before the first > execution. > It is exactly it what is done by plpgsql_check_function - without introduction new dependencies and without possible performance impacts. Regards Pavel > > > Regards, > Marko Tiikkaja > > > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs >
So, should I clean up either (or both?) of these patches? Personally I think we ought to both check for the easy cases during compilation and for the harder cases during run-time. Regards, Marko Tiikkaja
On 2/2/14, 5:48 PM, I wrote: > So, should I clean up either (or both?) of these patches? Well, I went ahead and did that anyway. Attached is the cleaned up version of the one that looks at the raw parse trees and rejects obviously buggy code. I'll start working on the run-time version soonish, unless someone wants to shoot me down before that. This might be too late for 9.4 (though one could argue that this is a bug fix), but that's all right. Regards, Marko Tiikkaja