Обсуждение: Bogus attribute-number range checks in spi.c
The various exported functions in spi.c that take an attribute number are not consistent about how they range-check it --- some test against tupdesc->natts and some against HeapTupleHeaderGetNatts(tuple). (Look for references to SPI_ERROR_NOATTRIBUTE to see what I'm talking about.) I'm thinking that the former is correct and the latter is simply wrong. There are two possible cases: * tupdesc has more columns than the tuple does. This is possible after ALTER TABLE ADD COLUMN, for example. The correct interpretation in this situation is that the extra columns exist but are NULL. Throwing an error is not correct. The code perhaps thinks it's protecting heap_getattr against an out-of-range attnum, but heap_getattr is supposed to take care of itself that way. * tupdesc has fewer columns than the tuple does. I think this can happen in certain inheritance cases --- we might be inspecting a child tuple using a parent's tupdesc. Whether it's possible or not, it's simply wrong for the code to use the larger number, as that would result in accessing off the end of the tupdesc's attribute array. So I think this needs to be made consistent, and furthermore that it's a backpatchable bug fix. Comments? regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > * tupdesc has more columns than the tuple does. This is possible after > ALTER TABLE ADD COLUMN, for example. The correct interpretation in > this situation is that the extra columns exist but are NULL. Throwing > an error is not correct. The code perhaps thinks it's protecting > heap_getattr against an out-of-range attnum, but heap_getattr is > supposed to take care of itself that way. Shouldn't this be failing then? If something like this does fail then definitely back-patchable++. postgres=# create table t (i integer); CREATE TABLE postgres=# insert into t values (0); INSERT 0 1 postgres=# alter table t add j integer; ALTER TABLE postgres=# create function x() returns integer as 'declare v integer; begin select j from t where i = 0 into v; return v;end' language plpgsql; CREATE FUNCTION postgres=# select x();x --- (1 row) > * tupdesc has fewer columns than the tuple does. I think this can > happen in certain inheritance cases --- we might be inspecting a child > tuple using a parent's tupdesc. Whether it's possible or not, it's > simply wrong for the code to use the larger number, as that would result > in accessing off the end of the tupdesc's attribute array. There are some comments in the source about cases like this but I've never understood how it can happen. Children are supposed to have a superset of the parent's columns. Does it depend on the parent having had dropped columns but not the child? But then wouldn't the Append node have had to do some magic to map the columns correctly meaning you wouldn't be looking at the physical tuple any more? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!
Gregory Stark <stark@enterprisedb.com> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> * tupdesc has more columns than the tuple does. This is possible after >> ALTER TABLE ADD COLUMN, for example. The correct interpretation in >> this situation is that the extra columns exist but are NULL. Throwing >> an error is not correct. > Shouldn't this be failing then? If something like this does fail then > definitely back-patchable++. [ pokes around ... ] The difference between correct and incorrect behavior here is that it is correct for SPI_getvalue and SPI_getbinval to return NULL for added columns, but they are incorrect to also set SPI_result to SPI_ERROR_NOATTRIBUTE. However, so far as I can see none of the callers in our CVS bother to check SPI_result :-(. So there is no visible failure in any test case using our code. You'd need a third-party module that was actually paying attention to the documented error-reporting convention. Maybe that means it's not worth back-patching, but it still looks like a bug to me. regards, tom lane
Gregory Stark <stark@enterprisedb.com> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> * tupdesc has fewer columns than the tuple does. I think this can >> happen in certain inheritance cases --- we might be inspecting a child >> tuple using a parent's tupdesc. > There are some comments in the source about cases like this but I've never > understood how it can happen. Children are supposed to have a superset of the > parent's columns. Does it depend on the parent having had dropped columns but > not the child? But then wouldn't the Append node have had to do some magic to > map the columns correctly meaning you wouldn't be looking at the physical > tuple any more? Sorry, I missed this part of your response. I dug around a bit and found that the comments suggesting this can happen date back to 7.1 days, and arose from situations wherein a function declared to take a table rowtype "foo" gets invoked on a tuple from an inheritance child "foo_child". Now, while ignoring added columns in the child is certainly *necessary* in this case, it is not *sufficient*. The child's columns corresponding to the parent's might have different column numbers. So really you need to map the child's tuples through a conversion operation. We do have the infrastructure for that now, which we did not have in 7.1 --- look at ConvertRowtypeExpr and associated code. Bottom line is that the concern about tupdesc natts < tuple natts might be obsolete. Nonetheless, it's wrong or at least uselessly fragile for the code to be assuming that the case cannot happen; and if it were to happen, the correct limit to use is the tupdesc's not the tuple's. The tupdesc always defines what the calling code is expecting to see. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Gregory Stark <stark@enterprisedb.com> writes: >> Tom Lane <tgl@sss.pgh.pa.us> writes: >>> * tupdesc has more columns than the tuple does. This is possible after >>> ALTER TABLE ADD COLUMN, for example. The correct interpretation in >>> this situation is that the extra columns exist but are NULL. Throwing >>> an error is not correct. > >> Shouldn't this be failing then? If something like this does fail then >> definitely back-patchable++. > > [ pokes around ... ] The difference between correct and incorrect > behavior here is that it is correct for SPI_getvalue and SPI_getbinval > to return NULL for added columns, but they are incorrect to also set > SPI_result to SPI_ERROR_NOATTRIBUTE. However, so far as I can see > none of the callers in our CVS bother to check SPI_result :-(. So there > is no visible failure in any test case using our code. You'd need a > third-party module that was actually paying attention to the documented > error-reporting convention. I do see several checks against SPI_ERROR_NOATTRIBUTE. I'm not sure what context they're in though. pl_exec.c:3606 and pl_exec.c:3940 I also see tsearch (and tsearch2 and fti) and checking for that error. And several of the EDB oracle compatibility modules. It seems likely there are other third party modules which check for this as well. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB'sPostgreSQL training!
Gregory Stark <stark@enterprisedb.com> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> [ pokes around ... ] The difference between correct and incorrect >> behavior here is that it is correct for SPI_getvalue and SPI_getbinval >> to return NULL for added columns, but they are incorrect to also set >> SPI_result to SPI_ERROR_NOATTRIBUTE. However, so far as I can see >> none of the callers in our CVS bother to check SPI_result :-(. > I do see several checks against SPI_ERROR_NOATTRIBUTE. I'm not sure what > context they're in though. pl_exec.c:3606 and pl_exec.c:3940 Those all seem to be checking SPI_fnumber calls. The calls of SPI_getvalue and SPI_getbinval just assume they cannot get a failure... regards, tom lane