Re: 8.0.0beta5 FailedAssertion (Crash) when casting composite types
От | Tom Lane |
---|---|
Тема | Re: 8.0.0beta5 FailedAssertion (Crash) when casting composite types |
Дата | |
Msg-id | 19420.1102286530@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | 8.0.0beta5 FailedAssertion (Crash) when casting composite types (kris.shannon@gmail.com) |
Ответы |
Re: 8.0.0beta5 FailedAssertion (Crash) when casting composite types
|
Список | pgsql-hackers |
kris.shannon@gmail.com writes: > template2=# CREATE TABLE base (i integer); > template2=# CREATE TABLE derived () INHERITS (base); > template2=# INSERT INTO derived (i) VALUES (0); > template2=# SELECT derived::base FROM derived; > TRAP: FailedAssertion The cause of this failure is that parse_coerce.c thinks that a child table's rowtype is binary-compatible with its parent's rowtype: if (typeInheritsFrom(inputTypeId, targetTypeId)) { /* * Input class type is a subclass of target, so nothingto do --- * except relabel the type. This is binary compatibility for * complex types. */ return (Node *) makeRelabelType((Expr *) node, targetTypeId, -1, cformat); } As of 8.0 this isn't really true anymore, because a tuple being used as a rowtype Datum is supposed to carry its rowtype OID in a field of the tuple header. A proper coercion would require replacing the type OID in the header. In a larger sense, however, this code has been wrong since day one. A child table must have all the same columns its parent does, but it need not have them in the same column positions. This can happen after ALTER TABLE ADD COLUMN on the parent, and it can happen even without ALTER TABLE by use of multiple inheritance --- columns inherited from a non-first parent are very unlikely to be in the same positions as they are in that parent. If the columns aren't in the same positions then the rowtypes certainly aren't binary-compatible. So at least some flavors of the bug go back a long time, probably to Berkeley days. For instance, this dumps core in all versions back to at least 7.1 (though you need to adjust the syntax before 7.4): regression=# create table p1(ff1 int); CREATE TABLE regression=# create table p2(f1 text); CREATE TABLE regression=# create function p2text(p2) returns text as 'select $1.f1' language sql; CREATE FUNCTION regression=# create table c1(f3 int) inherits(p1,p2); CREATE TABLE regression=# insert into c1 values(123456789,'hi', 42); INSERT 589036 1 regression=# select p2text(c1.*) from c1; server closed the connection unexpectedly (BTW, "select p2text(p2.*) from p2" works, at least in the last couple versions, because the planner understands that it must convert child rowtypes to match the parent. But that path isn't taken in Kris' example.) The Really Clean And Correct fix to this, IMHO, would be to invent a new expression node type that represents coercing a rowtype expression to a different rowtype. Execution of this node would disassemble and reassemble the tuple Datum, using code not too much different from execJunk.c, to produce the right column order and the right rowtype OID label. However that seems much too large a change for post-RC. (You could also argue that it requires an initdb, though I'd take the position that it doesn't because there are no working databases that would be affected.) A less clean but much more localized fix would be to cause ExecEvalExpr to do that work when it finds a RelabelType node whose output type is a composite type. (We could arrange to lookup the pg_type entry only once per query, during ExecInitExpr, so the performance hit on normal uses of RelabelType wouldn't be too bad.) A rough estimate is that this would require about 100 lines of new code in execQual.c, much of which could be adapted from other places. Another possibility is to #ifdef out the code in parse_coerce.c that thinks it can coerce one rowtype to another this way, causing attempts to do rowtype coercions to fail with a "can't convert type" kind of error. I've tried this and it does not seem to break any of the regression tests, but I'm afraid that it will break cases that people are depending on in practice. Still, given that we are in RC, maybe this is the only acceptable answer for 8.0. If I were called on to close the hole in 7.4 and before, I'd say this is the only reasonable quick-fix for those branches, for sure. Choice #4 is to do nothing, figuring that a bug that has lasted this long can wait another release cycle to be fixed. But in today's security atmosphere I doubt that will be considered acceptable. I definitely intend to do the "clean" fix in 8.1 after we branch, but I am unsure what to do in 8.0, or in the back branches. Comments? regards, tom lane
В списке pgsql-hackers по дате отправления: