Re: [SQL] Update problem I cannot work out
От | Tom Lane |
---|---|
Тема | Re: [SQL] Update problem I cannot work out |
Дата | |
Msg-id | 13436.925663017@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [SQL] Update problem I cannot work out (M Simms <grim@argh.demon.co.uk>) |
Список | pgsql-sql |
M Simms <grim@argh.demon.co.uk> writes: >> Create function new_function(value_data_type) >> 'Select sum(t1.var1) from t1 where t1.var2 = $1' LANGUAGE >> 'sql'; >> >> update t2 set var1=t2.var1+new_function(t2.var2); > My reply to this is the question, that as each function contains a > select, is this technically a single pass. I think you are right: this way will repeat the sub-select for each tuple in t2. Now if you have an index on t1.var2, the system could exploit it to pull out just the desired tuples in each sub-select, so the total work wouldn't really be O(M*N). But it still looks slow. If I knew t1 would be large, I think I'd do it with a temporary table: SELECT var2, sum(var1) FROM t1 GROUP BY var2 INTO temptable;UPDATE t2 SET var1 = t2.var1 + temptable.sum WHERE t2.var2= temptable.var2;DROP TABLE temptable; (In 6.5 you can use a temporary table to avoid needing explicit DROP, but that's just a convenience.) Perhaps there's a way to accomplish the same thing with a subselect, rather than explicitly creating a table, but I don't know how. This is more like O(N log N) in the size of each table, assuming that the join is done with mergejoin. Actually, if there's not a huge number of distinct values of var2, then temptable should be small enough for a hashjoin, which'd mean that there's no need to sort t2 at all. BTW, I think your original try update t2 set var1=t2.var1+t1.var1 where t2.var2=t1.var1; should have worked; it's probably a bug that it doesn't work. IMHO the semantics ought to be that the addition is repeated for each t1 tuple that matches the current t2 tuple ... but I haven't read the SQL spec. regards, tom lane
В списке pgsql-sql по дате отправления: