Re: UPDATE runs slow in a transaction
От | Viktor Rosenfeld |
---|---|
Тема | Re: UPDATE runs slow in a transaction |
Дата | |
Msg-id | A7DBBEA9-4E49-4290-B63D-549D0FA7E986@informatik.hu-berlin.de обсуждение исходный текст |
Ответ на | Re: UPDATE runs slow in a transaction ("Pavel Stehule" <pavel.stehule@gmail.com>) |
Список | pgsql-general |
Yes, I either run begin; ... lots of other stuff in a script explain analyze update ... (called from the same script) rollback; or ... lots of other stuff in a script (same as above) explain analyze update ... Cheers, Viktor Am 16.07.2008 um 16:58 schrieb Pavel Stehule: > this is strange. > > what means "run under transaction"? > > you did exactly statements in psql console: > begin; > explain analyze select ... > commit? > > regards > Pavel Stehule > > 2008/7/16 Viktor Rosenfeld <rosenfel@informatik.hu-berlin.de>: >> Hi Pavel, >> >> thanks for the advice on how to uncorrelate the query. I must >> admit I >> didn't know about the UPDATE ... SET ... FROM ... syntax. >> >> Now the UPDATE runs in an acceptable time inside a transaction, >> however the >> query plan still differs when I run it outside. >> >> Outside a transaction: >> >> >> QUERY PLAN >> -------------------------------------------------------------------------------------------------------------------------------------------- >> Merge Join (cost=0.00..11481.84 rows=65756 width=1087) (actual >> time=0.151..323.856 rows=65756 loops=1) >> Merge Cond: (_struct.id = tmp.id) >> -> Index Scan using "_PK_struct" on _struct (cost=0.00..7084.50 >> rows=98149 width=1083) (actual time=0.028..137.463 rows=32300 >> loops=1) >> -> Index Scan using idx_tmp__id on tmp (cost=0.00..3330.02 >> rows=65756 >> width=12) (actual time=0.115..58.601 rows=65756 loops=1) >> Total runtime: 2905.580 ms >> >> This looks like an optimal plan and average run time over 5 runs is >> 2660 ms. >> >> Inside a transaction: >> >> >> QUERY PLAN >> -------------------------------------------------------------------------------------------------------------------------------------------- >> Merge Join (cost=7427.63..16159.84 rows=65756 width=1087) (actual >> time=315.570..574.075 rows=65756 loops=1) >> Merge Cond: (_struct.id = tmp.id) >> -> Index Scan using "_PK_struct" on _struct (cost=0.00..7500.50 >> rows=98149 width=1083) (actual time=0.020..129.915 rows=32300 >> loops=1) >> -> Sort (cost=7427.63..7592.02 rows=65756 width=12) (actual >> time=315.538..333.359 rows=65756 loops=1) >> Sort Key: tmp.id >> Sort Method: quicksort Memory: 4617kB >> -> Seq Scan on tmp (cost=0.00..2165.56 rows=65756 width=12) >> (actual time=10.070..37.411 rows=65756 loops=1) >> Trigger for constraint _FK_struct_2_collection: time=1105.892 >> calls=32300 >> Trigger for constraint _FK_struct_2_text: time=1468.009 calls=32300 >> Total runtime: 4955.784 ms >> >> Again, the planner does not use the index on tmp (id) although I >> put an >> "ANALYZE tmp" right before the UPDATE. Average run time over 5 >> runs is 4610 >> ms. >> >> Thanks, >> Viktor >> >> Am 16.07.2008 um 15:33 schrieb Pavel Stehule: >> >>> Hello >>> >>> my advice is little bit offtopic, I am sorry. Why you use correlated >>> subquery? Your update statement should be >>> >>> update _struct set left_token = tmp.left_token from tmp where >>> _struct.id = tmp.id; >>> >>> send output of explain analyze statement, please. etc >>> explain analyze UPDATE _struct SET left_token = (SELECT DISTINCT >>> left_token FROM tmp WHERE _struct.id = tmp.id) >>> >>> regards >>> Pavel Stehule >>> >>> >>> >>> 2008/7/16 Viktor Rosenfeld <rosenfel@informatik.hu-berlin.de>: >>>> >>>> Hi Tom, >>>> Postgres is indeed selecting a bad plan. Turns out that the >>>> index I >>>> created >>>> to speed up the UPDATE isn't used inside a transaction block. >>>> Here's the plan for "UPDATE _struct SET left_token = (SELECT >>>> DISTINCT >>>> left_token FROM tmp WHERE _struct.id = tmp.id)" outside of a >>>> transaction: >>>> QUERY PLAN >>>> >>>> >>>> ------------------------------------------------------------------------------------------- >>>> Seq Scan on _struct (cost=0.00..826643.13 rows=98149 width=1083) >>>> SubPlan >>>> -> Unique (cost=8.38..8.40 rows=1 width=4) >>>> -> Sort (cost=8.38..8.39 rows=4 width=4) >>>> Sort Key: tmp.left_token >>>> -> Index Scan using idx_tmp__id on tmp >>>> (cost=0.00..8.34 >>>> rows=4 width=4) >>>> Index Cond: ($0 = id) >>>> And inside a transaction: >>>> QUERY PLAN >>>> >>>> >>>> --------------------------------------------------------------------------------------- >>>> Seq Scan on _struct (cost=100000000.00..3230175260746.00 >>>> rows=32300 >>>> width=70) >>>> SubPlan >>>> -> Unique (cost=100002329.99..100002330.01 rows=1 width=4) >>>> -> Sort (cost=100002329.99..100002330.00 rows=4 width=4) >>>> Sort Key: tmp.left_token >>>> -> Seq Scan on tmp (cost=100000000.00..100002329.95 >>>> rows=4 width=4) >>>> Filter: ($0 = id) >>>> The high cost of the seqscan on tmp are because I tried disabling >>>> sequential >>>> scans inside the transaction to force an index scan, which Postgres >>>> decided >>>> to ignore in this case. >>>> Putting an ANALYZE tmp and ANALYZE _struct right before the >>>> UPDATE didn't >>>> help either. (Also shouldn't the creation of an index on tmp >>>> (id) take >>>> care >>>> of analyzing that column?) >>>> Thanks, >>>> Viktor >>>> Am 14.07.2008 um 20:52 schrieb Tom Lane: >>>> >>>> Viktor Rosenfeld <rosenfel@informatik.hu-berlin.de> writes: >>>> >>>> the script below runs very fast when executed alone. But when I >>>> call >>>> >>>> it from within a transaction block it's so slow that I have to >>>> abort >>>> >>>> it after a while. Specifically the second-to-last UPDATE seems to >>>> >>>> take forever within a transaction while it completes in about 3 >>>> >>>> seconds outside a transaction. >>>> >>>> Since the table you're working on was just created in the same >>>> transaction, there's been no opportunity for autovacuum to run an >>>> ANALYZE on it; that's probably preventing selection of a good plan. >>>> Try throwing in an "ANALYZE tmp" after you load the table. >>>> >>>> regards, tom lane >>>> >>>> -- >>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >>>> To make changes to your subscription: >>>> http://www.postgresql.org/mailpref/pgsql-general >>>> >>>> >> >>
В списке pgsql-general по дате отправления: