Re: UPDATE runs slow in a transaction
От | Viktor Rosenfeld |
---|---|
Тема | Re: UPDATE runs slow in a transaction |
Дата | |
Msg-id | 06F6D3BB-BAA3-4C7A-9B3C-99039F55E34B@informatik.hu-berlin.de обсуждение исходный текст |
Ответ на | Re: UPDATE runs slow in a transaction ("Pavel Stehule" <pavel.stehule@gmail.com>) |
Ответы |
Re: UPDATE runs slow in a transaction
("Pavel Stehule" <pavel.stehule@gmail.com>)
Re: UPDATE runs slow in a transaction ("Pavel Stehule" <pavel.stehule@gmail.com>) |
Список | pgsql-general |
Hi Pavel, thanks for the advice on how to uncorrelate the query. I must admit I didn't know about the UPDATE ... SET ... FROM ... syntax. Now the UPDATE runs in an acceptable time inside a transaction, however the query plan still differs when I run it outside. Outside a transaction: QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=0.00..11481.84 rows=65756 width=1087) (actual time=0.151..323.856 rows=65756 loops=1) Merge Cond: (_struct.id = tmp.id) -> Index Scan using "_PK_struct" on _struct (cost=0.00..7084.50 rows=98149 width=1083) (actual time=0.028..137.463 rows=32300 loops=1) -> Index Scan using idx_tmp__id on tmp (cost=0.00..3330.02 rows=65756 width=12) (actual time=0.115..58.601 rows=65756 loops=1) Total runtime: 2905.580 ms This looks like an optimal plan and average run time over 5 runs is 2660 ms. Inside a transaction: QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=7427.63..16159.84 rows=65756 width=1087) (actual time=315.570..574.075 rows=65756 loops=1) Merge Cond: (_struct.id = tmp.id) -> Index Scan using "_PK_struct" on _struct (cost=0.00..7500.50 rows=98149 width=1083) (actual time=0.020..129.915 rows=32300 loops=1) -> Sort (cost=7427.63..7592.02 rows=65756 width=12) (actual time=315.538..333.359 rows=65756 loops=1) Sort Key: tmp.id Sort Method: quicksort Memory: 4617kB -> Seq Scan on tmp (cost=0.00..2165.56 rows=65756 width=12) (actual time=10.070..37.411 rows=65756 loops=1) Trigger for constraint _FK_struct_2_collection: time=1105.892 calls=32300 Trigger for constraint _FK_struct_2_text: time=1468.009 calls=32300 Total runtime: 4955.784 ms Again, the planner does not use the index on tmp (id) although I put an "ANALYZE tmp" right before the UPDATE. Average run time over 5 runs is 4610 ms. Thanks, Viktor Am 16.07.2008 um 15:33 schrieb Pavel Stehule: > Hello > > my advice is little bit offtopic, I am sorry. Why you use correlated > subquery? Your update statement should be > > update _struct set left_token = tmp.left_token from tmp where > _struct.id = tmp.id; > > send output of explain analyze statement, please. etc > explain analyze UPDATE _struct SET left_token = (SELECT DISTINCT > left_token FROM tmp WHERE _struct.id = tmp.id) > > regards > Pavel Stehule > > > > 2008/7/16 Viktor Rosenfeld <rosenfel@informatik.hu-berlin.de>: >> Hi Tom, >> Postgres is indeed selecting a bad plan. Turns out that the index >> I created >> to speed up the UPDATE isn't used inside a transaction block. >> Here's the plan for "UPDATE _struct SET left_token = (SELECT DISTINCT >> left_token FROM tmp WHERE _struct.id = tmp.id)" outside of a >> transaction: >> QUERY PLAN >> >> ------------------------------------------------------------------------------------------- >> Seq Scan on _struct (cost=0.00..826643.13 rows=98149 width=1083) >> SubPlan >> -> Unique (cost=8.38..8.40 rows=1 width=4) >> -> Sort (cost=8.38..8.39 rows=4 width=4) >> Sort Key: tmp.left_token >> -> Index Scan using idx_tmp__id on tmp >> (cost=0.00..8.34 >> rows=4 width=4) >> Index Cond: ($0 = id) >> And inside a transaction: >> QUERY PLAN >> >> --------------------------------------------------------------------------------------- >> Seq Scan on _struct (cost=100000000.00..3230175260746.00 rows=32300 >> width=70) >> SubPlan >> -> Unique (cost=100002329.99..100002330.01 rows=1 width=4) >> -> Sort (cost=100002329.99..100002330.00 rows=4 width=4) >> Sort Key: tmp.left_token >> -> Seq Scan on tmp (cost=100000000.00..100002329.95 >> rows=4 width=4) >> Filter: ($0 = id) >> The high cost of the seqscan on tmp are because I tried disabling >> sequential >> scans inside the transaction to force an index scan, which Postgres >> decided >> to ignore in this case. >> Putting an ANALYZE tmp and ANALYZE _struct right before the UPDATE >> didn't >> help either. (Also shouldn't the creation of an index on tmp (id) >> take care >> of analyzing that column?) >> Thanks, >> Viktor >> Am 14.07.2008 um 20:52 schrieb Tom Lane: >> >> Viktor Rosenfeld <rosenfel@informatik.hu-berlin.de> writes: >> >> the script below runs very fast when executed alone. But when I call >> >> it from within a transaction block it's so slow that I have to abort >> >> it after a while. Specifically the second-to-last UPDATE seems to >> >> take forever within a transaction while it completes in about 3 >> >> seconds outside a transaction. >> >> Since the table you're working on was just created in the same >> transaction, there's been no opportunity for autovacuum to run an >> ANALYZE on it; that's probably preventing selection of a good plan. >> Try throwing in an "ANALYZE tmp" after you load the table. >> >> regards, tom lane >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general >> >>
В списке pgsql-general по дате отправления: