Re: how to efficiently update tuple in many-to-many relationship?
От | Merlin Moncure |
---|---|
Тема | Re: how to efficiently update tuple in many-to-many relationship? |
Дата | |
Msg-id | b42b73150704091408s7fc6b644sf5f732f4e49409e1@mail.gmail.com обсуждение исходный текст |
Ответ на | how to efficiently update tuple in many-to-many relationship? (Drew Wilson <drewmwilson@gmail.com>) |
Список | pgsql-performance |
On 4/9/07, Drew Wilson <drewmwilson@gmail.com> wrote: > I have 2 tables (A,B) joined in a many-to-many relationship via a > join table ("membership"), where updating table A based on table B > takes a very long time. > > Tables A and B have oid primary keys (a_id and b_id respectively). > The join table, "membership", has its own oid primary key > (membership_id), as well as foreign keys "a_id" and "b_id". > > A SELECT query across all 3 tables takes 12 seconds. > "SELECT count(*) FROM a JOIN membership USING(a_id) JOIN b USING > (b_id) WHERE b.is_public = true" > > But a simple UPDATE using the same SELECT query takes 30 minutes to > an hour. > "UPDATE A set is_public=true WHERE a_id IN (SELECT count(*) FROM a > JOIN membership USING(a_id) JOIN b USING(b_id) WHERE b.is_public = > true)". > > What am I doing wrong here? I'm not sure how to diagnose this further. > > Here's the output from explain: > db=# EXPLAIN SELECT a_id FROM a JOIN membership USING(a_id) JOIN b > USING(b_id) WHERE b.is_public = true; > ------------------------------------------------------------------------ > ----------------------------------------- > Hash Join (cost=167154.78..173749.48 rows=51345 width=4) > Hash Cond: (a.a_id = membership.a_id) > -> Function Scan on a (cost=0.00..12.50 rows=1000 width=4) > -> Hash (cost=144406.48..144406.48 rows=1819864 width=4) > -> Hash Join (cost=417.91..144406.48 rows=1819864 width=4) > Hash Cond: (membership.b_id = b.b_id) > -> Seq Scan on membership (cost=0.00..83623.83 > rows=4818983 width=8) > -> Hash (cost=348.52..348.52 rows=5551 width=4) > -> Index Scan using b_is_public on b > (cost=0.00..348.52 rows=5551 width=4) > Index Cond: (is_public = true) > Filter: is_public > (11 rows) > > > db=# EXPLAIN UPDATE a SET is_public = true WHERE a_id IN > ( SELECT a_id FROM a JOIN membership USING(a_id) JOIN b USING > (b_id) WHERE b.is_public = true); > ------------------------------------------------------------------------ > ----------------------------------------- > hash in join (cost=281680.17..370835.63 rows=1819864 width=90) > hash cond: (public.a.a_id = public.a.a_id) > -> seq scan on a (cost=0.00..47362.09 rows=2097309 width=90) > -> hash (cost=258931.87..258931.87 rows=1819864 width=8) > -> hash join (cost=73996.36..258931.87 rows=1819864 width=8) > hash cond: (membership.a_id = public.a.a_id) > -> hash join (cost=417.91..144406.48 rows=1819864 > width=4) > hash cond: (membership.b_id = b.b_id) > -> seq scan on membership > (cost=0.00..83623.83 rows=4818983 width=8) > -> hash (cost=348.52..348.52 rows=5551 width=4) > -> index scan using > loc_submission_is_public on b (cost=0.00..348.52 rows=5551 width=4) > index cond: (is_public = true) > filter: is_public > -> hash (cost=47362.09..47362.09 rows=2097309 width=4) > -> seq scan on a (cost=0.00..47362.09 > rows=2097309 width=4) why don't you rewrite your update statement to use joins (joins > where exists > where in)? WHERE a_id IN (SELECT count(*) FROM a the above looks wrong maybe? merlin
В списке pgsql-performance по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: Re: join to view over custom aggregate seems like it should be faster
Следующее
От: "Merlin Moncure"Дата:
Сообщение: Re: join to view over custom aggregate seems like it should be faster