Re: Extremely Slow Cascade Delete Operation
| От | Craig Ringer | 
|---|---|
| Тема | Re: Extremely Slow Cascade Delete Operation | 
| Дата | |
| Msg-id | 4B592EC6.6080709@postnewspapers.com.au обсуждение исходный текст | 
| Ответ на | Re: Extremely Slow Cascade Delete Operation (Yan Cheng Cheok <yccheok@yahoo.com>) | 
| Ответы | Re: Extremely Slow Cascade Delete Operation | 
| Список | pgsql-general | 
Yan Cheng Cheok wrote: > I try to create a following simple scenario, to demonstrate cascade delete is rather slow in PostgreSQL. > > Can anyone help me to confirm? Is this my only machine problem, or every PostgreSQL users problem? > > I create 1 lot. > every lot is having 10000 unit > every unit is having 100 measurement. 101 measurements per unit by the looks. But it doesn't much matter. test=> CREATE INDEX fk_unit_id_idx ON measurement (fk_unit_id); CREATE INDEX Time: 3072.635 ms Now suddenly everything is much faster: test=> delete from lot; DELETE 1 Time: 8066.140 ms Before that index creation, every deletion of a unit required a seqscan of `measurement' to find referenced measurements. At 200ms apiece, it would've taken about half an hour to `delete from lot' on my machine, and smaller deletes took a proportional amount of time (ie 20s for 100 units). Now it takes 8 seconds to delete the lot. You just forgot to create an index on one of the foreign key relationships that you do a cascade delete on. BTW, Pg doesn't force you to do this because sometimes you'd prefer to wait. For example, you might do the deletes very rarely, and not way to pay the cost of maintaining the index the rest of the time. (What I was personally surprised by is that it's no faster to DELETE FROM measurement; directly than to delete via LOT. I would've expected a seqscan delete of the table to be MUCH faster than all the index-hopping required to delete via lot. I guess the reason there's no real difference is because the whole dataset fits in cache, so there's no seek penalty. ) AFAIK, Pg isn't clever enough to batch foreign key deletes together and then plan them as a single operation. That means it can't use something other than a bunch of little index lookups where doing a sequential scan or a hash join might be faster. Adding this facility would certainly be an "interesting" project. Most of the time, though, you get on fine using index-based delete cascading, and you can generally pre-delete rows using a join on those rare occasions it is a problem. -- Craig Ringer
В списке pgsql-general по дате отправления: