Re: Very long deletion time on a 200 GB database
От | Shaun Thomas |
---|---|
Тема | Re: Very long deletion time on a 200 GB database |
Дата | |
Msg-id | 4F466B4B.4080702@peak6.com обсуждение исходный текст |
Ответ на | Very long deletion time on a 200 GB database ("Reuven M. Lerner" <reuven@lerner.co.il>) |
Список | pgsql-performance |
On 02/23/2012 02:39 AM, Reuven M. Lerner wrote: > I should note that my primary concern is available RAM. The database, as > I wrote, is about 200 GB in size, and PostgreSQL is reporting (according > to Windows) use of about 5 GB RAM, plus another 25 GB of virtual memory. O_o That... that would probably swap just constantly. No end. Just swap all day long. But maybe not. Please tell us the values for these settings: * shared_buffers * work_mem * maintenance_work_mem * checkpoint_segments * checkpoint_timeout It also wouldn't be a bad idea to see how many concurrent connections there are, because that may determine how much memory all the backends are consuming. In any case, if it's actually using 25GB of virtual memory, any command you run that doesn't happen to be in cache, will just immediately join a giant logjam. > I've told the Windows folks on this project that virtual memory kills a > database, and that it shouldn't surprise us to have horrible performance > if the database and operating system are both transferring massive > amounts of data back and forth. But there doesn't seem to be a good way > to handle this You kinda can, by checking those settings and sanitizing them. If they're out of line, or too large, they'll create the need for more virtual memory. Having the virtual memory there isn't necessarily bad, but using it is. > DELETE FROM B > WHERE r_id IN (SELECT R.id > FROM R, B > WHERE r.end_date < (NOW() - (interval '1 day' * 30)) > AND r.id = b.r_id Just to kinda help you out syntactically, have you ever tried a DELETE FROM ... USING? You can also collapse your interval notation. DELETE FROM B USING R WHERE R.id = B.r_id AND R.end_date < CURRENT_DATE - INTERVAL '30 days'; But besides that, the other advise you've received is sound. Since your select->truncate->insert attempt was also slow, I suspect you're having problems with foreign key checks, and updating the index trees. Maintaining an existing index can be multiples slower than filling an empty table and creating the indexes afterwards. So far as your foreign keys, if any of the child tables don't have an index on the referring column, your delete performance will be atrocious. You also need to make sure the types of the columns are identical. Even a numeric/int difference will be enough to render an index unusable. We have a 100GB *table* with almost 200M rows and even deleting from that in many of our archive tests doesn't take anywhere near 9 hours. But I *have* seen a delete take that long when we had a numeric primary key, and an integer foreign key. Even a handful of records can cause a nested loop sequence scan, which will vastly inflate delete time. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 sthomas@peak6.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
В списке pgsql-performance по дате отправления: