Обсуждение: Big delete on big table... now what?
Hi list. I have a table with over 30 million rows. Performance was dropping steadily so I moved old data not needed online to an historic table. Now the table has about 14 million rows. I don't need the disk space returned to the OS but I do need to improve performance. Will a plain vacuum do or is a vacuum full necessary? ¿Would a vacuum full improve performance at all? Thanks for your hindsight. Regards, Fernando.
>>> "Fernando Hevia" <fhevia@ip-tel.com.ar> wrote: > I have a table with over 30 million rows. Performance was dropping steadily > so I moved old data not needed online to an historic table. Now the table > has about 14 million rows. I don't need the disk space returned to the OS > but I do need to improve performance. Will a plain vacuum do or is a vacuum > full necessary? > *Would a vacuum full improve performance at all? If this database can be out of production for long enough to run it (possibly a few hours, depending on hardware, configuration, table width, indexes) your best option might be to CLUSTER and ANALYZE the table. It gets more complicated if you can't tolerate down-time. -Kevin
"Fernando Hevia" <fhevia@ip-tel.com.ar> wrote: > > Hi list. > > I have a table with over 30 million rows. Performance was dropping steadily > so I moved old data not needed online to an historic table. Now the table > has about 14 million rows. I don't need the disk space returned to the OS > but I do need to improve performance. Will a plain vacuum do or is a vacuum > full necessary? > ¿Would a vacuum full improve performance at all? If you can afford the downtime on that table, cluster would be best. If not, do the normal vacuum and analyze. This is unlikely to improve the performance much (although it may shrink the table _some_) but regular vacuum will keep performance from getting any worse. You can also reindex pretty safely. Any queries that run during the reindex will just have to do so without the indexes. Longer term, if you remove smaller groups of rows more frequently, you'll probably be able to maintain performance and table bloat at a reasonable level with normal vacuum. -- Bill Moran Collaborative Fusion Inc. wmoran@collaborativefusion.com Phone: 412-422-3463x4023
"Bill Moran" <wmoran@collaborativefusion.com> writes: > "Fernando Hevia" <fhevia@ip-tel.com.ar> wrote: >> >> Hi list. >> >> I have a table with over 30 million rows. Performance was dropping steadily >> so I moved old data not needed online to an historic table. Now the table >> has about 14 million rows. I don't need the disk space returned to the OS >> but I do need to improve performance. Will a plain vacuum do or is a vacuum >> full necessary? >> ¿Would a vacuum full improve performance at all? > > If you can afford the downtime on that table, cluster would be best. > > If not, do the normal vacuum and analyze. This is unlikely to improve > the performance much (although it may shrink the table _some_) but > regular vacuum will keep performance from getting any worse. Note that CLUSTER requires enough space to store the new and the old copies of the table simultaneously. That's the main reason for VACUUM FULL to still exist. There is also the option of doing something like (assuming id is already an integer -- ie this doesn't actually change the data): ALTER TABLE x ALTER id TYPE integer USING id; which will rewrite the whole table. This is effectively the same as CLUSTER except it doesn't order the table according to an index. It will still require enough space to hold two copies of the table but it will be significantly faster. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!
> Gregory Stark <start@enterprisedb.com> writes: > > "Bill Moran" <wmoran@collaborativefusion.com> writes: > > > "Fernando Hevia" <fhevia@ip-tel.com.ar> wrote: > >> Hi list. > >> I have a table with over 30 million rows. Performance was dropping > >> steadily so I moved old data not needed online to an > historic table. > >> Now the table has about 14 million rows. I don't need the > disk space > >> returned to the OS but I do need to improve performance. > Will a plain > >> vacuum do or is a vacuum full necessary? > >> ¿Would a vacuum full improve performance at all? > > > > If you can afford the downtime on that table, cluster would be best. > > > > If not, do the normal vacuum and analyze. This is unlikely > to improve > > the performance much (although it may shrink the table _some_) but > > regular vacuum will keep performance from getting any worse. > > Note that CLUSTER requires enough space to store the new and > the old copies of the table simultaneously. That's the main > reason for VACUUM FULL to still exist. > > There is also the option of doing something like (assuming id > is already an integer -- ie this doesn't actually change the data): > > ALTER TABLE x ALTER id TYPE integer USING id; > > which will rewrite the whole table. This is effectively the > same as CLUSTER except it doesn't order the table according > to an index. It will still require enough space to hold two > copies of the table but it will be significantly faster. > Yes, I can afford a downtime on Sunday. Actually the clustering option would help since most of our slow queries use the same index. Thanks Bill and Gregory for the advice. Regards, Fernando.