Обсуждение: Idea for vacuuming
I'm running a 8.0 database. I have a very large log table that is rarely updated or deleted from. The nightly vacuum does not know this, and spends a lot of time on it, and all its indexes. My RFE: When vacuuming a table, pg should try to vacuum the primary key first. If that results in 0 recovered entries, then assume the table has no updates/deletes and skip the rest of that table. I'm picking the primary key here, but any index that indexes each row of the table will do. Maybe it should just pick the smallest index that indexes each row of the table.
On Jun 22, 2006, at 7:12 PM, Joseph Shraibman wrote: > I'm running a 8.0 database. I have a very large log table that is > rarely updated or deleted from. The nightly vacuum does not know > this, and spends a lot of time on it, and all its indexes. > > My RFE: When vacuuming a table, pg should try to vacuum the primary > key first. If that results in 0 recovered entries, then assume the > table has no updates/deletes and skip the rest of that table. I'm > picking the primary key here, but any index that indexes each row > of the table will do. Maybe it should just pick the smallest index > that indexes each row of the table. *shrug* It's kinda hard to get excited about that when running autovacuum (or pg_autovacuum in the case of 8.0) would be a much better solution. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
I like to make sure the vacuum takes place during off peak times, which is why I don't use autovacuum. Jim Nasby wrote: > On Jun 22, 2006, at 7:12 PM, Joseph Shraibman wrote: >> I'm running a 8.0 database. I have a very large log table that is >> rarely updated or deleted from. The nightly vacuum does not know >> this, and spends a lot of time on it, and all its indexes. >> >> My RFE: When vacuuming a table, pg should try to vacuum the primary >> key first. If that results in 0 recovered entries, then assume the >> table has no updates/deletes and skip the rest of that table. I'm >> picking the primary key here, but any index that indexes each row of >> the table will do. Maybe it should just pick the smallest index that >> indexes each row of the table. > > *shrug* It's kinda hard to get excited about that when running > autovacuum (or pg_autovacuum in the case of 8.0) would be a much better > solution. > -- > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > Pervasive Software http://pervasive.com work: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
Jim Nasby <decibel@decibel.org> writes: > > My RFE: When vacuuming a table, pg should try to vacuum the primary key > > first. If that results in 0 recovered entries, then assume the table has no > > updates/deletes and skip the rest of that table. That makes no sense. Vacuum starts by scanning the table itself, not the indexes. It only goes to the indexes after it has found tuples that need cleaning up. There's nothing to look at in the indexes that would tell it whether there are any tuples to clean up. -- greg
The verbose output shows the table being vacuumed last. Maybe it changed after 8.0 Greg Stark wrote: > Jim Nasby <decibel@decibel.org> writes: > >>> My RFE: When vacuuming a table, pg should try to vacuum the primary key >>> first. If that results in 0 recovered entries, then assume the table has no >>> updates/deletes and skip the rest of that table. > > That makes no sense. Vacuum starts by scanning the table itself, not the > indexes. It only goes to the indexes after it has found tuples that need > cleaning up. There's nothing to look at in the indexes that would tell it > whether there are any tuples to clean up. >
On Fri, Jun 23, 2006 at 02:00:38PM -0400, Joseph Shraibman wrote: > I like to make sure the vacuum takes place during off peak times, which > is why I don't use autovacuum. FWIW, now that there's vacuum_cost_delay that's usually not a very good strategy. If you have anywhere close to enough load during they day that runing vacuum would impact things then you're going to end up with tables and indexes that are horribly bloated because they're not being vacuumed enough. You'll probably get better overall performance by using autovacuum and setting vacuum_cost_delay to about 20, depending on your hardware. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461