Обсуждение: Index bloat, reindex weekly, suggestions etc?
Good day, So I've been running 8.3 for a few months now and things seem good. I also note there are some bug fixes and you are up to 8.3.4 now, but reading it I don't see that I'm being affected by anything, but please tell me if i should plan upgrades to 8.3.4.. The real issue is my index growth and my requirement for weekly re-indexing (which blocks and therefore is more or less a manual process in a live production environment (fail over, changing vips etc). After Reindex My data 3.1G /data/cls My Indexes 1.5G /data/clsindex After 2 weeks My data 3.4G /data/cls My indexes 6.8G /data/clsindex This is 2 weeks. and throughout those 2 weeks I can see performance degrading. In fact I just bumped the memory footprint of my server to 32gigs from 8gigs as my entire PostgreSQL DB is now 15gb, this has helped tremendously, everything ends up in memory and there is almost zero read access from the DB (that's great), however as my DB grows I can't expect to manage it by adding Terabytes of memory to the system, so something is not being done efficiently. We are testing with fill factor, but that just seems to allocate space and doesn't really decrease my index size on disk. It does however seem to help performance a bit, but not a ton. Where should I start looking, what type of information should I provide in order for me to help you, help me? This is a transactional DB with over 800K updates happening each day. tons of updates and deletes. autovac is on and I do a analyze each night after the dump Thanks Tory
>>> "Tory M Blue" <tmblue@gmail.com> wrote: > tell me if i should plan upgrades to 8.3.4.. It's a good idea. It should be painless -- drop in and restart. > After Reindex > My Indexes > 1.5G /data/clsindex > > After 2 weeks > My indexes > 6.8G /data/clsindex > > This is 2 weeks. and throughout those 2 weeks I can see performance > degrading. In fact I just bumped the memory footprint of my server to > 32gigs from 8gigs Make sure that effective_cache_size reflects this. > This is a transactional DB with over 800K updates happening each day. > tons of updates and deletes. autovac is on and I do a analyze each > night after the dump Make sure that you run as VACUUM ANALYZE VERBOSE and look at the last few lines; you might need to increase your free space manager allocations. -Kevin
On Fri, Oct 17, 2008 at 9:30 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >>>> "Tory M Blue" <tmblue@gmail.com> wrote: > >> tell me if i should plan upgrades to 8.3.4.. > > It's a good idea. It should be painless -- drop in and restart. > > Make sure that effective_cache_size reflects this. > Thanks Kevin DETAIL: A total of 501440 page slots are in use (including overhead). 501440 page slots are required to track all free space. Current limits are: 1087500 page slots, 430 relations, using 6401 kB. VACUUM that looks fine And effective_cache, ya I didn't change that as I wanted to see how the results were with just adding memory and in all honesty it was night and day without changing that param, what will modifying that param accomplish? effective_cache_size = 7GB (so no not changed and our performance is 50x better with just the memory upgrade) BTW running Linux.. shared_buffers = 600MB (have 300 connections specified) Thanks Tory
On Fri, Oct 17, 2008 at 10:47 AM, Tory M Blue <tmblue@gmail.com> wrote: > DETAIL: A total of 501440 page slots are in use (including overhead). > 501440 page slots are required to track all free space. > Current limits are: 1087500 page slots, 430 relations, using 6401 kB. > VACUUM > > that looks fine That's still a lot of dead space. You might do better with more aggresive autovacuuming settings.
On Fri, Oct 17, 2008 at 10:02 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Fri, Oct 17, 2008 at 10:47 AM, Tory M Blue <tmblue@gmail.com> wrote: >> DETAIL: A total of 501440 page slots are in use (including overhead). >> 501440 page slots are required to track all free space. >> Current limits are: 1087500 page slots, 430 relations, using 6401 kB. >> VACUUM >> >> that looks fine > > That's still a lot of dead space. You might do better with more > aggresive autovacuuming settings. Hmmm, autovac is on by default and it appears I've left most of the default settings in play, but since I don't get any fsm warnings, things seem to be running cleanly. autovacuum_max_workers = 5 # max number of autovacuum subprocesses autovacuum_vacuum_threshold = 1000 # min number of row updates before # vacuum autovacuum_analyze_threshold = 2000 # min number of row updates before # analyze in fact there is a vac running right now :) Thanks Scott! Tory
>>> "Tory M Blue" <tmblue@gmail.com> wrote: > DETAIL: A total of 501440 page slots are in use (including overhead). > 501440 page slots are required to track all free space. > Current limits are: 1087500 page slots, 430 relations, using 6401 kB. As already pointed out, that's a lot of free space. You don't use VACUUM FULL on this database, do you? That would keep the data relatively tight but seriously bloat indexes, which is consistent with your symptoms. VACUUM FULL should not be used routinely, it is basically for recovery from serious heap bloat when you don't have space for another copy of the data, and it should usually be followed by a REINDEX to clean up the index bloat it causes. > And effective_cache, ya I didn't change that as I wanted to see how > the results were with just adding memory and in all honesty it was > night and day without changing that param, what will modifying that > param accomplish? It may allow PostgreSQL to pick more efficient plans for some of your queries. Be honest with it about the available resources and give it the chance. In particular, you may see fewer queries resorting to sequential scans of entire tables. -Kevin
On Fri, Oct 17, 2008 at 10:35 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > As already pointed out, that's a lot of free space. You don't use > VACUUM FULL on this database, do you? That would keep the data > relatively tight but seriously bloat indexes, which is consistent with > your symptoms. VACUUM FULL should not be used routinely, it is > basically for recovery from serious heap bloat when you don't have > space for another copy of the data, and it should usually be followed > by a REINDEX to clean up the index bloat it causes. Interesting, I do run: "# vacuum and analyze each db before dumping psql $DB -c 'vacuum analyze verbose'"" every night before I dump, is that causing some issues that I'm not aware of? Tory
>>> "Tory M Blue" <tmblue@gmail.com> wrote: > psql $DB -c 'vacuum analyze verbose'"" That's fine; you're not using the FULL option. > every night before I dump Another thing that could cause bloat is a long-running transaction. Check for that. If your database is being updated during a pg_dump or pg_dumpall, that would count as a long-running transaction. You might possibly want to look at going to the PITR backup technique for your regular backups. -Kevin
2008/10/17 Tory M Blue <tmblue@gmail.com>
The real issue is my index growth and my requirement for weekly
re-indexing (which blocks and therefore is more or less a manual
process in a live production environment (fail over, changing vips
etc).
BTW: Can't you simply recreate indexes online? Since postgresql accepts multiple indexes of same definition, this may look like:
1) create index concurrently index_alt
2) analyze index_alt
3) drop index_orig
Both index_alt and index_orig having same definition
On Fri, Oct 17, 2008 at 11:00 PM, Віталій Тимчишин <tivv00@gmail.com> wrote: > > > 2008/10/17 Tory M Blue <tmblue@gmail.com> >> >> The real issue is my index growth and my requirement for weekly >> re-indexing (which blocks and therefore is more or less a manual >> process in a live production environment (fail over, changing vips >> etc). > > BTW: Can't you simply recreate indexes online? Since postgresql accepts > multiple indexes of same definition, this may look like: > 1) create index concurrently index_alt > 2) analyze index_alt > 3) drop index_orig > Both index_alt and index_orig having same definition Sorry for the late response. After some testing, this is in fact a solution that will and does work. It's really much simpler than doing switch overs and taking systems down to re index. Thanks for the clue stick over the head, I know we looked into this in the past and it was a non starter, but it seems to be working just fine!! Thanks Tory.