Обсуждение: Large pgstat.stat file causes I/O storm
Hello all, I have a ~150GB sized server, containing two databases that are active in mostly read mode. I have noticed lately that the global/pgstat.stat file is somewhere around 1MB freshly after a restart, but at some point it baloons to 74MB in size for no apparent reason, after a few hours of uptime. Needless to say, having the stats collector dump 74MB of stuff on disk on its every loop takes a big bite of the I/O capabilities of this box. Looking at all the othe replicas I have of this database (but which are under a more lightweight read load), the pgstat.stat file again is rather small in size. Am I right to assume that a 74MB pgstat.stat file is not normal - and what might have caused it? Thanks, Cristian -- Cristian Gafton rPath, Inc.
On Tue, 29 Jan 2008, Cristian Gafton wrote: > I have a ~150GB sized server, containing two databases that are active in > mostly read mode. I have noticed lately that the global/pgstat.stat file is > somewhere around 1MB freshly after a restart, but at some point it baloons to > 74MB in size for no apparent reason, after a few hours of uptime. Needless to > say, having the stats collector dump 74MB of stuff on disk on its every loop > takes a big bite of the I/O capabilities of this box. Of course, leaving out the most important thing - this is postgresql 8.2.6 on x86_64 > Looking at all the othe replicas I have of this database (but which are under > a more lightweight read load), the pgstat.stat file again is rather small in > size. Am I right to assume that a 74MB pgstat.stat file is not normal - and > what might have caused it? Cristian -- Cristian Gafton rPath, Inc.
Cristian Gafton <gafton@rpath.com> writes: > On Tue, 29 Jan 2008, Cristian Gafton wrote: >> I have a ~150GB sized server, containing two databases that are active in >> mostly read mode. I have noticed lately that the global/pgstat.stat file is >> somewhere around 1MB freshly after a restart, but at some point it baloons to >> 74MB in size for no apparent reason, after a few hours of uptime. Needless to >> say, having the stats collector dump 74MB of stuff on disk on its every loop >> takes a big bite of the I/O capabilities of this box. > Of course, leaving out the most important thing - this is postgresql 8.2.6 > on x86_64 Hmm ... do you have autovacuum enabled? If not, what's the vacuuming policy on that box? I'm wondering if this is triggered by something deciding to vacuum or analyze a bunch of otherwise-unused tables, and thereby causing stats entries to be created for those tables. You could investigate by comparing the contents of the stats views before and after the file balloons. I would expect to see a lot more rows, and the key is exactly what non-null activity is recorded in the extra rows. regards, tom lane
On Tue, 29 Jan 2008, Tom Lane wrote: >> On Tue, 29 Jan 2008, Cristian Gafton wrote: >>> I have a ~150GB sized server, containing two databases that are active in >>> mostly read mode. I have noticed lately that the global/pgstat.stat file is >>> somewhere around 1MB freshly after a restart, but at some point it baloons to >>> 74MB in size for no apparent reason, after a few hours of uptime. Needless to >>> say, having the stats collector dump 74MB of stuff on disk on its every loop >>> takes a big bite of the I/O capabilities of this box. > >> Of course, leaving out the most important thing - this is postgresql 8.2.6 >> on x86_64 > > Hmm ... do you have autovacuum enabled? If not, what's the vacuuming > policy on that box? I'm wondering if this is triggered by something > deciding to vacuum or analyze a bunch of otherwise-unused tables, and > thereby causing stats entries to be created for those tables. Autovacuum is disabled, since the database is mostly read only. There is a "vacuumdb -a -z" running nightly on the box. However, the application that queries it does a lot of work with temporary tables - would those bloat the stats at all? > You could investigate by comparing the contents of the stats views > before and after the file balloons. I would expect to see a lot more > rows, and the key is exactly what non-null activity is recorded in > the extra rows. Any one of the stats views in particular? Currently all of the stats_* flags are set to "on". Thanks, Cristian -- Cristian Gafton rPath, Inc.
Cristian Gafton <gafton@rpath.com> writes: > Autovacuum is disabled, since the database is mostly read only. There is a > "vacuumdb -a -z" running nightly on the box. However, the application that > queries it does a lot of work with temporary tables - would those bloat > the stats at all? Conceivably, if you mean a lot of short-lived tables rather than a lot of operations on a few tables. However, I'd think that would result in a steady accumulation of stats entries, not a sudden jump as you seemed to describe. regards, tom lane
On Tue, 29 Jan 2008, Tom Lane wrote: > Cristian Gafton <gafton@rpath.com> writes: >> Autovacuum is disabled, since the database is mostly read only. There is a >> "vacuumdb -a -z" running nightly on the box. However, the application that >> queries it does a lot of work with temporary tables - would those bloat >> the stats at all? > > Conceivably, if you mean a lot of short-lived tables rather than a lot > of operations on a few tables. However, I'd think that would result in > a steady accumulation of stats entries, not a sudden jump as you seemed > to describe. We are churning through a bunch of short-lived temp tables. Since I reported the problem, the pgstat file is now sitting at 85M, yet the pg_stat* tables barely have any entries in them: count(*) pg_stats 298 pg_statistic 298 pg_stat_all_indexes 76 pg_stat_all_tables 76 pg_statio_all_tables 56 pg_statio_all_indexes 76 Is there a way to inspect the pgstat file and see what's in it that it is taking all this space? (it's not the space that bothers me, it's the fact that the statistics collector has to dump 85MB of stuff once a second to disk...) Thanks, Cristian -- Cristian Gafton rPath, Inc.
Cristian Gafton <gafton@rpath.com> writes: > We are churning through a bunch of short-lived temp tables. I think that's probably the root of the problem ... > Since I > reported the problem, the pgstat file is now sitting at 85M, yet the > pg_stat* tables barely have any entries in them: > count(*) > pg_stats 298 > pg_statistic 298 > pg_stat_all_indexes 76 > pg_stat_all_tables 76 > pg_statio_all_tables 56 > pg_statio_all_indexes 76 Those views are joins against pg_class, so only tables that have live pg_class rows can possibly show up there. You could try remembering the OIDs of some temp tables and probing the underlying pg_stat_get_xxx() functions to see if there are stats-table entries for them. (Pokes around in the code...) I think the problem here is that the only active mechanism for flushing dead stats-table entries is pgstat_vacuum_tabstat(), which is invoked by a VACUUM command or an autovacuum. Once-a-day VACUUM isn't gonna cut it for you under those circumstances. What you might do is just issue a VACUUM on some otherwise-uninteresting small table, once an hour or however often you need to keep the stats file bloat to a reasonable level. There is a pgstat_drop_relation() function to tell the stats collector to drop a single table entry, but it's not being called from anyplace. We probably ought to try a bit harder to make that work. The problem is described here: 2007-07-08 18:23 tgl * src/: backend/postmaster/pgstat.c, backend/storage/smgr/smgr.c,include/pgstat.h (REL8_1_STABLE), backend/postmaster/pgstat.c,backend/storage/smgr/smgr.c,include/pgstat.h (REL8_2_STABLE),backend/postmaster/pgstat.c, backend/storage/smgr/smgr.c,include/pgstat.h:Remove the pgstat_drop_relation() call fromsmgr_internal_unlink(), because wedon't know at that point whichrelation OID to tell pgstat to forget. The code was passing therelfilenode, which is incorrect,and could possibly cause someother relation's stats to be zeroed out. While we could try toclean this up, it seemsmuch simpler and more reliable to let thenext invocation of pgstat_vacuum_tabstat() fix things; which indeedis how itworked before I introduced the buggy code into 8.1.3 andlater :-(. Problem noticed by Itagaki Takahiro, fix is persubsequentdiscussion. regards, tom lane
On Tue, 29 Jan 2008, Tom Lane wrote: > (Pokes around in the code...) I think the problem here is that the only > active mechanism for flushing dead stats-table entries is > pgstat_vacuum_tabstat(), which is invoked by a VACUUM command or an > autovacuum. Once-a-day VACUUM isn't gonna cut it for you under those > circumstances. What you might do is just issue a VACUUM on some > otherwise-uninteresting small table, once an hour or however often you > need to keep the stats file bloat to a reasonable level. I just ran a vacuumdb -a on the box - the pgstat file is still >90MB in size. If vacuum is supposed to clean up the cruft from pgstat, then I don't know if we're looking at the right cruft - I kind of expected the pgstat file to go down in size and the I/O storm to subside, but that is not happening after vacuum. I will try to instrument the application to record the oids of the temp tables it creates and investigate from that angle, but in the meantime is there any way to reset the stats collector altogether? Could this be a corrupt stat file that gets read and written right back on every loop without any sort of validation? Thanks, Cristian -- Cristian Gafton rPath, Inc.
Cristian Gafton <gafton@rpath.com> writes: > I just ran a vacuumdb -a on the box - the pgstat file is still >90MB in > size. If vacuum is supposed to clean up the cruft from pgstat, then I > don't know if we're looking at the right cruft - I kind of expected the > pgstat file to go down in size and the I/O storm to subside, but that is > not happening after vacuum. Hmph ... I did a simple test here involving creating a lot of temp tables, and indeed it made the stats file bigger, but the size went right down again after vacuuming. Is it possible that the vacuumdb failed to connect to the particular database in which the temp tables are coming and going? > I will try to instrument the application to record the oids of the temp > tables it creates and investigate from that angle, but in the meantime is > there any way to reset the stats collector altogether? Could this be a > corrupt stat file that gets read and written right back on every loop > without any sort of validation? There's stats_reset_on_server_start (sp?), and I think 8.2 also has a stats-reset function. But what might be more interesting is to pull the file-reading function out of pgstat.c and dump out the stats file in readable form to see what the heck is in there. (If you decide to try resetting the stats, I'd suggest saving a copy of the stats file first for possible analysis later.) I have the beginnings of such a program laying about, which I'll attach --- note that it was last used for 8.1 and might require some tweaks for 8.2, and that you'd need to flesh it out a lot if you want details about individual entries instead of just a count. regards, tom lane /* * dumpstat --- simple standalone program to read and analyze a PG stats * file. Based on pgstat_read_statsfile() from 8.1 sources. * * Currently works with either 8.0 or 8.1 formats depending on which * headers it is compiled against. */ #include "postgres.h" #include "pgstat.h" int main(int argc, char **argv) { PgStat_StatDBEntry dbbuf; PgStat_StatTabEntry tabbuf; PgStat_StatBeEntry beentry; FILE *fpin; int32 format_id; int maxbackends = 0; int havebackends = 0; int havedbs = 0; int havetabs = 0; /* * Try to open the status file. If it doesn't exist, the backends simply * return zero for anything and the collector simply starts from scratch * with empty counters. */ if ((fpin = fopen(argv[1], "rb")) == NULL) { perror(argv[1]); return 1; } /* * Verify it's of the expected format. */ #ifdef PGSTAT_FILE_FORMAT_ID if (fread(&format_id, 1, sizeof(format_id), fpin) != sizeof(format_id) || format_id != PGSTAT_FILE_FORMAT_ID) { fprintf(stderr, "corrupted pgstat.stat file\n"); goto done; } #endif /* * We found an existing collector stats file. Read it and put all the * hashtable entries into place. */ for (;;) { switch (fgetc(fpin)) { /* * 'D' A PgStat_StatDBEntry struct describing a database * follows. Subsequently, zero to many 'T' entries will follow * until a 'd' is encountered. */ case 'D': if (fread(&dbbuf, 1, sizeof(dbbuf), fpin) != sizeof(dbbuf)) { fprintf(stderr, "corrupted pgstat.stat file\n"); goto done; } havedbs++; break; /* * 'd' End of this database. */ case 'd': break; /* * 'T' A PgStat_StatTabEntry follows. */ case 'T': if (fread(&tabbuf, 1, sizeof(tabbuf), fpin) != sizeof(tabbuf)) { fprintf(stderr, "corrupted pgstat.stat file\n"); goto done; } havetabs++; break; /* * 'M' The maximum number of backends to expect follows. */ case 'M': if (fread(&maxbackends, 1, sizeof(maxbackends), fpin) != sizeof(maxbackends)) { fprintf(stderr, "corrupted pgstat.stat file\n"); goto done; } if (maxbackends == 0) goto done; break; /* * 'B' A PgStat_StatBeEntry follows. */ case 'B': /* * Read it directly into the table. */ if (fread(&beentry, 1, sizeof(PgStat_StatBeEntry), fpin) != sizeof(PgStat_StatBeEntry)) { fprintf(stderr, "corrupted pgstat.stat file\n"); goto done; } havebackends++; break; /* * 'E' The EOF marker of a complete stats file. */ case 'E': goto done; default: fprintf(stderr, "corrupted pgstat.stat file at %ld\n", ftell(fpin) - 1); goto done; } } done: fclose(fpin); printf("found %d backends of %d (%ld bytes)\n", havebackends, maxbackends, havebackends * (long) sizeof(PgStat_StatBeEntry)); printf("%d databases (%ld bytes)\n", havedbs, havedbs * (long) sizeof(PgStat_StatDBEntry)); printf("%d tables (%ld bytes)\n", havetabs, havetabs * (long) sizeof(PgStat_StatTabEntry)); return 0; }