Обсуждение: Database size stays constant but disk space keeps shrinking -- postgres 9.1
Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Greg Williamson
Дата:
Dear list, I have a postgres database, 9.1.3, which shows a fairly constant amount of space used by postgres, but total disk space keesshrinking. If I restart postgres the space on my file system returns. This cluster is replicated to another; that custer does not show this problem. There are two main databases on this cluster, both using logging. One is used to accumulate some stats on our systen. The other is a slimmed-down version of our production database, which gets recreated hourly by a shell script which pullsdata from remote servers, does a pg_dump of the resulting 3 gig database, and then drops it. I posted a message about ths a few days and got no responses. I am hoping for better luck this time as this is a seriousissue,not relayed t vacuuming or routine admin tasks which may or may not be happening. Thanks for any help / advice / things to look for. Greg a few details: db11:5432=# select version(); version -------------------------------------------------------------------------------------------------------------- PostgreSQL 9.1.3 on x86_64-unknown-linux-gnu, compiled by gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit Says one of our sysads: "These are single disk systems. single no n-raid sata. Using xfs as the filesystem like the rest of the dbs."
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Steve Crawford
Дата:
On 09/27/2012 03:05 PM, Greg Williamson wrote: > Dear list, > > I have a postgres database, 9.1.3, which shows a fairly constant amount of space used by postgres, but total disk spacekees shrinking. > If I restart postgres the space on my file system returns. > > This cluster is replicated to another; that custer does not show this problem. > > There are two main databases on this cluster, both using logging. One is used to accumulate some stats on our systen. > > The other is a slimmed-down version of our production database, which gets recreated hourly by a shell script which pullsdata from remote servers, does a pg_dump of the resulting 3 gig database, and then drops it. > > I posted a message about ths a few days and got no responses. I am hoping for better luck this time as this is a seriousissue,not relayed t vacuuming or routine admin tasks which may or may not be happening. > > No answer but some questions. How are you determining space used by PostgreSQL? From the OS (i.e. du on PostgreSQL's directories or similar) or from PostgreSQL itself (pg_ system tables, \dt+, etc.)? What method of replication are you using? Do you have any indication if the issue is only related to which database is causing the issue? Can you stop activity on one of them and see if the problem persists? Have you checked to see if there are any processes that have open handles to deleted files (lsof -X | grep deleted). Deleted files won't show up in du but won't release their disk space until the process exits. Perhaps a script or scripts, even one of your hourly ones, that terminate when the server restarts? You could save the output of lsof and ps immediately before and after a restart and compare them. Let us know what you find. Cheers, Steve
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Micky Gough
Дата:
Hi Greg,
On Fri, Sep 28, 2012 at 8:05 AM, Greg Williamson <gwilliamson39@yahoo.com> wrote:
I have a postgres database, 9.1.3, which shows a fairly constant amount of space used by postgres, but total disk space kees shrinking.
If I restart postgres the space on my file system returns.
It sounds like (maybe?) Postgres is writing to a file that has been deleted. You won't be able to see the file using ls, so it's hard to track. Restarting postgres will release the lock on the inode and the filesystem flushes it.
You can check for deleted files that are being held open using lsof |grep deleted.
Micky
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Dinesh Bhandary
Дата:
You might also want to check to see if temp files are growing out of bounds. Thanks. Dinesh On 9/27/2012 3:05 PM, Greg Williamson wrote: > Dear list, > > I have a postgres database, 9.1.3, which shows a fairly constant amount of space used by postgres, but total disk spacekees shrinking. > If I restart postgres the space on my file system returns. > > This cluster is replicated to another; that custer does not show this problem. > > There are two main databases on this cluster, both using logging. One is used to accumulate some stats on our systen. > > The other is a slimmed-down version of our production database, which gets recreated hourly by a shell script which pullsdata from remote servers, does a pg_dump of the resulting 3 gig database, and then drops it. > > I posted a message about ths a few days and got no responses. I am hoping for better luck this time as this is a seriousissue,not relayed t vacuuming or routine admin tasks which may or may not be happening. > > Thanks for any help / advice / things to look for. > > Greg > > a few details: > db11:5432=# select version(); > version > -------------------------------------------------------------------------------------------------------------- > PostgreSQL 9.1.3 on x86_64-unknown-linux-gnu, compiled by gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit > > Says one of our sysads: > "These are single disk systems. single no > n-raid sata. Using xfs as the filesystem like the rest of the dbs." > >
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Greg Williamson
Дата:
Steve (and others who replied): ----- Original Message ----- > From: Steve Crawford <scrawford@pinpointresearch.com> > To: Greg Williamson <gwilliamson39@yahoo.com> > Cc: "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org> > Sent: Thursday, September 27, 2012 3:48 PM > Subject: Re: [ADMIN] Database size stays constant but disk space keeps shrinking -- postgres 9.1 > > On 09/27/2012 03:05 PM, Greg Williamson wrote: >> Dear list, >> >> I have a postgres database, 9.1.3, which shows a fairly constant amount of > space used by postgres, but total disk space kees shrinking. >> If I restart postgres the space on my file system returns. >> >> This cluster is replicated to another; that custer does not show this > problem. >> >> There are two main databases on this cluster, both using logging. One is > used to accumulate some stats on our systen. >> >> The other is a slimmed-down version of our production database, which gets > recreated hourly by a shell script which pulls data from remote servers, does a > pg_dump of the resulting 3 gig database, and then drops it. >> >> I posted a message about ths a few days and got no responses. I am hoping > for better luck this time as this is a serious issue,not relayed t vacuuming or > routine admin tasks which may or may not be happening. >> >> > No answer but some questions. > > How are you determining space used by PostgreSQL? From the OS (i.e. du on > PostgreSQL's directories or similar) or from PostgreSQL itself (pg_ system > tables, \dt+, etc.)? > df -h and the postgres internal functions that show size of relations. > What method of replication are you using? streaming replication w/ synchronous commit > > Do you have any indication if the issue is only related to which database is > causing the issue? Can you stop activity on one of them and see if the problem > persists? If I suspend the build / drop database process the shrinking of available disk space slows dramatically. > > Have you checked to see if there are any processes that have open handles to > deleted files (lsof -X | grep deleted). Deleted files won't show up in du > but won't release their disk space until the process exits. Perhaps a script > or scripts, even one of your hourly ones, that terminate when the server > restarts? You could save the output of lsof and ps immediately before and after > a restart and compare them. > lsof -X | grep deleted | wc -l shows: 835 such files. A couple: postgres 2540 postgres 50u REG 8,3 409600 93429 /var/lib/postgresql/9.1/main/base/2789 200/11816 (deleted) postgres 2540 postgres 51u REG 8,3 18112512 49694570 /var/lib/postgresql/9.1/main/base/2789 200/2791679 (deleted) <...> while ls shows: root@db11:~# ls /var/lib/postgresql/9.1/main/base/ | more 1 11938 11946 1418400 16387 16392 16402 16424 16449 2047839 pgsql_tmp We've a planned restart scheduled soon which will let me find any scripts that might be keeping things open, and I'll reviewmy script that creates / populates / drops the database hourly. This does not seem to be related to temp tables. Something in the script, perhaps ... Further suggestions and / or questions welcome. Thanks ! Greg W.
Greg Williamson <gwilliamson39@yahoo.com> writes: >> Have you checked to see if there are any processes that have open handles to >> deleted files (lsof -X | grep deleted). > lsof -X | grep deleted | wc -l > shows: 835 such files. > A couple: > postgres 2540 postgres 50u REG 8,3 409600 93429 /var/lib/postgresql/9.1/main/base/2789 > 200/11816 (deleted) > postgres 2540 postgres 51u REG 8,3 18112512 49694570 /var/lib/postgresql/9.1/main/base/2789 > 200/2791679 (deleted) > <...> So, which processes are holding these open, and what are they doing exactly? Let's see output from ps and pg_stat_activity, maybe even attach to them with gdb and get stack traces. > We've a planned restart scheduled soon which will let me find any > scripts that might be keeping things open, A restart will destroy all the evidence, so let's not be in a hurry to do that before we've identified what's happening. regards, tom lane
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Greg Williamson
Дата:
Tom -- ----- Original Message ----- > From: Tom Lane <tgl@sss.pgh.pa.us> > To: Greg Williamson <gwilliamson39@yahoo.com> > Cc: "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org> > Sent: Thursday, September 27, 2012 7:14 PM > Subject: Re: [ADMIN] Database size stays constant but disk space keeps shrinking -- postgres 9.1 > >G reg Williamson <gwilliamson39@yahoo.com> writes: >>> Have you checked to see if there are any processes that have open > handles to >>> deleted files (lsof -X | grep deleted). > >> lsof -X | grep deleted | wc -l > >> shows: 835 such files. > >> A couple: >> postgres 2540 postgres 50u REG 8,3 409600 > 93429 /var/lib/postgresql/9.1/main/base/2789 >> 200/11816 (deleted) >> postgres 2540 postgres 51u REG 8,3 18112512 > 49694570 /var/lib/postgresql/9.1/main/base/2789 >> 200/2791679 (deleted) >> <...> > > So, which processes are holding these open, and what are they doing > exactly? Let's see output from ps and pg_stat_activity, maybe even > attach to them with gdb and get stack traces. > >> We've a planned restart scheduled soon which will let me find any >> scripts that might be keeping things open, > > A restart will destroy all the evidence, so let's not be in a hurry > to do that before we've identified what's happening. > > regards, tom lane > Thanks for the suggestions -- I'll post back when I have more info. Many of these do not seem to have a link to any identifiableprocess that is still running, but some do and they have pointed me away from the hourly drop / rebuild, at leastfor now. Looks like the stats database may be the issue. Greg W.
Greg Williamson <gwilliamson39@yahoo.com> writes: >>> postgres 2540 postgres 50u REG 8,3 409600 93429 /var/lib/postgresql/9.1/main/base/2789200/11816(deleted) >>> postgres 2540 postgres 51u REG 8,3 18112512 49694570 /var/lib/postgresql/9.1/main/base/2789200/2791679(deleted) > Thanks for the suggestions -- I'll post back when I have more info. Many of these do not seem to have a link to any identifiableprocess that is still running, but some do and they have pointed me away from the hourly drop / rebuild, at leastfor now. Looks like the stats database may be the issue. BTW, looking at that again --- the filenames appear to be ordinary tables in database 2789200, but there is something mighty odd about the first one: 11816 is an OID that should only be handed out during initdb. And in 9.1 what it would be handed out to is pg_shdescription. Now it's not impossible that pg_shdescription's original table file would get deleted: a VACUUM FULL or CLUSTER on that catalog would do it. But AFAICS there is no situation in which that relfilenode number would appear in a regular database --- it should be under the global/ subdirectory of $PGDATA. So unless you miscopied that filename, there is something odd going on here above and beyond the problem of open file descriptors not getting closed. Do you have any nonstandard maintenance practices in this installation, such as doing database-wide VACUUM FULL every so often? regards, tom lane
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Greg Williamson
Дата:
Tom -- ----- Original Message ----- > From: Tom Lane <tgl@sss.pgh.pa.us> > To: Greg Williamson <gwilliamson39@yahoo.com> > Cc: "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org> > Sent: Thursday, September 27, 2012 7:55 PM > Subject: Re: [ADMIN] Database size stays constant but disk space keeps shrinking -- postgres 9.1 > >G reg Williamson <gwilliamson39@yahoo.com> writes: >>>> postgres 2540 postgres 50u REG 8,3 > 409600 93429 /var/lib/postgresql/9.1/main/base/2789200/11816 (deleted) >>>> postgres 2540 postgres 51u REG 8,3 > 18112512 49694570 /var/lib/postgresql/9.1/main/base/2789200/2791679 (deleted) > >> Thanks for the suggestions -- I'll post back when I have more info. > Many of these do not seem to have a link to any identifiable process that is > still running, but some do and they have pointed me away from the hourly drop / > rebuild, at least for now. Looks like the stats database may be the issue. > > BTW, looking at that again --- the filenames appear to be ordinary > tables in database 2789200, but there is something mighty odd about the > first one: 11816 is an OID that should only be handed out during initdb. > And in 9.1 what it would be handed out to is pg_shdescription. Now it's > not impossible that pg_shdescription's original table file would get > deleted: a VACUUM FULL or CLUSTER on that catalog would do it. But > AFAICS there is no situation in which that relfilenode number would > appear in a regular database --- it should be under the global/ > subdirectory of $PGDATA. So unless you miscopied that filename, there > is something odd going on here above and beyond the problem of open > file descriptors not getting closed. Nope -- a sim-ple copy that seems to have gotten all of the data output. > Do you have any nonstandard > maintenance practices in this installation, such as doing database-wide > VACUUM FULL every so often? None that I know of -- logs don't show any. and there are none on cronjobs. I'm asking the developers behind this app --they may be doing something strange. I'll post back as they answer. Thanks hugely, Greg W.
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Steve Crawford
Дата:
On 09/27/2012 07:01 PM, Greg Williamson wrote: > Steve (and others who replied): > ... > The other is a slimmed-down version of our production database, which gets >> recreated hourly by a shell script which pulls data from remote servers, does a >> pg_dump of the resulting 3 gig database, and then drops it. >> ... Could you explain this process in more detail? Are you creating a new database, reading in data, dumping then dropping the whole database or just manipulating tables within an existing database? >> Have you checked to see if there are any processes that have open handles to >> deleted files (lsof -X | grep deleted). Deleted files won't show up in du >> but won't release their disk space until the process exits. Perhaps a script >> or scripts, even one of your hourly ones, that terminate when the server >> restarts? You could save the output of lsof and ps immediately before and after >> a restart and compare them. >> > lsof -X | grep deleted | wc -l > > shows: 835 such files. > > A couple: > postgres 2540 postgres 50u REG 8,3 409600 93429 /var/lib/postgresql/9.1/main/base/2789 > 200/11816 (deleted) > postgres 2540 postgres 51u REG 8,3 18112512 49694570 /var/lib/postgresql/9.1/main/base/2789 > 200/2791679 (deleted) > <...>... I'll leave it to you and Tom to puzzle over the the postgres-related open files. Meanwhile, I'm a bit curious about the other 800+ and whether they are associated with scripts or processes that are connected to PostgreSQL. First, what is the output of "select * from pg_stat_activity;"? Are there connections you don't expect to see? If you force any of them closed (after checking with anyone who may be impacted), do you see any file handles released or disk-space freed? Second, do any of the processes associated with the other open-but-deleted files relate to programs or scripts that connect to PostgreSQL? Next time you do a restart, do any of the processes exit or do any of the deleted files get closed? I'm wondering if you have processes that connect to PostgreSQL which terminate and release their file-handles when PG is restarted. Cheers, Steve
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Dinesh Bhandary
Дата:
We had a situation where external sort was creating a humungous temp file and the space was reclaimed when the process was completed. Thanks. Dinesh On 9/28/2012 8:59 AM, Steve Crawford wrote: > On 09/27/2012 07:01 PM, Greg Williamson wrote: >> Steve (and others who replied): >> ... >> The other is a slimmed-down version of our production database, >> which gets >>> recreated hourly by a shell script which pulls data from remote >>> servers, does a >>> pg_dump of the resulting 3 gig database, and then drops it. >>> ... > > Could you explain this process in more detail? Are you creating a new > database, reading in data, dumping then dropping the whole database or > just manipulating tables within an existing database? > >>> Have you checked to see if there are any processes that have open >>> handles to >>> deleted files (lsof -X | grep deleted). Deleted files won't show up >>> in du >>> but won't release their disk space until the process exits. Perhaps >>> a script >>> or scripts, even one of your hourly ones, that terminate when the >>> server >>> restarts? You could save the output of lsof and ps immediately >>> before and after >>> a restart and compare them. >> lsof -X | grep deleted | wc -l >> >> shows: 835 such files. >> >> A couple: >> postgres 2540 postgres 50u REG 8,3 >> 409600 93429 /var/lib/postgresql/9.1/main/base/2789 >> 200/11816 (deleted) >> postgres 2540 postgres 51u REG 8,3 18112512 >> 49694570 /var/lib/postgresql/9.1/main/base/2789 >> 200/2791679 (deleted) >> <...>... > I'll leave it to you and Tom to puzzle over the the postgres-related > open files. Meanwhile, I'm a bit curious about the other 800+ and > whether they are associated with scripts or processes that are > connected to PostgreSQL. > > First, what is the output of "select * from pg_stat_activity;"? Are > there connections you don't expect to see? If you force any of them > closed (after checking with anyone who may be impacted), do you see > any file handles released or disk-space freed? > > Second, do any of the processes associated with the other > open-but-deleted files relate to programs or scripts that connect to > PostgreSQL? Next time you do a restart, do any of the processes exit > or do any of the deleted files get closed? > > I'm wondering if you have processes that connect to PostgreSQL which > terminate and release their file-handles when PG is restarted. > > Cheers, > Steve > > >
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Greg Williamson
Дата:
Steve -- ----- Original Message ----- > From: Steve Crawford <scrawford@pinpointresearch.com> > To: Greg Williamson <gwilliamson39@yahoo.com> > Cc: "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org> > Sent: Friday, September 28, 2012 8:59 AM > Subject: Re: [ADMIN] Database size stays constant but disk space keeps shrinking -- postgres 9.1 > > On 09/27/2012 07:01 PM, Greg Williamson wrote: >> Steve (and others who replied): >> ... >> The other is a slimmed-down version of our production database, which > gets >>> recreated hourly by a shell script which pulls data from remote > servers, does a >>> pg_dump of the resulting 3 gig database, and then drops it. >>> ... > > Could you explain this process in more detail? Are you creating a new database, > reading in data, dumping then dropping the whole database or just manipulating > tables within an existing database? > We create the database, and then use shell scripts to get a pg_dump -s of a production server and use that to create tables,indexes, etc. Then a sequence of psql commands retrieves data from production, usually a subset of whatever table is being grabbed basedon a slimmed down set of userids of interest, recency, etc. Mostly these get loaded as is i nto the tables in the newdatabase; occasionally we build a temp table and do some simple joins to get the final results. When all these commands are done, we pg_dump the new database, compress it and make a tar file, and finally issue a DROPDATABASE command. Takes about 20 minutes and runs once an hour. >>> Have you checked to see if there are any processes that have open > handles to >>> deleted files (lsof -X | grep deleted). Deleted files won't show up > in du >>> but won't release their disk space until the process exits. Perhaps > a script >>> or scripts, even one of your hourly ones, that terminate when the > server >>> restarts? You could save the output of lsof and ps immediately before > and after >>> a restart and compare them. >>> >> lsof -X | grep deleted | wc -l >> >> shows: 835 such files. >> >> A couple: >> postgres 2540 postgres 50u REG 8,3 409600 > 93429 /var/lib/postgresql/9.1/main/base/2789 >> 200/11816 (deleted) >> postgres 2540 postgres 51u REG 8,3 18112512 > 49694570 /var/lib/postgresql/9.1/main/base/2789 >> 200/2791679 (deleted) >> <...>... > I'll leave it to you and Tom to puzzle over the the postgres-related open > files. Meanwhile, I'm a bit curious about the other 800+ and whether they > are associated with scripts or processes that are connected to PostgreSQL. > These all seem to be from two places -- repmgr (transient ) and this stats application. > First, what is the output of "select * from pg_stat_activity;"? Are > there connections you don't expect to see? If you force any of them closed > (after checking with anyone who may be impacted), do you see any file handles > released or disk-space freed? > Nothing unexpected, lots of IDLE connections (20-30, depending). We just tried a round of closing half of the stat applicationconnections and it didn't seem to make a big difference. > Second, do any of the processes associated with the other open-but-deleted files > relate to programs or scripts that connect to PostgreSQL? Next time you do a > restart, do any of the processes exit or do any of the deleted files get closed? > repmgr is now also a suspect, although from what I can see it keeps things in that state for nly a short while. But we areinvestigating further. Everythng else connects via pgbouncer, so we are also wondering if the tcp_keepalive we added might be hurting us. > I'm wondering if you have processes that connect to PostgreSQL which > terminate and release their file-handles when PG is restarted. > > Cheers, > Steve > Thanks for questions and the time -- still digging into this. Greg
Greg Williamson <gwilliamson39@yahoo.com> writes: >> I'll leave it to you and Tom to puzzle over the the postgres-related open >> files. Meanwhile, I'm a bit curious about the other 800+ and whether they >> are associated with scripts or processes that are connected to PostgreSQL. > These all seem to be from two places -- repmgr (transient ) and this stats application. Do you have any characterization yet of which deleted files are being held open by which processes? In particular I'm wondering if the held-open deleted files are in a recently-dropped database, and whether they are being held open by regular backends or one of the background processes such as bgwriter, and if the former what are those backends doing exactly. It's entirely expected that recently-deleted files might be held open for a little while, but there are mechanisms that are supposed to prevent them from being held open indefinitely. I'm guessing that your usage pattern might be tripping over some gap in those mechanisms, but we don't have enough info yet to speculate about what. BTW, I now think that my question about the unexpected OID value shown for one deleted file may have been a red herring --- it seems not implausible that lsof was just lying to you. It has to do some guesswork to reconstruct file paths for deleted files, and I don't think it's always right about that. regards, tom lane
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Dinesh Bhandary
Дата:
I'd also keep track of tmp directory while building. I hope it is not creating a huge external sort file. By building you meant to say pg_restore, I assume. How big is your maintenance_work_mem? Index creation pools memory from this buffer. n Sep 27, 2012 07:01 PM, Greg Williamson <gwilliamson39@yahoo.com> wrote: >Steve (and others who replied): > > >----- Original Message ----- >>From: Steve Crawford <scrawford@pinpointresearch.com> >>To: Greg Williamson <gwilliamson39@yahoo.com> >>Cc: "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org> >>Sent: Thursday, September 27, 2012 3:48 PM >>Subject: Re: [ADMIN] Database size stays constant but disk space keeps >>shrinking -- postgres 9.1 >> >>On 09/27/2012 03:05 PM, Greg Williamson wrote: >>>Dear list, >>> >>>I have a postgres database, 9.1.3, which shows a fairly constant >>>amount of >>space used by postgres, but total disk space kees shrinking. >>>If I restart postgres the space on my file system returns. >>> >>>This cluster is replicated to another; that custer does not show this >>problem. >>> >>>There are two main databases on this cluster, both using logging. One >>>is >>used to accumulate some stats on our systen. >>> >>>The other is a slimmed-down version of our production database, which >>>gets >>recreated hourly by a shell script which pulls data from remote >>servers, does a >>pg_dump of the resulting 3 gig database, and then drops it. >>> >>>I posted a message about ths a few days and got no responses. I am >>>hoping >>for better luck this time as this is a serious issue,not relayed t >>vacuuming or >>routine admin tasks which may or may not be happening. >>> >>> >>No answer but some questions. >> >>How are you determining space used by PostgreSQL? From the OS (i.e. du >>on >>PostgreSQL's directories or similar) or from PostgreSQL itself (pg_ >>system >>tables, \dt+, etc.)? >> > >df -h and the postgres internal functions that show size of relations. > >>What method of replication are you using? > > >streaming replication w/ synchronous commit > >> >>Do you have any indication if the issue is only related to which >>database is >>causing the issue? Can you stop activity on one of them and see if the >>problem >>persists? > > >If I suspend the build / drop database process the shrinking of >available disk space slows dramatically. > >> >>Have you checked to see if there are any processes that have open >>handles to >>deleted files (lsof -X | grep deleted). Deleted files won't show up in >>du >>but won't release their disk space until the process exits. Perhaps a >>script >>or scripts, even one of your hourly ones, that terminate when the >>server >>restarts? You could save the output of lsof and ps immediately before >>and after >>a restart and compare them. >> > >lsof -X | grep deleted | wc -l > >shows: 835 such files. > >A couple: >postgres 2540 postgres 50u REG 8,3 409600 > 93429 /var/lib/postgresql/9.1/main/base/2789 >200/11816 (deleted) >postgres 2540 postgres 51u REG 8,3 18112512 >49694570 /var/lib/postgresql/9.1/main/base/2789 >200/2791679 (deleted) ><...> > >while ls shows: >root@db11:~# ls /var/lib/postgresql/9.1/main/base/ | more >1 >11938 >11946 >1418400 > 16387 >16392 >16402 >16424 >16449 >2047839 > pgsql_tmp > >We've a planned restart scheduled soon which will let me find any >scripts that might be keeping things open, and I'll review my script >that creates / populates / drops the database hourly. > > >This does not seem to be related to temp tables. Something in the >script, perhaps ... > >Further suggestions and / or questions welcome. > >Thanks ! > >Greg W. > > > >-- >Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) >To make changes to your subscription: >http://www.postgresql.org/mailpref/pgsql-admin > >
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Greg Williamson
Дата:
Dinesh -- ----- Original Message ----- > From: Dinesh Bhandary <bhandary@iii.com> > To: Greg Williamson <gwilliamson39@yahoo.com> > Cc: "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org> > Sent: Thursday, September 27, 2012 7:19 PM > Subject: Re: [ADMIN] Database size stays constant but disk space keeps shrinking -- postgres 9.1 > > > I'd also keep track of tmp directory while building. I hope it is not > creating a huge external sort file. By building you meant to say > pg_restore, I assume. Actually everything is done a single SQL commands; the DDL from the source gets run on the new database, and then we do aseries of SQL commands. The tmp space doesn't seem to get hit as there are only a few sorts being run as we reduce a listof "users from" and "users to" to a single list of unique user ids. > How big is your maintenance_work_mem? Index creation pools memory from > this buffer. maintenance_work_mem = 1500MB# min 1MB Which seems to be enough for this task set. postgres@db11:~/9.1/main$ ls -lrt base/pgsql_tmp total 0 Something malign in one of our applications, perhaps interacting with this hourly drop. Still working with the developersto see what they are doing "under the hood." Thanks for the suggestions! Greg
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Greg Williamson
Дата:
I've done some more testing and the problem seems to be repmgr itself. A few details below... ----- Original Message ----- > From: Greg Williamson <gwilliamson39@yahoo.com> > To: Tom Lane <tgl@sss.pgh.pa.us> > Cc: "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org> > Sent: Thursday, September 27, 2012 7:23 PM > Subject: Re: [ADMIN] Database size stays constant but disk space keeps shrinking -- postgres 9.1 > >T om -- > > ----- Original Message ----- >> From: Tom Lane <tgl@sss.pgh.pa.us> >> To: Greg Williamson <gwilliamson39@yahoo.com> >> Cc: "pgsql-admin@postgresql.org" > <pgsql-admin@postgresql.org> >> Sent: Thursday, September 27, 2012 7:14 PM >> Subject: Re: [ADMIN] Database size stays constant but disk space keeps > shrinking -- postgres 9.1 >> >> G reg Williamson <gwilliamson39@yahoo.com> writes: >>>> Have you checked to see if there are any processes that have open >> handles to >>>> deleted files (lsof -X | grep deleted). >> >>> lsof -X | grep deleted | wc -l >> >>> shows: 835 such files. >> >>> A couple: >>> postgres 2540 postgres 50u REG 8,3 409600 > >> 93429 /var/lib/postgresql/9.1/main/base/2789 >>> 200/11816 (deleted) >>> postgres 2540 postgres 51u REG 8,3 18112512 > >> 49694570 /var/lib/postgresql/9.1/main/base/2789 >>> 200/2791679 (deleted) >>> <...> >> >> So, which processes are holding these open, and what are they doing >> exactly? Let's see output from ps and pg_stat_activity, maybe even >> attach to them with gdb and get stack traces. >> >>> We've a planned restart scheduled soon which will let me find any >>> scripts that might be keeping things open, >> >> A restart will destroy all the evidence, so let's not be in a hurry >> to do that before we've identified what's happening. >> >> regards, tom lane >> > > Thanks for the suggestions -- I'll post back when I have more info. Many of > these do not seem to have a link to any identifiable process that is still > running, but some do and they have pointed me away from the hourly drop / > rebuild, at least for now. Looks like the stats database may be the issue. > > Greg W. I turned off the cronjob that did the hourly database create / drop and am still leaking disk space, but a but slower --only lost 2 gigs overnight. While running this process I see these data directories: postgres@db11:~$ ls -lrt 9.1/main/base total 200 drwx------ 2 postgres postgres 6 2012-09-21 16:36 pgsql_tmp drwx------ 2 postgres postgres 8192 2012-10-01 00:26 16387 drwx------ 2 postgres postgres 16384 2012-10-01 00:26 1418400 drwx------ 2 postgres postgres 8192 2012-10-01 00:26 2047839 drwx------ 2 postgres postgres 8192 2012-10-01 00:26 11946 drwx------ 2 postgres postgres 8192 2012-10-01 00:27 16449 drwx------ 2 postgres postgres 8192 2012-10-01 00:27 16392 drwx------ 2 postgres postgres 8192 2012-10-01 00:27 16402 drwx------ 2 postgres postgres 8192 2012-10-01 00:27 11938 drwx------ 2 postgres postgres 8192 2012-10-01 00:27 1 drwx------ 2 postgres postgres 8192 2012-10-01 08:17 16424 drwx------ 2 postgres postgres 32768 2012-10-01 19:20 3171846 When it is done (note the last directory is now gone): postgres@db11:~$ ls -lrt 9.1/main/base total 140 drwx------ 2 postgres postgres 6 2012-09-21 16:36 pgsql_tmp drwx------ 2 postgres postgres 8192 2012-10-01 00:26 16387 drwx------ 2 postgres postgres 16384 2012-10-01 00:26 1418400 drwx------ 2 postgres postgres 8192 2012-10-01 00:26 2047839 drwx------ 2 postgres postgres 8192 2012-10-01 00:26 11946 drwx------ 2 postgres postgres 8192 2012-10-01 00:27 16449 drwx------ 2 postgres postgres 8192 2012-10-01 00:27 16392 drwx------ 2 postgres postgres 8192 2012-10-01 00:27 16402 drwx------ 2 postgres postgres 8192 2012-10-01 00:27 11938 drwx------ 2 postgres postgres 8192 2012-10-01 00:27 1 drwx------ 2 postgres postgres 8192 2012-10-01 08:17 16424 When I run lsof -X and grep for deleted files I see these 4 new entries added since the last database create/drop: ase/3167420/3169915 (deleted) postgres 21116 postgres 66u REG 8,3 19709952 136501576 /var/lib/postgresql/9.1/main/base/3171846/3174279(deleted) postgres 21116 postgres 67u REG 8,3 15450112 136501574 /var/lib/postgresql/9.1/main/base/3171846/3174278(deleted) postgres 21116 postgres 68u REG 8,3 28344320 136410873 /var/lib/postgresql/9.1/main/base/3171846/3172541(deleted) postgres 21116 postgres 69u REG 8,3 82452480 144333458 /var/lib/postgresql/9.1/main/base/3171846/3174341(deleted) root@db11:~# root@db11:~# ps auxww | grep 21116 postgres 21116 0.0 0.1 100416 32332 ? Ss 00:26 0:16 postgres: repmgr repmgr 199.9.xxx.yyy(45239) idle root 25755 0.0 0.0 6440 840 pts/2 S+ 19:38 0:00 grep --color=auto 21116 ====== With the database create/drop suspended we still see a steady accumulation of dead file descriptors, but at a slower rate. < /dev/sda3 67G 28G 39G 42% / --- > /dev/sda3 67G 29G 38G 44% / Other than abandoning repmgr I don't see a solution. I've posted this to the repmgr discussion group but have had zero responses(and, frankly, am not holding my breath). If anyone has any suggestions I'm all ears. Thanks for the bandwidth! Greg W.
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Gabriele Bartolini
Дата:
Hi Greg, On Tue, 2 Oct 2012 15:02:23 -0700 (PDT), Greg Williamson <gwilliamson39@yahoo.com> wrote: > Other than abandoning repmgr I don't see a solution. I've posted this > to the repmgr discussion group but have had zero responses (and, > frankly, am not holding my breath). If you are 100% sure it is repmgr ... :) I am not 100% sure it is ... Under normal circumstances (once the standby has been cloned), repmgr simply controls the status of a standby server communicating with a master through the streaming replication protocol. As any other standby would do. Cheers, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support Gabriele.Bartolini@2ndQuadrant.it - www.2ndQuadrant.it
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Jaime Casanova
Дата:
On Wed, Oct 3, 2012 at 6:41 AM, Gabriele Bartolini <Gabriele.Bartolini@2ndquadrant.it> wrote: > Hi Greg, > > > On Tue, 2 Oct 2012 15:02:23 -0700 (PDT), Greg Williamson > <gwilliamson39@yahoo.com> wrote: > >> Other than abandoning repmgr I don't see a solution. I've posted this >> to the repmgr discussion group but have had zero responses (and, >> frankly, am not holding my breath). > > > If you are 100% sure it is repmgr ... :) > > I am not 100% sure it is ... Under normal circumstances (once the standby > has been cloned), repmgr simply controls the status of a standby server > communicating with a master through the streaming replication protocol. As > any other standby would do. > more to the point... if this is happening on master (where you can create and drop databases), then repmgrd has no reason to be running there... and very limited subset of commands (master register, cluster show, cluster cleanup) can be run on master -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Greg Williamson
Дата:
Gabriele -- ----- Original Message ----- > From: Gabriele Bartolini <Gabriele.Bartolini@2ndQuadrant.it> > To: Greg Williamson <gwilliamson39@yahoo.com> > Cc: pgsql-admin@postgresql.org > Sent: Wednesday, October 3, 2012 4:41 AM > Subject: Re: [ADMIN] Database size stays constant but disk space keeps shrinking -- postgres 9.1 > > Hi Greg, > > On Tue, 2 Oct 2012 15:02:23 -0700 (PDT), Greg Williamson > <gwilliamson39@yahoo.com> wrote: > >> Other than abandoning repmgr I don't see a solution. I've posted > this >> to the repmgr discussion group but have had zero responses (and, >> frankly, am not holding my breath). > > If you are 100% sure it is repmgr ... :) > > I am not 100% sure it is ... Under normal circumstances (once the standby has > been cloned), repmgr simply controls the status of a standby server > communicating with a master through the streaming replication protocol. As any > other standby would do. > Indeed -- stopping repmgr has lef to a continued accumulation of the dead but not gone file handles, but almost all are nowowned by the application, with only a few held by WAL shipping and the like. So repmgr was just getting the blame, unfairly. Am working on the code itself now. Thanks for the response! Greg W.
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Jaime Casanova
Дата:
On Wed, Oct 3, 2012 at 4:53 PM, Greg Williamson <gwilliamson39@yahoo.com> wrote: > > Indeed -- stopping repmgr has lef to a continued accumulation of the dead but not gone file handles, but almost all arenow owned by the application, with only a few held by WAL shipping and the like. So repmgr was just getting the blame,unfairly. > > Am working on the code itself now. > Just in case, it seems like http://archives.postgresql.org/pgsql-committers/2012-10/msg00194.php is a fix to your problem. It was commited a few minutes ago by Tom Lane so you can patch your instalation or wait until the next postgresql minor version to be released -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157
Re: Database size stays constant but disk space keeps shrinking -- postgres 9.1
От
Greg Williamson
Дата:
fantastic !
thanks.
greg
From: Jaime Casanova <jaime@2ndquadrant.com>
To: Greg Williamson <gwilliamson39@yahoo.com>
Cc: "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org>
Sent: Wednesday, October 17, 2012 10:05 AM
Subject: Re: [ADMIN] Database size stays constant but disk space keeps shrinking -- postgres 9.1
On Wed, Oct 3, 2012 at 4:53 PM, Greg Williamson <gwilliamson39@yahoo.com> wrote:
>
> Indeed -- stopping repmgr has lef to a continued accumulation of the dead but not gone file handles, but almost all are now owned by the application, with only a few held by WAL shipping and the like. So repmgr was just getting the blame, unfairly.
>
> Am working on the code itself now.
>
Just in case, it seems like
http://archives.postgresql.org/pgsql-committers/2012-10/msg00194.php
is a fix to your problem. It was commited a few minutes ago by Tom
Lane so you can patch your instalation or wait until the next
postgresql minor version to be released
--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157