Обсуждение: Disk space consumed by pk not returned after vacuum or reindex
Hello,
I am running into a problem on a RHEL3 systems, running PostgreSQL 7.4
We have a database, which consistently consumes more and more of the disk space in it's lvol until it reaches 100%. So far, we have tried to run a full vacuum on the database, with limited success. Eventually, we had to drop and reload the database with the same data inside. It brought the disk usage down to 73%. It then began to creep once more toward 100%. After some research, I was able to use the pg_class catalog to find that the items which are expanding the quickest are primary key (btree) indexes. I attempted to run a REINDEX on one of the tables with the pk taking up the largest amount of space. The usage according to pg_class dropped dramatically, however the disk space was not returned to the system. So I attempted another full vacuum afterwards, and still nothing was returned to the system. These tables are updated extremely frequently (although their total number of rows is close to constant), which is my guess as to why the pk indexes increase so rapidly in terms of their disk usage. Unfortunately, PostgreSQL knowledge is limited, and I was wondering if anyone had experienced something similar / knows what else we can do to return this disk space back to the system?
Thank you in advance for any/all help!
Kevin
I am running into a problem on a RHEL3 systems, running PostgreSQL 7.4
We have a database, which consistently consumes more and more of the disk space in it's lvol until it reaches 100%. So far, we have tried to run a full vacuum on the database, with limited success. Eventually, we had to drop and reload the database with the same data inside. It brought the disk usage down to 73%. It then began to creep once more toward 100%. After some research, I was able to use the pg_class catalog to find that the items which are expanding the quickest are primary key (btree) indexes. I attempted to run a REINDEX on one of the tables with the pk taking up the largest amount of space. The usage according to pg_class dropped dramatically, however the disk space was not returned to the system. So I attempted another full vacuum afterwards, and still nothing was returned to the system. These tables are updated extremely frequently (although their total number of rows is close to constant), which is my guess as to why the pk indexes increase so rapidly in terms of their disk usage. Unfortunately, PostgreSQL knowledge is limited, and I was wondering if anyone had experienced something similar / knows what else we can do to return this disk space back to the system?
Thank you in advance for any/all help!
Kevin
Вложения
On Wed, Sep 13, 2006 at 09:53:16 -0400, Kevin Johnson <Kevin.Johnson@noaa.gov> wrote: > > We have a database, which consistently consumes more and more of the > disk space in it's lvol until it reaches 100%. So far, we have tried to > run a full vacuum on the database, with limited success. Eventually, we > had to drop and reload the database with the same data inside. It > brought the disk usage down to 73%. It then began to creep once more > toward 100%. After some research, I was able to use the pg_class > catalog to find that the items which are expanding the quickest are > primary key (btree) indexes. I attempted to run a REINDEX on one of the > tables with the pk taking up the largest amount of space. The usage > according to pg_class dropped dramatically, however the disk space was > not returned to the system. So I attempted another full vacuum > afterwards, and still nothing was returned to the system. These tables > are updated extremely frequently (although their total number of rows is > close to constant), which is my guess as to why the pk indexes increase > so rapidly in terms of their disk usage. Unfortunately, PostgreSQL > knowledge is limited, and I was wondering if anyone had experienced > something similar / knows what else we can do to return this disk space > back to the system? This is possibly index bloat due to new keys always being larger than existing keys. This was fixed in later releases. There is still some potential for bloat due to fragmentation, but I believe that has a constant bound. You might try using the cluster command. I think that will both clean up the indexes and remove dead rows and do it faster than using a vacuum full and reindexing. The downside is that the table will be unavailable during the cluster which might be a deal breaker for you.
Thank you for the suggestion, Bruno. The clustering did the trick in reducing the current disk usage, however eventually the disk space get consumed once more. I fear that we may just need to update the version of Postgres to help alleviate index bloat!
Bruno Wolff III wrote:
Bruno Wolff III wrote:
On Wed, Sep 13, 2006 at 09:53:16 -0400, Kevin Johnson <Kevin.Johnson@noaa.gov> wrote:We have a database, which consistently consumes more and more of the disk space in it's lvol until it reaches 100%. So far, we have tried to run a full vacuum on the database, with limited success. Eventually, we had to drop and reload the database with the same data inside. It brought the disk usage down to 73%. It then began to creep once more toward 100%. After some research, I was able to use the pg_class catalog to find that the items which are expanding the quickest are primary key (btree) indexes. I attempted to run a REINDEX on one of the tables with the pk taking up the largest amount of space. The usage according to pg_class dropped dramatically, however the disk space was not returned to the system. So I attempted another full vacuum afterwards, and still nothing was returned to the system. These tables are updated extremely frequently (although their total number of rows is close to constant), which is my guess as to why the pk indexes increase so rapidly in terms of their disk usage. Unfortunately, PostgreSQL knowledge is limited, and I was wondering if anyone had experienced something similar / knows what else we can do to return this disk space back to the system?This is possibly index bloat due to new keys always being larger than existing keys. This was fixed in later releases. There is still some potential for bloat due to fragmentation, but I believe that has a constant bound. You might try using the cluster command. I think that will both clean up the indexes and remove dead rows and do it faster than using a vacuum full and reindexing. The downside is that the table will be unavailable during the cluster which might be a deal breaker for you.
-- Kevin Johnson Raytheon AWIPS Sr. Systems Engineer NWS Network Control Facility p: 301.713.9362x325 f: 301.713.1905
I didn't see you mention what version you're running; index bloat shouldn't be a big issue in 7.4 and above. You also didn't mention how often you're vacuuming the table. If you don't vacuum the table frequently enough, you're going to get bloat, plain and simple. On Oct 5, 2006, at 11:24 AM, Kevin Johnson wrote: > Thank you for the suggestion, Bruno. The clustering did the trick > in reducing the current disk usage, however eventually the disk > space get consumed once more. I fear that we may just need to > update the version of Postgres to help alleviate index bloat! > > Bruno Wolff III wrote: >> On Wed, Sep 13, 2006 at 09:53:16 -0400, Kevin Johnson >> <Kevin.Johnson@noaa.gov> wrote: >>> We have a database, which consistently consumes more and more of >>> the disk space in it's lvol until it reaches 100%. So far, we >>> have tried to run a full vacuum on the database, with limited >>> success. Eventually, we had to drop and reload the database with >>> the same data inside. It brought the disk usage down to 73%. It >>> then began to creep once more toward 100%. After some research, I >>> was able to use the pg_class catalog to find that the items which >>> are expanding the quickest are primary key (btree) indexes. I >>> attempted to run a REINDEX on one of the tables with the pk >>> taking up the largest amount of space. The usage according to >>> pg_class dropped dramatically, however the disk space was not >>> returned to the system. So I attempted another full vacuum >>> afterwards, and still nothing was returned to the system. These >>> tables are updated extremely frequently (although their total >>> number of rows is close to constant), which is my guess as to why >>> the pk indexes increase so rapidly in terms of their disk usage. >>> Unfortunately, PostgreSQL knowledge is limited, and I was >>> wondering if anyone had experienced something similar / knows >>> what else we can do to return this disk space back to the system? >> This is possibly index bloat due to new keys always being larger >> than existing keys. This was fixed in later releases. There is >> still some potential for bloat due to fragmentation, but I believe >> that has a constant bound. You might try using the cluster >> command. I think that will both clean up the indexes and remove >> dead rows and do it faster than using a vacuum full and >> reindexing. The downside is that the table will be unavailable >> during the cluster which might be a deal breaker for you. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) -- Jim Nasby jimn@enterprisedb.com EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
You are correct, it is 7.4 we are running on a RHEL3 system.
The database itself is vacuumed via cron 6 times a day.
Jim Nasby wrote:
The database itself is vacuumed via cron 6 times a day.
Jim Nasby wrote:
I didn't see you mention what version you're running; index bloat shouldn't be a big issue in 7.4 and above. You also didn't mention how often you're vacuuming the table. If you don't vacuum the table frequently enough, you're going to get bloat, plain and simple.
On Oct 5, 2006, at 11:24 AM, Kevin Johnson wrote:Thank you for the suggestion, Bruno. The clustering did the trick in reducing the current disk usage, however eventually the disk space get consumed once more. I fear that we may just need to update the version of Postgres to help alleviate index bloat!--
Bruno Wolff III wrote:On Wed, Sep 13, 2006 at 09:53:16 -0400, Kevin Johnson <Kevin.Johnson@noaa.gov> wrote:We have a database, which consistently consumes more and more of the disk space in it's lvol until it reaches 100%. So far, we have tried to run a full vacuum on the database, with limited success. Eventually, we had to drop and reload the database with the same data inside. It brought the disk usage down to 73%. It then began to creep once more toward 100%. After some research, I was able to use the pg_class catalog to find that the items which are expanding the quickest are primary key (btree) indexes. I attempted to run a REINDEX on one of the tables with the pk taking up the largest amount of space. The usage according to pg_class dropped dramatically, however the disk space was not returned to the system. So I attempted another full vacuum afterwards, and still nothing was returned to the system. These tables are updated extremely frequently (although their total number of rows is close to constant), which is my guess as to why the pk indexes increase so rapidly in terms of their disk usage. Unfortunately, PostgreSQL knowledge is limited, and I was wondering if anyone had experienced something similar / knows what else we can do to return this disk space back to the system?This is possibly index bloat due to new keys always being larger than existing keys. This was fixed in later releases. There is still some potential for bloat due to fragmentation, but I believe that has a constant bound. You might try using the cluster command. I think that will both clean up the indexes and remove dead rows and do it faster than using a vacuum full and reindexing. The downside is that the table will be unavailable during the cluster which might be a deal breaker for you.
Jim Nasby jim@nasby.net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
--
Jim Nasby jimn@enterprisedb.com
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
-- Kevin Johnson Raytheon AWIPS Sr. Systems Engineer NWS Network Control Facility p: 301.713.9362x325 f: 301.713.1905
Вложения
On Oct 6, 2006, at 5:39 AM, Kevin Johnson wrote: > You are correct, it is 7.4 we are running on a RHEL3 system. > The database itself is vacuumed via cron 6 times a day. > > Jim Nasby wrote: >> I didn't see you mention what version you're running; index bloat >> shouldn't be a big issue in 7.4 and above. You also didn't mention >> how often you're vacuuming the table. If you don't vacuum the >> table frequently enough, you're going to get bloat, plain and simple. For a heavy update load, even 6 times a day may not be enough. Also, remember that the size you get after a REINDEX will be unreasonably low, because there will be very little free space in the index. That's fine if you won't be updating the table much, but otherwise the index will have no choice but to grow. -- Jim Nasby jimn@enterprisedb.com EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Fri, Oct 06, 2006 at 06:39:40 -0400, Kevin Johnson <Kevin.Johnson@noaa.gov> wrote: > You are correct, it is 7.4 we are running on a RHEL3 system. > The database itself is vacuumed via cron 6 times a day. The other thing you need to watch out for is that your FSM setting is large enough to recover all of the dead space. If your vacuum runs aren't recovering as much dead space as is being created on average, then you are going to have dead space accumalate.