Обсуждение: Cleaning up large objects
Hi, I have a problem with the cleanup of large objects. My database consists of (mainly) one table that holds a date and an associated large object id. The large object can be as big as 2 megabytes. Every hour, I have a small method that removes all entries that are older than a certain date: private void cleanup(Connection dbConn, Date deleteTo) throws SQLException { try { dbConn.setAutoCommit(false); dbConn.setTransactionIsolation(Connection.TRANSACTION_READ_COMMITTED); LargeObjectManager lobj = ((PGConnection)dbConn).getLargeObjectAPI(); PreparedStatement queryStmt = dbConn.prepareStatement( "SELECT Values FROM MyTable WHERE From_Date < ?"); queryStmt.setTimestamp(0, new Timestamp(deleteTo.getTime())); ResultSet rs = queryStmt.executeQuery(); try { while (rs.next()) { int oid = rs.getInt(1); try { lobj.delete(oid); } catch (SQLException e) {e.printStackTrace();} } } finally { rs.close(); } PreparedStatement deleteStmt = dbConn.prepareStatement("DELETE FROM MyTable WHERE From_Date < ?"); deleteStmt.setTimestamp(1, new Timestamp(deleteTo.getTime())); deleteStmt.executeUpdate(); dbConn.commit(); } catch (SQLException e) { dbConn.rollback(); throw e; } This program also seems to work, that means that I dont get any exceptions from it. Also, every night I run the "vacuum" command: /usr/bin/vacuumdb -a -z But: the data base keeps growing. The table MyTable seems to successfully remove the entries, but the disk usage keeps high. I am quit sure that the disk usage mainly comes from the LOBs since the disk usage shrinks with exactly the speed I expect from the LOBs I put in. What is wrong with my approach that it does not free the disk space? Regards Ole -- NEU: WLAN-Router f�r 0,- EUR* - auch f�r DSL-Wechsler! GMX DSL = superg�nstig & kabellos http://www.gmx.net/de/go/dsl
I forgot to mention some things: I use postgreSQL 7.3.2, with the jdbc driver 7.4. When I execute "vacuum -a -v -z", I get INFO: --Relation pg_catalog.pg_largeobject-- INFO: Index pg_largeobject_loid_pn_index: Pages 41027; Tuples 731708: Deleted 20. CPU 3.95s/0.55u sec elapsed 100.14 sec. INFO: Removed 320 tuples in 75 pages. CPU 0.02s/0.01u sec elapsed 1.61 sec. INFO: Pages 1768741: Changed 14814, Empty 0; Tup 731708: Vac 320, Keep 104534 UnUsed 9314229. Total CPU 169.45s/9.47u sec elapsed 737.33 sec. INFO: Analyzing pg_catalog.pg_largeobject 320 tuples look like twice the number of large object I just removed. So, I gues that /something/ works. However, no space is freed. Why? Ole -- NEU: WLAN-Router f�r 0,- EUR* - auch f�r DSL-Wechsler! GMX DSL = superg�nstig & kabellos http://www.gmx.net/de/go/dsl
"Ole Streicher" <ole-usenet-spam@gmx.net> writes: > What is wrong with my approach that it does not free the disk space? Plain vacuum is not designed to "free" disk space, at least not in the sense of returning it to the operating system. What it's designed to do is record free disk space within table files in the "free space map" (FSM) so that it can be re-used for future row creation. To aggressively compact table files and return space to the OS, you need vacuum full, which is a great deal slower and requires exclusive table locks. If you see space leakage in a database that should have a reasonably constant total size, the reason is usually that you don't have the FSM parameters set large enough to remember all the free space. Check your settings in postgresql.conf and increase if needed. (Note that an increase requires a postmaster restart to take effect.) You might also want to think about vacuuming more often than once a day, so that space can be recycled into the FSM sooner. When you have a periodic bulk delete process, it's not a bad idea to vacuum the table that had the deletes immediately after each deletion run. Depending on how far behind the eight-ball you are, you may need a pass of vacuum full to get the DB back down to a reasonable size. BTW, "cluster" can serve as a substitute for vacuum full, since it also compacts out dead space. It can be significantly faster than vacuum full when there's a lot of space to be reclaimed. regards, tom lane
On Tue, 10 Aug 2004, Ole Streicher wrote: > When I execute "vacuum -a -v -z", I get > However, no space is freed. Why? > Without -f (FULL) vacuum will not actually free space, but just mark it for later reuse. If you are seeing a growing rate over time that you think should hit a steady state, perhaps your free space map settings aren't large enough. Kris Jurka
Hi Tom, > "Ole Streicher" <ole-usenet-spam@gmx.net> writes: > To aggressively compact table files and return space to the OS, you need > vacuum full, which is a great deal slower and requires exclusive table > locks. OK; this freed the space while running for about four hours. > If you see space leakage in a database that should have a reasonably > constant total size, the reason is usually that you don't have the FSM > parameters set large enough to remember all the free space. Check your > settings in postgresql.conf and increase if needed. (Note that an > increase requires a postmaster restart to take effect.) What is a reasonable value for these settings in my case? I write about 5 MBytes per hour (measured by the decrease of free disk space over some time) to 157 LOBs (mostly by appending to existing LOBs), where the data remain for about three months. This makes about 11 Gigabytes of needed Disk space. But my disk went out of space at about 15 Gigabytes. I have still no idea which value to put there; the (commented out) line in the cfg file is #max_fsm_pages = 10000 # min 1000, fsm is free space map, ~6 bytes > You might also want to think about vacuuming more often than once a day, > so that space can be recycled into the FSM sooner. When you have a > periodic bulk delete process, it's not a bad idea to vacuum the table > that had the deletes immediately after each deletion run. My problem here is that vaccum takes always a few minutes (except in cases when the database was not touched at all), and during that time the system load is quite high. That disturbs the "normal" data storage processes on the same server. Is it possible to run the vacuum with a lower priority (while all other postgres queries keeping the normal priority)? It does not help just to decrease the priority of the vacuum command since it is the postmaster thread that makes the load and not the command script). Regards Ole -- NEU: WLAN-Router f�r 0,- EUR* - auch f�r DSL-Wechsler! GMX DSL = superg�nstig & kabellos http://www.gmx.net/de/go/dsl
"Ole Streicher" <ole-usenet-spam@gmx.net> writes: > What is a reasonable value for these settings in my case? I'd try bumping up max_fsm_pages to a million or so. That would definitely be enough for a 10Gb database. You could probably get away with less but I'm not sure it's worth worrying about. If you are running 7.4 then a "vacuum verbose" will give you some data (at the very end of its lengthy printout) about FSM consumption, which you could use if you want to size more carefully. > Is it possible to run the vacuum with a lower priority (while all other > postgres queries keeping the normal priority)? No. In 8.0 there will be some tuning parameters that will help with that, but not in current releases... regards, tom lane