Обсуждение: vacuum vs pg_repack for clearing bloat?
Our app makes extensive use of temp tables, and this causes a significant amount of bloat that can often only be cleared with a manual vacuum process. We're looking for a better way that doesn't involve locking, we found pg_repack and pg_reorg and were wondering if anybody here could weigh in on using this instead of using vacuum? pg_repack: https://github.com/reorg/pg_repack pg_reorg http://reorg.projects.pgfoundry.org/ Thanks in advance, Ben
Lists <lists@benjamindsmith.com> writes: > Our app makes extensive use of temp tables, and this causes a > significant amount of bloat that can often only be cleared with a manual > vacuum process. We're looking for a better way that doesn't involve > locking, we found pg_repack and pg_reorg and were wondering if anybody > here could weigh in on using this instead of using vacuum? A temp table is only accessible to the owning process, so if you're hoping for vacuuming of it to happen silently in background, you'll be sadly disappointed. The speed advantage of a temp table come exactly from not having to worry about concurrent access, so this isn't a tradeoff that can easily be adjusted. regards, tom lane
On Wed, Jan 15, 2014 at 04:09:28PM -0800, Lists wrote: > Our app makes extensive use of temp tables, and this causes a > significant amount of bloat that can often only be cleared with a Note what Tom Lane said, but why do you have bloat that can only be cleared by vacuum? Why not drop them or whatever (I presume they're not going away because your connection is long lived)? They're supposed to be temporary, after all: cheap and disposable. A -- Andrew Sullivan ajs@crankycanuck.ca
On 1/15/2014 4:09 PM, Lists wrote: > Our app makes extensive use of temp tables, and this causes a > significant amount of bloat that can often only be cleared with a > manual vacuum process. whats the persistence of these temporary tables? by design, they are meant for relatively short lifespan uses, and as Tom said are only accessible by the connection that created them, and when you're done with them you should drop them. now, if you mean 'temporary table' in another sense, like regular tables you're using for storing temporary data, how are you cleaning this 'temporary data' out of them? truncate should free the disk space they use where delete won't. -- john r pierce 37N 122W somewhere on the middle of the left coast
On 01/15/2014 04:24 PM, Tom Lane wrote: > Lists <lists@benjamindsmith.com> writes: >> Our app makes extensive use of temp tables, and this causes a >> significant amount of bloat that can often only be cleared with a manual >> vacuum process. We're looking for a better way that doesn't involve >> locking, we found pg_repack and pg_reorg and were wondering if anybody >> here could weigh in on using this instead of using vacuum? > A temp table is only accessible to the owning process, so if you're hoping > for vacuuming of it to happen silently in background, you'll be sadly > disappointed. The speed advantage of a temp table come exactly from not > having to worry about concurrent access, so this isn't a tradeoff that can > easily be adjusted. > > regards, tom lane Tom, The process(es) creating the temp tables are not persistent, so the issue isn't trying to clean up bloat from a long running process, it's clearing out the cruft that results from creating temp tables, loading a bunch of data, then dropping the table, either explicitly or when the connection is terminated. This causes PG disk usage to climb without causing any change in pg_dump output. I was wondering if anybody else had used either of these projects (pg_repack or pg_reorg, though reorg seems to be unsupported) and if so, how successful they had been. -Ben
Lists <lists@benjamindsmith.com> writes: > The process(es) creating the temp tables are not persistent, so the > issue isn't trying to clean up bloat from a long running process, it's > clearing out the cruft that results from creating temp tables, loading a > bunch of data, then dropping the table, either explicitly or when the > connection is terminated. This causes PG disk usage to climb without > causing any change in pg_dump output. Oh, you're worried about system catalog bloat from lots of temp tables? I'd have thought that autovacuum could manage that, though you might need to dial up its aggressiveness. Not sure that things like pg_reorg can safely be applied to system catalogs. (That's not to say it wouldn't work, but I'd sure test it on disposable installations first.) regards, tom lane
On 1/15/2014 5:37 PM, Lists wrote: > it's clearing out the cruft that results from creating temp tables, > loading a bunch of data, then dropping the table, there shoudl be zero cruft. when the table is dropped, it should pooferate. -- john r pierce 37N 122W somewhere on the middle of the left coast
On Wed, Jan 15, 2014 at 05:37:27PM -0800, Lists wrote: > it's clearing out the cruft that results from creating temp tables, > loading a bunch of data, then dropping the table, either explicitly > or when the connection is terminated. This causes PG disk usage to > climb without causing any change in pg_dump output. Oh. You need to up your autovacuum settings for the system catalog tables. I built a system that did this sort of thing. If your autovacuum settings are correct, this will stabilize. A -- Andrew Sullivan ajs@crankycanuck.ca