Обсуждение: Toast space grows

Поиск
Список
Период
Сортировка

Toast space grows

От
"Pavel Rotek"
Дата:
Hello,

i have problem with following table...

create table dataaction (
   id INT4 not null,
   log text,
   primary key (id)
);

It is the table for storing results of long running jobs. The log attribute takes approximately 5MB for one row (there is about 300 rows). My problem is, that table dataaction takes after restoring about 1,5G, but in few days grows to 79G(Toast space)... Vacuum on the table doesn't finish.

Where can be the problem?? How to solve it, how to shrink toast space??

Any help will be appretiated.

Kind regards,

Pavel Rotek

Re: Toast space grows

От
Richard Huxton
Дата:
Pavel Rotek wrote:
> Hello,
>
> i have problem with following table...
>
> create table dataaction (
>    id INT4 not null,
>    log text,
>    primary key (id)
> );
>
> It is the table for storing results of long running jobs. The log attribute
> takes approximately 5MB for one row (there is about 300 rows). My problem
> is, that table dataaction takes after restoring about 1,5G, but in few days
> grows to 79G(Toast space)...

1. What is happening with this table - just inserts, lots of updates?

2. What does SELECT sum(length(log)) FROM dataaction; show?

 > Vacuum on the table doesn't finish.

A plain vacuum doesn't finish, or vacuum full doesn't finish?

--
   Richard Huxton
   Archonet Ltd

Re: Toast space grows

От
"Pavel Rotek"
Дата:
we just restored it to free 70G :-(

There are inserts and few updates (but what do you mean with update?? committed update??, because there are many updates of the log attribute in trasaction, we do periodical flush during transaction), sum takes approximately 1,2G, and i mean vacuum full (but there is no lock on the table when running vacuum full). I haven't try plain vacuum.

2008/3/7, Richard Huxton <dev@archonet.com>:
Pavel Rotek wrote:
> Hello,
>
> i have problem with following table...
>
> create table dataaction (
>    id INT4 not null,
>    log text,
>    primary key (id)
> );
>
> It is the table for storing results of long running jobs. The log attribute
> takes approximately 5MB for one row (there is about 300 rows). My problem
> is, that table dataaction takes after restoring about 1,5G, but in few days
> grows to 79G(Toast space)...


1. What is happening with this table - just inserts, lots of updates?

2. What does SELECT sum(length(log)) FROM dataaction; show?


  > Vacuum on the table doesn't finish.


A plain vacuum doesn't finish, or vacuum full doesn't finish?


--
   Richard Huxton
   Archonet Ltd

Re: Toast space grows

От
Bill Moran
Дата:
In response to "Pavel Rotek" <pavel.rotek@gmail.com>:
>
> There are inserts and few updates (but what do you mean with update??

He means adding or changing data in the table.

> committed update??, because there are many updates of the log attribute in
> trasaction, we do periodical flush during transaction)

Are you saying you have long-running transactions?  How long does a
single transaction take?

Transactions prevent vacuum from being able to clean up.  Long running
transactions tend to render vacuum ineffective.

>, sum takes
> approximately 1,2G, and i mean vacuum full (but there is no lock on the
> table when running vacuum full). I haven't try plain vacuum.

Don't do vacuum full on this table.  Do frequent vacuums.  The table will
bloat some, but not 10x the required size, once you find a reasonable
frequency for vacuums.  You might find it practical to manually vacuum
this table from your application after insert and update operations.

>
> 2008/3/7, Richard Huxton <dev@archonet.com>:
> >
> > Pavel Rotek wrote:
> > > Hello,
> > >
> > > i have problem with following table...
> > >
> > > create table dataaction (
> > >    id INT4 not null,
> > >    log text,
> > >    primary key (id)
> > > );
> > >
> > > It is the table for storing results of long running jobs. The log
> > attribute
> > > takes approximately 5MB for one row (there is about 300 rows). My
> > problem
> > > is, that table dataaction takes after restoring about 1,5G, but in few
> > days
> > > grows to 79G(Toast space)...
> >
> >
> > 1. What is happening with this table - just inserts, lots of updates?
> >
> > 2. What does SELECT sum(length(log)) FROM dataaction; show?
> >
> >
> >   > Vacuum on the table doesn't finish.
> >
> >
> > A plain vacuum doesn't finish, or vacuum full doesn't finish?
> >
> >
> > --
> >    Richard Huxton
> >    Archonet Ltd
> >
>


--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

****************************************************************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************

Re: Toast space grows

От
"Pavel Rotek"
Дата:


2008/3/7, Bill Moran <wmoran@collaborativefusion.com>:
In response to "Pavel Rotek" <pavel.rotek@gmail.com>:

>
> There are inserts and few updates (but what do you mean with update??


He means adding or changing data in the table.

I understand, but i don't have deep understanding of mechanism, that affects toast space. I don't know if toast space is affected only with commited transactions or also with uncommitted transactions. This is the reason why i asked.

> committed update??, because there are many updates of the log attribute in
> trasaction, we do periodical flush during transaction)


Are you saying you have long-running transactions?  How long does a
single transaction take?

Transactions prevent vacuum from being able to clean up.  Long running
transactions tend to render vacuum ineffective.

No i do not mean long running transactions... Update of log entry (update of row in dataaction) is performed in series of short transactions, but during short transaction there is a lot of change log value, flush, change log value ,flush ..... change log value, flush actions (flush means perform flush operation via JDBC driver). I'm not sure if this flush affects toast space... Maybe this is the reason.

>, sum takes
> approximately 1,2G, and i mean vacuum full (but there is no lock on the
> table when running vacuum full). I haven't try plain vacuum.


Don't do vacuum full on this table.  Do frequent vacuums.  The table will
bloat some, but not 10x the required size, once you find a reasonable
frequency for vacuums.  You might find it practical to manually vacuum
this table from your application after insert and update operations.

I perform autovacuum daily.

>
> 2008/3/7, Richard Huxton <dev@archonet.com>:
> >
> > Pavel Rotek wrote:
> > > Hello,
> > >
> > > i have problem with following table...
> > >
> > > create table dataaction (
> > >    id INT4 not null,
> > >    log text,
> > >    primary key (id)
> > > );
> > >
> > > It is the table for storing results of long running jobs. The log
> > attribute
> > > takes approximately 5MB for one row (there is about 300 rows). My
> > problem
> > > is, that table dataaction takes after restoring about 1,5G, but in few
> > days
> > > grows to 79G(Toast space)...
> >
> >
> > 1. What is happening with this table - just inserts, lots of updates?
> >
> > 2. What does SELECT sum(length(log)) FROM dataaction; show?
> >
> >
> >   > Vacuum on the table doesn't finish.
> >
> >
> > A plain vacuum doesn't finish, or vacuum full doesn't finish?
> >
> >
> > --
> >    Richard Huxton
> >    Archonet Ltd
> >
>



--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

****************************************************************
IMPORTANT: This message contains confidential information and is
intended only for the individual named. If the reader of this
message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
****************************************************************

Re: Toast space grows

От
Alvaro Herrera
Дата:
Pavel Rotek escribió:
> 2008/3/7, Bill Moran <wmoran@collaborativefusion.com>:

> > Don't do vacuum full on this table.  Do frequent vacuums.  The table will
> > bloat some, but not 10x the required size, once you find a reasonable
> > frequency for vacuums.  You might find it practical to manually vacuum
> > this table from your application after insert and update operations.
>
> I perform autovacuum daily.

Sorry, this sentence makes no sense.  Do you mean that you set
autovacuum_naptime=1 day?  If so, that's a bad idea -- you should let
autovacuum run far more frequently.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Toast space grows

От
Tom Lane
Дата:
"Pavel Rotek" <pavel.rotek@gmail.com> writes:
> No i do not mean long running transactions... Update of log entry (update of
> row in dataaction) is performed in series of short transactions, but during
> short transaction there is a lot of change log value, flush, change log
> value ,flush ..... change log value, flush actions (flush means perform
> flush operation via JDBC driver). I'm not sure if this flush affects toast
> space... Maybe this is the reason.

You mean that you build up the 5MB log entry by adding a few lines at a
time?  That's going to consume horrid amounts of toast space, because
each time you add a few lines, an entire new toasted field value is
created.

If you have to do it that way, you'll need very frequent vacuums on this
table (not vacuum full, as noted already) to keep the toast space from
bloating too much.  And make sure you've got max_fsm_pages set high
enough.

If you can restructure your code a bit, it might be better to accumulate
log values in a short-lived table and only store the final form of a log
entry into the main table.

            regards, tom lane

Re: Toast space grows

От
Matthew
Дата:
> "Pavel Rotek" <pavel.rotek@gmail.com> writes:
>> No i do not mean long running transactions... Update of log entry (update of
>> row in dataaction) is performed in series of short transactions, but during
>> short transaction there is a lot of change log value, flush, change log
>> value ,flush ..... change log value, flush actions (flush means perform
>> flush operation via JDBC driver). I'm not sure if this flush affects toast
>> space... Maybe this is the reason.

On Fri, 7 Mar 2008, Tom Lane wrote:
> You mean that you build up the 5MB log entry by adding a few lines at a
> time?  That's going to consume horrid amounts of toast space, because
> each time you add a few lines, an entire new toasted field value is
> created.

Moreover, what is the point of flushing data to Postgres without
committing the transaction, if you're only going to overwrite the data
later. You don't get any level of protection for your data just by
flushing it to Postgres - you have to commit the transaction for that to
happen. In my opinion, you should just be generating the log entry in
memory entirely, and then flushing it in a transaction commit when it's
finished, since you're obviously holding it all in memory all the time
anyway.

> If you have to do it that way, you'll need very frequent vacuums on this
> table (not vacuum full, as noted already) to keep the toast space from
> bloating too much.  And make sure you've got max_fsm_pages set high
> enough.

Agreed, this is kind of the worst-case-scenario for table bloat.

Matthew

--
Now the reason people powdered their faces back then was to change the values
"s" and "n" in this equation here.                 - Computer science lecturer

Re: Toast space grows

От
"Pavel Rotek"
Дата:


2008/3/7, Tom Lane <tgl@sss.pgh.pa.us>:
"Pavel Rotek" <pavel.rotek@gmail.com> writes:
> No i do not mean long running transactions... Update of log entry (update of
> row in dataaction) is performed in series of short transactions, but during
> short transaction there is a lot of change log value, flush, change log
> value ,flush ..... change log value, flush actions (flush means perform
> flush operation via JDBC driver). I'm not sure if this flush affects toast
> space... Maybe this is the reason.


You mean that you build up the 5MB log entry by adding a few lines at a
time?  That's going to consume horrid amounts of toast space, because
each time you add a few lines, an entire new toasted field value is
created.

well, this will be the main problem... But... do uncomitted trasactions affect toast space?

If you have to do it that way, you'll need very frequent vacuums on this
table (not vacuum full, as noted already) to keep the toast space from
bloating too much.  And make sure you've got max_fsm_pages set high
enough.
 
i'll set max_fsm_pages to 1 000 000. It should be enough and set autovacuum_naptime to 10 minutes. May it be?

If you can restructure your code a bit, it might be better to accumulate
log values in a short-lived table and only store the final form of a log
entry into the main table.

I'll try to refactor the code... My application do following thing... long running jobs (for example long imports) are broken into series of short transactions to store snapshot of current state of long running job. Short transaction consist of
(begin tx, load previous log, do business action, append new log, flush, do business action, append new log, flush, ... do business action, append new log, flush, commit tx). Is it enough to avoid multiple "append new log, flush" in one short transaction and keep log changes for short transaction in the buffer (only one update of log attribute at the end of transaction)? From your answer probably not, but i ask for sure, it will be less work. Or store logs for each one partial transaction and concat all at the end of long running job??

                        regards, tom lane

Re: Toast space grows

От
Matthew
Дата:
On Fri, 7 Mar 2008, Pavel Rotek wrote:
> well, this will be the main problem... But... do uncomitted trasactions
> affect toast space?

I think the demonstrated answer to this is yes.

> (begin tx, load previous log, do business action, append new log, flush, do
> business action, append new log, flush, ... do business action, append new
> log, flush, commit tx).

If all you're doing is appending to the end of the log, why don't you make
each "append" a new row in a table. Instead of building massive rows, use
the database for what it was designed for, and have many smaller
independent rows.

Matthew

--
"To err is human; to really louse things up requires root
 privileges."                 -- Alexander Pope, slightly paraphrased

Re: Toast space grows

От
"Pavel Rotek"
Дата:


2008/3/7, Matthew <matthew@flymine.org>:
> "Pavel Rotek" <pavel.rotek@gmail.com> writes:
>> No i do not mean long running transactions... Update of log entry (update of
>> row in dataaction) is performed in series of short transactions, but during
>> short transaction there is a lot of change log value, flush, change log
>> value ,flush ..... change log value, flush actions (flush means perform
>> flush operation via JDBC driver). I'm not sure if this flush affects toast
>> space... Maybe this is the reason.


On Fri, 7 Mar 2008, Tom Lane wrote:
> You mean that you build up the 5MB log entry by adding a few lines at a
> time?  That's going to consume horrid amounts of toast space, because
> each time you add a few lines, an entire new toasted field value is
> created.


Moreover, what is the point of flushing data to Postgres without
committing the transaction, if you're only going to overwrite the data
later. You don't get any level of protection for your data just by
flushing it to Postgres - you have to commit the transaction for that to
happen. In my opinion, you should just be generating the log entry in
memory entirely, and then flushing it in a transaction commit when it's
finished, since you're obviously holding it all in memory all the time
anyway.

Because i use kind of hibrid access to work with data in database (both hibernate and plain JDBC queries shares the same connection) and when i want to see data saved via hibernate in JDBC queries, i have to do flush of hibernate session... :-(

> If you have to do it that way, you'll need very frequent vacuums on this
> table (not vacuum full, as noted already) to keep the toast space from
> bloating too much.  And make sure you've got max_fsm_pages set high
> enough.


Agreed, this is kind of the worst-case-scenario for table bloat.

Matthew

--
Now the reason people powdered their faces back then was to change the values
"s" and "n" in this equation here.                 - Computer science lecturer


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Toast space grows

От
"Pavel Rotek"
Дата:


2008/3/7, Matthew <matthew@flymine.org>:
On Fri, 7 Mar 2008, Pavel Rotek wrote:
> well, this will be the main problem... But... do uncomitted trasactions
> affect toast space?


I think the demonstrated answer to this is yes.


> (begin tx, load previous log, do business action, append new log, flush, do
> business action, append new log, flush, ... do business action, append new
> log, flush, commit tx).


If all you're doing is appending to the end of the log, why don't you make
each "append" a new row in a table. Instead of building massive rows, use
the database for what it was designed for, and have many smaller
independent rows.

Matthew

Because I modify existing application, where logic is already given :-(. If no other way exists, i'll have to do refactoring...

--
"To err is human; to really louse things up requires root
  privileges."                 -- Alexander Pope, slightly paraphrased

Re: Toast space grows

От
Tom Lane
Дата:
"Pavel Rotek" <pavel.rotek@gmail.com> writes:
> 2008/3/7, Tom Lane <tgl@sss.pgh.pa.us>:
>> You mean that you build up the 5MB log entry by adding a few lines at a
>> time?  That's going to consume horrid amounts of toast space, because
>> each time you add a few lines, an entire new toasted field value is
>> created.

> well, this will be the main problem... But... do uncomitted trasactions
> affect toast space?

Sure.  Where do you think the data goes?  It's gotta be stored
someplace.  Every UPDATE operation that changes a toasted field will
consume space for a fresh copy of that field, whether it ever commits or
not.  You need VACUUM to reclaim the space eaten by no-longer-accessible
copies.

            regards, tom lane

Re: Toast space grows

От
Bill Moran
Дата:
In response to "Pavel Rotek" <pavel.rotek@gmail.com>:

> 2008/3/7, Tom Lane <tgl@sss.pgh.pa.us>:

[snip]

> > If you have to do it that way, you'll need very frequent vacuums on this
> > table (not vacuum full, as noted already) to keep the toast space from
> > bloating too much.  And make sure you've got max_fsm_pages set high
> > enough.
>
> i'll set max_fsm_pages to 1 000 000. It should be enough and set
> autovacuum_naptime to 10 minutes. May it be?

No.  Keep naptime at 1 minute.  If it comes around and there's nothing
to do, the overhead is minimal.  If you set the naptime too high, it might
have too much to do on the next cycle and then it'll bog things down.
Also, it only checks 1 database per cycle, so setting it to 10 minutes
means a _minimum_ of 40 minutes between checks (because you have a template0,
template1, postgres, and your database minimum)

Also, keep an eye on your database bloat to ensure the various
autovacuum_*_scale_factor and related settings are appropriate.
It's been found that these are often not aggressive enough for
good maintenance.  If you see bloat even with autovacuum running,
reduce those values.

Personally, I'd recommend running a MRTG graph that graphs the size
of this table so you can easily watch to see if your config tweaks
are getting the job done or not.  And remember that _some_ bloat is
expected and normal for operation.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

Re: Toast space grows

От
"Pavel Rotek"
Дата:
Thanks to all for time and valuable help,

Pavel Rotek

2008/3/7, Bill Moran <wmoran@collaborativefusion.com>:
In response to "Pavel Rotek" <pavel.rotek@gmail.com>:


> 2008/3/7, Tom Lane <tgl@sss.pgh.pa.us>:


[snip]


> > If you have to do it that way, you'll need very frequent vacuums on this
> > table (not vacuum full, as noted already) to keep the toast space from
> > bloating too much.  And make sure you've got max_fsm_pages set high
> > enough.
>
> i'll set max_fsm_pages to 1 000 000. It should be enough and set
> autovacuum_naptime to 10 minutes. May it be?


No.  Keep naptime at 1 minute.  If it comes around and there's nothing
to do, the overhead is minimal.  If you set the naptime too high, it might
have too much to do on the next cycle and then it'll bog things down.
Also, it only checks 1 database per cycle, so setting it to 10 minutes
means a _minimum_ of 40 minutes between checks (because you have a template0,
template1, postgres, and your database minimum)

Also, keep an eye on your database bloat to ensure the various
autovacuum_*_scale_factor and related settings are appropriate.
It's been found that these are often not aggressive enough for
good maintenance.  If you see bloat even with autovacuum running,
reduce those values.

Personally, I'd recommend running a MRTG graph that graphs the size
of this table so you can easily watch to see if your config tweaks
are getting the job done or not.  And remember that _some_ bloat is
expected and normal for operation.


--

Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023