Обсуждение: db size growing out of control when using clustered Jackrabbit
Hello. I'm hoping someone has seen this before.
We are trying to use Postgres Plus v9.1.3 as the Persistence Manager in Jackrabbit (Apache JCR) clustering (http://wiki.apache.org/jackrabbit/Clustering).
Whenever the JCR is under load, the ws_bundle TOAST table in the repository schema, grows out of control !
Some of my team members maintain that this problem doesn't occur with MySQL, but I would rather stay with Postgres if possible...
Thanks.
We are trying to use Postgres Plus v9.1.3 as the Persistence Manager in Jackrabbit (Apache JCR) clustering (http://wiki.apache.org/jackrabbit/Clustering).
Whenever the JCR is under load, the ws_bundle TOAST table in the repository schema, grows out of control !
Some of my team members maintain that this problem doesn't occur with MySQL, but I would rather stay with Postgres if possible...
Thanks.
On 07/23/2012 02:13 PM, Gary Webster wrote: > Hello. I'm hoping someone has seen this before. > > We are trying to use Postgres Plus v9.1.3 as the Persistence Manager in > Jackrabbit (Apache JCR) clustering > (http://wiki.apache.org/jackrabbit/Clustering). > Whenever the JCR is under load, the ws_bundle TOAST table in the > repository schema, grows out of control ! > > Some of my team members maintain that this problem doesn't occur with > MySQL, but I would rather stay with Postgres if possible... I don't really know anything about jackrabbit but generally this problem presents when you have a lot of transactions that are idle. Meaning, you have transactions that are just open, doing nothing. This will present a problem with routine maintenance. Under load you can check your process list to see if you have long running transactions that are idle ( idle in transaction ). If you do, you have a code problem not a postgres problem and it is presenting itself through bloat. Note: IDLE is fine. It is specifically IDLE IN TRANSACTION that is a problem. Sincerely, Joshua D. Drake > > Thanks. > -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC @cmdpromptinc - 509-416-6579
On 07/24/2012 05:13 AM, Gary Webster wrote:
Hello. I'm hoping someone has seen this before.
We are trying to use Postgres Plus v9.1.3 as the Persistence Manager in Jackrabbit (Apache JCR) clustering (http://wiki.apache.org/jackrabbit/Clustering).
Whenever the JCR is under load, the ws_bundle TOAST table in the repository schema, grows out of control !
Has somebody disabled autovacuum or set it to barely run at all?
Try setting autovacuum to very aggressively vacuum the problem table(s).
--
Craig Ringer
Hello.
Thanks for the response.
Autovacuum is set VERY aggressive.
However, it does not help with the ws_bundle Toast table.
A manual _full_ vacuum (not recommended?) does do the deed.
However, it often gives this error:
ERROR: missing chunk number 0 for toast value 639113 in pg_toast_533386
BTW, how bad is that? Corruption?
Thanks for the response.
Autovacuum is set VERY aggressive.
However, it does not help with the ws_bundle Toast table.
A manual _full_ vacuum (not recommended?) does do the deed.
However, it often gives this error:
ERROR: missing chunk number 0 for toast value 639113 in pg_toast_533386
BTW, how bad is that? Corruption?
On Mon, Jul 23, 2012 at 8:35 PM, Craig Ringer <ringerc@ringerc.id.au> wrote:
Has somebody disabled autovacuum or set it to barely run at all?On 07/24/2012 05:13 AM, Gary Webster wrote:Hello. I'm hoping someone has seen this before.
We are trying to use Postgres Plus v9.1.3 as the Persistence Manager in Jackrabbit (Apache JCR) clustering (http://wiki.apache.org/jackrabbit/Clustering).
Whenever the JCR is under load, the ws_bundle TOAST table in the repository schema, grows out of control !
Try setting autovacuum to very aggressively vacuum the problem table(s).
--
Craig Ringer
Hello.
Thanks for the response.
There are several 'idle in transaction' on this server/app, but to a different db/schema.
The "repository" (JCR) schema has only a few 'idle', none 'in transaction' .
By "routine maintenance", do you mean autovacuum, or something else?
Autovacuum does appear to usually get 'auto-canceled' by a lock. However, even when it runs successfully, it doesn't seem to help with this ws_bundle Toast table size.
I am rather looking for a root cause here. Surely this table is not supposed to grow so much (100s of GB). It is even bigger than the data store!
I agree that this may not be an 'error' in Postgres, but somehow it is not playing well with Jackrabbit clustering.
Thanks for the response.
There are several 'idle in transaction' on this server/app, but to a different db/schema.
The "repository" (JCR) schema has only a few 'idle', none 'in transaction' .
By "routine maintenance", do you mean autovacuum, or something else?
Autovacuum does appear to usually get 'auto-canceled' by a lock. However, even when it runs successfully, it doesn't seem to help with this ws_bundle Toast table size.
I am rather looking for a root cause here. Surely this table is not supposed to grow so much (100s of GB). It is even bigger than the data store!
I agree that this may not be an 'error' in Postgres, but somehow it is not playing well with Jackrabbit clustering.
On Mon, Jul 23, 2012 at 5:40 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
I don't really know anything about jackrabbit but generally this problem presents when you have a lot of transactions that are idle. Meaning, you have transactions that are just open, doing nothing. This will present a problem with routine maintenance.
On 07/23/2012 02:13 PM, Gary Webster wrote:Hello. I'm hoping someone has seen this before.
We are trying to use Postgres Plus v9.1.3 as the Persistence Manager in
Jackrabbit (Apache JCR) clustering
(http://wiki.apache.org/jackrabbit/Clustering).
Whenever the JCR is under load, the ws_bundle TOAST table in the
repository schema, grows out of control !
Some of my team members maintain that this problem doesn't occur with
MySQL, but I would rather stay with Postgres if possible...
Under load you can check your process list to see if you have long running transactions that are idle ( idle in transaction ). If you do, you have a code problem not a postgres problem and it is presenting itself through bloat.
Note: IDLE is fine. It is specifically IDLE IN TRANSACTION that is a problem.
Sincerely,
Joshua D. Drake
Thanks.
--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579
Gary Webster <webster@lexmark.com> writes: > By "routine maintenance", do you mean autovacuum, or something else? > Autovacuum does appear to usually get 'auto-canceled' by a lock. That's bad and you should look into the reason why it happens. Ordinary DML (CRUD) operations should not kick autovac off a table. If it's happening, it's probably because something is fooling with the table's schema, which doesn't seem like something you want to have happening during routine operations; especially not on tables that are large enough for this to be an issue in the first place. Or it might be something doing a LOCK TABLE as a substitute for more fine-grained locking; which again is bad for performance reasons that have nothing to do with hobbling autovacuum. regards, tom lane
On 07/24/2012 08:58 AM, Gary Webster wrote: > Hello. > Thanks for the response. > > There are several 'idle in transaction' on this server/app, but to a > different db/schema. This is a cluster issue, not a database issue. So if you have an idnle in transaction, then it is affecting your JCR schema as well. > The "repository" (JCR) schema has only a few 'idle', none 'in transaction' . > > By "routine maintenance", do you mean autovacuum, or something else? I mean autovacuum. > Autovacuum does appear to usually get 'auto-canceled' by a lock. That is a problem too. > However, even when it runs successfully, it doesn't seem to help with > this ws_bundle Toast table size. It won't if you have the above idle in transactions, regardless of database. Sincerely, jD -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC @cmdpromptinc - 509-416-6579
On Tue, Jul 24, 2012 at 1:41 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
On 07/24/2012 08:58 AM, Gary Webster wrote:Hello.
Thanks for the response.
There are several 'idle in transaction' on this server/app, but to a
different db/schema.
This is a cluster issue, not a database issue. So if you have an idnle in transaction, then it is affecting your JCR schema as well.
OK, how do I track/debug/stop the "idle in transaction"s ?
The "repository" (JCR) schema has only a few 'idle', none 'in transaction' .I mean autovacuum.
By "routine maintenance", do you mean autovacuum, or something else?
I was hoping to find more of a 'root cause' (eg. jackrabbit config) for this issue.
I can't believe that this table is supposed to be getting so big, to even require much vacuuming.
I can't believe that this table is supposed to be getting so big, to even require much vacuuming.
That is a problem too.Autovacuum does appear to usually get 'auto-canceled' by a lock.
OK, I am trying to find out why this is happening.
However, even when it runs successfully, it doesn't seem to help with
this ws_bundle Toast table size.
It won't if you have the above idle in transactions, regardless of database.
Sincerely,
jD
--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579
On 07/25/2012 11:37 AM, Gary Webster wrote: > This is a cluster issue, not a database issue. So if you have an > idnle in transaction, then it is affecting your JCR schema as well. > > OK, how do I track/debug/stop the "idle in transaction"s ? Well idle in transaction is ALWAYS a code issue. You have code that is executing that is starting a transaction, leaving the connection open while not closing (committing/rollingback) the transaction. You could turn on query logging and make sure pid and timestamp is in the log_line_prefix. They you can see what pids are idle in transaction and trace to what the last query was. o you mean autovacuum, or something else? > > > I mean autovacuum. > > I was hoping to find more of a 'root cause' (eg. jackrabbit config) for > this issue. > I can't believe that this table is supposed to be getting so big, to > even require much vacuuming. Any update/delete to that table is going to cause bloat, autovacuum cleans that up. If it can. If it can't, it will just continously grow. Sincerely, jD -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC @cmdpromptinc - 509-416-6579
On Wed, Jul 25, 2012 at 2:44 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
Well idle in transaction is ALWAYS a code issue. You have code that is executing that is starting a transaction, leaving the connection open while not closing (committing/rollingback) the transaction.
On 07/25/2012 11:37 AM, Gary Webster wrote:This is a cluster issue, not a database issue. So if you have an
idnle in transaction, then it is affecting your JCR schema as well.
OK, how do I track/debug/stop the "idle in transaction"s ?
You could turn on query logging and make sure pid and timestamp is in the log_line_prefix. They you can see what pids are idle in transaction and trace to what the last query was.
OK, I set "log_statement = "all""
The log grew to 1GB in ~minute! It is dominated by this one statement, which occurs every ~1.4 sec:
"update WS_BUNDLE set BUNDLE_DATA = $1 where NODE_ID_HI = $2 and NODE_ID_LO = $3"
parameter $1 is hex, over 6million characters long !! Surely this is the root of my problem.
The log grew to 1GB in ~minute! It is dominated by this one statement, which occurs every ~1.4 sec:
"update WS_BUNDLE set BUNDLE_DATA = $1 where NODE_ID_HI = $2 and NODE_ID_LO = $3"
parameter $1 is hex, over 6million characters long !! Surely this is the root of my problem.
o you mean autovacuum, or something else?Any update/delete to that table is going to cause bloat, autovacuum cleans that up. If it can. If it can't, it will just continously grow.
I mean autovacuum.
I was hoping to find more of a 'root cause' (eg. jackrabbit config) for
this issue.
I can't believe that this table is supposed to be getting so big, to
even require much vacuuming.
Sincerely,
jD
--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579
On Thu, Jul 26, 2012 at 1:31 PM, Gary Webster <webster@lexmark.com> wrote: > OK, I set "log_statement = "all"" > The log grew to 1GB in ~minute! It is dominated by this one statement, > which occurs every ~1.4 sec: > "update WS_BUNDLE set BUNDLE_DATA = $1 where NODE_ID_HI = $2 and NODE_ID_LO > = $3" > parameter $1 is hex, over 6million characters long !! Surely this is the > root of my problem. That's crazy big. Is that a normal payload size for this app?
On 07/26/2012 12:31 PM, Gary Webster wrote: > OK, I set "log_statement = "all"" > The log grew to 1GB in ~minute! It is dominated by this one statement, > which occurs every ~1.4 sec: > "update WS_BUNDLE set BUNDLE_DATA = $1 where NODE_ID_HI = $2 and > NODE_ID_LO = $3" > parameter $1 is hex, over 6million characters long !! Surely this is > the root of my problem. It definitely is. Every time you update, you are creating a dead tuple (unless it is HOT capable). If autovacuum can't come in behind and clean that up due to idle in transaction, you are going to have serious problems. Sincerely, jD -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC @cmdpromptinc - 509-416-6579