Обсуждение: db size growing out of control when using clustered Jackrabbit

Поиск
Список
Период
Сортировка

db size growing out of control when using clustered Jackrabbit

От
Gary Webster
Дата:
Hello. I'm hoping someone has seen this before.

We are trying to use Postgres Plus v9.1.3 as the Persistence Manager in Jackrabbit (Apache JCR) clustering (http://wiki.apache.org/jackrabbit/Clustering).
Whenever the JCR is under load, the ws_bundle TOAST table in the  repository  schema, grows out of control !

Some of my team members maintain that this problem doesn't occur with MySQL, but I would rather stay with Postgres if possible...

Thanks.

Re: db size growing out of control when using clustered Jackrabbit

От
"Joshua D. Drake"
Дата:
On 07/23/2012 02:13 PM, Gary Webster wrote:
> Hello. I'm hoping someone has seen this before.
>
> We are trying to use Postgres Plus v9.1.3 as the Persistence Manager in
> Jackrabbit (Apache JCR) clustering
> (http://wiki.apache.org/jackrabbit/Clustering).
> Whenever the JCR is under load, the ws_bundle TOAST table in the
> repository  schema, grows out of control !
>
> Some of my team members maintain that this problem doesn't occur with
> MySQL, but I would rather stay with Postgres if possible...

I don't really know anything about jackrabbit but generally this problem
presents when you have a lot of transactions that are idle. Meaning, you
have transactions that are just open, doing nothing. This will present a
problem with routine maintenance.

Under load you can check your process list to see if you have long
running transactions that are idle (  idle in transaction ). If you do,
you have a code problem not a postgres problem and it is presenting
itself through bloat.

Note: IDLE is fine. It is specifically IDLE IN TRANSACTION that is a
problem.


Sincerely,

Joshua D. Drake


>
> Thanks.
>


--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579

Re: db size growing out of control when using clustered Jackrabbit

От
Craig Ringer
Дата:
On 07/24/2012 05:13 AM, Gary Webster wrote:
Hello. I'm hoping someone has seen this before.

We are trying to use Postgres Plus v9.1.3 as the Persistence Manager in Jackrabbit (Apache JCR) clustering (http://wiki.apache.org/jackrabbit/Clustering).
Whenever the JCR is under load, the ws_bundle TOAST table in the  repository  schema, grows out of control !

Has somebody disabled autovacuum or set it to barely run at all?

Try setting autovacuum to very aggressively vacuum the problem table(s).

--
Craig Ringer

Re: db size growing out of control when using clustered Jackrabbit

От
Gary Webster
Дата:
Hello.
Thanks for the response.

Autovacuum is set VERY aggressive.
However, it does not help with the ws_bundle Toast table.

A manual _full_ vacuum (not recommended?) does do the deed.
However, it often gives this error:
ERROR:  missing chunk number 0 for toast value 639113 in pg_toast_533386

BTW, how bad is that?  Corruption?


On Mon, Jul 23, 2012 at 8:35 PM, Craig Ringer <ringerc@ringerc.id.au> wrote:
On 07/24/2012 05:13 AM, Gary Webster wrote:
Hello. I'm hoping someone has seen this before.

We are trying to use Postgres Plus v9.1.3 as the Persistence Manager in Jackrabbit (Apache JCR) clustering (http://wiki.apache.org/jackrabbit/Clustering).
Whenever the JCR is under load, the ws_bundle TOAST table in the  repository  schema, grows out of control !

Has somebody disabled autovacuum or set it to barely run at all?

Try setting autovacuum to very aggressively vacuum the problem table(s).

--
Craig Ringer

Re: db size growing out of control when using clustered Jackrabbit

От
Gary Webster
Дата:
Hello.
Thanks for the response.

There are several 'idle in transaction' on this server/app, but to a different db/schema.
The "repository" (JCR) schema has only a few 'idle', none 'in transaction' .

By "routine maintenance", do you mean autovacuum, or something else?
Autovacuum does appear to usually get 'auto-canceled' by a lock.  However, even when it runs successfully, it doesn't seem to help with this ws_bundle Toast table size.
I am rather looking for a root cause here.  Surely this table is not supposed to grow so much (100s of GB).  It is even bigger than the data store!

I agree that this may not be an 'error' in Postgres, but somehow it is not playing well with Jackrabbit clustering.


On Mon, Jul 23, 2012 at 5:40 PM, Joshua D. Drake <jd@commandprompt.com> wrote:

On 07/23/2012 02:13 PM, Gary Webster wrote:
Hello. I'm hoping someone has seen this before.

We are trying to use Postgres Plus v9.1.3 as the Persistence Manager in
Jackrabbit (Apache JCR) clustering
(http://wiki.apache.org/jackrabbit/Clustering).
Whenever the JCR is under load, the ws_bundle TOAST table in the
repository  schema, grows out of control !

Some of my team members maintain that this problem doesn't occur with
MySQL, but I would rather stay with Postgres if possible...

I don't really know anything about jackrabbit but generally this problem presents when you have a lot of transactions that are idle. Meaning, you have transactions that are just open, doing nothing. This will present a problem with routine maintenance.

Under load you can check your process list to see if you have long running transactions that are idle (  idle in transaction ). If you do, you have a code problem not a postgres problem and it is presenting itself through bloat.

Note: IDLE is fine. It is specifically IDLE IN TRANSACTION that is a problem.


Sincerely,

Joshua D. Drake



Thanks.



--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579

Re: db size growing out of control when using clustered Jackrabbit

От
Tom Lane
Дата:
Gary Webster <webster@lexmark.com> writes:
> By "routine maintenance", do you mean autovacuum, or something else?
> Autovacuum does appear to usually get 'auto-canceled' by a lock.

That's bad and you should look into the reason why it happens.  Ordinary
DML (CRUD) operations should not kick autovac off a table.  If it's
happening, it's probably because something is fooling with the table's
schema, which doesn't seem like something you want to have happening
during routine operations; especially not on tables that are large
enough for this to be an issue in the first place.  Or it might be
something doing a LOCK TABLE as a substitute for more fine-grained
locking; which again is bad for performance reasons that have nothing
to do with hobbling autovacuum.

            regards, tom lane

Re: db size growing out of control when using clustered Jackrabbit

От
"Joshua D. Drake"
Дата:
On 07/24/2012 08:58 AM, Gary Webster wrote:
> Hello.
> Thanks for the response.
>
> There are several 'idle in transaction' on this server/app, but to a
> different db/schema.

This is a cluster issue, not a database issue. So if you have an idnle
in transaction, then it is affecting your JCR schema as well.

> The "repository" (JCR) schema has only a few 'idle', none 'in transaction' .

>
> By "routine maintenance", do you mean autovacuum, or something else?

I mean autovacuum.


> Autovacuum does appear to usually get 'auto-canceled' by a lock.

That is a problem too.

> However, even when it runs successfully, it doesn't seem to help with
> this ws_bundle Toast table size.

It won't if you have the above idle in transactions, regardless of database.

Sincerely,

jD


--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579

Re: db size growing out of control when using clustered Jackrabbit

От
Gary Webster
Дата:

On Tue, Jul 24, 2012 at 1:41 PM, Joshua D. Drake <jd@commandprompt.com> wrote:

On 07/24/2012 08:58 AM, Gary Webster wrote:
Hello.
Thanks for the response.

There are several 'idle in transaction' on this server/app, but to a
different db/schema.

This is a cluster issue, not a database issue. So if you have an idnle in transaction, then it is affecting your JCR schema as well.
 
OK, how do I track/debug/stop the "idle in transaction"s ?
 
The "repository" (JCR) schema has only a few 'idle', none 'in transaction' .


By "routine maintenance", do you mean autovacuum, or something else?

I mean autovacuum.

I was hoping to find more of a 'root cause' (eg. jackrabbit config) for this issue.
I can't believe that this table is supposed to be getting so big, to even require much vacuuming.


Autovacuum does appear to usually get 'auto-canceled' by a lock.

That is a problem too.

OK, I am trying to find out why this is happening.
 
However, even when it runs successfully, it doesn't seem to help with
this ws_bundle Toast table size.

It won't if you have the above idle in transactions, regardless of database.

Sincerely,

jD


--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579

Re: db size growing out of control when using clustered Jackrabbit

От
"Joshua D. Drake"
Дата:
On 07/25/2012 11:37 AM, Gary Webster wrote:

>     This is a cluster issue, not a database issue. So if you have an
>     idnle in transaction, then it is affecting your JCR schema as well.
>
> OK, how do I track/debug/stop the "idle in transaction"s ?

Well idle in transaction is ALWAYS a code issue. You have code that is
executing that is starting a transaction, leaving the connection open
while not closing (committing/rollingback) the transaction.

You could turn on query logging and make sure pid and timestamp is in
the log_line_prefix. They you can see what pids are idle in transaction
and trace to what the last query was.
o you mean autovacuum, or something else?
>
>
>     I mean autovacuum.
>
> I was hoping to find more of a 'root cause' (eg. jackrabbit config) for
> this issue.
> I can't believe that this table is supposed to be getting so big, to
> even require much vacuuming.

Any update/delete to that table is going to cause bloat, autovacuum
cleans that up. If it can. If it can't, it will just continously grow.


Sincerely,

jD


--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579

Re: db size growing out of control when using clustered Jackrabbit

От
Gary Webster
Дата:

On Wed, Jul 25, 2012 at 2:44 PM, Joshua D. Drake <jd@commandprompt.com> wrote:

On 07/25/2012 11:37 AM, Gary Webster wrote:

    This is a cluster issue, not a database issue. So if you have an
    idnle in transaction, then it is affecting your JCR schema as well.

OK, how do I track/debug/stop the "idle in transaction"s ?

Well idle in transaction is ALWAYS a code issue. You have code that is executing that is starting a transaction, leaving the connection open while not closing (committing/rollingback) the transaction.

You could turn on query logging and make sure pid and timestamp is in the log_line_prefix. They you can see what pids are idle in transaction and trace to what the last query was.

OK, I set "log_statement = "all"" 
The log grew to 1GB in ~minute!  It is dominated by this one statement, which occurs every ~1.4 sec:
"update WS_BUNDLE set BUNDLE_DATA = $1 where NODE_ID_HI = $2 and NODE_ID_LO = $3"
parameter $1 is hex, over 6million characters long !!   Surely this is the root of my problem.

o you mean autovacuum, or something else?


    I mean autovacuum.

I was hoping to find more of a 'root cause' (eg. jackrabbit config) for
this issue.
I can't believe that this table is supposed to be getting so big, to
even require much vacuuming.

Any update/delete to that table is going to cause bloat, autovacuum cleans that up. If it can. If it can't, it will just continously grow.



Sincerely,

jD


--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579

Re: db size growing out of control when using clustered Jackrabbit

От
Scott Marlowe
Дата:
On Thu, Jul 26, 2012 at 1:31 PM, Gary Webster <webster@lexmark.com> wrote:
> OK, I set "log_statement = "all""
> The log grew to 1GB in ~minute!  It is dominated by this one statement,
> which occurs every ~1.4 sec:
> "update WS_BUNDLE set BUNDLE_DATA = $1 where NODE_ID_HI = $2 and NODE_ID_LO
> = $3"
> parameter $1 is hex, over 6million characters long !!   Surely this is the
> root of my problem.

That's crazy big.  Is that a normal payload size for this app?

Re: db size growing out of control when using clustered Jackrabbit

От
"Joshua D. Drake"
Дата:
On 07/26/2012 12:31 PM, Gary Webster wrote:

> OK, I set "log_statement = "all""
> The log grew to 1GB in ~minute!  It is dominated by this one statement,
> which occurs every ~1.4 sec:
> "update WS_BUNDLE set BUNDLE_DATA = $1 where NODE_ID_HI = $2 and
> NODE_ID_LO = $3"
> parameter $1 is hex, over 6million characters long !!   Surely this is
> the root of my problem.

It definitely is. Every time you update, you are creating a dead tuple
(unless it is HOT capable). If autovacuum can't come in behind and clean
that up due to idle in transaction, you are going to have serious problems.

Sincerely,

jD



--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579