Обсуждение: Postgres consuming way too much memory???

Поиск
Список
Период
Сортировка

Postgres consuming way too much memory???

От
"jody brownell"
Дата:
I have a box with an app and postgresql on it. Hardware includes with 2 2.8 Ghz xeons 512KB cache, 4 GB of memory, 6
scsidisk in a software  
raid 5 on a trustix 2.2 with a 2.6.15.3 kernel. The data and indexes are on the raid array while the tx log is on disk
with the OS. All is well.

The one application executes one transaction every 60 seconds or so. The transaction can range from tiny
to relatively large. Maybe 30-70k inserts, 60-100k updates... nothing too heavy, take about 8-12 seconds
to finish the the entire update in the worst case. The application is using the latest jdbc.... I am using
preparedStatements with addBatch/executebatch/clearBatch to send statements in batches of 10 thousand...
(is that high?)

The box itself is a little over subscribed for memory which is causing us to swap a bit... As the
application runs, I notice the postgres process which handles this particular app connection grows in memory seemingly
uncrontrollably until kaboom. Once the kernel kills off enough processes and the system settles, I see the postgres
processis at 1.9GB  
of res memory and 77MB of shared memory. This challenges a number of assumptions I have made in the last while and
raisesa  
few questions... BTW, I am assuming this is not a memory leak b/c the same install of our software on a box
with 8GB of memory and no swap being used has no unexplained growth in the memory... it is perfectly healthy
and quite performant.

Anyway, due to errors in the transaction, it is rolledback afterwhich the postgres process remains at 901MB of
resident memory and 91MB of of shared memory.

27116 postgres  15   0 1515m 901m  91m S  0.0 22.9  18:33.96 postgres: qradar qradar ::ffff:x.x.x.x(51149) idle

There are a few things I would like to understand.

- What in the postgres will grow at an uncontrolled rate when the system is under heavy load or the transaction
  is larger... there must be something not governed by the shared memory or other configuration in postgresql.conf.
  It seems like, once we start hitting swap, postgres grows in memory resulting in more swapping... until applications
  start getting killed.
- when the transaction was rolled back why did the process hold onto the 901MB of memory?
- when is a transaction too big? is this determined by the configuration and performance of wal_buffers and wal log or
isthere  
  house cleaning which MUST be done at commit/rollback to avoid siutations like this thus indicating there is an upper
bound.

I have been configuring postgres from tidbits I collected reading this list in the last few months....
not sure if what I have is totally right for the work load, but when I have adequate memory and avoid swap, we are more
than 
happy with performance. Configuration which is not below is just the default.

shared_buffers = 32767
work_mem = 20480
maintenance_work_mem = 32768
max_fsm_pages = 4024000
max_fsm_relations = 2000
fsync = false
wal_sync_method = fsync
wal_buffers = 4096
checkpoint_segments = 32
checkpoint_timeout = 1200
checkpoint_warning = 60
commit_delay = 5000
commit_siblings = 5
effective_cache_size = 175000
random_page_cost = 2
autovacuum = true
autovacuum_naptime = 60
autovacuum_vacuum_threshold = 500
autovacuum_analyze_threshold = 250
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_cost_delay=100
autovacuum_vacuum_cost_limit=100
default_statistics_target = 40

Is there anything here which looks weird or mis configured? I am just starting to play with the bg writer configuration
soI did not include. 
typically, there is little or no iowait... and no reason to think there is something miconfigured... from what I have
seen.

In one transaction i have seen as many as 5 checkpoint_segments be created/used so I was considering increasing
wal_buffersto 8192 from 4096  
given as many as 4 segments in memory/cache at once... need to test this though ....

Anyone have any thoughts on what could have caused the bloat?

thanks

Re: Postgres consuming way too much memory???

От
Tom Lane
Дата:
"jody brownell" <jody.brownell@q1labs.com> writes:
> 27116 postgres  15   0 1515m 901m  91m S  0.0 22.9  18:33.96 postgres: qradar qradar ::ffff:x.x.x.x(51149) idle

This looks like a memory leak, but you haven't provided enough info to
let someone else reproduce it.  Can you log what your application is
doing and extract a test case?  What PG version is this, anyway?

            regards, tom lane

Re: Postgres consuming way too much memory???

От
"jody brownell"
Дата:
Sorry about that, I was in a slight panic :)

I am using postgresql 8.1.4. I will install 8.1.3 and see if the same behavior exists.. we
may have started seeing this in 8.1.3, but I dont think before. I will check some stability
machines for similar bloating.

The query (calling a store proc) which is always running when the spiral begins is below. It simply performs
bulk linking of two objects.  Depending on what the application is detecting, it could be called to insert
40 - 50k records, 500 at a time. When the box is healthy, this is a 200 - 500 ms op, but this starts to become
a 20000+ ms op. I guess this makes sense considering the paging.....

Jun 14 12:50:18 xxx postgres[5649]: [3-1] LOG:  duration: 20117.984 ms  statement: EXECUTE <unnamed>  [PREPARE:  select
*from link_attacker_targets($1, $2, $3)  as 

CREATE OR REPLACE FUNCTION link_attacker_targets (p_attacker bigint, p_targets varchar, p_targets_size integer)
    returns bigint[] as
$body$
DECLARE
    v_targets      bigint[];
    v_target       bigint;
    v_returns      bigint[];
    v_returns_size integer := 0;
BEGIN
    v_targets := convert_string2bigint_array (p_targets, p_targets_size);

    FOR i IN 1..p_targets_size LOOP
          v_target := v_targets[i];

    BEGIN
          INSERT into attacker_target_link (attacker_id, target_id) values (p_attacker, v_target);
          v_returns_size := v_returns_size + 1;
          v_returns[v_returns_size] := v_target;

    EXCEPTION WHEN unique_violation THEN
        -- do nothing... app cache may be out of date.
    END;
    END LOOP;
    RETURN v_returns;
END;
$body$
LANGUAGE plpgsql VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;

On Wednesday 14 June 2006 17:03, you wrote:
> "jody brownell" <jody.brownell@q1labs.com> writes:
> > 27116 postgres  15   0 1515m 901m  91m S  0.0 22.9  18:33.96 postgres: qradar qradar ::ffff:x.x.x.x(51149) idle
>
> This looks like a memory leak, but you haven't provided enough info to
> let someone else reproduce it.  Can you log what your application is
> doing and extract a test case?  What PG version is this, anyway?
>
>             regards, tom lane
>
>
>

Re: Postgres consuming way too much memory???

От
"jody brownell"
Дата:
The last version of postgres we had in production was 8.1.1 actually, not 8.1.3.

So far, on my stability box and older production stability boxes I dont see the same behavior.

I will install 8.1.1 on these boxes and see what I see.

On Thursday 15 June 2006 09:01, jody brownell wrote:
> Sorry about that, I was in a slight panic :)
>
> I am using postgresql 8.1.4. I will install 8.1.3 and see if the same behavior exists.. we
> may have started seeing this in 8.1.3, but I dont think before. I will check some stability
> machines for similar bloating.
>
> The query (calling a store proc) which is always running when the spiral begins is below. It simply performs
> bulk linking of two objects.  Depending on what the application is detecting, it could be called to insert
> 40 - 50k records, 500 at a time. When the box is healthy, this is a 200 - 500 ms op, but this starts to become
> a 20000+ ms op. I guess this makes sense considering the paging.....
>
> Jun 14 12:50:18 xxx postgres[5649]: [3-1] LOG:  duration: 20117.984 ms  statement: EXECUTE <unnamed>  [PREPARE:
select* from link_attacker_targets($1, $2, $3)  as 
>
> CREATE OR REPLACE FUNCTION link_attacker_targets (p_attacker bigint, p_targets varchar, p_targets_size integer)
>     returns bigint[] as
> $body$
> DECLARE
>     v_targets      bigint[];
>     v_target       bigint;
>     v_returns      bigint[];
>     v_returns_size integer := 0;
> BEGIN
>     v_targets := convert_string2bigint_array (p_targets, p_targets_size);
>
>     FOR i IN 1..p_targets_size LOOP
>           v_target := v_targets[i];
>
>     BEGIN
>           INSERT into attacker_target_link (attacker_id, target_id) values (p_attacker, v_target);
>           v_returns_size := v_returns_size + 1;
>           v_returns[v_returns_size] := v_target;
>
>     EXCEPTION WHEN unique_violation THEN
>         -- do nothing... app cache may be out of date.
>     END;
>     END LOOP;
>     RETURN v_returns;
> END;
> $body$
> LANGUAGE plpgsql VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;
>
> On Wednesday 14 June 2006 17:03, you wrote:
> > "jody brownell" <jody.brownell@q1labs.com> writes:
> > > 27116 postgres  15   0 1515m 901m  91m S  0.0 22.9  18:33.96 postgres: qradar qradar ::ffff:x.x.x.x(51149) idle
> >
> > This looks like a memory leak, but you haven't provided enough info to
> > let someone else reproduce it.  Can you log what your application is
> > doing and extract a test case?  What PG version is this, anyway?
> >
> >             regards, tom lane
> >
> >
> >
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

Re: Postgres consuming way too much memory???

От
"jody brownell"
Дата:
Some more information...

When postgresql starts to go into this bloating state, I can only make it happen from my java app.
If I simultaneously perform insert of 10million rows into another table, it behaves as expected, but
the postgresql process handling the java connection slows down and bloats.

This leads me to think it has something to do with either the long lived connection. I am using
dbcp jdbc pool from jakarta OR I am trigger this behavior with something I am doing in the link
routine I sent earlier.

I am going to try closing the connection after each TX to see if this resolves it for now. If not, I will write
a java app, stored procedure (table etc) reproduce it without our application.

Oh yeah, it is when I use about have of my swap, dstat starts reporting heavy paging, memory climbs very quickly.



On Thursday 15 June 2006 09:15, jody brownell wrote:
> The last version of postgres we had in production was 8.1.1 actually, not 8.1.3.
>
> So far, on my stability box and older production stability boxes I dont see the same behavior.
>
> I will install 8.1.1 on these boxes and see what I see.
>
> On Thursday 15 June 2006 09:01, jody brownell wrote:
> > Sorry about that, I was in a slight panic :)
> >
> > I am using postgresql 8.1.4. I will install 8.1.3 and see if the same behavior exists.. we
> > may have started seeing this in 8.1.3, but I dont think before. I will check some stability
> > machines for similar bloating.
> >
> > The query (calling a store proc) which is always running when the spiral begins is below. It simply performs
> > bulk linking of two objects.  Depending on what the application is detecting, it could be called to insert
> > 40 - 50k records, 500 at a time. When the box is healthy, this is a 200 - 500 ms op, but this starts to become
> > a 20000+ ms op. I guess this makes sense considering the paging.....
> >
> > Jun 14 12:50:18 xxx postgres[5649]: [3-1] LOG:  duration: 20117.984 ms  statement: EXECUTE <unnamed>  [PREPARE:
select* from link_attacker_targets($1, $2, $3)  as 
> >
> > CREATE OR REPLACE FUNCTION link_attacker_targets (p_attacker bigint, p_targets varchar, p_targets_size integer)
> >     returns bigint[] as
> > $body$
> > DECLARE
> >     v_targets      bigint[];
> >     v_target       bigint;
> >     v_returns      bigint[];
> >     v_returns_size integer := 0;
> > BEGIN
> >     v_targets := convert_string2bigint_array (p_targets, p_targets_size);
> >
> >     FOR i IN 1..p_targets_size LOOP
> >           v_target := v_targets[i];
> >
> >     BEGIN
> >           INSERT into attacker_target_link (attacker_id, target_id) values (p_attacker, v_target);
> >           v_returns_size := v_returns_size + 1;
> >           v_returns[v_returns_size] := v_target;
> >
> >     EXCEPTION WHEN unique_violation THEN
> >         -- do nothing... app cache may be out of date.
> >     END;
> >     END LOOP;
> >     RETURN v_returns;
> > END;
> > $body$
> > LANGUAGE plpgsql VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;
> >
> > On Wednesday 14 June 2006 17:03, you wrote:
> > > "jody brownell" <jody.brownell@q1labs.com> writes:
> > > > 27116 postgres  15   0 1515m 901m  91m S  0.0 22.9  18:33.96 postgres: qradar qradar ::ffff:x.x.x.x(51149) idle
> > >
> > > This looks like a memory leak, but you haven't provided enough info to
> > > let someone else reproduce it.  Can you log what your application is
> > > doing and extract a test case?  What PG version is this, anyway?
> > >
> > >             regards, tom lane
> > >
> > >
> > >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: explain analyze is your friend
> >
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>

Re: Postgres consuming way too much memory???

От
Tom Lane
Дата:
"jody brownell" <jody.brownell@q1labs.com> writes:
> When postgresql starts to go into this bloating state, I can only make it happen from my java app.

That's interesting.  The JDBC driver uses protocol features that aren't
used by psql, so it's possible that the leak is triggered by one of
those features.  I wouldn't worry too much about duplicating the problem
from psql anyway --- a Java test case will do fine.

> I am going to try closing the connection after each TX to see if this
> resolves it for now. If not, I will write a java app, stored procedure
> (table etc) reproduce it without our application.

Even if that works around it for you, please pursue getting a test case
together so we can find and fix the underlying problem.

            regards, tom lane

Re: Postgres consuming way too much memory???

От
Tom Lane
Дата:
"jody brownell" <jody.brownell@Q1Labs.com> writes:
>     BEGIN
>           INSERT into attacker_target_link (attacker_id, target_id) values (p_attacker, v_target);
>           v_returns_size := v_returns_size + 1;
>           v_returns[v_returns_size] := v_target;

>     EXCEPTION WHEN unique_violation THEN
>         -- do nothing... app cache may be out of date.
>     END;

Hmm.  There is a known problem that plpgsql leaks some memory when
catching an exception:
http://archives.postgresql.org/pgsql-hackers/2006-02/msg00885.php

So if your problem case involves a whole lot of duplicates then that
could explain the initial bloat.  However, AFAIK that leakage is in
a transaction-local memory context, so the space ought to be freed at
transaction end.  And Linux's malloc does know about giving space back
to the kernel (unlike some platforms).  So I'm not sure why you're
seeing persistent bloat.

Can you rewrite the function to not use an EXCEPTION block (perhaps
a separate SELECT probe for each row --- note this won't be reliable
if there are concurrent processes making insertions)?  If so, does
that fix the problem?

            regards, tom lane

Re: Postgres consuming way too much memory???

От
"jody brownell"
Дата:
Tom - that make sense... and fits the timeline of when the instability may have been introduced.

I use soft references in java to track these relationships. When the GC needs memory it will collect
objects referenced by soft references so I need to have this exception caught where my caches may get cleaned.

When the system is under load as it would be in this case, there references would be cleaned causing a large
number of exceptions in the pgplsql, subsequently causing the leak... hence the swift downward spiral.

The previous version of these routines used selects but due to volume of selects, performance suffered quite
a bit. I dont think I could revert now for production use... closing the connection maybe the workaround for
us for this release IF this is in fact what the problem is. Unfortunatly, I use the catch in about 20 similar
routines to reset sequences etc.... this may be painful :(

I will modify the routine to help isolate the problem. stay tuned.

BTW - the fix you mentioned .... is that targeted for 8.2? Is there a timeline for 8.2?

On Thursday 15 June 2006 12:44, Tom Lane wrote:
> "jody brownell" <jody.brownell@Q1Labs.com> writes:
> >     BEGIN
> >           INSERT into attacker_target_link (attacker_id, target_id) values (p_attacker, v_target);
> >           v_returns_size := v_returns_size + 1;
> >           v_returns[v_returns_size] := v_target;
>
> >     EXCEPTION WHEN unique_violation THEN
> >         -- do nothing... app cache may be out of date.
> >     END;
>
> Hmm.  There is a known problem that plpgsql leaks some memory when
> catching an exception:
> http://archives.postgresql.org/pgsql-hackers/2006-02/msg00885.php
>
> So if your problem case involves a whole lot of duplicates then that
> could explain the initial bloat.  However, AFAIK that leakage is in
> a transaction-local memory context, so the space ought to be freed at
> transaction end.  And Linux's malloc does know about giving space back
> to the kernel (unlike some platforms).  So I'm not sure why you're
> seeing persistent bloat.
>
> Can you rewrite the function to not use an EXCEPTION block (perhaps
> a separate SELECT probe for each row --- note this won't be reliable
> if there are concurrent processes making insertions)?  If so, does
> that fix the problem?
>
>             regards, tom lane
>
>
>

Re: Postgres consuming way too much memory???

От
Tom Lane
Дата:
"jody brownell" <jody.brownell@Q1Labs.com> writes:
> BTW - the fix you mentioned .... is that targeted for 8.2? Is there a timeline for 8.2?

There is no fix as yet, but it's on the radar screen to fix for 8.2.

We expect 8.2 will go beta towards the end of summer (I forget whether
Aug 1 or Sep 1 is the target).

            regards, tom lane

Re: Postgres consuming way too much memory???

От
Mark Lewis
Дата:
On Thu, 2006-06-15 at 11:34 -0400, Tom Lane wrote:
> "jody brownell" <jody.brownell@q1labs.com> writes:
> > When postgresql starts to go into this bloating state, I can only make it happen from my java app.
>
> That's interesting.  The JDBC driver uses protocol features that aren't
> used by psql, so it's possible that the leak is triggered by one of
> those features.  I wouldn't worry too much about duplicating the problem
> from psql anyway --- a Java test case will do fine.
>
> > I am going to try closing the connection after each TX to see if this
> > resolves it for now. If not, I will write a java app, stored procedure
> > (table etc) reproduce it without our application.


Just to mention another possible culprit; this one doesn't seem all that
likely to me, but at least it's easy to investigate.

With DBCP and non-ancient versions of the JDBC driver that use v3
protocol and real prepared statements, it is possible to (mis)configure
the system to create an unbounded number of cached prepared statements
on any particular connection.  Older versions of DBCP were also known to
have bugs which aggravated this issue when prepared statement caching
was enabled, IIRC.

-- Mark Lewis

Re: Postgres consuming way too much memory???

От
Bruce Momjian
Дата:
Added to TODO:

>               o Fix memory leak from exceptions
>
>                http://archives.postgresql.org/pgsql-performance/2006-06/msg0$


---------------------------------------------------------------------------

Tom Lane wrote:
> "jody brownell" <jody.brownell@Q1Labs.com> writes:
> >     BEGIN
> >           INSERT into attacker_target_link (attacker_id, target_id) values (p_attacker, v_target);
> >           v_returns_size := v_returns_size + 1;
> >           v_returns[v_returns_size] := v_target;
>
> >     EXCEPTION WHEN unique_violation THEN
> >         -- do nothing... app cache may be out of date.
> >     END;
>
> Hmm.  There is a known problem that plpgsql leaks some memory when
> catching an exception:
> http://archives.postgresql.org/pgsql-hackers/2006-02/msg00885.php
>
> So if your problem case involves a whole lot of duplicates then that
> could explain the initial bloat.  However, AFAIK that leakage is in
> a transaction-local memory context, so the space ought to be freed at
> transaction end.  And Linux's malloc does know about giving space back
> to the kernel (unlike some platforms).  So I'm not sure why you're
> seeing persistent bloat.
>
> Can you rewrite the function to not use an EXCEPTION block (perhaps
> a separate SELECT probe for each row --- note this won't be reliable
> if there are concurrent processes making insertions)?  If so, does
> that fix the problem?
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>

--
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +