Обсуждение: BUG #3418: Memory leak with sql EXCEPTION?
The following bug has been logged online:
Bug reference:      3418
Logged by:          Viatcheslav
Email address:      vka@ipcb.net
PostgreSQL version: 8.2.4
Operating system:   RedHat Linux kernel 2.6.9-34.EL
Description:        Memory leak with sql EXCEPTION?
Details:
On our system we came across the following issue:
/---------------------------------------------------/
create table dummy(
    id integer primary key,
    value varchar(10)
);
CREATE OR REPLACE FUNCTION "public"."test" (fi integer) RETURNS
"pg_catalog"."void" AS
$body$
declare
 vi integer;
 idx integer := 0;
begin
while idx < fi loop
    idx := idx + 1;
    begin
        insert into dummy values (idx, idx::varchar);
    exception
    when others then
       raise exception '% %', idx, 'stop';
    end;
end loop;
end;
$body$
LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY DEFINER;
select test(10000000);
/---------------------------------------------------/
We've been watching CPU and memory usage during the execution of the
'select' with and without exception trap of 'insert'. While in the second
case memory usage kept being relatively low and stayed constant at the level
of approx 2.9Mb across the execution time, in the first case it gradually
raised with step of approx 6Mb up to somewhat 140Mb in the end.
With more complicated PL/PgSQL code (with the same structure: begin -
exception - end block in the main function, some complex function (without
exceptions inside them) doing actual job were called from there) we've run
out of virtual memory (> 1Gb) already with 30000 cycles loop and the process
got killed by operating system. Again, everything worked just fine with
negligible memory consumption without exceptions.
We are aware of additional costs of exceptions yet this one appears to be a
memory leak.
My apologies for poor-styled report in advance for this is my first one.
			
		"Viatcheslav" <vka@ipcb.net> writes:
> We are aware of additional costs of exceptions yet this one appears to be a
> memory leak.
AFAICT, the only memory consumed per-iteration by your example is the
subtransaction XID that's added to the list of XIDs for eventual commit.
It's not really possible to dispense with storing it.
            regards, tom lane
			
		Viatcheslav Kalinin <vka@ipcb.net> writes:
> I've made two cases actually, simple one that I already mentioned about
> in my previous mails and more complex one that I tried to make somewhat
> close to our work case where I first encountered the problem in question.
As I said before, the only memory growth I see in the first one is the
list of subcommitted child XIDs, which we really can't get rid of.
The second one is having problems because of AFTER triggers fired for
the foreign key constraints.  That we can improve, as per the attached
patch.
            regards, tom lane
Index: trigger.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/commands/trigger.c,v
retrieving revision 1.210.2.1
diff -c -r1.210.2.1 trigger.c
*** trigger.c    25 Jan 2007 04:17:56 -0000    1.210.2.1
--- trigger.c    1 Jul 2007 17:26:52 -0000
***************
*** 2766,2771 ****
--- 2766,2789 ----
          afterTriggers->state_stack[my_level] = NULL;
          Assert(afterTriggers->query_depth ==
                 afterTriggers->depth_stack[my_level]);
+         /*
+          * It's entirely possible that the subxact created an event_cxt but
+          * there is not anything left in it (because all the triggers were
+          * fired at end-of-statement).  If so, we should release the context
+          * to prevent memory leakage in a long sequence of subtransactions.
+          * We can detect whether there's anything of use in the context by
+          * seeing if anything was added to the global events list since
+          * subxact start.  (This test doesn't catch every case where the
+          * context is deletable; for instance maybe the only additions were
+          * from a sub-sub-xact.  But it handles the common case.)
+          */
+         if (afterTriggers->cxt_stack[my_level] &&
+             afterTriggers->events.tail == afterTriggers->events_stack[my_level].tail)
+         {
+             MemoryContextDelete(afterTriggers->cxt_stack[my_level]);
+             /* avoid double delete if abort later */
+             afterTriggers->cxt_stack[my_level] = NULL;
+         }
      }
      else
      {
			
		Hello, Tom
As of you last mail:
 > Well, if you think there's some other memory leak then you need to
 > submit a test case that demonstrates it.
Not to be going to make any assumptions on what might be malfunctioning
internally, I populated a test case that at least on our system
demonstrates behavior that imo couldn't take place if per-iteration
overhead was only a record in XIDs list.
I've made two cases actually, simple one that I already mentioned about
in my previous mails and more complex one that I tried to make somewhat
close to our work case where I first encountered the problem in question.
Simple:
/---------------------------------------------------/
create table dummy(
    id integer primary key,
    value varchar(10)
);
CREATE OR REPLACE FUNCTION "public"."test_smpl" (fi integer) RETURNS
"pg_catalog"."void" AS
$body$
declare
 vi integer;
 idx integer := 0;
begin
while idx < fi loop
    idx := idx + 1;
    begin
        insert into dummy values (idx, idx::varchar);
    exception
    when others then
       raise exception '% %', idx, 'stop';
    end;
end loop;
end;
$body$
LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY DEFINER;
/---------------------------------------------------/
Complex:
/---------------------------------------------------/
create table ref1 (
    id       integer primary key,
    value    varchar(10)
);
insert into ref1 values (1, 'ref#1');
create table cplx1 (
    id       integer primary key,
    fid      integer references ref1 (id),
    value1   smallint not null,
    value2   varchar(100) unique,
    value3   varchar(100)
);
create table cplx2 (
    fid      integer references cplx1 (id),
    value1   varchar(50),
    value2   varchar(50),
    value3   varchar(50),
    value4   smallint,
    value5   real
);
CREATE OR REPLACE FUNCTION "public"."test_cplx" (fi integer) RETURNS
"pg_catalog"."void" AS
$body$
declare
    idx         integer := 0;
    viid        integer;
    vivalue1    smallint;
    vsvalue2    varchar;
    vsvalue3    varchar;
    vsvalue2_1  varchar;
    vsvalue2_2  varchar;
    vsvalue2_3  varchar;
    vivalue2_4  smallint;
    vxvalue2_5  real;
begin
while idx < fi loop
    idx := idx + 1;
    viid := nextval('autoinc');
    vivalue1 := idx % 32000;
    vsvalue2 := 'val' || trunc(random() * 10000000);
    vsvalue3 := 'aaa' || idx;
    vsvalue2_1  := 'bbb' || idx;
    vsvalue2_2  := 'ccc' || idx;
    vsvalue2_3  := 'ddd' || idx;
    vivalue2_4  := trunc(random() * 2);
    vxvalue2_5  := random();
    begin
    perform internal_insert(viid, vivalue1, vsvalue2, vsvalue3,
                      vsvalue2_1, vsvalue2_2, vsvalue2_3,
                      vivalue2_4, vxvalue2_5);
    exception
    when others then
       raise notice '% %', idx, SQLERRM;
    end;
end loop;
end;
$body$
LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY DEFINER;
CREATE OR REPLACE FUNCTION "public"."internal_insert" (piid integer,
pivalue1 smallint, psvalue2 varchar, psvalue3 varchar, psvalue2_1
varchar, psvalue2_2 varchar, psvalue2_3 varchar, pivalue2_4 smallint,
pxvalue2_5 real) RETURNS "pg_catalog"."void" AS
$body$
begin
    perform 1
       from cplx1
      where value2 = psvalue2;
    if found then
        raise exception 'not unique';
    end if;
    insert into cplx1
    values (piid, 1, pivalue1, psvalue2, psvalue3);
    if psvalue2_1 is not null or
       psvalue2_2 is not null or
       psvalue2_3 is not null or
       pivalue2_4 is not null or
       pxvalue2_5 is not null
    then
        insert into cplx2
        values (piid, psvalue2_1, psvalue2_2, psvalue2_3, pivalue2_4,
pxvalue2_5);
    end if;
end;
$body$
LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY DEFINER;
/---------------------------------------------------/
Couple of notes on the second one: main function fills in 2 linked
indexed tables with certain data of various types, work prototype of the
function "test_cplx" is used as an interface to some applications (such
as CGI scripts) thus does some internal parameters check demonstrated
here as 'perform' block.
Our box is Linux 2.6.9-34 running on P4 1.8GHz.
Starting from new connection:
# ps -eo %cpu,cputime,vsize,size,rss,pid,cmd | grep postgres
 0.0 00:00:00 141524 2620 4912 21861 postgres: postgres <...> idle
------------------------------------------------------------------
I do 'select test_smpl(100000);' and near the end of the transaction get
the following report:
 11.9 00:00:05 143988 4968 13712 21861 postgres: postgres <...> SELECT
----------------------------------------------------------------------
When I do 'select test_cplx(100000)' I get:
 99.6 00:01:01 971552 832508 858012 21812 postgres: postgres <...> SELECT
-------------------------------------------------------------------------
I should also note here that it seems that the allocated memory is not
freed after th transaction ends, so if I start next select memory
consumption continues to grow.
Having commented out 'begin' and 'exception ... when ... end' statements
in test_cplx (and returning immediately from internal_insert if value2
is not unique) I get:
 39.6 00:00:39 142420 3400 27824 21877 postgres: postgres <...> SELECT
----------------------------------------------------------------------
This behavior seems odd to me.
Sincerely, Viatcheslav