Обсуждение: memory leak regression 9.1 versus 8.1

Поиск
Список
Период
Сортировка

memory leak regression 9.1 versus 8.1

От
Joe Conway
Дата:
I'm working on an upgrade of PostgreSQL embedded in a product from
version 8.1.x to 9.1.x. One particular PL/pgSQL function is giving us an
issue as there seems to be a rather severe regression in memory usage --
a query that finishes in 8.1 causes an out of memory exception on 9.1.

Using the same data on the same machine I see memory use stay steady at
a reasonably low value on the 8.1 installation but steadily climb on 9.1
(I watched it go over 2 GB and canceled the query -- the production
machines are 32 bit)

The attached standalone script seems to reproduce the effect. On 8.1
memory usage remains steady and low, on 9.1 I watched it climb past 1.1
GB and canceled the query.

I suspect the append node to be the culprit because if I skip the "UNION
ALL", i.e. if I use one generate_series with 20 million rows instead of
10 with 2 million each, then I do not see the memory leak. The real
function is actually selecting over many inherited tables (i.e. a
partitioned table).

Thoughts?

Thanks,

Joe

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support

Вложения

Re: memory leak regression 9.1 versus 8.1

От
Tom Lane
Дата:
Joe Conway <mail@joeconway.com> writes:
> I'm working on an upgrade of PostgreSQL embedded in a product from
> version 8.1.x to 9.1.x. One particular PL/pgSQL function is giving us an
> issue as there seems to be a rather severe regression in memory usage --
> a query that finishes in 8.1 causes an out of memory exception on 9.1.

I see no memory leak at all in this example, either in HEAD or 9.1
branch tip.  Perhaps whatever you're seeing is an already-fixed bug?

Another likely theory is that you've changed settings from the 8.1
installation.  I would expect this example to eat about 10 times
work_mem (due to one tuplestore for each generate_series invocation),
and that's more or less what I see happening here.  A large work_mem
could look like a leak, but it isn't.

If you need further help in debugging, try launching the postmaster
under a fairly restrictive memory ulimit, so that the backend will get a
malloc failure before it starts to swap too badly.  The memory map it
will then print on stderr should point to where the memory is going.
        regards, tom lane


Re: memory leak regression 9.1 versus 8.1

От
Joe Conway
Дата:
On 05/09/2012 03:08 PM, Tom Lane wrote:
> I see no memory leak at all in this example, either in HEAD or 9.1
> branch tip.  Perhaps whatever you're seeing is an already-fixed bug?
> 
> Another likely theory is that you've changed settings from the 8.1
> installation.  I would expect this example to eat about 10 times
> work_mem (due to one tuplestore for each generate_series invocation),
> and that's more or less what I see happening here.  A large work_mem
> could look like a leak, but it isn't.

Good call -- of course that just means my contrived example fails to
duplicate the real issue :-(
In the real example, even with work_mem = 1 MB I see the same behavior
on 9.1.

> If you need further help in debugging, try launching the postmaster
> under a fairly restrictive memory ulimit, so that the backend will get a
> malloc failure before it starts to swap too badly.  The memory map it
> will then print on stderr should point to where the memory is going.

Thanks -- will try that.

Joe


-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support


Re: memory leak regression 9.1 versus 8.1

От
Joe Conway
Дата:
On 05/09/2012 03:36 PM, Joe Conway wrote:
> Good call -- of course that just means my contrived example fails to
> duplicate the real issue :-(
> In the real example, even with work_mem = 1 MB I see the same behavior
> on 9.1.

OK, new script. This more faithfully represents the real life scenario,
and reproduces the issue on HEAD with out-of-the-box config settings,
versus 8.1 which completes the query having never exceeded a very modest
memory usage:

---------------
On pg 8.1 with out of the box config:
VIRT  RES  SHR
199m  11m 3032
---------------
On pg head with out of the box config:
VIRT  RES  SHR
1671m 1.5g  16m
---------------

I have not tried your ulimit suggestion yet but will do that next.

Joe

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support

Вложения

Re: memory leak regression 9.1 versus 8.1

От
Joe Conway
Дата:
On 05/09/2012 05:06 PM, Joe Conway wrote:
> OK, new script. This more faithfully represents the real life scenario,
> and reproduces the issue on HEAD with out-of-the-box config settings,
> versus 8.1 which completes the query having never exceeded a very modest
> memory usage:
>
> ---------------
> On pg 8.1 with out of the box config:
> VIRT  RES  SHR
> 199m  11m 3032
> ---------------
> On pg head with out of the box config:
> VIRT  RES  SHR
> 1671m 1.5g  16m
> ---------------

The attached one-liner seems to plug up the majority (although not quite
all) of the leakage.

do_convert_tuple() is allocating a new tuple for every row in the loop
and exec_stmt_return_next() is leaking it.

The query now finishes successfully. On pg head with attached patch and
out of the box config:
VIRT  RES  SHR
196m  35m  31m

This look sane/correct?

Joe

--
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support

Вложения

Re: memory leak regression 9.1 versus 8.1

От
Tom Lane
Дата:
Joe Conway <mail@joeconway.com> writes:
> The attached one-liner seems to plug up the majority (although not quite
> all) of the leakage.

Looks sane to me.  Are you planning to look for the remaining leakage?
        regards, tom lane


Re: memory leak regression 9.1 versus 8.1

От
Joe Conway
Дата:
On 05/09/2012 10:01 PM, Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>> The attached one-liner seems to plug up the majority (although not quite
>> all) of the leakage.
> 
> Looks sane to me.  Are you planning to look for the remaining leakage?

Actually, now I'm not so sure there really are any other leaks. The last
test I ran, on 9.1 with the original data and plpgsql function, grew to:
 VIRT  RES  SHR 540m 327m 267m

but then stabilized there through the end of the query, which
successfully returned:
 count
----------28847766
(1 row)

This was with:

report_log=# show shared_buffers;shared_buffers
----------------256MB
(1 row)

report_log=# show work_mem;work_mem
----------16MB
(1 row)

So I think those memory usage numbers look reasonable.

The bug appears to go back through 8.4 -- kind of surprising no one has
complained before.

Joe


-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support