Обсуждение: memory leak regression 9.1 versus 8.1
I'm working on an upgrade of PostgreSQL embedded in a product from version 8.1.x to 9.1.x. One particular PL/pgSQL function is giving us an issue as there seems to be a rather severe regression in memory usage -- a query that finishes in 8.1 causes an out of memory exception on 9.1. Using the same data on the same machine I see memory use stay steady at a reasonably low value on the 8.1 installation but steadily climb on 9.1 (I watched it go over 2 GB and canceled the query -- the production machines are 32 bit) The attached standalone script seems to reproduce the effect. On 8.1 memory usage remains steady and low, on 9.1 I watched it climb past 1.1 GB and canceled the query. I suspect the append node to be the culprit because if I skip the "UNION ALL", i.e. if I use one generate_series with 20 million rows instead of 10 with 2 million each, then I do not see the memory leak. The real function is actually selecting over many inherited tables (i.e. a partitioned table). Thoughts? Thanks, Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support
Вложения
Joe Conway <mail@joeconway.com> writes: > I'm working on an upgrade of PostgreSQL embedded in a product from > version 8.1.x to 9.1.x. One particular PL/pgSQL function is giving us an > issue as there seems to be a rather severe regression in memory usage -- > a query that finishes in 8.1 causes an out of memory exception on 9.1. I see no memory leak at all in this example, either in HEAD or 9.1 branch tip. Perhaps whatever you're seeing is an already-fixed bug? Another likely theory is that you've changed settings from the 8.1 installation. I would expect this example to eat about 10 times work_mem (due to one tuplestore for each generate_series invocation), and that's more or less what I see happening here. A large work_mem could look like a leak, but it isn't. If you need further help in debugging, try launching the postmaster under a fairly restrictive memory ulimit, so that the backend will get a malloc failure before it starts to swap too badly. The memory map it will then print on stderr should point to where the memory is going. regards, tom lane
On 05/09/2012 03:08 PM, Tom Lane wrote: > I see no memory leak at all in this example, either in HEAD or 9.1 > branch tip. Perhaps whatever you're seeing is an already-fixed bug? > > Another likely theory is that you've changed settings from the 8.1 > installation. I would expect this example to eat about 10 times > work_mem (due to one tuplestore for each generate_series invocation), > and that's more or less what I see happening here. A large work_mem > could look like a leak, but it isn't. Good call -- of course that just means my contrived example fails to duplicate the real issue :-( In the real example, even with work_mem = 1 MB I see the same behavior on 9.1. > If you need further help in debugging, try launching the postmaster > under a fairly restrictive memory ulimit, so that the backend will get a > malloc failure before it starts to swap too badly. The memory map it > will then print on stderr should point to where the memory is going. Thanks -- will try that. Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support
On 05/09/2012 03:36 PM, Joe Conway wrote: > Good call -- of course that just means my contrived example fails to > duplicate the real issue :-( > In the real example, even with work_mem = 1 MB I see the same behavior > on 9.1. OK, new script. This more faithfully represents the real life scenario, and reproduces the issue on HEAD with out-of-the-box config settings, versus 8.1 which completes the query having never exceeded a very modest memory usage: --------------- On pg 8.1 with out of the box config: VIRT RES SHR 199m 11m 3032 --------------- On pg head with out of the box config: VIRT RES SHR 1671m 1.5g 16m --------------- I have not tried your ulimit suggestion yet but will do that next. Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support
Вложения
On 05/09/2012 05:06 PM, Joe Conway wrote: > OK, new script. This more faithfully represents the real life scenario, > and reproduces the issue on HEAD with out-of-the-box config settings, > versus 8.1 which completes the query having never exceeded a very modest > memory usage: > > --------------- > On pg 8.1 with out of the box config: > VIRT RES SHR > 199m 11m 3032 > --------------- > On pg head with out of the box config: > VIRT RES SHR > 1671m 1.5g 16m > --------------- The attached one-liner seems to plug up the majority (although not quite all) of the leakage. do_convert_tuple() is allocating a new tuple for every row in the loop and exec_stmt_return_next() is leaking it. The query now finishes successfully. On pg head with attached patch and out of the box config: VIRT RES SHR 196m 35m 31m This look sane/correct? Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support
Вложения
Joe Conway <mail@joeconway.com> writes: > The attached one-liner seems to plug up the majority (although not quite > all) of the leakage. Looks sane to me. Are you planning to look for the remaining leakage? regards, tom lane
On 05/09/2012 10:01 PM, Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: >> The attached one-liner seems to plug up the majority (although not quite >> all) of the leakage. > > Looks sane to me. Are you planning to look for the remaining leakage? Actually, now I'm not so sure there really are any other leaks. The last test I ran, on 9.1 with the original data and plpgsql function, grew to: VIRT RES SHR 540m 327m 267m but then stabilized there through the end of the query, which successfully returned: count ----------28847766 (1 row) This was with: report_log=# show shared_buffers;shared_buffers ----------------256MB (1 row) report_log=# show work_mem;work_mem ----------16MB (1 row) So I think those memory usage numbers look reasonable. The bug appears to go back through 8.4 -- kind of surprising no one has complained before. Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, & 24x7 Support