Обсуждение: Memory for BYTEA returned by C function is not released until connection is dropped

Поиск
Список
Период
Сортировка

Memory for BYTEA returned by C function is not released until connection is dropped

От
John Leiseboer
Дата:
I have written a number of functions in C that return BYTEA type. I have compiled and run on both Windows and Linux,
32-bitand 64-bit, PostgreSQL versions 9.3 and 9.4. 

My functions return BYTEA data to the caller. The problem is that memory usage grows until there is no memory left on
thehost, at which point an error is returned. If I drop the connection (e.g. by quitting from pqsql), the memory is
returned.

I wrote the following minimal function to test palloc() and BYTEA return behaviour, and found that this minimal program
alsoexhibits the unbounded memory growth problem. 


C source code:

PG_FUNCTION_INFO_V1(test_palloc);
Datum test_palloc()
{
    bytea *test_ret;
    int test_len = 1024;

    test_ret = (bytea *)palloc(test_len + VARHDRSZ);
    SET_VARSIZE(test_ret, test_len + VARHDRSZ);
    PG_RETURN_BYTEA_P(test_ret);
}

Function definition:

CREATE OR REPLACE FUNCTION test_palloc() RETURNS BYTEA AS E'<path to shared library>', test_palloc' LANGUAGE C
IMMUTABLESTRICT; 

psql commands to reproduce the problem:

\o out.txt
SELECT ids.*, test_palloc() FROM GENERATE_SERIES(1, 1000000) ids;

At the completion of the above command, host memory will have been consumed but not released back to the system. After
quittingpsql (\q), memory is released. 

Is this expected behaviour or a bug? Am I doing something wrong? How can I return a BYTEA type from a C library
functionwithout having to drop the connection in order to recover the allocated memory that is returned to the caller? 

Regards,
John



Re: Memory for BYTEA returned by C function is not released until connection is dropped

От
Tom Lane
Дата:
John Leiseboer <jleiseboer@bigpond.com> writes:
> I have written a number of functions in C that return BYTEA type. I have compiled and run on both Windows and Linux,
32-bitand 64-bit, PostgreSQL versions 9.3 and 9.4. 
> My functions return BYTEA data to the caller. The problem is that memory usage grows until there is no memory left on
thehost, at which point an error is returned. If I drop the connection (e.g. by quitting from pqsql), the memory is
returned.

> I wrote the following minimal function to test palloc() and BYTEA return behaviour, and found that this minimal
programalso exhibits the unbounded memory growth problem. 

> C source code:

> PG_FUNCTION_INFO_V1(test_palloc);
> Datum test_palloc()
> {
>     bytea *test_ret;
>     int test_len = 1024;

>     test_ret = (bytea *)palloc(test_len + VARHDRSZ);
>     SET_VARSIZE(test_ret, test_len + VARHDRSZ);
>     PG_RETURN_BYTEA_P(test_ret);
> }

> Function definition:

> CREATE OR REPLACE FUNCTION test_palloc() RETURNS BYTEA AS E'<path to shared library>', test_palloc' LANGUAGE C
IMMUTABLESTRICT; 

> psql commands to reproduce the problem:

> \o out.txt
> SELECT ids.*, test_palloc() FROM GENERATE_SERIES(1, 1000000) ids;

> At the completion of the above command, host memory will have been consumed but not released back to the system.
Afterquitting psql (\q), memory is released. 

Well, first off, it's not that function that is eating memory: if you've
got it marked as IMMUTABLE then it will in fact only be executed *once*
per query.  I don't see any evidence of memory leakage on the server
at all when trying a comparable query.  (Disclaimer: I just used
"repeat('x', 1024)" rather than bothering to compile up a .so.  But I
do not think it behaves differently.)

What I do see bloating is psql, which is absorbing a query result of
about 1GB (1000000 1kB-sized rows) and then spending a lot of cycles
to pretty-print that while writing it to out.txt.

On my Linux box, psql does release the memory back to the kernel when
that's over.  But that would depend on the behavior of libc, so it's
quite plausible that some other platforms would not.

The way to avoid this is to not ask the client program to absorb the whole
1GB-sized query result at once.  You could use a cursor and FETCH a few
thousand rows at a time.  (In reasonably recent versions of psql, "\set
FETCH_COUNT" can do that for you automatically, at the cost of possibly
less pretty output formatting.)  Or use COPY, which is implemented in more
of a streaming style to begin with.

But at any rate, bottom line is that your problem is client-side not
server-side, and no amount of fooling with the function innards will
change it.

            regards, tom lane


Re: Memory for BYTEA returned by C function is not released until connection is dropped

От
John Leiseboer
Дата:
Tom Lane [mailto:tgl@sss.pgh.pa.us] writes:
> But at any rate, bottom line is that your problem is client-side not server-side, and no amount of fooling with the
functioninnards will change it. 

I wish it were. While monitoring memory on Linux and Windows machines I see that psql memory usage hardly changes, but
PostgreSQLserver memory usage increases steadily until the query stops. PostgreSQL server memory usage stays high until
afterthe client drops the connection. This is definitely a case of the server holding onto memory until the client
dropsthe connection. 

In other case, when I let the query continue until memory is exhausted, the PostgreSQL server crashes with "out of
memory"error, not the client. 

When does the PostgreSQL server call pfree() after a C function has returned to the caller? All I've found in books and
Googlesearches is: 

"What makes palloc() special is that it allocates the memory in the current context and the whole memory is freed in
onego when the context is destroyed." 

What "context"? The connection? The transaction? A SQL statement? The function call?

John