Обсуждение: DO with a large amount of statements get stuck with high memory consumption

Поиск
Список
Период
Сортировка

DO with a large amount of statements get stuck with high memory consumption

От
Merlin Moncure
Дата:
I've noticed that pl/pgsql functions/do commands do not behave well
when the statement resolves and frees memory.   To be clear:

FOR i in 1..1000000
LOOP INSERT INTO foo VALUES (i);
END LOOP;

...runs just fine while

BEGIN INSERT INTO foo VALUES (1); INSERT INTO foo VALUES (2); ... INSERT INTO foo VALUES (1000000);
END;

(for the curious, create a script yourself via
copy ( select   'do $$begin create temp table foo(i int);' union all select   format('insert into foo values (%s);', i)
fromgenerate_series(1,1000000) i union all select 'raise notice ''abandon all hope!''; end; $$;'
 
) to '/tmp/breakit.sql';

...while consume amounts of resident memory proportional to the number
of statemnts and eventually crash the server.  The problem is obvious;
each statement causes a plan to get created and the server gets stuck
in a loop where SPI_freeplan() is called repeatedly.  Everything is
working as designed I guess, but when this happens it's really
unpleasant: the query is uncancellable and unterminatable, nicht gut.
A pg_ctl kill ABRT <pid> will do the trick but I was quite astonished
to see linux take a few minutes to clean up the mess (!) on a somewhat
pokey virtualized server with lots of memory.  With even as little as
ten thousand statements the cleanup time far exceed the runtime of the
statement block.

I guess the key takeaway here is, "don't do that"; pl/pgsql
aggressively generates plans and turns out to be a poor choice for
bulk loading because of all the plan caching.   Having said that, I
can't help but wonder if there should be a (perhaps user configurable)
limit to the amount of SPI plans a single function call should be able
to acquire on the basis you are going to smack into very poor
behaviors in the memory subsystem.

Stepping back, I can't help but wonder what the value of all the plan
caching going on is at all for statement blocks.  Loops might comprise
a notable exception, noted.  I'd humbly submit though that (relative
to functions) it's much more likely to want to do something like
insert a lot of statements and a impossible to utilize any cached
plans.

This is not an academic gripe -- I just exploded production :-D.

merlin



Re: DO with a large amount of statements get stuck with high memory consumption

От
Jan Wieck
Дата:


On Tue, Jul 12, 2016 at 3:29 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
I've noticed that pl/pgsql functions/do commands do not behave well
when the statement resolves and frees memory.   To be clear:

FOR i in 1..1000000
LOOP
  INSERT INTO foo VALUES (i);
END LOOP;

...runs just fine while

BEGIN
  INSERT INTO foo VALUES (1);
  INSERT INTO foo VALUES (2);
  ...
  INSERT INTO foo VALUES (1000000);
END;

This sounds very much like what led to commit 25c539233044c235e97fd7c9dc600fb5f08fe065.

It seems that patch was only applied to master and never backpatched to 9.5 or earlier.


Regards, Jan


 

(for the curious, create a script yourself via
copy (
  select
    'do $$begin create temp table foo(i int);'
  union all select
    format('insert into foo values (%s);', i) from generate_series(1,1000000) i
  union all select 'raise notice ''abandon all hope!''; end; $$;'
) to '/tmp/breakit.sql';

...while consume amounts of resident memory proportional to the number
of statemnts and eventually crash the server.  The problem is obvious;
each statement causes a plan to get created and the server gets stuck
in a loop where SPI_freeplan() is called repeatedly.  Everything is
working as designed I guess, but when this happens it's really
unpleasant: the query is uncancellable and unterminatable, nicht gut.
A pg_ctl kill ABRT <pid> will do the trick but I was quite astonished
to see linux take a few minutes to clean up the mess (!) on a somewhat
pokey virtualized server with lots of memory.  With even as little as
ten thousand statements the cleanup time far exceed the runtime of the
statement block.

I guess the key takeaway here is, "don't do that"; pl/pgsql
aggressively generates plans and turns out to be a poor choice for
bulk loading because of all the plan caching.   Having said that, I
can't help but wonder if there should be a (perhaps user configurable)
limit to the amount of SPI plans a single function call should be able
to acquire on the basis you are going to smack into very poor
behaviors in the memory subsystem.

Stepping back, I can't help but wonder what the value of all the plan
caching going on is at all for statement blocks.  Loops might comprise
a notable exception, noted.  I'd humbly submit though that (relative
to functions) it's much more likely to want to do something like
insert a lot of statements and a impossible to utilize any cached
plans.

This is not an academic gripe -- I just exploded production :-D.

merlin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



--
Jan Wieck
Senior Postgres Architect

Re: DO with a large amount of statements get stuck with high memory consumption

От
Jan Wieck
Дата:
BTW, here is the email thread about double-linking MemoryContext children
patch, that Kevin at the end committed to master.



Regards, Jan


On Sat, Jul 16, 2016 at 3:47 PM, Jan Wieck <jan@wi3ck.info> wrote:


On Tue, Jul 12, 2016 at 3:29 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
I've noticed that pl/pgsql functions/do commands do not behave well
when the statement resolves and frees memory.   To be clear:

FOR i in 1..1000000
LOOP
  INSERT INTO foo VALUES (i);
END LOOP;

...runs just fine while

BEGIN
  INSERT INTO foo VALUES (1);
  INSERT INTO foo VALUES (2);
  ...
  INSERT INTO foo VALUES (1000000);
END;

This sounds very much like what led to commit 25c539233044c235e97fd7c9dc600fb5f08fe065.

It seems that patch was only applied to master and never backpatched to 9.5 or earlier.


Regards, Jan


 

(for the curious, create a script yourself via
copy (
  select
    'do $$begin create temp table foo(i int);'
  union all select
    format('insert into foo values (%s);', i) from generate_series(1,1000000) i
  union all select 'raise notice ''abandon all hope!''; end; $$;'
) to '/tmp/breakit.sql';

...while consume amounts of resident memory proportional to the number
of statemnts and eventually crash the server.  The problem is obvious;
each statement causes a plan to get created and the server gets stuck
in a loop where SPI_freeplan() is called repeatedly.  Everything is
working as designed I guess, but when this happens it's really
unpleasant: the query is uncancellable and unterminatable, nicht gut.
A pg_ctl kill ABRT <pid> will do the trick but I was quite astonished
to see linux take a few minutes to clean up the mess (!) on a somewhat
pokey virtualized server with lots of memory.  With even as little as
ten thousand statements the cleanup time far exceed the runtime of the
statement block.

I guess the key takeaway here is, "don't do that"; pl/pgsql
aggressively generates plans and turns out to be a poor choice for
bulk loading because of all the plan caching.   Having said that, I
can't help but wonder if there should be a (perhaps user configurable)
limit to the amount of SPI plans a single function call should be able
to acquire on the basis you are going to smack into very poor
behaviors in the memory subsystem.

Stepping back, I can't help but wonder what the value of all the plan
caching going on is at all for statement blocks.  Loops might comprise
a notable exception, noted.  I'd humbly submit though that (relative
to functions) it's much more likely to want to do something like
insert a lot of statements and a impossible to utilize any cached
plans.

This is not an academic gripe -- I just exploded production :-D.

merlin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



--
Jan Wieck
Senior Postgres Architect



--
Jan Wieck
Senior Postgres Architect

Re: DO with a large amount of statements get stuck with high memory consumption

От
Merlin Moncure
Дата:
On Sat, Jul 16, 2016 at 2:47 PM, Jan Wieck <jan@wi3ck.info> wrote:
> On Tue, Jul 12, 2016 at 3:29 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>>
>> I've noticed that pl/pgsql functions/do commands do not behave well
>> when the statement resolves and frees memory.   To be clear:
>>
>> FOR i in 1..1000000
>> LOOP
>>   INSERT INTO foo VALUES (i);
>> END LOOP;
>>
>> ...runs just fine while
>>
>> BEGIN
>>   INSERT INTO foo VALUES (1);
>>   INSERT INTO foo VALUES (2);
>>   ...
>>   INSERT INTO foo VALUES (1000000);
>> END;
>
>
> This sounds very much like what led to commit
> 25c539233044c235e97fd7c9dc600fb5f08fe065.
>
> It seems that patch was only applied to master and never backpatched to 9.5
> or earlier.

You're right; thanks (my bad for missing that).  For those following
along, the case that turned this up was:
DO
<create temp table stuff>
<insert into stuff>
...;

Where the insertion step was a large number of standalone insert statements.

(temp table creation isn't necessary to turn up this bug, but it's a
common pattern when sending batch updates to a server).

For those following along, the workaround I recommend would be to do this:

do $d$
begin
<create temp table stuff>
create function doit() returns void as
$$ <insert into stuff>
$$ language sql;
perform doit();
end;
$d$;

BTW, while the fix does address the cleanup performance issue, it's
still the case that anonymous code blocks burn up lots of resident
memory (my 315k example I tested with ate around 8gb IIRC) when run
like this.  My question is, if the pl/pgsql code block is anonymous
and not in some kind of a loop, why bother caching the plan at all?

merlin



Re: DO with a large amount of statements get stuck with high memory consumption

От
Tom Lane
Дата:
Merlin Moncure <mmoncure@gmail.com> writes:
> BTW, while the fix does address the cleanup performance issue, it's
> still the case that anonymous code blocks burn up lots of resident
> memory (my 315k example I tested with ate around 8gb IIRC) when run
> like this.  My question is, if the pl/pgsql code block is anonymous
> and not in some kind of a loop, why bother caching the plan at all?

Nobody got around to it.  Also, as you note, it's not as simple as
"don't cache if in a DO block".  You'd need to track whether you were
inside any sort of looping construct.  Depending on how difficult
that turned out to be, it might add overhead to regular functions
that we don't want.
        regards, tom lane



Re: DO with a large amount of statements get stuck with high memory consumption

От
Jan Wieck
Дата:


On Mon, Jul 18, 2016 at 9:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Merlin Moncure <mmoncure@gmail.com> writes:
> BTW, while the fix does address the cleanup performance issue, it's
> still the case that anonymous code blocks burn up lots of resident
> memory (my 315k example I tested with ate around 8gb IIRC) when run
> like this.  My question is, if the pl/pgsql code block is anonymous
> and not in some kind of a loop, why bother caching the plan at all?

Nobody got around to it.  Also, as you note, it's not as simple as
"don't cache if in a DO block".  You'd need to track whether you were
inside any sort of looping construct.  Depending on how difficult
that turned out to be, it might add overhead to regular functions
that we don't want.

Agreed. And from the structures themselves it is not really easy to detect
if inside of a loop, the toplevel, while, for and if all use the same statement
block and call exec_stmts(), which in turn calls exec_stmt() for  each
element in that list. It is not impossible to add a flag, set at PL compile
time, to that element and check it every time, the statement is executed.
But such a change definitely needs more testing and probably won't
qualify for backpatching.

In the meantime, would it be appropriate to backpatch the double linking
of memory context children at this time? I believe it has had plenty of
testing in the 9.6 cycle to be sure it didn't break anything.


Regards, Jan

--
Jan Wieck
Senior Postgres Architect

Re: DO with a large amount of statements get stuck with high memory consumption

От
Merlin Moncure
Дата:
On Mon, Jul 18, 2016 at 8:59 AM, Jan Wieck <jan@wi3ck.info> wrote:
>
>
> On Mon, Jul 18, 2016 at 9:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Merlin Moncure <mmoncure@gmail.com> writes:
>> > BTW, while the fix does address the cleanup performance issue, it's
>> > still the case that anonymous code blocks burn up lots of resident
>> > memory (my 315k example I tested with ate around 8gb IIRC) when run
>> > like this.  My question is, if the pl/pgsql code block is anonymous
>> > and not in some kind of a loop, why bother caching the plan at all?
>>
>> Nobody got around to it.  Also, as you note, it's not as simple as
>> "don't cache if in a DO block".  You'd need to track whether you were
>> inside any sort of looping construct.  Depending on how difficult
>> that turned out to be, it might add overhead to regular functions
>> that we don't want.
>
> Agreed. And from the structures themselves it is not really easy to detect
> if inside of a loop, the toplevel, while, for and if all use the same
> statement
> block and call exec_stmts(), which in turn calls exec_stmt() for  each
> element in that list. It is not impossible to add a flag, set at PL compile
> time, to that element and check it every time, the statement is executed.
> But such a change definitely needs more testing and probably won't
> qualify for backpatching.

Right. Note, not arguing for backpatch here, just some open
speculation and some evidence that we still have a problem (although
nearly as nasty of one -- the pre-patch behavior of not responding to
cancel is very dangerous and solved).

Hm, maybe, instead of trying to figure out if in a loop, set a
'called' flag  with each statement and only cache when touched the
second time.  (If that's easier, dunno).

merlin



Re: DO with a large amount of statements get stuck with high memory consumption

От
Tom Lane
Дата:
Jan Wieck <jan@wi3ck.info> writes:
> In the meantime, would it be appropriate to backpatch the double linking
> of memory context children at this time? I believe it has had plenty of
> testing in the 9.6 cycle to be sure it didn't break anything.

I'm concerned about the ABI breakage risk from changing a data structure
as fundamental as MemoryContext.  Yeah, code outside utils/mmgr probably
*shouldn't* be looking at that struct, but that doesn't mean it isn't.
In the past we've generally only taken that sort of risk when there was
no other way to fix a bug; and this patch isn't a bug fix.  While this
does help performance in some corner cases, I don't think it's enough of
an across-the-board win to justify the risk of back-patching.
        regards, tom lane



Re: DO with a large amount of statements get stuck with high memory consumption

От
Tom Lane
Дата:
Merlin Moncure <mmoncure@gmail.com> writes:
> Hm, maybe, instead of trying to figure out if in a loop, set a
> 'called' flag  with each statement and only cache when touched the
> second time.  (If that's easier, dunno).

Well, then you just guarantee to lose once.  I think Jan's sketch of
marking potentially-cacheable expressions at compile time sounds
promising.  I'm envisioning a counter that starts at 1 normally or 0 in a
DO block, increment at beginning of parsing a loop construct, decrement at
end; then mark expressions/statements as cacheable if counter>0.

Of course the next question is what exactly to do differently for a
noncacheable expression.  My recollection is that that's all tied
pretty tightly to the plancache these days, so it might take a little
work.
        regards, tom lane



Re: DO with a large amount of statements get stuck with high memory consumption

От
Jan Wieck
Дата:


On Mon, Jul 18, 2016 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jan Wieck <jan@wi3ck.info> writes:
> In the meantime, would it be appropriate to backpatch the double linking
> of memory context children at this time? I believe it has had plenty of
> testing in the 9.6 cycle to be sure it didn't break anything.

I'm concerned about the ABI breakage risk from changing a data structure
as fundamental as MemoryContext.  Yeah, code outside utils/mmgr probably
*shouldn't* be looking at that struct, but that doesn't mean it isn't.
In the past we've generally only taken that sort of risk when there was
no other way to fix a bug; and this patch isn't a bug fix.  While this
does help performance in some corner cases, I don't think it's enough of
an across-the-board win to justify the risk of back-patching.

I would consider mucking with the linked lists of memory context children inside
of 3rd party code a really bad idea, but I concede. It isn't a bug fix and there is
that small risk that somebody did precisely that, so no backpatch.


Regards, Jan

--
Jan Wieck
Senior Postgres Architect