Обсуждение: strong memory leak in plpgsql from handled rollback and lazy cast

Поиск
Список
Период
Сортировка

strong memory leak in plpgsql from handled rollback and lazy cast

От
Pavel Stehule
Дата:
Hi

When I tested some hypothesis I wrote buggy code. It was surprise how fast I lost all free memory

do $$                  
begin
  for i in 1..3000000
  loop
    begin
      -- do some error
      if i then end if;
    exception when others then
      -- do nothing
    end;
  end loop;
end;
$$;

problem is somewhere in implicit casting inside IF statement. When I use explicit casting -

  IF i::boolean THEN

then there is not memory leak.

Regards

Pavel

Re: strong memory leak in plpgsql from handled rollback and lazy cast

От
Tom Lane
Дата:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> When I tested some hypothesis I wrote buggy code. It was surprise how fast
> I lost all free memory

> do $$
> begin
>   for i in 1..3000000
>   loop
>     begin
>       -- do some error
>       if i then end if;
>     exception when others then
>       -- do nothing
>     end;
>   end loop;
> end;
> $$;

Yeah, this is because an error gets thrown inside the cast-to-boolean.
It's intentional that the execution state tree gets thrown away if that
happens, per the comment in get_cast_hashentry:

     * Prepare the expression for execution, if it's not been done already in
     * the current transaction; also, if it's marked busy in the current
     * transaction, abandon that expression tree and build a new one, so as to
     * avoid potential problems with recursive cast expressions and failed
     * executions.  (We will leak some memory intra-transaction if that
     * happens a lot, but we don't expect it to.)  It's okay to update the

I'm not convinced that it'd be safe to re-use an ExprState after a
previous execution failed (though perhaps Andres has a different opinion?)
so I think the only way to avoid the intratransaction memory leak would
be to set up each new cast ExprState in its own memory context that we
could free.  That seems like adding quite a lot of overhead to get rid
of a leak that we've been living with for ages.

Maybe we could pay the extra overhead only after the expression has
failed at least once.  Seems a bit messy though, and I'm afraid that
we'd have to add PG_TRY overhead in any case.

            regards, tom lane



Re: strong memory leak in plpgsql from handled rollback and lazy cast

От
Andres Freund
Дата:
Hi,

On 2019-09-22 18:43:23 -0400, Tom Lane wrote:
> I'm not convinced that it'd be safe to re-use an ExprState after a
> previous execution failed (though perhaps Andres has a different
> opinion?)

I don't immediately see why it'd be problematic to reuse at a later
time, as long as it's guaranteed that a) there's only one execution
happening at a time b) the failure wasn't in the middle of writing a
value.  a) would be problematic regardless of reuse-after-failure, and
b) should be satisfied by only failing at ereport etc.

Most memory writes during ExprState evaluation are redone from scratch
every execution, and the remaining things should only be things like
tupledesc's being cached at first execution. And that even uses an
ExprContext callback to reset the cache on context shutdown.

The other piece is that on the first execution of a expression we use
ExecInterpExprStillValid, and we don't on later executions. Not sure if
that's relevant here?


> so I think the only way to avoid the intratransaction memory leak would
> be to set up each new cast ExprState in its own memory context that we
> could free.  That seems like adding quite a lot of overhead to get rid
> of a leak that we've been living with for ages.

Hm. I interestingly am working on a patch that merges all the memory
allocations done for an ExprState into one or two allocations (by
basically doing the traversal twice). Then it'd be feasible to just
pfree() the memory, if that helps.

Greetings,

Andres Freund



Re: strong memory leak in plpgsql from handled rollback and lazy cast

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2019-09-22 18:43:23 -0400, Tom Lane wrote:
>> I'm not convinced that it'd be safe to re-use an ExprState after a
>> previous execution failed (though perhaps Andres has a different
>> opinion?)

> I don't immediately see why it'd be problematic to reuse at a later
> time, as long as it's guaranteed that a) there's only one execution
> happening at a time b) the failure wasn't in the middle of writing a
> value.  a) would be problematic regardless of reuse-after-failure, and
> b) should be satisfied by only failing at ereport etc.

I think you're considering a much smaller slice of the system than
what seems to me to be at risk here.  As an example, an ExprState
tree would also include any fn_extra-linked state that individual
functions might've set up.  We've got very little control over what
the validity requirements are for those or how robust the code that
creates them is.  I *think* that most of the core code that makes
such things is written in a way that it doesn't leave partially-valid
cache state if setup fails partway through ... but I wouldn't swear
that it all is, and I'd certainly bet money on there being third-party
code that isn't careful about that.

> Hm. I interestingly am working on a patch that merges all the memory
> allocations done for an ExprState into one or two allocations (by
> basically doing the traversal twice). Then it'd be feasible to just
> pfree() the memory, if that helps.

Again, that'd do nothing to clean up subsidiary fn_extra state.
If we want no leaks, we need a separate context for the tree
to live in.

            regards, tom lane



Re: strong memory leak in plpgsql from handled rollback and lazy cast

От
Andres Freund
Дата:
Hi,

On 2019-09-22 20:16:15 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2019-09-22 18:43:23 -0400, Tom Lane wrote:
> >> I'm not convinced that it'd be safe to re-use an ExprState after a
> >> previous execution failed (though perhaps Andres has a different
> >> opinion?)
> 
> > I don't immediately see why it'd be problematic to reuse at a later
> > time, as long as it's guaranteed that a) there's only one execution
> > happening at a time b) the failure wasn't in the middle of writing a
> > value.  a) would be problematic regardless of reuse-after-failure, and
> > b) should be satisfied by only failing at ereport etc.
> 
> I think you're considering a much smaller slice of the system than
> what seems to me to be at risk here.

Yea, I was only referencing the expression eval logic itself, as I
understood your question to aim mainly at that...


> As an example, an ExprState tree would also include any
> fn_extra-linked state that individual functions might've set up.
> We've got very little control over what the validity requirements are
> for those or how robust the code that creates them is.  I *think* that
> most of the core code that makes such things is written in a way that
> it doesn't leave partially-valid cache state if setup fails partway
> through ... but I wouldn't swear that it all is, and I'd certainly bet
> money on there being third-party code that isn't careful about that.

Hm. I'd be kinda willing to just declare such code broken. But given
that the memory leak situation, as you say, still exists, I don't think
it matters for now.


> > Hm. I interestingly am working on a patch that merges all the memory
> > allocations done for an ExprState into one or two allocations (by
> > basically doing the traversal twice). Then it'd be feasible to just
> > pfree() the memory, if that helps.
> 
> Again, that'd do nothing to clean up subsidiary fn_extra state.
> If we want no leaks, we need a separate context for the tree
> to live in.

Well, it'd presumably leak a lot less :/.

Greetings,

Andres Freund



Re: strong memory leak in plpgsql from handled rollback and lazy cast

От
Andres Freund
Дата:
Hi,

On 2019-09-22 20:16:15 -0400, Tom Lane wrote:
> I think you're considering a much smaller slice of the system than
> what seems to me to be at risk here.  As an example, an ExprState
> tree would also include any fn_extra-linked state that individual
> functions might've set up.

> > Hm. I interestingly am working on a patch that merges all the memory
> > allocations done for an ExprState into one or two allocations (by
> > basically doing the traversal twice). Then it'd be feasible to just
> > pfree() the memory, if that helps.
>
> Again, that'd do nothing to clean up subsidiary fn_extra state.
> If we want no leaks, we need a separate context for the tree
> to live in.

As mentioned, as part of some expression evaluation improvements (both
w/ and wo/ jit) I now have a prototype patch that moves nearly all the
dynamically allocated memory, including the FunctionCallInfo, into one
chunk of memory. That's currently allocated together with the ExprState.
With the information collected for that it'd be fairly trivial to reset
things like fn_extra in a reasonably efficient manner, without needing
knowledge about each ExprEvalOp.

Obviously that'd not itself change e.g. anything about the lifetime of
the memory pointed to by fn_extra etc. But it'd not be particularly hard
to have FmgrInfo->fn_mcxt point somewhere else.

As part of the above rework ExecInitExprRec() etc all now pass down a
new ExprStateBuilder * object, containing state needed somewhere down in
the expression tree.  It'd e.g. be quite possible to add an
ExecInitExpr() variant that allows to specify in more detail what memory
context ought to be used for what.

I'm however not at all sure it's worth investing time into doing so
specifically for this case. But it seems like it might generally be
something we'll need more infrastructure for in other cases too.

Greetings,

Andres Freund