Обсуждение: reorderbuffer: memory overconsumption with medium-size subxacts

Поиск
Список
Период
Сортировка

reorderbuffer: memory overconsumption with medium-size subxacts

От
Alvaro Herrera
Дата:
Hello

Found this on Postgres 9.6, but I think it affects back to 9.4.

I've seen a case where reorderbuffer keeps very large amounts of memory
in use, without spilling to disk, if the main transaction does little or
no changes and many subtransactions execute changes just below the
threshold to spill to disk.

The particular case we've seen is the main transaction does one UPDATE,
then a subtransaction does something between 300 and 4000 changes.
Since all these are below max_changes_in_memory, nothing gets spilled to
disk.  (To make matters worse: even if there are some subxacts that do
more than max_changes_in_memory, only that subxact is spilled, not the
whole transaction.)  This was causing a 16GB-machine to die, unable to
process the long transaction; had to add additional 16 GB of physical
RAM for the machine to be able to process the transaction.

I think there's a one-line fix, attached: just add the number of changes
in a subxact to nentries_mem when the transaction is assigned to the
parent.  Since a wal ASSIGNMENT records happens once every 32 subxacts,
this accumulates just that number of subxact changes in memory before
spilling, which is much more reasonable.  (Hmm, I wonder why this
happens every 32 subxacts, if the code seems to be using
PGPROC_MAX_CACHED_SUBXIDS which is 64.)

Hmm, while writing this I am wonder if this affects cases with many
levels of subtransactions.  Not sure how are nested subxacts handled by
reorderbuffer.c, but reading code I think it is okay.

Of course, there's Tomas logical_work_mem too, but that's too invasive
to backpatch.

-- 
Álvaro Herrera                PostgreSQL Expert, https://www.2ndQuadrant.com/

Вложения

Re: reorderbuffer: memory overconsumption with medium-size subxacts

От
Tomas Vondra
Дата:
On 12/16/18 4:06 PM, Alvaro Herrera wrote:
> Hello
> 
> Found this on Postgres 9.6, but I think it affects back to 9.4.
> 
> I've seen a case where reorderbuffer keeps very large amounts of memory
> in use, without spilling to disk, if the main transaction does little or
> no changes and many subtransactions execute changes just below the
> threshold to spill to disk.
> 
> The particular case we've seen is the main transaction does one UPDATE,
> then a subtransaction does something between 300 and 4000 changes.
> Since all these are below max_changes_in_memory, nothing gets spilled to
> disk.  (To make matters worse: even if there are some subxacts that do
> more than max_changes_in_memory, only that subxact is spilled, not the
> whole transaction.)  This was causing a 16GB-machine to die, unable to
> process the long transaction; had to add additional 16 GB of physical
> RAM for the machine to be able to process the transaction.
> 

Yeah. We do the check for each xact separately, so it's vulnerable to
this scenario (subxact large just below max_changes_in_memory).

> I think there's a one-line fix, attached: just add the number of changes
> in a subxact to nentries_mem when the transaction is assigned to the
> parent.  Since a wal ASSIGNMENT records happens once every 32 subxacts,
> this accumulates just that number of subxact changes in memory before
> spilling, which is much more reasonable.  (Hmm, I wonder why this
> happens every 32 subxacts, if the code seems to be using
> PGPROC_MAX_CACHED_SUBXIDS which is 64.)
> 

Not sure, for a couple of reasons ...

Doesn't that essentially mean we'd always evict toplevel xact(s),
including all subxacts, no matter how tiny those subxacts are? That
seems like it might easily cause regressions for the common case with
many small subxact and one huge subxact (which is the only one we'd
currently spill, I think). That seems annoying.

But even if we decide it's the right approach, isn't the proposed patch
a couple of bricks shy? It redefines the nentries_mem field from "per
(sub)xact" to "total" for the toplevel xact, but then updates it only in
when assigning the child. But the subxacts may receive changes after
that, so ReorderBufferQueueChange() probably needs to update the value
for toplevel xact too, I guess. And ReorderBufferCheckSerializeTXN()
should probably check the toplevel xact too ...

Maybe a simpler solution would be to simply track total number of
changes in memory (from all xacts), and then evict the largest one. But
I doubt that would be backpatchable - it's pretty much what the
logical_work_mem patch does.

And of course, addressing this on pg11 is a bit more complicated due to
the behavior of Generation context (see the logical_work_mem thread for
details).

> Hmm, while writing this I am wonder if this affects cases with many
> levels of subtransactions.  Not sure how are nested subxacts handled by
> reorderbuffer.c, but reading code I think it is okay.
> 

That should be OK. The assignments don't care about the intermediate
subxacts, they only care about the toplevel xact and current subxact.

> Of course, there's Tomas logical_work_mem too, but that's too invasive
> to backpatch.
> 

Yep.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: reorderbuffer: memory overconsumption with medium-size subxacts

От
Andres Freund
Дата:
Hi,

On 2018-12-16 12:06:16 -0300, Alvaro Herrera wrote:
> Found this on Postgres 9.6, but I think it affects back to 9.4.
> 
> I've seen a case where reorderbuffer keeps very large amounts of memory
> in use, without spilling to disk, if the main transaction does little or
> no changes and many subtransactions execute changes just below the
> threshold to spill to disk.
> 
> The particular case we've seen is the main transaction does one UPDATE,
> then a subtransaction does something between 300 and 4000 changes.
> Since all these are below max_changes_in_memory, nothing gets spilled to
> disk.  (To make matters worse: even if there are some subxacts that do
> more than max_changes_in_memory, only that subxact is spilled, not the
> whole transaction.)  This was causing a 16GB-machine to die, unable to
> process the long transaction; had to add additional 16 GB of physical
> RAM for the machine to be able to process the transaction.
> 
> I think there's a one-line fix, attached: just add the number of changes
> in a subxact to nentries_mem when the transaction is assigned to the
> parent.

Isn't this going to cause significant breakage, because we rely on
nentries_mem to be accurate?

    /* try to load changes from disk */
    if (entry->txn->nentries != entry->txn->nentries_mem)


Greetings,

Andres Freund


Re: reorderbuffer: memory overconsumption with medium-size subxacts

От
Alvaro Herrera
Дата:
On 2018-Dec-16, Andres Freund wrote:

> > I think there's a one-line fix, attached: just add the number of changes
> > in a subxact to nentries_mem when the transaction is assigned to the
> > parent.
> 
> Isn't this going to cause significant breakage, because we rely on
> nentries_mem to be accurate?
> 
>     /* try to load changes from disk */
>     if (entry->txn->nentries != entry->txn->nentries_mem)

Bahh.

Are you suggesting I should try a more thorough rewrite of
reorderbuffer.c?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: reorderbuffer: memory overconsumption with medium-size subxacts

От
Andres Freund
Дата:
Hi,

On 2018-12-16 17:30:30 -0300, Alvaro Herrera wrote:
> On 2018-Dec-16, Andres Freund wrote:
> 
> > > I think there's a one-line fix, attached: just add the number of changes
> > > in a subxact to nentries_mem when the transaction is assigned to the
> > > parent.
> > 
> > Isn't this going to cause significant breakage, because we rely on
> > nentries_mem to be accurate?
> > 
> >     /* try to load changes from disk */
> >     if (entry->txn->nentries != entry->txn->nentries_mem)
> 
> Bahh.
> 
> Are you suggesting I should try a more thorough rewrite of
> reorderbuffer.c?

I'm saying you can't just randomly poke at one variable without actually
checking how it's used.  I kinda doubt you're going to find something
that's immediately backpatchable here. Perhaps something for master
that's then backpatched after it matured for a while.

Greetings,

Andres Freund