Re: PostgreSQL crashes with SIGSEGV
От | Peter Geoghegan |
---|---|
Тема | Re: PostgreSQL crashes with SIGSEGV |
Дата | |
Msg-id | CAH2-WzmR=Aucv9mjstXRD8SyP0mfw00PSK+bBnpTE3TRwqR67Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: PostgreSQL crashes with SIGSEGV (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: PostgreSQL crashes with SIGSEGV
(Peter Geoghegan <pg@bowt.ie>)
|
Список | pgsql-bugs |
On Thu, Dec 14, 2017 at 12:06 PM, Peter Geoghegan <pg@bowt.ie> wrote: > I can reproduce this against REL9_6_STABLE, once work_mem is set to > 4MB, and replacement_sort_tuples is set to 150000. I've figured this one out. It's a double free. Consider this code: (Although REL9_5_STABLE is also affected, the following remarks reference REL9_6_STABLE-only code.) * The slot receives a copied tuple (sometimes allocated in caller memory * context) that will stay valid regardless of future manipulations of the * tuplesort's state. */ bool tuplesort_gettupleslot(/* SNIP */) I wrote that comment block, in commit a5f0bd77a, an early bugfix for 9.6 (I actually mentioned this specific commit to Bernd in passing already). This bug is very similar to the one fixed by a5f0bd77a, and is arguably a broader version of that same bug. Note that I didn't do anything with external sorts until 9.6, so I expect that this affects all stable branches. That said, it may not be possible to produce a hard crash on all versions. More or less by accident. The function opens with this code: MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext); SortTuple stup; bool should_free; if (!tuplesort_gettuple_common(state, forward, &stup, &should_free)) stup.tuple = NULL; MemoryContextSwitchTo(oldcontext); /* SNIP */ ... As long as the slot receives a copied tuple that may or may not be in caller's memory context (it may instead get allocated in a tuplesort-private memory context), and as long as destroying a tuplesort via tuplesort_end() counts as a "future manipulation", then you it's possible to crash with a double-pfree() while not violating the tuplesort_gettupleslot() contract. The memory may or may not be freed as part of tuplesort_end() teardown, based only on accidental implementation factors (that's the first time the memory gets freed when we do crash). Specifically, if state->status == TSS_SORTEDONTAPE, we risk this crash on versions before Postgres 10 (more on why 10 appears to be okay tomorrow), when tuplesort_end() just so happens to have been called (destroying the 2 memory contexts) before the executor's superfluous pfree() within ExecResetTupleTable(). The ambiguity about who owns the tuple memory when is the basic problem, once again. One could argue that it's the caller's fault for not knowing to not pfree() the tuple after tuplesort_end() is called, but I don't buy that because it seems like an undue burden on callers to do that. It seems okay to either provide a very weak, simple guarantee about tuple memory lifetime, or very strong simple guarantee about tuple memory lifetime. That's what routines like tuplesort_gettupleslot() and tuplesort_getindextuple() should limit their contracts to -- we've had enough problems in this area over the years that that seems pretty clear to me. ISTM that an appropriate fix is one that results in having tuplesort_gettupleslot() tuple memory that is *consistently* allocated within caller's own memory context, at least on versions prior to Postgres 10 (it's a bit more complicated in Postgres 10). I'll begin work at a patch for this tomorrow. It's not obvious what approach to take, because on 9.6+, memory is allocated in tuplecontext, not sortcontext. A well-targeted fix will not burden low-level READTUP() routines (that use tuplecontext) with direct knowledge of the problem. -- Peter Geoghegan
В списке pgsql-bugs по дате отправления:
Предыдущее
От: John R PierceДата:
Сообщение: Re: BUG #14978: Import/Export button is disabled in pdAdmin 4.2.0