Re: [BUGS] BUG #14781: server process was terminated by signal 11:Segmentation fault

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: [BUGS] BUG #14781: server process was terminated by signal 11:Segmentation fault
Дата
Msg-id 20170816164737.mvd3dl4xgk2ofoia@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: [BUGS] BUG #14781: server process was terminated by signal 11: Segmentation fault  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [BUGS] BUG #14781: server process was terminated by signal 11: Segmentation fault  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
Tom Lane wrote:
> Maksim Karaba <Maksim_Karaba@epam.com> writes:
> > Unfortunately we cannot reproduce this issue on other servers, only on production system.
> > And we cannot provide internal database info, schema structure and tables info.
> 
> [ shrug... ]  We may just have to wait for somebody to be more
> forthcoming.
> 
> FWIW, the stack trace seems to indicate that an incorrect plan has been
> generated, ie one that has a remote join node without an EPQ recheck
> subplan.  That mistake in itself is probably pretty deterministic.  The
> reason you can't reproduce the crash easily is that the lack of a subplan
> only manifests as a crash if we enter the EPQ recheck code, and that only
> happens if the query tries to update a row that's just been updated by
> some concurrent query.  So it's not going to crash except under concurrent
> load, which probably also explains why the bug wasn't found long ago.

One way to figure out the exact bug is to explore the sequence of WAL
records that leads to the tuple causing the crash; it should be possible
to create a reproducer by writing an isolationtester script that
produces the same WAL sequence.  That's how we found the bug fixed in
https://git.postgresql.org/pg/commitdiff/459c64d3227f8 for example.

> If you want to push this forward rather than wait for somebody else
> to hit the problem, you could try adding something like
> 
>     if (fsplan->scan.scanrelid == 0 && outerPlanState(node) == NULL &&
>         (estate->es_plannedstmt->commandType != CMD_SELECT ||
>          estate->es_rowMarks))
>         elog(WARNING, "foreign join plan lacks EPQ support");
> 
> near the beginning of postgresBeginForeignScan and then running your app
> on a test server.

Hmm, is there a reason this cannot be included as a sanity check always?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [BUGS] Hello I got this error when installing postgresql 9.4 on my antsle debian 8 LXC do you know a work around?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [BUGS] BUG #14781: server process was terminated by signal 11: Segmentation fault