Обсуждение: Peripatus/failures

Поиск
Список
Период
Сортировка

Peripatus/failures

От
Larry Rosenman
Дата:

It looks like my upgrade to the current head of FreeBSD 12-to-be, which includes OpenSSL 1.1.1 broke a bunch of our stuff.

 

In at least the 9.x branches.  Just a heads up.

 

--

Larry Rosenman                     http://www.lerctr.org/~ler

Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org

US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

Re: Peripatus/failures

От
Tom Lane
Дата:
Larry Rosenman <ler@lerctr.org> writes:
> It looks like my upgrade to the current head of FreeBSD 12-to-be, which includes OpenSSL 1.1.1 broke a bunch of our
stuff.
> In at least the 9.x branches.  Just a heads up.

It looks like configure is drawing the wrong conclusions about OpenSSL's
API options.  Since the configure outputs are cached, perhaps clearing
your buildfarm critter's accache subdirectory would be enough to fix that.

            regards, tom lane


Re: Peripatus/failures

От
Larry Rosenman
Дата:
On Wed, Oct 17, 2018 at 01:41:59PM -0400, Tom Lane wrote:
> Larry Rosenman <ler@lerctr.org> writes:
> > It looks like my upgrade to the current head of FreeBSD 12-to-be, which includes OpenSSL 1.1.1 broke a bunch of our
stuff.
> > In at least the 9.x branches.  Just a heads up.
> 
> It looks like configure is drawing the wrong conclusions about OpenSSL's
> API options.  Since the configure outputs are cached, perhaps clearing
> your buildfarm critter's accache subdirectory would be enough to fix that.
> 

That got it further, but still fails at PLCheck-C (at least on 9.3).
It's still running the other branches.


-- 
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106


DSM robustness failure (was Re: Peripatus/failures)

От
Tom Lane
Дата:
Larry Rosenman <ler@lerctr.org> writes:
> That got it further, but still fails at PLCheck-C (at least on 9.3).
> It's still running the other branches.

Hmm.  I'm not sure why plpython is crashing for you, but this is exposing
a robustness problem in the DSM logic:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus&dt=2018-10-17%2018%3A22%3A50

The postmaster is suffering an Assert failure while trying to clean up
after the crash:

2018-10-17 13:43:23.203 CDT [51974:8] pg_regress LOG:  statement: SELECT import_succeed();
2018-10-17 13:43:24.228 CDT [46467:2] LOG:  server process (PID 51974) was terminated by signal 11: Segmentation fault
2018-10-17 13:43:24.228 CDT [46467:3] DETAIL:  Failed process was running: SELECT import_succeed();
2018-10-17 13:43:24.229 CDT [46467:4] LOG:  terminating any other active server processes
2018-10-17 13:43:24.229 CDT [46778:2] WARNING:  terminating connection because of crash of another server process
2018-10-17 13:43:24.229 CDT [46778:3] DETAIL:  The postmaster has commanded this server process to roll back the
currenttransaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 
2018-10-17 13:43:24.229 CDT [46778:4] HINT:  In a moment you should be able to reconnect to the database and repeat
yourcommand. 
2018-10-17 13:43:24.235 CDT [46467:5] LOG:  all server processes terminated; reinitializing
2018-10-17 13:43:24.235 CDT [46467:6] LOG:  dynamic shared memory control segment is corrupt
TRAP: FailedAssertion("!(dsm_control_mapped_size == 0)", File: "dsm.c", Line: 181)

It looks to me like what's happening is

(1) crashing process corrupts the DSM control segment somehow.

(2) dsm_postmaster_shutdown notices that, bleats to the log, and
figures its job is done.

(3) dsm_postmaster_startup crashes on Assert because
dsm_control_mapped_size isn't 0, because the old seg is still mapped.

I would argue that both dsm_postmaster_shutdown and dsm_postmaster_startup
are broken here; the former because it makes no attempt to unmap
the old control segment (which it oughta be able to do no matter how badly
broken the contents are), and the latter because it should not let
garbage old state prevent it from establishing a valid new segment.

BTW, the header comment on dsm_postmaster_startup is a lie, which
is probably not unrelated to its failure to consider this situation.

            regards, tom lane


Re: DSM robustness failure (was Re: Peripatus/failures)

От
Thomas Munro
Дата:
On Thu, Oct 18, 2018 at 9:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 2018-10-17 13:43:24.235 CDT [46467:6] LOG:  dynamic shared memory control segment is corrupt
> TRAP: FailedAssertion("!(dsm_control_mapped_size == 0)", File: "dsm.c", Line: 181)
>
> It looks to me like what's happening is
>
> (1) crashing process corrupts the DSM control segment somehow.

I wonder how.  Apparently mapped size was tiny (least likely
explanation), control->magic was wrong, or control->maxitems and
control->nitems were inconsistent with each other or the mapped size.

> (2) dsm_postmaster_shutdown notices that, bleats to the log, and
> figures its job is done.

Right, that seems to be the main problem.

> (3) dsm_postmaster_startup crashes on Assert because
> dsm_control_mapped_size isn't 0, because the old seg is still mapped.

Right.

> I would argue that both dsm_postmaster_shutdown and dsm_postmaster_startup
> are broken here; the former because it makes no attempt to unmap
> the old control segment (which it oughta be able to do no matter how badly
> broken the contents are), and the latter because it should not let
> garbage old state prevent it from establishing a valid new segment.

Looking.

> BTW, the header comment on dsm_postmaster_startup is a lie, which
> is probably not unrelated to its failure to consider this situation.

Agreed.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: DSM robustness failure (was Re: Peripatus/failures)

От
Thomas Munro
Дата:
On Thu, Oct 18, 2018 at 9:43 AM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Thu, Oct 18, 2018 at 9:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I would argue that both dsm_postmaster_shutdown and dsm_postmaster_startup
> > are broken here; the former because it makes no attempt to unmap
> > the old control segment (which it oughta be able to do no matter how badly
> > broken the contents are), and the latter because it should not let
> > garbage old state prevent it from establishing a valid new segment.
>
> Looking.

(CCing Amit Kapila)

To reproduce this, I attached lldb to a backend and did "mem write
&dsm_control->magic 42", and then delivered SIGKILL to the backend.
Here's one way to fix it.  I think we have no choice but to leak the
referenced segments, but we can free the control segment.  See
comments in the attached patch for rationale.

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

Re: DSM robustness failure (was Re: Peripatus/failures)

От
Larry Rosenman
Дата:
On Thu, Oct 18, 2018 at 11:08:33AM +1300, Thomas Munro wrote:
> On Thu, Oct 18, 2018 at 9:43 AM Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
> > On Thu, Oct 18, 2018 at 9:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > I would argue that both dsm_postmaster_shutdown and dsm_postmaster_startup
> > > are broken here; the former because it makes no attempt to unmap
> > > the old control segment (which it oughta be able to do no matter how badly
> > > broken the contents are), and the latter because it should not let
> > > garbage old state prevent it from establishing a valid new segment.
> >
> > Looking.
>
> (CCing Amit Kapila)
>
> To reproduce this, I attached lldb to a backend and did "mem write
> &dsm_control->magic 42", and then delivered SIGKILL to the backend.
> Here's one way to fix it.  I think we have no choice but to leak the
> referenced segments, but we can free the control segment.  See
> comments in the attached patch for rationale.
>
On the original failure, I recompiled and reinstalled the 2 Python's I
have on this box, and at least 9.3 went back to OK.


> --
> Thomas Munro
> http://www.enterprisedb.com



--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

Вложения

Re: DSM robustness failure (was Re: Peripatus/failures)

От
Tom Lane
Дата:
Larry Rosenman <ler@lerctr.org> writes:
> On the original failure, I recompiled and reinstalled the 2 Python's I
> have on this box, and at least 9.3 went back to OK. 

Hmm.  I'd just finished pulling down FreeBSD-12.0-ALPHA9 and failing
to reproduce any problem with that ... and then I noticed your box
said it was on ALPHA10, which I'd missed seeing in the clutter of
FreeBSD's download server.  I'm about halfway through switching to
that, but I bet it will work too.  Was your Python install built
with any special switches?  I just used what came from "pkg install".

            regards, tom lane


Re: DSM robustness failure (was Re: Peripatus/failures)

От
Larry Rosenman
Дата:
On Wed, Oct 17, 2018 at 07:07:09PM -0400, Tom Lane wrote:
> Larry Rosenman <ler@lerctr.org> writes:
> > On the original failure, I recompiled and reinstalled the 2 Python's I
> > have on this box, and at least 9.3 went back to OK.
>
> Hmm.  I'd just finished pulling down FreeBSD-12.0-ALPHA9 and failing
> to reproduce any problem with that ... and then I noticed your box
> said it was on ALPHA10, which I'd missed seeing in the clutter of
> FreeBSD's download server.  I'm about halfway through switching to
> that, but I bet it will work too.  Was your Python install built
> with any special switches?  I just used what came from "pkg install".

It had been built on a previous FreeBSD build, I have my own poudriere
infrastructure.  I can probably get you the package from my ZFS snaps if
you'd like.


>
>             regards, tom lane

--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

Вложения

Re: DSM robustness failure (was Re: Peripatus/failures)

От
Tom Lane
Дата:
Larry Rosenman <ler@lerctr.org> writes:
> On Wed, Oct 17, 2018 at 07:07:09PM -0400, Tom Lane wrote:
>> ... Was your Python install built
>> with any special switches?  I just used what came from "pkg install".

> It had been built on a previous FreeBSD build, I have my own poudriere
> infrastructure.  I can probably get you the package from my ZFS snaps if
> you'd like. 

I've now verified that the bog-standard packages for python 2.7 and python
3.6 both build and pass regression cleanly, using ALPHA10.  And I see
peripatus is back to green too.  So as far as the plpython end of this is
concerned, I think we can write it off as "something wrong with Larry's
custom package build".  However, I'm still slightly interested in how it
was that that broke DSM so thoroughly ... I pulled down your version of
python2.7 and will see if that reproduces it.

            regards, tom lane


Re: DSM robustness failure (was Re: Peripatus/failures)

От
Larry Rosenman
Дата:
On Wed, Oct 17, 2018 at 08:10:28PM -0400, Tom Lane wrote:
> Larry Rosenman <ler@lerctr.org> writes:
> > On Wed, Oct 17, 2018 at 07:07:09PM -0400, Tom Lane wrote:
> >> ... Was your Python install built
> >> with any special switches?  I just used what came from "pkg install".
>
> > It had been built on a previous FreeBSD build, I have my own poudriere
> > infrastructure.  I can probably get you the package from my ZFS snaps if
> > you'd like.
>
> I've now verified that the bog-standard packages for python 2.7 and python
> 3.6 both build and pass regression cleanly, using ALPHA10.  And I see
> peripatus is back to green too.  So as far as the plpython end of this is
> concerned, I think we can write it off as "something wrong with Larry's
> custom package build".  However, I'm still slightly interested in how it
> was that that broke DSM so thoroughly ... I pulled down your version of
> python2.7 and will see if that reproduces it.
>

It was built on a previous alpha, so who knows what the differing
compiler/libs/kernel/etc did.  The options used did *NOT* change, just
the userland used to compile it.  (I.E. that package, running on
ALPHA10 is what broke).

that's one of the nice things about poudriere, it's a clean room type
environment, and uses canned options as set by the user, and I haven't
changed python options in a LONG (I.E. months/years) time.


--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

Вложения

Re: DSM robustness failure (was Re: Peripatus/failures)

От
Tom Lane
Дата:
Larry Rosenman <ler@lerctr.org> writes:
> On Wed, Oct 17, 2018 at 08:10:28PM -0400, Tom Lane wrote:
>> However, I'm still slightly interested in how it
>> was that that broke DSM so thoroughly ... I pulled down your version of
>> python2.7 and will see if that reproduces it.

> It was built on a previous alpha, so who knows what the differing
> compiler/libs/kernel/etc did.  The options used did *NOT* change, just
> the userland used to compile it.  (I.E. that package, running on
> ALPHA10 is what broke).

Hm.  I forcibly installed your package over the regular one using
pkg add -f python27-2.7.15.txz
and rebuilt PG, and what I get is a failure in the plpython regression
test, but no crash.  So I'm still confused.

*** /usr/home/tgl/pgsql/src/pl/plpython/expected/plpython_import.out    Wed Oct 17 19:48:19 2018
--- /usr/home/tgl/pgsql/src/pl/plpython/results/plpython_import.out     Wed Oct 17 20:50:22 2018
***************
*** 64,79 ****
  -- test import and simple argument handling
  --
  SELECT import_test_one('sha hash of this string');
!              import_test_one
! ------------------------------------------
!  a04e23cb9b1a09cd1051a04a7c571aae0f90346c
! (1 row)
!
  -- test import and tuple argument handling
  --
  select import_test_two(users) from users where fname = 'willem';
!                           import_test_two
! -------------------------------------------------------------------
!  sha hash of willemdoe is 3cde6b574953b0ca937b4d76ebc40d534d910759
! (1 row)
!
--- 64,79 ----
  -- test import and simple argument handling
  --
  SELECT import_test_one('sha hash of this string');
! ERROR:  AttributeError: 'module' object has no attribute 'sha1'
! CONTEXT:  Traceback (most recent call last):
!   PL/Python function "import_test_one", line 3, in <module>
!     digest = hashlib.sha1(p.encode("ascii"))
! PL/Python function "import_test_one"
  -- test import and tuple argument handling
  --
  select import_test_two(users) from users where fname = 'willem';
! ERROR:  AttributeError: 'module' object has no attribute 'sha1'
! CONTEXT:  Traceback (most recent call last):
!   PL/Python function "import_test_two", line 4, in <module>
!     digest = hashlib.sha1(plain.encode("ascii"))
! PL/Python function "import_test_two"

======================================================================


            regards, tom lane


Re: DSM robustness failure (was Re: Peripatus/failures)

От
Thomas Munro
Дата:
On Thu, Oct 18, 2018 at 11:08 AM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Thu, Oct 18, 2018 at 9:43 AM Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
> > On Thu, Oct 18, 2018 at 9:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > I would argue that both dsm_postmaster_shutdown and dsm_postmaster_startup
> > > are broken here; the former because it makes no attempt to unmap
> > > the old control segment (which it oughta be able to do no matter how badly
> > > broken the contents are), and the latter because it should not let
> > > garbage old state prevent it from establishing a valid new segment.
> >
> > Looking.
>
> (CCing Amit Kapila)
>
> To reproduce this, I attached lldb to a backend and did "mem write
> &dsm_control->magic 42", and then delivered SIGKILL to the backend.
> Here's one way to fix it.  I think we have no choice but to leak the
> referenced segments, but we can free the control segment.  See
> comments in the attached patch for rationale.

I realised that the nearly identical code in dsm_postmaster_shutdown()
might as well destroy a corrupted control segment too.  New version
attached.

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

Re: DSM robustness failure (was Re: Peripatus/failures)

От
Larry Rosenman
Дата:
On Wed, Oct 17, 2018 at 08:55:09PM -0400, Tom Lane wrote:
> Larry Rosenman <ler@lerctr.org> writes:
> > On Wed, Oct 17, 2018 at 08:10:28PM -0400, Tom Lane wrote:
> >> However, I'm still slightly interested in how it
> >> was that that broke DSM so thoroughly ... I pulled down your version of
> >> python2.7 and will see if that reproduces it.
>
> > It was built on a previous alpha, so who knows what the differing
> > compiler/libs/kernel/etc did.  The options used did *NOT* change, just
> > the userland used to compile it.  (I.E. that package, running on
> > ALPHA10 is what broke).
>
> Hm.  I forcibly installed your package over the regular one using
> pkg add -f python27-2.7.15.txz
> and rebuilt PG, and what I get is a failure in the plpython regression
> test, but no crash.  So I'm still confused.

Hell if I know.  I've cleaned up my environment (old, broken, etc
stuff), but who knows whats broken what where.

Not sure how to get back to the "broken" state :(


--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

Вложения

Re: DSM robustness failure (was Re: Peripatus/failures)

От
Thomas Munro
Дата:
On Thu, Oct 18, 2018 at 1:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> ... However, I'm still slightly interested in how it
> was that that broke DSM so thoroughly ...

Me too.  Frustratingly, that vm object might still exist on Larry's
machine if it hasn't been rebooted (since we failed to shm_unlink()
it), so if we knew its name we could write a program to shm_open(),
mmap(), dump out to a file for analysis and then we could work out
which of the sanity tests it failed and maybe get some clues.
Unfortunately it's not in any of our logs AFAIK, and I can't see any
way to get a list of existing shm_open() objects from the kernel.
From sys/kern/uipc_shm.c:

 * TODO:
 *
 * (1) Need to export data to a userland tool via a sysctl.  Should ipcs(1)
 *     and ipcrm(1) be expanded or should new tools to manage both POSIX
 *     kernel semaphores and POSIX shared memory be written?

Gah.  So basically that's hiding in shm_dictionary in the kernel and I
don't know a way to look at it from userspace (other than trying to
open all 2^32 random paths we're capable of generating).

-- 
Thomas Munro
http://www.enterprisedb.com


Re: DSM robustness failure (was Re: Peripatus/failures)

От
Larry Rosenman
Дата:
On Thu, Oct 18, 2018 at 02:17:14PM +1300, Thomas Munro wrote:
> On Thu, Oct 18, 2018 at 1:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > ... However, I'm still slightly interested in how it
> > was that that broke DSM so thoroughly ...
>
> Me too.  Frustratingly, that vm object might still exist on Larry's
> machine if it hasn't been rebooted (since we failed to shm_unlink()
> it), so if we knew its name we could write a program to shm_open(),
> mmap(), dump out to a file for analysis and then we could work out
> which of the sanity tests it failed and maybe get some clues.
> Unfortunately it's not in any of our logs AFAIK, and I can't see any
> way to get a list of existing shm_open() objects from the kernel.
> From sys/kern/uipc_shm.c:
>
>  * TODO:
>  *
>  * (1) Need to export data to a userland tool via a sysctl.  Should ipcs(1)
>  *     and ipcrm(1) be expanded or should new tools to manage both POSIX
>  *     kernel semaphores and POSIX shared memory be written?
>
> Gah.  So basically that's hiding in shm_dictionary in the kernel and I
> don't know a way to look at it from userspace (other than trying to
> open all 2^32 random paths we're capable of generating).

It has *NOT* been rebooted.  I can give y'all id's if you want to go
poking around.


>
> --
> Thomas Munro
> http://www.enterprisedb.com

--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

Вложения

Re: DSM robustness failure (was Re: Peripatus/failures)

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@enterprisedb.com> writes:
> On Thu, Oct 18, 2018 at 1:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ... However, I'm still slightly interested in how it
>> was that that broke DSM so thoroughly ...

> Me too.  Frustratingly, that vm object might still exist on Larry's
> machine if it hasn't been rebooted (since we failed to shm_unlink()
> it), so if we knew its name we could write a program to shm_open(),
> mmap(), dump out to a file for analysis and then we could work out
> which of the sanity tests it failed and maybe get some clues.

Larry's REL_10_STABLE failure logs are interesting:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus&dt=2018-10-17%2020%3A42%3A17

2018-10-17 15:48:08.849 CDT [55240:7] LOG:  dynamic shared memory control segment is corrupt
2018-10-17 15:48:08.849 CDT [55240:8] LOG:  sem_destroy failed: Invalid argument
2018-10-17 15:48:08.850 CDT [55240:9] LOG:  sem_destroy failed: Invalid argument
2018-10-17 15:48:08.850 CDT [55240:10] LOG:  sem_destroy failed: Invalid argument
2018-10-17 15:48:08.850 CDT [55240:11] LOG:  sem_destroy failed: Invalid argument
... lots more ...
2018-10-17 15:48:08.862 CDT [55240:122] LOG:  sem_destroy failed: Invalid argument
2018-10-17 15:48:08.862 CDT [55240:123] LOG:  sem_destroy failed: Invalid argument
TRAP: FailedAssertion("!(dsm_control_mapped_size == 0)", File: "dsm.c", Line: 182)

So at least in this case, not only did we lose the DSM segment but also
all of our semaphores.  Is it conceivable that Python somehow destroyed
those objects, rather than stomping on the contents of the DSM segment?
If not, how do we explain this log?

Also, why is there branch-specific variation?  The fact that v11 and HEAD
aren't whinging about lost semaphores is not hard to understand --- we
stopped using SysV semas.  But why don't the older branches look like v10
here?

            regards, tom lane


Re: DSM robustness failure (was Re: Peripatus/failures)

От
Thomas Munro
Дата:
On Thu, Oct 18, 2018 at 2:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Larry's REL_10_STABLE failure logs are interesting:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus&dt=2018-10-17%2020%3A42%3A17
>
> 2018-10-17 15:48:08.849 CDT [55240:7] LOG:  dynamic shared memory control segment is corrupt
> 2018-10-17 15:48:08.849 CDT [55240:8] LOG:  sem_destroy failed: Invalid argument
> 2018-10-17 15:48:08.850 CDT [55240:9] LOG:  sem_destroy failed: Invalid argument
> 2018-10-17 15:48:08.850 CDT [55240:10] LOG:  sem_destroy failed: Invalid argument
> 2018-10-17 15:48:08.850 CDT [55240:11] LOG:  sem_destroy failed: Invalid argument
> ... lots more ...
> 2018-10-17 15:48:08.862 CDT [55240:122] LOG:  sem_destroy failed: Invalid argument
> 2018-10-17 15:48:08.862 CDT [55240:123] LOG:  sem_destroy failed: Invalid argument
> TRAP: FailedAssertion("!(dsm_control_mapped_size == 0)", File: "dsm.c", Line: 182)
>
> So at least in this case, not only did we lose the DSM segment but also
> all of our semaphores.  Is it conceivable that Python somehow destroyed
> those objects, rather than stomping on the contents of the DSM segment?
> If not, how do we explain this log?

One idea:  In the backend I'm looking at there is a contiguous run of
read/write mappings from the the location of the semaphore array
through to the DSM control segment.  That means that a single runaway
loop/memcpy/memset etc could overwrite both of those.  Eventually it
would run off the end of contiguously mapped space and SEGV, and we do
indeed see a segfault from that Python code before the trouble begins.

> Also, why is there branch-specific variation?  The fact that v11 and HEAD
> aren't whinging about lost semaphores is not hard to understand --- we
> stopped using SysV semas.  But why don't the older branches look like v10
> here?

I think v10 is where we switched to POSIX unnamed (= sem_destroy()),
so it's 10, 11 and master that should be the same in this respect, no?

-- 
Thomas Munro
http://www.enterprisedb.com


Re: DSM robustness failure (was Re: Peripatus/failures)

От
Amit Kapila
Дата:
On Thu, Oct 18, 2018 at 6:30 AM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
>
> On Thu, Oct 18, 2018 at 11:08 AM Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
> > On Thu, Oct 18, 2018 at 9:43 AM Thomas Munro
> > <thomas.munro@enterprisedb.com> wrote:
> > > On Thu, Oct 18, 2018 at 9:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > > I would argue that both dsm_postmaster_shutdown and dsm_postmaster_startup
> > > > are broken here; the former because it makes no attempt to unmap
> > > > the old control segment (which it oughta be able to do no matter how badly
> > > > broken the contents are), and the latter because it should not let
> > > > garbage old state prevent it from establishing a valid new segment.
> > >
> > > Looking.
> >
> > (CCing Amit Kapila)
> >
> > To reproduce this, I attached lldb to a backend and did "mem write
> > &dsm_control->magic 42", and then delivered SIGKILL to the backend.
> > Here's one way to fix it.  I think we have no choice but to leak the
> > referenced segments, but we can free the control segment.  See
> > comments in the attached patch for rationale.
>
> I realised that the nearly identical code in dsm_postmaster_shutdown()
> might as well destroy a corrupted control segment too.  New version
> attached.
>

The below code seems to be problemetic:
dsm_cleanup_using_control_segment()
{
..
if (!dsm_control_segment_sane(old_control, mapped_size))
{
dsm_impl_op(DSM_OP_DETACH, old_control_handle, 0, &impl_private,
&mapped_address, &mapped_size, LOG);
..
}

Here, don't we need to use dsm_control_* variables instead of local
variable mapped_* variables?  Do we need to reset the dsm_control_*
variables in dsm_cleanup_for_mmap?

@@ -336,13 +333,14 @@ dsm_postmaster_shutdown(int code, Datum arg)
  * stray shared memory segments, but there's not much we can do about that
  * if the metadata is gone.
  */
- nitems = dsm_control->nitems;
  if (!dsm_control_segment_sane(dsm_control, dsm_control_mapped_size))
  {
  ereport(LOG,
  (errmsg("dynamic shared memory control segment is corrupt")));
- return;
+ nitems = 0;

The comment above this code says:
/*
* If some other backend exited uncleanly, it might have corrupted the
* control segment while it was dying.  In that case, we warn and ignore
* the contents of the control segment.  This may end up leaving behind
* stray shared memory segments, but there's not much we can do about that
* if the metadata is gone.
*/

Don't we need to adjust this comment, if we want to go with what you
have proposed?  BTW, it doesn't appear insane to me that if the
control segment got corrupted, then we leave it as it is rather than
trying to destroy it.  I am not sure, but if it is corrupted, then is
it guaranteed that destroy will always succeed?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: DSM robustness failure (was Re: Peripatus/failures)

От
Larry Rosenman
Дата:
On Wed, Oct 17, 2018 at 08:19:52PM -0500, Larry Rosenman wrote:
> On Thu, Oct 18, 2018 at 02:17:14PM +1300, Thomas Munro wrote:
> > On Thu, Oct 18, 2018 at 1:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > ... However, I'm still slightly interested in how it
> > > was that that broke DSM so thoroughly ...
> >
> > Me too.  Frustratingly, that vm object might still exist on Larry's
> > machine if it hasn't been rebooted (since we failed to shm_unlink()
> > it), so if we knew its name we could write a program to shm_open(),
> > mmap(), dump out to a file for analysis and then we could work out
> > which of the sanity tests it failed and maybe get some clues.
> > Unfortunately it's not in any of our logs AFAIK, and I can't see any
> > way to get a list of existing shm_open() objects from the kernel.
> > From sys/kern/uipc_shm.c:
> >
> >  * TODO:
> >  *
> >  * (1) Need to export data to a userland tool via a sysctl.  Should ipcs(1)
> >  *     and ipcrm(1) be expanded or should new tools to manage both POSIX
> >  *     kernel semaphores and POSIX shared memory be written?
> >
> > Gah.  So basically that's hiding in shm_dictionary in the kernel and I
> > don't know a way to look at it from userspace (other than trying to
> > open all 2^32 random paths we're capable of generating).
>
> It has *NOT* been rebooted.  I can give y'all id's if you want to go
> poking around.
Let me know soon(ish) if any of you want to poke at this machine, as I'm
likely to forget and reboot it.....


>
>
> >
> > --
> > Thomas Munro
> > http://www.enterprisedb.com
>
> --
> Larry Rosenman                     http://www.lerctr.org/~ler
> Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
> US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106



--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 214-642-9640                 E-Mail: ler@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106

Вложения

Re: DSM robustness failure (was Re: Peripatus/failures)

От
Thomas Munro
Дата:
On Thu, Oct 18, 2018 at 6:02 PM Larry Rosenman <ler@lerctr.org> wrote:
> Let me know soon(ish) if any of you want to poke at this machine, as I'm
> likely to forget and reboot it.....

Hi Larry,

Thanks for the offer but it looks like there is no way to get our
hands on that memory for forensics now.  I'll see if there is some way
to improve that situation for next time something like this happens.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: DSM robustness failure (was Re: Peripatus/failures)

От
Thomas Munro
Дата:
On Thu, Oct 18, 2018 at 5:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> The below code seems to be problemetic:
> dsm_cleanup_using_control_segment()
> {
> ..
> if (!dsm_control_segment_sane(old_control, mapped_size))
> {
> dsm_impl_op(DSM_OP_DETACH, old_control_handle, 0, &impl_private,
> &mapped_address, &mapped_size, LOG);
> ..
> }
>
> Here, don't we need to use dsm_control_* variables instead of local
> variable mapped_* variables?

I was a little fuzzy on when exactly
dsm_cleanup_using_control_segment() and dsm_postmaster_shutdown() run,
but after some more testing I think I have this straight now.  You can
test by setting dsm_control->magic to 42 in a debugger and trying
three cases:

1.  Happy shutdown: dsm_postmaster_shutdown() complains on shutdown.
2.  kill -9 a non-postmaster process: dsm_postmaster_shutdown()
complains during auto-restart.
3.  kill -9 the postmaster, manually start up again:
dsm_cleanup_using_control_segment() runs.  It ignores the old segment
quietly if it doesn't pass the sanity test.

So to answer your question: no, dsm_cleanup_using_control_segment() is
case 3.  This entirely new postmaster process has never had the
segment mapped in, so the dsm_control_* variables are not relevant
here.

Hmm.... but if you're running N other independent clusters on the same
machine that started up after this cluster crashed in case 3, I think
there is an N-in-four-billion chance that the segment with that ID now
belongs to another cluster and happens to be its DSM control segment,
and therefore passes the magic-number sanity test, and then we'll nuke
it and all the segments it references.  Am I missing something?  Could
we fix that by putting some kind of cluster ID in the control segment?

So on second thoughts, I think I should remove the
dsm_cleanup_using_control_segment() change from the patch, because it
widens the risk of case 3 nuking someone else's stuff to include even
segments that don't pass the sanity test.  That's not good.  Hunk
removed.

> ... Do we need to reset the dsm_control_*
> variables in dsm_cleanup_for_mmap?

I don't think so -- that unlinks the files in file-backed DSM, but we
also call the other code in that case.

> @@ -336,13 +333,14 @@ dsm_postmaster_shutdown(int code, Datum arg)
>   * stray shared memory segments, but there's not much we can do about that
>   * if the metadata is gone.
>   */
> - nitems = dsm_control->nitems;
>   if (!dsm_control_segment_sane(dsm_control, dsm_control_mapped_size))
>   {
>   ereport(LOG,
>   (errmsg("dynamic shared memory control segment is corrupt")));
> - return;
> + nitems = 0;
>
> The comment above this code says:
> /*
> * If some other backend exited uncleanly, it might have corrupted the
> * control segment while it was dying.  In that case, we warn and ignore
> * the contents of the control segment.  This may end up leaving behind
> * stray shared memory segments, but there's not much we can do about that
> * if the metadata is gone.
> */
>
> Don't we need to adjust this comment, if we want to go with what you
> have proposed?

The comment seems correct: we ignore the *contents* of the control
segment.  But to be clearer I'll add a sentence to say that we still
attempt to destroy the control segment itself.

>  BTW, it doesn't appear insane to me that if the
> control segment got corrupted, then we leave it as it is rather than
> trying to destroy it.  I am not sure, but if it is corrupted, then is
> it guaranteed that destroy will always succeed?

You're right that it might fail.  If the shm_unlink() fails it doesn't
matter, we just produce LOG, but if the unmap() fails we still
consider the segment to be mapped (because it is!), so ... hmm, we'd
better forget about it if we plan to continue running, by clearing
those variables explicitly.  That's actually another pre-existing way
for the assertion to fail in current master.  (Realistically, that
unmmap() could only fail if our global variables are trashed so we try
to unmap() a junk address; AFAIK our model here is that the postmaster
is sane, even if everything else is insane, so that's sort of outside
the model.  But this new patch addresses it anyway.)

Thanks for the review!

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

Re: DSM robustness failure (was Re: Peripatus/failures)

От
Amit Kapila
Дата:
On Thu, Oct 18, 2018 at 2:33 PM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
>
> On Thu, Oct 18, 2018 at 5:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > The below code seems to be problemetic:
> > dsm_cleanup_using_control_segment()
> > {
> > ..
> > if (!dsm_control_segment_sane(old_control, mapped_size))
> > {
> > dsm_impl_op(DSM_OP_DETACH, old_control_handle, 0, &impl_private,
> > &mapped_address, &mapped_size, LOG);
> > ..
> > }
> >
> > Here, don't we need to use dsm_control_* variables instead of local
> > variable mapped_* variables?
>
> I was a little fuzzy on when exactly
> dsm_cleanup_using_control_segment() and dsm_postmaster_shutdown() run,
> but after some more testing I think I have this straight now.  You can
> test by setting dsm_control->magic to 42 in a debugger and trying
> three cases:
>
> 1.  Happy shutdown: dsm_postmaster_shutdown() complains on shutdown.
> 2.  kill -9 a non-postmaster process: dsm_postmaster_shutdown()
> complains during auto-restart.
> 3.  kill -9 the postmaster, manually start up again:
> dsm_cleanup_using_control_segment() runs.  It ignores the old segment
> quietly if it doesn't pass the sanity test.
>
> So to answer your question: no, dsm_cleanup_using_control_segment() is
> case 3.  This entirely new postmaster process has never had the
> segment mapped in, so the dsm_control_* variables are not relevant
> here.
>
> Hmm.... but if you're running N other independent clusters on the same
> machine that started up after this cluster crashed in case 3, I think
> there is an N-in-four-billion chance that the segment with that ID now
> belongs to another cluster and happens to be its DSM control segment,
> and therefore passes the magic-number sanity test, and then we'll nuke
> it and all the segments it references.  Am I missing something?
>

Unless the previous cluster (which crashed) has removed the segment,
how will new cluster succeed in getting the same segment.  Won't it
get the EExist and retry to get the segment with another id?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com