Обсуждение: LOG: munmap(0x7fff80000000) failed: Invalid argument

Поиск
Список
Период
Сортировка

LOG: munmap(0x7fff80000000) failed: Invalid argument

От
Chris Richards
Дата:
Howdy.

I have an Ubuntu 14.04 LTS, configured with huge pages (4x1GB, disabled transparent) and a later kernel than what ships with 14.04.

root@ff2:~# uname -a
Linux ff2 3.16.7-ckt11-061515+ #1 SMP Mon Jun 15 18:47:13 CDT 2015 x86_64 x86_64 x86_64 GNU/Linux

It had postgresql-9.3 on it and I installed postgresql-9.5 via these steps:

aptitude remove -y postgresql-9.3 
echo "deb http://apt.postgresql.org/pub/repos/apt/ trusty-pgdg main 9.5" > /etc/apt/sources.list.d/postgresql.list
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 7FCC7D46ACCC4CF8
apt-get update
apt-get install -y postgresql-client-9.5 postgresql-common postgresql-9.5

At the end of the install, these were emitted to the screen:

Setting up postgresql-9.5 (9.5.4-1.pgdg14.04+2) ...
Creating new cluster 9.5/main ...
  config /etc/postgresql/9.5/main
  data   /var/lib/postgresql/9.5/main
  locale en_US.UTF-8
LOG:  munmap(0x7fff80000000) failed: Invalid argument
[... snip 14 or so repeats ...]
LOG:  munmap(0x7fff80000000) failed: Invalid argument
  socket /var/run/postgresql
  port   5433
update-alternatives: using /usr/share/postgresql/9.5/man/man1/postmaster.1.gz to provide /usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode
 * Starting PostgreSQL 9.5 database server                                       [ OK ]
Processing triggers for libc-bin (2.19-0ubuntu6.6) ...

I'm able to connect and I dumped a few default relations.

Is the munmap error of concern? It remains upon rebooting / restarting the server. 

Thanks,
Chris

Re: LOG: munmap(0x7fff80000000) failed: Invalid argument

От
Tom Lane
Дата:
Chris Richards <chris@infinite.io> writes:
> Setting up postgresql-9.5 (9.5.4-1.pgdg14.04+2) ...
> Creating new cluster 9.5/main ...
>   config /etc/postgresql/9.5/main
>   data   /var/lib/postgresql/9.5/main
>   locale en_US.UTF-8
> LOG:  munmap(0x7fff80000000) failed: Invalid argument
> [... snip 14 or so repeats ...]
> LOG:  munmap(0x7fff80000000) failed: Invalid argument
>   socket /var/run/postgresql
>   port   5433
> update-alternatives: using
> /usr/share/postgresql/9.5/man/man1/postmaster.1.gz to provide
> /usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode
>  * Starting PostgreSQL 9.5 database server
>       [ OK ]
> Processing triggers for libc-bin (2.19-0ubuntu6.6) ...

> I'm able to connect and I dumped a few default relations.

> Is the munmap error of concern? It remains upon rebooting / restarting the
> server.

Seems pretty fishy to me; I don't know what would be causing it.

[ digs in code... ]  One theory is that PGSharedMemoryDetach is getting
called more than once, but I'm not sure how that would happen.  Can you
characterize where this happens more precisely?  What nondefault settings
have you got in postgresql.conf?

            regards, tom lane


Re: LOG: munmap(0x7fff80000000) failed: Invalid argument

От
Andres Freund
Дата:
On 2016-10-10 18:21:48 -0400, Tom Lane wrote:
> Chris Richards <chris@infinite.io> writes:
> > Setting up postgresql-9.5 (9.5.4-1.pgdg14.04+2) ...
> > Creating new cluster 9.5/main ...
> >   config /etc/postgresql/9.5/main
> >   data   /var/lib/postgresql/9.5/main
> >   locale en_US.UTF-8
> > LOG:  munmap(0x7fff80000000) failed: Invalid argument
> > [... snip 14 or so repeats ...]
> > LOG:  munmap(0x7fff80000000) failed: Invalid argument
> >   socket /var/run/postgresql
> >   port   5433
> > update-alternatives: using
> > /usr/share/postgresql/9.5/man/man1/postmaster.1.gz to provide
> > /usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode
> >  * Starting PostgreSQL 9.5 database server
> >       [ OK ]
> > Processing triggers for libc-bin (2.19-0ubuntu6.6) ...
>
> > I'm able to connect and I dumped a few default relations.
>
> > Is the munmap error of concern? It remains upon rebooting / restarting the
> > server.
>
> Seems pretty fishy to me; I don't know what would be causing it.
>
> [ digs in code... ]  One theory is that PGSharedMemoryDetach is getting
> called more than once, but I'm not sure how that would happen.  Can you
> characterize where this happens more precisely?  What nondefault settings
> have you got in postgresql.conf?

Hm. Could that be from the DSM code?

- Andres


Re: LOG: munmap(0x7fff80000000) failed: Invalid argument

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2016-10-10 18:21:48 -0400, Tom Lane wrote:
>> Chris Richards <chris@infinite.io> writes:
>>> LOG:  munmap(0x7fff80000000) failed: Invalid argument

>> [ digs in code... ]  One theory is that PGSharedMemoryDetach is getting
>> called more than once, but I'm not sure how that would happen.  Can you
>> characterize where this happens more precisely?  What nondefault settings
>> have you got in postgresql.conf?

> Hm. Could that be from the DSM code?

That particular error message spelling only appears in sysv_shmem.c,
so it's not directly DSM's fault.

The comments around PGSharedMemoryDetach strongly suggest that it ought
to be a no-op if called twice, which it originally was but fails to be
since the addition of the munmap call.   So I'm *very* strongly tempted
to add "AnonymousShmem = NULL;" there and in IpcMemoryDetach.  But
it's not evident why we've not seen this behavior many times already,
so I'd kind of like to find out what's different about Chris's use-case
before assuming that that will fix it.

            regards, tom lane


Re: LOG: munmap(0x7fff80000000) failed: Invalid argument

От
Chris Richards
Дата:
What more would you like--I'm happy to oblige? The upgrade steps I listed installed into a new directory, new conf file. I didn't even migrate my (skeletal) 9.3 database. Here's the 9.5 command-line. I've attached the referenced config file.

$ chrisr@ff1:~$ ps ax | grep bin/post
109999 ?        S      0:00 /usr/lib/postgresql/9.5/bin/postgres -D /var/lib/postgresql/9.5/main -c config_file=/etc/postgresql/9.5/main/postgresql.conf

Cheers,
Chris


On Mon, Oct 10, 2016 at 5:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Chris Richards <chris@infinite.io> writes:
> Setting up postgresql-9.5 (9.5.4-1.pgdg14.04+2) ...
> Creating new cluster 9.5/main ...
>   config /etc/postgresql/9.5/main
>   data   /var/lib/postgresql/9.5/main
>   locale en_US.UTF-8
> LOG:  munmap(0x7fff80000000) failed: Invalid argument
> [... snip 14 or so repeats ...]
> LOG:  munmap(0x7fff80000000) failed: Invalid argument
>   socket /var/run/postgresql
>   port   5433
> update-alternatives: using
> /usr/share/postgresql/9.5/man/man1/postmaster.1.gz to provide
> /usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode
>  * Starting PostgreSQL 9.5 database server
>       [ OK ]
> Processing triggers for libc-bin (2.19-0ubuntu6.6) ...

> I'm able to connect and I dumped a few default relations.

> Is the munmap error of concern? It remains upon rebooting / restarting the
> server.

Seems pretty fishy to me; I don't know what would be causing it.

[ digs in code... ]  One theory is that PGSharedMemoryDetach is getting
called more than once, but I'm not sure how that would happen.  Can you
characterize where this happens more precisely?  What nondefault settings
have you got in postgresql.conf?

                        regards, tom lane

Вложения

Re: LOG: munmap(0x7fff80000000) failed: Invalid argument

От
Chris Richards
Дата:
One extra bit, if this helps:

chrisr@ff1:~$ cat /proc/meminfo
MemTotal:        8163104 kB
MemFree:         2183692 kB
MemAvailable:    3648680 kB
Buffers:          170080 kB
Cached:          1231708 kB
SwapCached:            0 kB
Active:          1083596 kB
Inactive:         442312 kB
Active(anon):     125128 kB
Inactive(anon):     3584 kB
Active(file):     958468 kB
Inactive(file):   438728 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:        499708 kB
SwapFree:         499708 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:        124124 kB
Mapped:            42472 kB
Shmem:              4596 kB
Slab:             144788 kB
SReclaimable:      97612 kB
SUnreclaim:        47176 kB
KernelStack:       10672 kB
PageTables:         6332 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     2484108 kB
Committed_AS:     900632 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      202276 kB
VmallocChunk:   34359464540 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       4
HugePages_Free:        3
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB
DirectMap4k:       83776 kB
DirectMap2M:     4110336 kB
DirectMap1G:     6291456 kB

On Mon, Oct 10, 2016 at 5:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Chris Richards <chris@infinite.io> writes:
> Setting up postgresql-9.5 (9.5.4-1.pgdg14.04+2) ...
> Creating new cluster 9.5/main ...
>   config /etc/postgresql/9.5/main
>   data   /var/lib/postgresql/9.5/main
>   locale en_US.UTF-8
> LOG:  munmap(0x7fff80000000) failed: Invalid argument
> [... snip 14 or so repeats ...]
> LOG:  munmap(0x7fff80000000) failed: Invalid argument
>   socket /var/run/postgresql
>   port   5433
> update-alternatives: using
> /usr/share/postgresql/9.5/man/man1/postmaster.1.gz to provide
> /usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode
>  * Starting PostgreSQL 9.5 database server
>       [ OK ]
> Processing triggers for libc-bin (2.19-0ubuntu6.6) ...

> I'm able to connect and I dumped a few default relations.

> Is the munmap error of concern? It remains upon rebooting / restarting the
> server.

Seems pretty fishy to me; I don't know what would be causing it.

[ digs in code... ]  One theory is that PGSharedMemoryDetach is getting
called more than once, but I'm not sure how that would happen.  Can you
characterize where this happens more precisely?  What nondefault settings
have you got in postgresql.conf?

                        regards, tom lane

Re: LOG: munmap(0x7fff80000000) failed: Invalid argument

От
Chris Richards
Дата:
Oh the email spam :P

I did another reboot test to validate the error. It does exist; it does not spam munmap error like it did with the aptitude install.

I then marked the log file, shutdown (1 munmap during), marked it again and then started it (1 munmap during).

MARK PRE-SHUTDOWN
2016-10-11 20:02:45 UTC [1656-2] LOG:  received fast shutdown request
2016-10-11 20:02:45 UTC [1656-3] LOG:  aborting any active transactions
2016-10-11 20:02:45 UTC [1707-2] LOG:  autovacuum launcher shutting down
2016-10-11 20:02:45 UTC [1704-1] LOG:  shutting down
2016-10-11 20:02:45 UTC [1704-2] LOG:  database system is shut down
2016-10-11 20:02:45 UTC [1656-4] LOG:  munmap(0x7fff80000000) failed: Invalid argument
MARK PRE-START
2016-10-11 20:03:02 UTC [9894-1] LOG:  database system was shut down at 2016-10-11 20:02:45 UTC
2016-10-11 20:03:02 UTC [9894-2] LOG:  MultiXact member wraparound protections are now enabled
2016-10-11 20:03:02 UTC [9893-1] LOG:  database system is ready to accept connections
2016-10-11 20:03:02 UTC [9898-1] LOG:  autovacuum launcher started
2016-10-11 20:03:02 UTC [9899-1] LOG:  munmap(0x7fff80000000) failed: Invalid argument
2016-10-11 20:03:02 UTC [9900-1] [unknown]@[unknown] LOG:  incomplete startup packet


On Mon, Oct 10, 2016 at 5:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Chris Richards <chris@infinite.io> writes:
> Setting up postgresql-9.5 (9.5.4-1.pgdg14.04+2) ...
> Creating new cluster 9.5/main ...
>   config /etc/postgresql/9.5/main
>   data   /var/lib/postgresql/9.5/main
>   locale en_US.UTF-8
> LOG:  munmap(0x7fff80000000) failed: Invalid argument
> [... snip 14 or so repeats ...]
> LOG:  munmap(0x7fff80000000) failed: Invalid argument
>   socket /var/run/postgresql
>   port   5433
> update-alternatives: using
> /usr/share/postgresql/9.5/man/man1/postmaster.1.gz to provide
> /usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode
>  * Starting PostgreSQL 9.5 database server
>       [ OK ]
> Processing triggers for libc-bin (2.19-0ubuntu6.6) ...

> I'm able to connect and I dumped a few default relations.

> Is the munmap error of concern? It remains upon rebooting / restarting the
> server.

Seems pretty fishy to me; I don't know what would be causing it.

[ digs in code... ]  One theory is that PGSharedMemoryDetach is getting
called more than once, but I'm not sure how that would happen.  Can you
characterize where this happens more precisely?  What nondefault settings
have you got in postgresql.conf?

                        regards, tom lane

Re: LOG: munmap(0x7fff80000000) failed: Invalid argument

От
Tom Lane
Дата:
Chris Richards <chris@infinite.io> writes:
> Oh the email spam :P

No problem.  I'm starting to think that this isn't actually a Postgres
bug, but rather something funny with your kernel.  The code in sysv_shmem
is certainly as fragile as can be, but I'm darned if I can find an
existing code path that would actually break it --- and the lack of other
complaints is pointing towards something funny with your system, too.

First off, does changing "huge_pages" in postgresql.conf alter the
behavior?  (It defaults to "try", see what happens with "on" or "off".)

Second, I notice that you've got

> Hugepagesize:    1048576 kB

which is just enormous.  PG is only going to request circa 140MB given
the settings you mentioned.  We've seen reports of kernel bugs that cause
mmap() to fail for requests that aren't a multiple of the hugepage size,
but I've not heard that mmap() might succeed and then munmap() fail.
That seems like what's happening to you though.

            regards, tom lane


Re: LOG: munmap(0x7fff80000000) failed: Invalid argument

От
Chris Richards
Дата:
I have another application that consumes all of the huge pages; they aren't for pgsql. :)  I've modified the configuration file from "try" to "off" and munmap is no more. Mischief managed.

Thanks for your help.

On Tue, Oct 11, 2016 at 6:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Chris Richards <chris@infinite.io> writes:
> Oh the email spam :P

No problem.  I'm starting to think that this isn't actually a Postgres
bug, but rather something funny with your kernel.  The code in sysv_shmem
is certainly as fragile as can be, but I'm darned if I can find an
existing code path that would actually break it --- and the lack of other
complaints is pointing towards something funny with your system, too.

First off, does changing "huge_pages" in postgresql.conf alter the
behavior?  (It defaults to "try", see what happens with "on" or "off".)

Second, I notice that you've got

> Hugepagesize:    1048576 kB

which is just enormous.  PG is only going to request circa 140MB given
the settings you mentioned.  We've seen reports of kernel bugs that cause
mmap() to fail for requests that aren't a multiple of the hugepage size,
but I've not heard that mmap() might succeed and then munmap() fail.
That seems like what's happening to you though.

                        regards, tom lane