Обсуждение: InitDB: Bad system call

Поиск
Список
Период
Сортировка

InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Hello,

i've just compiled a new Jail at my FreeBDS 7.0-STABLE machine and
trying to get PostgreSQL 9.0 Beta 4 running. Compiling etc works fine.

But when i call the initdb, i get "Bad System Call" messages. Here is
the output:

$ /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data -d
Running in debug mode.
VERSION=9.0beta4
PGDATA=/usr/local/pgsql/data
share_path=/usr/local/pgsql/share
PGPATH=/usr/local/pgsql/bin
POSTGRES_SUPERUSERNAME=postgres
POSTGRES_BKI=/usr/local/pgsql/share/postgres.bki
POSTGRES_DESCR=/usr/local/pgsql/share/postgres.description
POSTGRES_SHDESCR=/usr/local/pgsql/share/postgres.shdescription
POSTGRESQL_CONF_SAMPLE=/usr/local/pgsql/share/postgresql.conf.sample
PG_HBA_SAMPLE=/usr/local/pgsql/share/pg_hba.conf.sample
PG_IDENT_SAMPLE=/usr/local/pgsql/share/pg_ident.conf.sample
The files belonging to this database system will be owned by user
"postgres".
This user must also own the server process.

The database cluster will be initialized with locale C.
The default database encoding has accordingly been set to SQL_ASCII.
The default text search configuration will be set to "english".

fixing permissions on existing directory /usr/local/pgsql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
10
selecting default shared_buffers ... Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
Bad system call (core dumped)
400kB
creating configuration files ... ok
creating template1 database in /usr/local/pgsql/data/base/1 ... Bad
system call (core dumped)
child process exited with exit code 140
initdb: removing contents of data directory "/usr/local/pgsql/data"

There is no further message in /var/log/messages.

First i believed this is an error relating to SYSVSHM-, SYSVSEM-,
SYSVMSG-options or User-Id
(http://www.freebsddiary.org/jail-multiple.php). But the postgres-user
has a user-id which is not used by other postgres-instances in other
jails. And the other options are enabled in the root-instance.

I also tried to build postgres from a fresh portstree, to make sure,
that i have nothing miss-"./configure"d, but there are the same problems.

I have no clue, what the problem is. Any hints?

Thanks,
Torsten

Re: InitDB: Bad system call

От
Thom Brown
Дата:
On 9 August 2010 12:56, Torsten Zühlsdorff <foo@meisterderspiele.de> wrote:
> Hello,
>
> i've just compiled a new Jail at my FreeBDS 7.0-STABLE machine and trying to
> get PostgreSQL 9.0 Beta 4 running. Compiling etc works fine.
>
> But when i call the initdb, i get "Bad System Call" messages. Here is the
> output:
>
> $ /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data -d
> Running in debug mode.
> VERSION=9.0beta4
> PGDATA=/usr/local/pgsql/data
> share_path=/usr/local/pgsql/share
> PGPATH=/usr/local/pgsql/bin
> POSTGRES_SUPERUSERNAME=postgres
> POSTGRES_BKI=/usr/local/pgsql/share/postgres.bki
> POSTGRES_DESCR=/usr/local/pgsql/share/postgres.description
> POSTGRES_SHDESCR=/usr/local/pgsql/share/postgres.shdescription
> POSTGRESQL_CONF_SAMPLE=/usr/local/pgsql/share/postgresql.conf.sample
> PG_HBA_SAMPLE=/usr/local/pgsql/share/pg_hba.conf.sample
> PG_IDENT_SAMPLE=/usr/local/pgsql/share/pg_ident.conf.sample
> The files belonging to this database system will be owned by user
> "postgres".
> This user must also own the server process.
>
> The database cluster will be initialized with locale C.
> The default database encoding has accordingly been set to SQL_ASCII.
> The default text search configuration will be set to "english".
>
> fixing permissions on existing directory /usr/local/pgsql/data ... ok
> creating subdirectories ... ok
> selecting default max_connections ... Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> 10
> selecting default shared_buffers ... Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> 400kB
> creating configuration files ... ok
> creating template1 database in /usr/local/pgsql/data/base/1 ... Bad system
> call (core dumped)
> child process exited with exit code 140
> initdb: removing contents of data directory "/usr/local/pgsql/data"
>
> There is no further message in /var/log/messages.
>
> First i believed this is an error relating to SYSVSHM-, SYSVSEM-,
> SYSVMSG-options or User-Id (http://www.freebsddiary.org/jail-multiple.php).
> But the postgres-user has a user-id which is not used by other
> postgres-instances in other jails. And the other options are enabled in the
> root-instance.
>
> I also tried to build postgres from a fresh portstree, to make sure, that i
> have nothing miss-"./configure"d, but there are the same problems.
>
> I have no clue, what the problem is. Any hints?
>
> Thanks,
> Torsten
>
> --

See http://www.postgresql.org/docs/9.0/static/kernel-resources.html
and the section under NetBSD/OpenBSD.

--
Thom Brown
Registered Linux user: #516935

Re: InitDB: Bad system call

От
Amitabh Kant
Дата:
On Mon, Aug 9, 2010 at 6:01 PM, Thom Brown <thom@linux.com> wrote:

See http://www.postgresql.org/docs/9.0/static/kernel-resources.html
and the section under NetBSD/OpenBSD.

--
Thom Brown
Registered Linux user: #516935


Thom

Not sure if it's a typo, but shouldn't he be looking under FreeBSD section as he is running FreeBSD 7.0?


Amitabh Kant

Re: InitDB: Bad system call

От
Thom Brown
Дата:
On 9 August 2010 13:56, Amitabh Kant <amitabhkant@gmail.com> wrote:
> On Mon, Aug 9, 2010 at 6:01 PM, Thom Brown <thom@linux.com> wrote:
>>
>> See http://www.postgresql.org/docs/9.0/static/kernel-resources.html
>> and the section under NetBSD/OpenBSD.
>>
>> --
>> Thom Brown
>> Registered Linux user: #516935
>>
>
> Thom
>
> Not sure if it's a typo, but shouldn't he be looking under FreeBSD section
> as he is running FreeBSD 7.0?
>

Ah yes, my bad.

--
Thom Brown
Registered Linux user: #516935

Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Hello Thom,

> See http://www.postgresql.org/docs/9.0/static/kernel-resources.html
> and the section under NetBSD/OpenBSD.

I already know the FreeBSD section. My current values are:

kern.ipc.shmall: 131072
kern.ipc.shmmax: 2684225436
kern.ipc.semmap: 4096

kern.ipc.semmnu: 512
kern.ipc.semmns: 1024
kern.ipc.semmni: 512

kern.ipc.shm_use_phys: 0

security.jail.sysvipc_allowed: 1

I also run the user with different UIDs:

$ grep pgsql -h /usr/local/jail/*/*/etc/passwd
pgsql:*:1070:70:PostgreSQL Daemon:/usr/local/pgsql:/bin/sh
pgsql:*:7575:7575:PostgreSQL Daemon:/usr/local/pgsql:/bin/sh
pgsql:*:1074:70:PostgreSQL Daemon:/usr/local/pgsql:/bin/sh
pgsql:*:1071:70:PostgreSQL Daemon:/usr/local/pgsql:/bin/sh

I also rebuild the complete jail to make sure, that it is not an error
while creating the jail.
I also disable all - but one (the live-db ;)) - postgresql instance to
make sure, that enough shared memory is free.
But the "bad system call" messages don't go away. Any other hint?

Greetings,
Torsten


Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Torsten Zühlsdorff schrieb:

> i've just compiled a new Jail at my FreeBDS 7.0-STABLE machine and
> trying to get PostgreSQL 9.0 Beta 4 running. Compiling etc works fine.
>
> But when i call the initdb, i get "Bad System Call" messages. Here is
> the output:
>
> $ /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data -d
> [output]
>
> First i believed this is an error relating to SYSVSHM-, SYSVSEM-,
> SYSVMSG-options or User-Id
> (http://www.freebsddiary.org/jail-multiple.php). But the postgres-user
> has a user-id which is not used by other postgres-instances in other
> jails. And the other options are enabled in the root-instance.
>
> I also tried to build postgres from a fresh portstree, to make sure,
> that i have nothing miss-"./configure"d, but there are the same problems.

I've tried the initdb in the only jail PostgreSQL is already running.
There it works.

I have no clue what to do next. I didn't even find the core-dump -.-
Should i just tune-up the System V IPC parameters and hope?

Greetings,
Torsten

--
http://www.dddbl.de - ein Datenbank-Layer, der die Arbeit mit 8
verschiedenen Datenbanksystemen abstrahiert,
Queries von Applikationen trennt und automatisch die Query-Ergebnisse
auswerten kann.

Re: InitDB: Bad system call

От
"Reko Turja"
Дата:
> Torsten Zühlsdorff schrieb:
>
>> i've just compiled a new Jail at my FreeBDS 7.0-STABLE machine and
>> trying to get PostgreSQL 9.0 Beta 4 running. Compiling etc works
>> fine.

Is the machine really running a pre-RELENG 7.0?

>> But when i call the initdb, i get "Bad System Call" messages. Here
>> is the output:

The system throwing out a coredump instead of failing gracefully
suggests an OS bug and as you are seemingly running an ancient
development branch, that seems even quite plausible.

In any case I'd ask the same question in the freebsd-questions as
well.

-Reko


Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Reko Turja schrieb:

>>> i've just compiled a new Jail at my FreeBDS 7.0-STABLE machine and
>>> trying to get PostgreSQL 9.0 Beta 4 running. Compiling etc works fine.
>
> Is the machine really running a pre-RELENG 7.0?

As far as i now, we used the 7.0 versions some month after their
release. So: no.
When i look in, i see in the welcome message:
FreeBSD 7.0-STABLE (GENERIC) #1: Fri Aug 15 19:33:13 CEST 2008

That are 6 months after initial release of 7.0.

>>> But when i call the initdb, i get "Bad System Call" messages. Here is
>>> the output:
>
> The system throwing out a coredump instead of failing gracefully
> suggests an OS bug and as you are seemingly running an ancient
> development branch, that seems even quite plausible.

I'm running a development *jail* at the *same* machine like the
live-database. The live-database works greats. There is also a second
jail were a postgresql-instance is running. In both i can use Postgresql
(versions 8.3 and 8.4) without any limitations. But in the third-jail i
get the problems.

Greetings,
Torsten

Re: InitDB: Bad system call

От
Greg Smith
Дата:
Torsten Zühlsdorff wrote:
> selecting default max_connections ... Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> Bad system call (core dumped)
> 10
> selecting default shared_buffers ... Bad system call (core dumped)
> Bad system call (core dumped) ...

What it's doing in this part is trying to start the server process in a
special testing mode, starting with large values for the settings that
impact shared memory, then stepping down the sizes until that works.
That's why there are so many of these.  But it looks like none of them
actually work.

Have you tried running the initdb with strace or truss?  That might give
you a clue as to exactly what system call is failing.  Your jail isn't
allowing something fundamental here, but it's hard to guess what.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: InitDB: Bad system call

От
Tom Lane
Дата:
Greg Smith <greg@2ndquadrant.com> writes:
> Torsten Z�hlsdorff wrote:
>> Bad system call (core dumped)

> Have you tried running the initdb with strace or truss?  That might give
> you a clue as to exactly what system call is failing.  Your jail isn't
> allowing something fundamental here, but it's hard to guess what.

Or even easier, gdb the core file ...

            regards, tom lane

Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Hi Tom,

>>> Bad system call (core dumped)
>
>> Have you tried running the initdb with strace or truss?  That might give
>> you a clue as to exactly what system call is failing.  Your jail isn't
>> allowing something fundamental here, but it's hard to guess what.
>
> Or even easier, gdb the core file ...

As written early i can't locate the core file. But now i use truss:
$ truss -o /tmp/pg.truss /usr/local/bin/initdb /usr/local/pgsql/

Here is the result:
http://www.dddbl.de/pg.truss.txt

The first suspicious i can see are a lots of "ERR#32 'Broken pipe'" entries.

I also changed some ipc-values from:
kern.ipc.semmni=512
kern.ipc.semmns=1024
kern.ipc.semmnu=512

to:
kern.ipc.semmnu: 4096
kern.ipc.semmns: 8192
kern.ipc.semmni: 32767

But these are read-only values. I have to reboot the machine. But it's a
live-machine and it will take some time to prepare rebooting. -.-

Greetings from Germany,
Torsten

Re: InitDB: Bad system call

От
Alvaro Herrera
Дата:
Excerpts from Torsten Zühlsdorff's message of mié ago 11 02:52:34 -0400 2010:
> Hi Tom,
>
> >>> Bad system call (core dumped)
> >
> >> Have you tried running the initdb with strace or truss?  That might give
> >> you a clue as to exactly what system call is failing.  Your jail isn't
> >> allowing something fundamental here, but it's hard to guess what.
> >
> > Or even easier, gdb the core file ...
>
> As written early i can't locate the core file. But now i use truss:
> $ truss -o /tmp/pg.truss /usr/local/bin/initdb /usr/local/pgsql/

This isn't as helpful because you're tracing the initdb process.  The
core file would give a backtrace of the postgres process, which is what
is actually crashing.

> The first suspicious i can see are a lots of "ERR#32 'Broken pipe'" entries.

This is the result of postgres crashing and thus initdb being unable to
write any more data to it.

I think you should try harder to generate the core file.  Maybe you have
too low an "ulimit -c" setting?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: InitDB: Bad system call

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from Torsten Zühlsdorff's message of mié ago 11 02:52:34 -0400 2010:
>>>> Bad system call (core dumped)

> I think you should try harder to generate the core file.  Maybe you have
> too low an "ulimit -c" setting?

The kernel message indicates that core *is* being dumped.  Possibly it's
being dumped in the $PGDATA directory, which initdb will rm -rf on
failure.  Try using initdb --noclean.

            regards, tom lane

Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Hello,

>> The first suspicious i can see are a lots of "ERR#32 'Broken pipe'" entries.
>
> This is the result of postgres crashing and thus initdb being unable to
> write any more data to it.
>
> I think you should try harder to generate the core file.  Maybe you have
> too low an "ulimit -c" setting?

There is no ulimit at FreeBSD.

Greetings,
Torsten

Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Hello,

>> Excerpts from Torsten ZÌhlsdorff's message of mié ago 11 02:52:34 -0400 2010:
>>>>> Bad system call (core dumped)
>
>> I think you should try harder to generate the core file.  Maybe you have
>> too low an "ulimit -c" setting?
>
> The kernel message indicates that core *is* being dumped.  Possibly it's
> being dumped in the $PGDATA directory, which initdb will rm -rf on
> failure.  Try using initdb --noclean.

So... yesterday night i was able to change the SyS-IPC Settings and
restart the server. Good bye 216 days uptime :D

After that i recreate the jail from the scratch and compiled PG 9.0 Beta
4 again.  I've compiled PG with:
$ ./configure --enable-debug

InitDB is:
$ /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data/ --noclean
Running in noclean mode.  Mistakes will not be cleaned up.
The files belonging to this database system will be owned by user "pgsql".
This user must also own the server process.

The database cluster will be initialized with locale en_US.ISO8859-1.
The default database encoding has accordingly been set to LATIN1.
The default text search configuration will be set to "english".

creating directory /usr/local/pgsql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
10
selecting default shared_buffers ... Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
Bad system call
400kB
creating configuration files ... ok
creating template1 database in /usr/local/pgsql/data/base/1 ... Bad
system call
child process exited with exit code 140
initdb: data directory "/usr/local/pgsql/data" not removed at user's request

Result in $PGDATA is:
$ ls -lah /usr/local/pgsql/data/
total 84
drwx------  12 pgsql  pgsql   512B Aug 12 08:56 .
drwx------   6 pgsql  pgsql   512B Aug 12 08:56 ..
-rw-------   1 pgsql  pgsql     4B Aug 12 08:56 PG_VERSION
drwx------   3 pgsql  pgsql   512B Aug 12 08:56 base
drwx------   2 pgsql  pgsql   512B Aug 12 08:56 global
drwx------   2 pgsql  pgsql   512B Aug 12 08:56 pg_clog
-rw-------   1 pgsql  pgsql   3.8K Aug 12 08:56 pg_hba.conf
-rw-------   1 pgsql  pgsql   1.6K Aug 12 08:56 pg_ident.conf
drwx------   4 pgsql  pgsql   512B Aug 12 08:56 pg_multixact
drwx------   2 pgsql  pgsql   512B Aug 12 08:56 pg_notify
drwx------   2 pgsql  pgsql   512B Aug 12 08:56 pg_stat_tmp
drwx------   2 pgsql  pgsql   512B Aug 12 08:56 pg_subtrans
drwx------   2 pgsql  pgsql   512B Aug 12 08:56 pg_tblspc
drwx------   2 pgsql  pgsql   512B Aug 12 08:56 pg_twophase
drwx------   3 pgsql  pgsql   512B Aug 12 08:56 pg_xlog
-rw-------   1 pgsql  pgsql    17K Aug 12 08:56 postgresql.conf
-rw-------   1 pgsql  pgsql    49B Aug 12 08:56 postmaster.pid

Please notice, that after changing the IPC-Settings of the system, no
core-file is dumped anymore. Quiet interessting.

Greetings,
Torsten

Re: InitDB: Bad system call

От
Tom Lane
Дата:
=?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes:
> Please notice, that after changing the IPC-Settings of the system, no
> core-file is dumped anymore. Quiet interessting.

How annoying :-(.  I think what you need to do is use truss or strace
or local equivalent with the follow-forks flag, so that you can see what
the stand-alone backend process does, not just initdb itself.

            regards, tom lane

Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Hi Tom,

>> Please notice, that after changing the IPC-Settings of the system, no
>> core-file is dumped anymore. Quiet interessting.
>
> How annoying :-(.  I think what you need to do is use truss or strace
> or local equivalent with the follow-forks flag, so that you can see what
> the stand-alone backend process does, not just initdb itself.

Ok, next round. I just have truss as an option, because strace didn't
work at my AMD64. Hope its helpfull:

$ truss -f -o /tmp/pgtuss-f.txt /usr/local/pgsql/bin/initdb -D
/usr/local/pgsql/data

Result:
http://www.dddbl.de/pg-truss-f.txt

Greetings,
Torsten

Re: InitDB: Bad system call

От
Tom Lane
Дата:
=?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes:
>> How annoying :-(.  I think what you need to do is use truss or strace
>> or local equivalent with the follow-forks flag, so that you can see what
>> the stand-alone backend process does, not just initdb itself.

> Ok, next round. I just have truss as an option, because strace didn't
> work at my AMD64. Hope its helpfull:

> $ truss -f -o /tmp/pgtuss-f.txt /usr/local/pgsql/bin/initdb -D
> /usr/local/pgsql/data

> Result:
> http://www.dddbl.de/pg-truss-f.txt

[ scratches head ... ]  That looks like it got interrupted before
getting to anything interesting.  Did the console printout show any "Bad
system call" reports?

            regards, tom lane

Re: InitDB: Bad system call

От
Alban Hertroys
Дата:
On 12 Aug 2010, at 16:04, Torsten Zühlsdorff wrote:

> Ok, next round. I just have truss as an option, because strace didn't work at my AMD64. Hope its helpfull:


I haven't used it yet, but I've heard good things about DTrace, which is apparently in base these days.

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


!DSPAM:737,4c6426a9967631439327345!



Re: InitDB: Bad system call

От
Glen Barber
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/12/10 11:23 AM, Tom Lane wrote:
> =?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes:
>>> How annoying :-(.  I think what you need to do is use truss or strace
>>> or local equivalent with the follow-forks flag, so that you can see what
>>> the stand-alone backend process does, not just initdb itself.
>
>> Ok, next round. I just have truss as an option, because strace didn't
>> work at my AMD64. Hope its helpfull:
>
>> $ truss -f -o /tmp/pgtuss-f.txt /usr/local/pgsql/bin/initdb -D
>> /usr/local/pgsql/data
>
>> Result:
>> http://www.dddbl.de/pg-truss-f.txt
>
> [ scratches head ... ]  That looks like it got interrupted before
> getting to anything interesting.  Did the console printout show any "Bad
> system call" reports?
>

Hi,

I didn't see it mentioned earlier in this thread - is
security.jail.sysvipc_allowed=1?  This will automatically be set to 1 if
you have jail_sysvipc_allow="YES" in rc.conf.

Regards,

- --
Glen Barber
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iQEcBAEBAgAGBQJMZC2yAAoJEFJPDDeguUajRksIAKRDxPxc9MEdo++CVjETSFI6
tRS8uNfnNjLf2DVmY7pAwQfCLvzLRyaJpvpJOeXo76RhqYB79IuRZNODVneXcmUU
6T6KVL+CflR6ql/Vt6XHEdi3VBUCwXmGImxMKm0cN42+cqg9Clr43hPptxTWV0Cw
vv0UIEanS3mTY4yBqwd7gwulLBrFl/X17k1oz8ALRpI+UmMmwEJUkcNANIdbhyrp
7JS0MBVfAO3qXCeG0JeKDvwAmdKOrPUEfumWa8SCqDuLgtK1QT29yEZCf2J2c6vz
jWSalckCQu+Alpse4t42mzC/tyoDBXzPe/zNBd9VRRwQntwnacdjBrjXyR8sv8c=
=6UOg
-----END PGP SIGNATURE-----

Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Hi Glen,

>>>> How annoying :-(.  I think what you need to do is use truss or strace
>>>> or local equivalent with the follow-forks flag, so that you can see what
>>>> the stand-alone backend process does, not just initdb itself.
>>> Ok, next round. I just have truss as an option, because strace didn't
>>> work at my AMD64. Hope its helpfull:
>>> $ truss -f -o /tmp/pgtuss-f.txt /usr/local/pgsql/bin/initdb -D
>>> /usr/local/pgsql/data
>>> Result:
>>> http://www.dddbl.de/pg-truss-f.txt
>> [ scratches head ... ]  That looks like it got interrupted before
>> getting to anything interesting.  Did the console printout show any "Bad
>> system call" reports?
>
> I didn't see it mentioned earlier in this thread - is
> security.jail.sysvipc_allowed=1?  This will automatically be set to 1 if
> you have jail_sysvipc_allow="YES" in rc.conf.

Yes, it is:
# sysctl -a | grep sysvipc_allowed
security.jail.sysvipc_allowed: 1

Greetings,
Torsten

Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Hello Tom,

>>> How annoying :-(.  I think what you need to do is use truss or strace
>>> or local equivalent with the follow-forks flag, so that you can see what
>>> the stand-alone backend process does, not just initdb itself.
>
>> Ok, next round. I just have truss as an option, because strace didn't
>> work at my AMD64. Hope its helpfull:
>
>> $ truss -f -o /tmp/pgtuss-f.txt /usr/local/pgsql/bin/initdb -D
>> /usr/local/pgsql/data
>
>> Result:
>> http://www.dddbl.de/pg-truss-f.txt
>
> [ scratches head ... ]  That looks like it got interrupted before
> getting to anything interesting.  Did the console printout show any "Bad
> system call" reports?

Yes, it does. But because i believed that it's not very helpful without
a core-file, i rebuild everything again. I checked out the newsted
sources from bsd, build the world new, the jail new and than the postgresql.

It's the same like before, but this time with core-file! :) I don't know
why, but now there is one. You can find it here:
http://www.dddbl.de/postgres.core (2,4 MB)

If helpful, i can give you access to the jail. This should be easier for
us, than communication over multiple timezones.

Greetings,
Torsten

Re: InitDB: Bad system call

От
Tom Lane
Дата:
=?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes:
> It's the same like before, but this time with core-file! :) I don't know
> why, but now there is one. You can find it here:
> http://www.dddbl.de/postgres.core (2,4 MB)

That's good, but the core file is pretty much useless to anyone else.
Please gdb it and post a stack trace:

    gdb /path/to/postgres /path/to/core
    gdb> bt
    gdb> quit

            regards, tom lane

Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Tom Lane schrieb:
> =?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes:
>> It's the same like before, but this time with core-file! :) I don't know
>> why, but now there is one. You can find it here:
>> http://www.dddbl.de/postgres.core (2,4 MB)
>
> That's good, but the core file is pretty much useless to anyone else.
> Please gdb it and post a stack trace:
>
>     gdb /path/to/postgres /path/to/core
>     gdb> bt
>     gdb> quit
>

Hm... /path/to/postgres? Not initdb? But regardless what i use, it looks
like:
#0  0x0000000800bb166c in ?? ()
#1  0x00000000005b158f in ?? ()
#2  0x0000003000000020 in ?? ()
#3  0x00007fffffffe620 in ?? ()
#4  0x00007fffffffe560 in ?? ()
#5  0x000000080091607a in ?? ()
#6  0x0000000800c04a60 in ?? ()
#7  0x0000000800913496 in ?? ()
#8  0x00007fffffffeab8 in ?? ()
#9  0x00007fffffffeab0 in ?? ()
#10 0xffffff00423f38e0 in ?? ()
#11 0x00007fffffffe618 in ?? ()
#12 0x0000000000000031 in ?? ()
#13 0x00000000ffffaa8a in ?? ()
#14 0x00000000007ea036 in ?? ()
#15 0x000000080091056d in ?? ()
#16 0x0000000000000207 in ?? ()
#17 0x00000000000005c8 in ?? ()
#18 0x00007fffffffe618 in ?? ()
#19 0xffffff00423f38e0 in ?? ()
#20 0x00007fffffffe65d in ?? ()
#21 0x00000000007ea094 in ?? ()
#22 0x00007fffffffeab0 in ?? ()
#23 0x00007fffffffeab8 in ?? ()
#24 0x0000000000000000 in ?? ()

I believe that is not very helpful, is it?

Greetings,
Torsten

Re: InitDB: Bad system call

От
Tom Lane
Дата:
=?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes:
> Hm... /path/to/postgres? Not initdb?

Yes; it's postgres that is failing, not initdb.

> But regardless what i use, it looks
> like:
> #0  0x0000000800bb166c in ?? ()
> #1  0x00000000005b158f in ?? ()
> ...
> I believe that is not very helpful, is it?

Nope, it's not.  Could you reconfigure with --enable-debug, rebuild, try
again?

            regards, tom lane

Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Tom Lane schrieb:

>> Hm... /path/to/postgres? Not initdb?
>
> Yes; it's postgres that is failing, not initdb.

Ok.

>> But regardless what i use, it looks
>> like:
>> #0  0x0000000800bb166c in ?? ()
>> #1  0x00000000005b158f in ?? ()
>> ...
>> I believe that is not very helpful, is it?
>
> Nope, it's not.  Could you reconfigure with --enable-debug, rebuild, try
> again?

Hm, that was already with --enable-debug. But i believe i just missused
gdb at the first time. Now i get the following result, which seems more
helpful. But i have to reuse an save core-dump, because like before
postgres don't create new ones. Here the result:

%gdb /usr/local/pgsql/bin/postgres /tmp/postgres.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

warning: exec file is newer than core file.
Core was generated by `postgres'.
Program terminated with signal 12, Bad system call.
Reading symbols from /lib/libm.so.5...done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libc.so.7...done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /libexec/ld-elf.so.1...done.
Loaded symbols for /libexec/ld-elf.so.1
#0  0x0000000800bb166c in shmctl () from /lib/libc.so.7
(gdb) bt
#0  0x0000000800bb166c in shmctl () from /lib/libc.so.7
#1  0x00000000005b158f in PGSharedMemoryIsInUse (id1=Variable "id1" is
not available.
) at pg_shmem.c:247
#2  0x00000000006a0844 in CreateLockFile (filename=0x7ea036
"postmaster.pid", amPostmaster=0 '\0', isDDLock=1 '\001',
refName=0x800e0b180 "/usr/local/pgsql/data") at miscinit.c:835
#3  0x000000000049baf0 in AuxiliaryProcessMain (argc=3,
argv=0x7fffffffebc8) at bootstrap.c:350
#4  0x000000000056742e in main (argc=4, argv=0x7fffffffebc0) at main.c:180
(gdb) quit

Greetings,
Torsten

Re: InitDB: Bad system call

От
Tom Lane
Дата:
=?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes:
> Core was generated by `postgres'.
> Program terminated with signal 12, Bad system call.
> Reading symbols from /lib/libm.so.5...done.
> Loaded symbols for /lib/libm.so.5
> Reading symbols from /lib/libc.so.7...done.
> Loaded symbols for /lib/libc.so.7
> Reading symbols from /libexec/ld-elf.so.1...done.
> Loaded symbols for /libexec/ld-elf.so.1
> #0  0x0000000800bb166c in shmctl () from /lib/libc.so.7
> (gdb) bt
> #0  0x0000000800bb166c in shmctl () from /lib/libc.so.7
> #1  0x00000000005b158f in PGSharedMemoryIsInUse (id1=Variable "id1" is
> not available.
> ) at pg_shmem.c:247
> #2  0x00000000006a0844 in CreateLockFile (filename=0x7ea036
> "postmaster.pid", amPostmaster=0 '\0', isDDLock=1 '\001',
> refName=0x800e0b180 "/usr/local/pgsql/data") at miscinit.c:835
> #3  0x000000000049baf0 in AuxiliaryProcessMain (argc=3,
> argv=0x7fffffffebc8) at bootstrap.c:350
> #4  0x000000000056742e in main (argc=4, argv=0x7fffffffebc0) at main.c:180

Well, this seems to be clear proof for what everyone suspected all
along: your kernel is rejecting SysV-shared-memory calls.  I'm too tired
to go check that that shmctl() is the first such syscall during the boot
sequence, but it looks about right.

So we're now back to the question of *why* it's rejecting those calls,
when you apparently have the proper support configured.  I'm afraid
you now need to seek the assistance of some FreeBSD kernel experts;
it's beyond the ken of a simple database hacker ...

            regards, tom lane

Re: InitDB: Bad system call

От
Glen Barber
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/15/10 1:32 AM, Tom Lane wrote:
>
> Well, this seems to be clear proof for what everyone suspected all
> along: your kernel is rejecting SysV-shared-memory calls.  I'm too tired
> to go check that that shmctl() is the first such syscall during the boot
> sequence, but it looks about right.
>
> So we're now back to the question of *why* it's rejecting those calls,
> when you apparently have the proper support configured.  I'm afraid
> you now need to seek the assistance of some FreeBSD kernel experts;
> it's beyond the ken of a simple database hacker ...
>

7.0-STABLE is ... old.  I would recommend upgrading to something more
recent before moving forward with this "bug", as I expect the FreeBSD
community to recommend such anyway.

Regards,

- --
Glen Barber
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iQEcBAEBAgAGBQJMZ4e4AAoJEFJPDDeguUajxlAH/0Q7hXCTRnsooq9+Xqs+QPGW
Ti77c1D2bcvt3Uq+BdBhbCW6Hx+8kKWPIo8wHG5ca6I5BXnb0ieZftrbPlHUzoNv
xnBSAQWWpmL01zt0LOgD2mVrC9b0Q0FUg+ZDXAQCwcZA/FhwA9Vmbf7y+6Eht1JQ
12mSqnAGzuNHvNhMd76+YQPhYo4/5cPQLvH9JKJG7K7CbD9kaP8q9qXoUM4VfcOP
NlNMk5huIGBZQVpYYiSPaKeWkjRy4TK5/bubLoRuQ9lYKWfRqDe+3tjqMWk07lyC
LJ8hf0cLUV45L0lHXtydQM+mCm0ZN7CgytdyXzt1vVEdfg/flkkf3oxR1aH6ygk=
=IpDN
-----END PGP SIGNATURE-----

Re: InitDB: Bad system call

От
Alban Hertroys
Дата:
On 15 Aug 2010, at 7:32, Tom Lane wrote:

> =?ISO-8859-15?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes:
>> Core was generated by `postgres'.
>> Program terminated with signal 12, Bad system call.
>> Reading symbols from /lib/libm.so.5...done.
>> Loaded symbols for /lib/libm.so.5
>> Reading symbols from /lib/libc.so.7...done.
>> Loaded symbols for /lib/libc.so.7
>> Reading symbols from /libexec/ld-elf.so.1...done.
>> Loaded symbols for /libexec/ld-elf.so.1
>> #0  0x0000000800bb166c in shmctl () from /lib/libc.so.7
>> (gdb) bt
>> #0  0x0000000800bb166c in shmctl () from /lib/libc.so.7
>> #1  0x00000000005b158f in PGSharedMemoryIsInUse (id1=Variable "id1" is
>> not available.
>> ) at pg_shmem.c:247
>> #2  0x00000000006a0844 in CreateLockFile (filename=0x7ea036
>> "postmaster.pid", amPostmaster=0 '\0', isDDLock=1 '\001',
>> refName=0x800e0b180 "/usr/local/pgsql/data") at miscinit.c:835
>> #3  0x000000000049baf0 in AuxiliaryProcessMain (argc=3,
>> argv=0x7fffffffebc8) at bootstrap.c:350
>> #4  0x000000000056742e in main (argc=4, argv=0x7fffffffebc0) at main.c:180
>
> Well, this seems to be clear proof for what everyone suspected all
> along: your kernel is rejecting SysV-shared-memory calls.  I'm too tired
> to go check that that shmctl() is the first such syscall during the boot
> sequence, but it looks about right.
>
> So we're now back to the question of *why* it's rejecting those calls,
> when you apparently have the proper support configured.  I'm afraid
> you now need to seek the assistance of some FreeBSD kernel experts;
> it's beyond the ken of a simple database hacker ...


Hmm... shared memory in a jail, there used to be some issues with that and I don't think they have been (or are going
tobe) solved. 
I recall that shared memory can't be local to a jail (it's "shared" after all), so you probably need(ed) to allow
accessto it somehow for your jails. 
Or you're running into issues sharing the same shared memory across multiple jails (and the base system) maybe?

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


!DSPAM:737,4c67aeef967631104912678!



Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Alban Hertroys schrieb:

>>> Core was generated by `postgres'. Program terminated with signal
>>> 12, Bad system call. Reading symbols from /lib/libm.so.5...done.
>>> Loaded symbols for /lib/libm.so.5 Reading symbols from
>>> /lib/libc.so.7...done. Loaded symbols for /lib/libc.so.7 Reading
>>> symbols from /libexec/ld-elf.so.1...done. Loaded symbols for
>>> /libexec/ld-elf.so.1 #0  0x0000000800bb166c in shmctl () from
>>> /lib/libc.so.7 (gdb) bt #0  0x0000000800bb166c in shmctl () from
>>> /lib/libc.so.7 #1  0x00000000005b158f in PGSharedMemoryIsInUse
>>> (id1=Variable "id1" is not available. ) at pg_shmem.c:247 #2
>>> 0x00000000006a0844 in CreateLockFile (filename=0x7ea036
>>> "postmaster.pid", amPostmaster=0 '\0', isDDLock=1 '\001',
>>> refName=0x800e0b180 "/usr/local/pgsql/data") at miscinit.c:835 #3
>>> 0x000000000049baf0 in AuxiliaryProcessMain (argc=3,
>>> argv=0x7fffffffebc8) at bootstrap.c:350 #4  0x000000000056742e in
>>> main (argc=4, argv=0x7fffffffebc0) at main.c:180
>> Well, this seems to be clear proof for what everyone suspected all
>> along: your kernel is rejecting SysV-shared-memory calls.  I'm too
>> tired to go check that that shmctl() is the first such syscall
>> during the boot sequence, but it looks about right.
>>
>> So we're now back to the question of *why* it's rejecting those
>> calls, when you apparently have the proper support configured.  I'm
>> afraid you now need to seek the assistance of some FreeBSD kernel
>> experts; it's beyond the ken of a simple database hacker ...
>
>
> Hmm... shared memory in a jail, there used to be some issues with
> that and I don't think they have been (or are going to be) solved. I
> recall that shared memory can't be local to a jail (it's "shared"
> after all), so you probably need(ed) to allow access to it somehow
> for your jails. Or you're running into issues sharing the same shared
> memory across multiple jails (and the base system) maybe?

The problems are known and i already have taken care of it. As written
at the beginning i already have two jails at the server with running
postgresql-instances.
Normally you have to tweak up the IPC-Params and use different user-ids
for each postgres-user to avoid the problem with the shared memory.
Thats why my problem is very strange. I never run into such a problem
and i run nearly a dozen postgresqls in jails at different FreeBSDs.

Greetings,
Torsten

Re: InitDB: Bad system call

От
Tom Lane
Дата:
=?ISO-8859-1?Q?Torsten_Z=FChlsdorff?= <foo@meisterderspiele.de> writes:
> The problems are known and i already have taken care of it. As written
> at the beginning i already have two jails at the server with running
> postgresql-instances.
> Normally you have to tweak up the IPC-Params and use different user-ids
> for each postgres-user to avoid the problem with the shared memory.
> Thats why my problem is very strange. I never run into such a problem
> and i run nearly a dozen postgresqls in jails at different FreeBSDs.

Now that I'm a bit more awake, I do notice something interesting about
that stack trace: the shmctl() is being executed to see whether a shared
memory segment ID mentioned in postmaster.pid still exists.  This
implies that some previous incarnation of the postmaster got as far as
writing postmaster.pid, which implies that it successfully executed
shmget() and shmat(), and then crashed later.  The simplest explanation
I can think of is that it's *only* shmctl that is malfunctioning, not
the other SysV shared memory calls.  Which is even weirder, and
definitely seems to move the problem into the category of kernel bug
rather than configuration mistake.

I concur with the upthread suggestion that you need to update your
FreeBSD instance.

            regards, tom lane

Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Hello,

>> Well, this seems to be clear proof for what everyone suspected all
>> along: your kernel is rejecting SysV-shared-memory calls.  I'm too tired
>> to go check that that shmctl() is the first such syscall during the boot
>> sequence, but it looks about right.
>>
>> So we're now back to the question of *why* it's rejecting those calls,
>> when you apparently have the proper support configured.  I'm afraid
>> you now need to seek the assistance of some FreeBSD kernel experts;
>> it's beyond the ken of a simple database hacker ...
>>
>
> 7.0-STABLE is ... old.  I would recommend upgrading to something more
> recent before moving forward with this "bug", as I expect the FreeBSD
> community to recommend such anyway.

FreeBSD 7 is from 2007. Thats not very old - you use FreeBSD for
services which just should run (like postgresql :)). In my supervised
server-park are half a dolzen FreeBSD-Server with uptimes around 7
years. Upgrading is something you do very very rarely. And till now i
didn't get such recommendation from the community. Its more likely to
add a new server with a new Version of  FreeBSD.

Hm... i can't start debugging the kernel of a live-maschine. I will add
a new server therefor. Maybe i can reproduce the problem at another
machine for the FreeBSD-Community.

Thanks to all for you help und time,
Torsten

Re: InitDB: Bad system call

От
Tom Lane
Дата:
I wrote:
> ... The simplest explanation
> I can think of is that it's *only* shmctl that is malfunctioning, not
> the other SysV shared memory calls.  Which is even weirder, and
> definitely seems to move the problem into the category of kernel bug
> rather than configuration mistake.

Hmmm ... Google turned up the information that FreeBSD migrated from int
to size_t variables for shared memory size between 7.0 and 8.0, and in
particular that the size of the struct used by shmctl() changed in
8.0.  So I'm now wondering if what you're dealing with is some sort of
version skew problem.  Could it be that you built Postgres against
system header files that don't match your kernel version?  I'm not
exactly sure how that would manifest as this particular signal,
but it seems worth checking.

            regards, tom lane

Re: InitDB: Bad system call

От
Torsten Zühlsdorff
Дата:
Hello,

>> ... The simplest explanation
>> I can think of is that it's *only* shmctl that is malfunctioning, not
>> the other SysV shared memory calls.  Which is even weirder, and
>> definitely seems to move the problem into the category of kernel bug
>> rather than configuration mistake.
>
> Hmmm ... Google turned up the information that FreeBSD migrated from int
> to size_t variables for shared memory size between 7.0 and 8.0, and in
> particular that the size of the struct used by shmctl() changed in
> 8.0.  So I'm now wondering if what you're dealing with is some sort of
> version skew problem.  Could it be that you built Postgres against
> system header files that don't match your kernel version?  I'm not
> exactly sure how that would manifest as this particular signal,
> but it seems worth checking.

I have the correct header files, but that brings me to an interesting
notice and a workaround.

Before i had build the new jail, i checked out the newest sources for
FreeBSD 7.0 and recompile the world. With the new "world" i build the
jail and the problems occurs.
Meanwhile there are two running jails with postgresql in at the same
server. And IPC-problems seems unfamiliar to me, because the
error-messages normally looks very different and other instances running
without problems;)

What i've done now, was disableing an old jail and copy it to an new
location. After some reconfiguration i use the copy as new jail and
install postgresql. And it works.

That fortify your assumption, that the problem must lie in FreeBSD. But
this will be hard to debug, because the last "make world" was 3 years
ago of the machine. I will discribe the problem to the FreeBSD-Community.

Thanks for all your help and time,
Torsten