BUG #17977: PorstGreSQL in a jail crashes randomly with Signal 10 bus error

Поиск
Список
Период
Сортировка
От PG Bug reporting form
Тема BUG #17977: PorstGreSQL in a jail crashes randomly with Signal 10 bus error
Дата
Msg-id 17977-e252db4b187adeb5@postgresql.org
обсуждение исходный текст
Ответы Re: BUG #17977: PorstGreSQL in a jail crashes randomly with Signal 10 bus error  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17977
Logged by:          Cory Albrecht
Email address:      coryca@gmail.com
PostgreSQL version: 14.6
Operating system:   TrueNAS-13.0-U5.1 (FreeBSD 13.1-RELEASE-p7)
Description:

PorstGreSQL running in a jail crashes randomly with signal 10 bus error. It
is the only application so affected.

Jun 14 16:25:10 postgresql-3 postgres[5026]: [11-1] [2023-06-14 16:25:10.563
EDT][6489e39c.13a2] <%>LOG:  00000: server process (PID 40166) was
terminated by signal 10: Bus error
Jun 14 16:25:10 postgresql-3 postgres[5026]: [11-2] [2023-06-14 16:25:10.563
EDT][6489e39c.13a2] <%>LOCATION:  LogChildExit, postmaster.c:3759
Jun 14 16:25:10 postgresql-3 postgres[5026]: [12-1] [2023-06-14 16:25:10.563
EDT][6489e39c.13a2] <%>LOG:  00000: terminating any other active server
processes
Jun 14 16:25:10 postgresql-3 postgres[5026]: [12-2] [2023-06-14 16:25:10.563
EDT][6489e39c.13a2] <%>LOCATION:  HandleChildCrash, postmaster.c:3482
Jun 14 16:25:10 postgresql-3 postgres[5026]: [13-1] [2023-06-14 16:25:10.577
EDT][6489e39c.13a2] <%>LOG:  00000: all server processes terminated;
reinitializing
Jun 14 16:25:10 postgresql-3 postgres[5026]: [13-2] [2023-06-14 16:25:10.577
EDT][6489e39c.13a2] <%>LOCATION:  PostmasterStateMachine,
postmaster.c:4038
Jun 14 16:25:10 postgresql-3 postgres[40194]: [14-1] [2023-06-14
16:25:10.603 EDT][648a2226.9d02] <%>LOG:  00000: database system was
interrupted; last known up at 2023-06-14 16:24:10 EDT
Jun 14 16:25:10 postgresql-3 postgres[40194]: [14-2] [2023-06-14
16:25:10.603 EDT][648a2226.9d02] <%>LOCATION:  StartupXLOG, xlog.c:6582
Jun 14 16:25:10 postgresql-3 postgres[40194]: [15-1] [2023-06-14
16:25:10.689 EDT][648a2226.9d02] <%>LOG:  00000: database system was not
properly shut down; automatic recovery in progress
Jun 14 16:25:10 postgresql-3 postgres[40194]: [15-2] [2023-06-14
16:25:10.689 EDT][648a2226.9d02] <%>LOCATION:  StartupXLOG, xlog.c:7106
Jun 14 16:25:10 postgresql-3 postgres[40194]: [16-1] [2023-06-14
16:25:10.696 EDT][648a2226.9d02] <%>LOG:  00000: redo starts at 6/B918C0C0
Jun 14 16:25:10 postgresql-3 postgres[40194]: [16-2] [2023-06-14
16:25:10.696 EDT][648a2226.9d02] <%>LOCATION:  StartupXLOG, xlog.c:7384
Jun 14 16:25:10 postgresql-3 postgres[40194]: [17-1] [2023-06-14
16:25:10.696 EDT][648a2226.9d02] <%>LOG:  00000: invalid record length at
6/B918C0F8: wanted 24, got 0
Jun 14 16:25:10 postgresql-3 postgres[40194]: [17-2] [2023-06-14
16:25:10.696 EDT][648a2226.9d02] <%>LOCATION:  ReadRecord, xlog.c:4454
Jun 14 16:25:10 postgresql-3 postgres[40194]: [18-1] [2023-06-14
16:25:10.696 EDT][648a2226.9d02] <%>LOG:  00000: redo done at 6/B918C0C0
system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
Jun 14 16:25:10 postgresql-3 postgres[40194]: [18-2] [2023-06-14
16:25:10.696 EDT][648a2226.9d02] <%>LOCATION:  StartupXLOG, xlog.c:7648
Jun 14 16:25:10 postgresql-3 postgres[5026]: [14-1] [2023-06-14 16:25:10.731
EDT][6489e39c.13a2] <%>LOG:  00000: database system is ready to accept
connections

In this case the last few log lines added by PID 40166 were:

Jun 14 16:24:10 postgresql-3 postgres[40166]: [11-1] [2023-06-14
16:24:10.353 EDT][648a21ea.9ce6] <maps%physgeo>LOG:  00000: statement:
SELECT regclass('pg_class')::oid AS oidselect;BEGIN;DECLARE oidcursor BINARY
CURSOR FOR SELECT regclass('pg_class')::oid AS oidbinarycursor;FETCH FORWARD
1 FROM oidcursor;CLOSE oidcursor;COMMIT;
Jun 14 16:24:10 postgresql-3 postgres[40166]: [11-2] [2023-06-14
16:24:10.353 EDT][648a21ea.9ce6] <maps%physgeo>LOCATION:  exec_simple_query,
postgres.c:1016
Jun 14 16:24:10 postgresql-3 postgres[40166]: [12-1] [2023-06-14
16:24:10.355 EDT][648a21ea.9ce6] <maps%physgeo>LOG:  00000: statement:
SELECT postgis_version()
Jun 14 16:24:10 postgresql-3 postgres[40166]: [12-2] [2023-06-14
16:24:10.355 EDT][648a21ea.9ce6] <maps%physgeo>LOCATION:  exec_simple_query,
postgres.c:1016
Jun 14 16:24:10 postgresql-3 postgres[40166]: [13-1] [2023-06-14
16:24:10.545 EDT][648a21ea.9ce6] <maps%physgeo>LOG:  00000: statement:
SELECT postgis_geos_version(), postgis_proj_version()
Jun 14 16:24:10 postgresql-3 postgres[40166]: [13-2] [2023-06-14
16:24:10.545 EDT][648a21ea.9ce6] <maps%physgeo>LOCATION:  exec_simple_query,
postgres.c:1016
Jun 14 16:24:10 postgresql-3 postgres[40166]: [14-1] [2023-06-14
16:24:10.545 EDT][648a21ea.9ce6] <maps%physgeo>LOG:  00000: statement:
SELECT has_schema_privilege(n.oid, 'usage') AND has_table_privilege(t.oid,
'select') AND has_table_privilege(l.oid, 'select') FROM pg_namespace n,
pg_class t, pg_class l WHERE n.nspname = 'topology' AND t.relnamespace =
n.oid AND l.relnamespace = n.oid AND t.relname = 'topology' AND l.relname =
'layer'
Jun 14 16:24:10 postgresql-3 postgres[40166]: [14-2] [2023-06-14
16:24:10.545 EDT][648a21ea.9ce6] <maps%physgeo>LOCATION:  exec_simple_query,
postgres.c:1016
Jun 14 16:24:10 postgresql-3 postgres[40166]: [15-1] [2023-06-14
16:24:10.549 EDT][648a21ea.9ce6] <maps%physgeo>LOG:  00000: statement:
SELECT has_table_privilege(c.oid, 'select') AND has_table_privilege(f.oid,
'select') FROM pg_class c, pg_class f, pg_namespace n, pg_extension e WHERE
c.relnamespace = n.oid AND c.relname = 'pointcloud_columns' AND
f.relnamespace = n.oid AND f.relname = 'pointcloud_formats' AND n.oid =
e.extnamespace AND e.extname = 'pointcloud'
Jun 14 16:24:10 postgresql-3 postgres[40166]: [15-2] [2023-06-14
16:24:10.549 EDT][648a21ea.9ce6] <maps%physgeo>LOCATION:  exec_simple_query,
postgres.c:1016
Jun 14 16:24:10 postgresql-3 postgres[40166]: [16-1] [2023-06-14
16:24:10.550 EDT][648a21ea.9ce6] <maps%physgeo>LOG:  00000: statement:
SELECT has_table_privilege(c.oid, 'select') FROM pg_class c, pg_namespace n,
pg_type t WHERE c.relnamespace = n.oid AND n.oid = t.typnamespace AND
c.relname = 'raster_columns' AND t.typname = 'raster'
Jun 14 16:24:10 postgresql-3 postgres[40166]: [16-2] [2023-06-14
16:24:10.550 EDT][648a21ea.9ce6] <maps%physgeo>LOCATION:  exec_simple_query,
postgres.c:1016
Jun 14 16:24:10 postgresql-3 postgres[40166]: [17-1] [2023-06-14
16:24:10.551 EDT][648a21ea.9ce6] <maps%physgeo>LOG:  00000: statement: SET
extra_float_digits=3; SET application_name='QGIS3 desktop'; SET
datestyle='ISO'
Jun 14 16:24:10 postgresql-3 postgres[40166]: [17-2] [2023-06-14
16:24:10.551 EDT][648a21ea.9ce6] <maps%physgeo>LOCATION:  exec_simple_query,
postgres.c:1016
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-1] [2023-06-14
16:24:10.551 EDT][648a21ea.9ce6] <maps%physgeo>LOG:  00000: statement:
SELECT COUNT(*)
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-2]
                                       FROM pg_class t, pg_class i,
pg_namespace ns, pg_index ix, pg_attribute a
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-3]
                                       WHERE
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-4]
                                           t.oid=ix.indrelid
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-5]
                                           AND t.relnamespace=ns.oid
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-6]
                                           AND i.oid=ix.indexrelid
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-7]
                                           AND a.attrelid=t.oid
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-8]
                                           AND a.attnum=ANY(ix.indkey)
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-9]
                                           AND t.relkind IN ('r', 'm')
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-10]
                                            AND ns.nspname='public'
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-11]
                                            AND
t.relname='ne_10m_rivers_lake_centerlines_scale_rank'
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-12]
                                            AND a.attname='geom';
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-13]
                                        
Jun 14 16:24:10 postgresql-3 postgres[40166]: [18-14] [2023-06-14
16:24:10.551 EDT][648a21ea.9ce6] <maps%physgeo>LOCATION:  exec_simple_query,
postgres.c:1016
Jun 14 16:24:10 postgresql-3 postgres[40166]: [19-1] [2023-06-14
16:24:10.555 EDT][648a21ea.9ce6] <maps%physgeo>LOG:  00000: statement:
SELECT oid, typname FROM pg_type WHERE oid IN (20)
Jun 14 16:24:10 postgresql-3 postgres[40166]: [19-2] [2023-06-14
16:24:10.555 EDT][648a21ea.9ce6] <maps%physgeo>LOCATION:  exec_simple_query,
postgres.c:1016

Which was the user application QGIS reconnecting right after the previous
signal 10 bus error crash. 

I have my suspicions that QGIS is doing something funky as I have had no
success making PostgreSQL crash with some scripts opening up several dozen
simultaneous connections and making dozens of SQL queries per connection.
QGIS is my heaviest database user, everything else is fairly tiny and quick
and never causes PostgreSQL to crash like this. However, I don;t really know
how to proceed to prove that it is QGIS or what, exactly, it is doing.

Let me know what logging you want me to turn on and how and I can gather
that data.


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: BUG #17974: Walsenders memory usage suddenly spike to 80G+ causing OOM and server reboot
Следующее
От: David Rowley
Дата:
Сообщение: Re: BUG #17975: Nested Loop Index Scan returning wrong result