Обсуждение: BUG #17636: terminating connection because of crash of another server process

Поиск
Список
Период
Сортировка

BUG #17636: terminating connection because of crash of another server process

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17636
Logged by:          Sujeet Swaminath
Email address:      sujeet.chaurasia@lisec.com
PostgreSQL version: 12.11
Operating system:   Windows
Description:

We are facing this issue with POSTGRES 12.11, only on windows OS, with Linux
OS, it works fine,

if the executable that is using the database session crashes, then the
entire database goes to recovery mode and restarts, in the Postgres log, we
can find the below messages.

"2022-10-06 17:44:09.210 CEST [8860] LOG:  server process (PID 9980) exited
with exit code 0
2022-10-06 17:44:09.210 CEST [8860] LOG:  terminating any other active
server processes
 
2022-10-06 17:44:09.211 CEST [9992] WARNING:  terminating the connection
because of the crash of another server process
 
2022-10-06 17:44:09.211 CEST [9992] DETAIL:  The postmaster has commanded
this server process to roll back the current transaction and exit because
another server process exited abnormally and possibly corrupted shared
memory.

2022-10-06 17:44:09.211 CEST [9992] HINT:  In a moment you should be able to
reconnect to the database and repeat your command. "


Re: BUG #17636: terminating connection because of crash of another server process

От
Julien Rouhaud
Дата:
Hi,

On Wed, Oct 12, 2022 at 11:22:26AM +0000, PG Bug reporting form wrote:
> The following bug has been logged on the website:
>
> Bug reference:      17636
> Logged by:          Sujeet Swaminath
> Email address:      sujeet.chaurasia@lisec.com
> PostgreSQL version: 12.11
> Operating system:   Windows
> Description:
>
> We are facing this issue with POSTGRES 12.11, only on windows OS, with Linux
> OS, it works fine,
>
> if the executable that is using the database session crashes, then the
> entire database goes to recovery mode and restarts, in the Postgres log, we
> can find the below messages.
>
> "2022-10-06 17:44:09.210 CEST [8860] LOG:  server process (PID 9980) exited
> with exit code 0
> 2022-10-06 17:44:09.210 CEST [8860] LOG:  terminating any other active
> server processes
>
> 2022-10-06 17:44:09.211 CEST [9992] WARNING:  terminating the connection
> because of the crash of another server process
>
> 2022-10-06 17:44:09.211 CEST [9992] DETAIL:  The postmaster has commanded
> this server process to roll back the current transaction and exit because
> another server process exited abnormally and possibly corrupted shared
> memory.
>
> 2022-10-06 17:44:09.211 CEST [9992] HINT:  In a moment you should be able to
> reconnect to the database and repeat your command. "

This unfortunately isn't enough information to understand what's going on.

First, is the problem still happening if you update to version 12.12?

Also, do you have any extension or modules configured?  You haven't shown the
logs corresponding to the original process problem, including the query it was
executing in case of a normal backend.

Can you manually reproduce the problem, and/or get a stack trace of the
problem?  See
https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows
(and possibly

https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows#Using_crash_dumps_to_debug_random_and_unpredictable_backend_crashes)
for more details on how to do.



RE: BUG #17636: terminating connection because of crash of another server process

От
Chaurasia Sujeet
Дата:
Hi,
 
We've upgraded postgres to postgres13, and no extension is configured in postgresql on the server, Yet it again crashed again, and there is no antivirus on the server.
 
Since the problem happens randomly and we are not able to recreate the problem ourselves. and the log shows no query or details as to what caused the crash. we just get a message that one PID of postgres.exe exited with code 0, which makes it difficult to even troubleshoot.
 
According to the article you shared, we set up the minidump for the random crash, but we didn't find the dump in the crashdumps directory.
 
We followed the highlighted part below from this article, as the crash dump handler is already included in the postgresql as per the article. so any crash dump should get generated in the crashdumps directory.
 
If the crashes appear to be random and you don't know how to trigger them, it's hard to connect a debugger to the problem postgres.exe before it crashes.
Setting your debugger as the JIT (just-in-time) or post-mortem debugger won't help you, because PostgreSQL generally runs as a service under a different user account that cannot interact with the desktop. You could always initdb a new cluster under your normal user account and use pg_ctl to start the postmaster with that cluster manually, so you can JIT debug under your own user account where Pg can interact with the desktop. This isn't suitable for production use, though, and you might not be able to reproduce the problem that way.
In PostgreSQL 9.0 and above there is a crash dump hander included in PostgreSQL. To use it:
  • Create a directory named crashdumps (all lower case) in the PostgreSQL data directory (as shown by SHOW data_directory; in psql)
  • Give the PostgreSQL user (postgres by default) "full control" of it in the security tab of the folder properties
  • Run the problem code. You don't need to restart Pg or change any settings.
  • When a backend crashes, a Windows minidump should be created in the crashdumps directory.
 

Please help us to know if there is any other step here to generate a crash dump as the issue is random and we are not aware of the cause that is making postgres.exe crash.


Thanks,
Sujeet
 
 
 
-----Original Message-----
From: Julien Rouhaud <rjuju123@gmail.com>
Sent: Monday, October 17, 2022 7:49 AM
To: Chaurasia Sujeet <Sujeet.Chaurasia@lisec.com>; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17636: terminating connection because of crash of another server process
 
[You don't often get email from mailto:rjuju123@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
 
Hi,
 
On Wed, Oct 12, 2022 at 11:22:26AM +0000, PG Bug reporting form wrote:
> The following bug has been logged on the website:
>
> Bug reference:      17636
> Logged by:          Sujeet Swaminath
> Email address:      mailto:sujeet.chaurasia@lisec.com
> PostgreSQL version: 12.11
> Operating system:   Windows
> Description:
>
> We are facing this issue with POSTGRES 12.11, only on windows OS, with
> Linux OS, it works fine,
>
> if the executable that is using the database session crashes, then the
> entire database goes to recovery mode and restarts, in the Postgres
> log, we can find the below messages.
>
> "2022-10-06 17:44:09.210 CEST [8860] LOG:  server process (PID 9980)
> exited with exit code 0
> 2022-10-06 17:44:09.210 CEST [8860] LOG:  terminating any other active
> server processes
>
> 2022-10-06 17:44:09.211 CEST [9992] WARNING:  terminating the
> connection because of the crash of another server process
>
> 2022-10-06 17:44:09.211 CEST [9992] DETAIL:  The postmaster has
> commanded this server process to roll back the current transaction and
> exit because another server process exited abnormally and possibly
> corrupted shared memory.
>
> 2022-10-06 17:44:09.211 CEST [9992] HINT:  In a moment you should be
> able to reconnect to the database and repeat your command. "
 
This unfortunately isn't enough information to understand what's going on.
 
First, is the problem still happening if you update to version 12.12?
 
Also, do you have any extension or modules configured?  You haven't shown the logs corresponding to the original process problem, including the query it was executing in case of a normal backend.
 
Can you manually reproduce the problem, and/or get a stack trace of the problem?  See
(and possibly
for more details on how to do.
 

Re: BUG #17636: terminating connection because of crash of another server process

От
Julien Rouhaud
Дата:
On Wed, Nov 02, 2022 at 07:13:29AM +0000, Chaurasia Sujeet wrote:
>
> We've upgraded postgres to postgres13, and no extension is configured in
> postgresql on the server, Yet it again crashed again, and there is no
> antivirus on the server.
>
> Since the problem happens randomly and we are not able to recreate the
> problem ourselves. and the log shows no query or details as to what caused
> the crash. we just get a message that one PID of postgres.exe exited with
> code 0, which makes it difficult to even troubleshoot.
>
> According to the article you shared, we set up the minidump for the random
> crash, but we didn't find the dump in the crashdumps directory.
>
> We followed the highlighted part below from this article, as the crash dump
> handler is already included in the postgresql as per the article. so any
> crash dump should get generated in the crashdumps directory.
>
>
> If the crashes appear to be random and you don't know how to trigger them,
> it's hard to connect a debugger to the problem postgres.exe before it
> crashes.  Setting your debugger as the JIT (just-in-time) or post-mortem
> debugger won't help you, because PostgreSQL generally runs as a service under
> a different user account that cannot interact with the desktop. You could
> always initdb a new cluster under your normal user account and use pg_ctl to
> start the postmaster with that cluster manually, so you can JIT debug under
> your own user account where Pg can interact with the desktop. This isn't
> suitable for production use, though, and you might not be able to reproduce
> the problem that way.  In PostgreSQL 9.0 and above there is a crash dump
>
hander<http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/port/win32/crashdump.c;h=7550fa6f26b82d6fc41f5f68afb35ec44d25d00b;hb=HEAD>
> included in PostgreSQL. To use it: *       Create a directory named
> crashdumps (all lower case) in the PostgreSQL data directory (as shown by
> SHOW data_directory; in psql) *       Give the PostgreSQL user (postgres by
> default) "full control" of it in the security tab of the folder properties *
> Run the problem code. You don't need to restart Pg or change any settings.  *
> When a backend crashes, a Windows minidump should be created in the
> crashdumps directory.
>
>
> Please help us to know if there is any other step here to generate a crash
> dump as the issue is random and we are not aware of the cause that is making
> postgres.exe crash.

Unfortunately I'm not a Windows user myself, so I have no idea how to generate
a coredump on that platform.  If the instructions on the wiki don't work, and
since no one seemed to show up with an answer, maybe ask Microsoft or on a
Windows dedicated forum.