Обсуждение: Postgres Crashing

Поиск

Список

Период

Сортировка

Postgres Crashing

От

Doug Roberts

Дата:

04 февраля 2020 г., 00:43:47

Hello,

I'm having an issue where a process in Postgres is crashing and cause the server to go into recovery mode.

I'm getting the following errors in the log.

2020-02-03 14:12:57.473 EST [11992] [0]WARNING: 57P02: terminating connection because of crash of another server process
2020-02-03 14:12:57.473 EST [11992] [0]DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-03 14:12:57.473 EST [11992] [0]HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-03 14:12:57.473 EST [11992] [0]CONTEXT: while locking tuple (4101,2) in relation "containers"
SQL statement "UPDATE containers
SET type_uid = COALESCE(declared_type_uid, type_uid),
carton_type_uid = COALESCE(declared_carton_type_uid, carton_type_uid),
status_uid = COALESCE(declared_status_uid, status_uid),
order_uid = COALESCE(in_order_uid, order_uid),
wave_uid = COALESCE(in_wave_uid, wave_uid),
length = COALESCE(in_length, carton_length, length),
width = COALESCE(in_width, carton_width, width),
height = COALESCE(in_height, carton_height, height),
weight = COALESCE(in_weight, weight),
weight_minimum = COALESCE(in_weight_minimum, weight_minimum),
weight_maximum = COALESCE(in_weight_maximum, weight_maximum),
weight_expected = COALESCE(in_weight_expected, weight_expected),
first_seen_decision_point_id = COALESCE(first_seen_decision_point_id, in_last_seen_decision_point_id),
first_seen_datetime = COALESCE(first_seen_datetime, last_seen_date_time),
last_seen_decision_point_id = COALESCE(in_last_seen_decision_point_id, last_seen_decision_point_id),
last_seen_datetime = COALESCE(last_seen_date_time, last_seen_datetime),
recirculation_count = COALESCE(in_recirculation_count, recirculation_count),
project_flags = COALESCE(in_project_flags, project_flags),
passed_weight_check = COALESCE(in_passed_weight_check, passed_weight_check)
WHERE uid = in_uid"
PL/pgSQL function containers_add_update(integer,integer,integer,integer,integer,integer,double precision,double precision,double precision,double precision,double precision,double precision,double precision,integer,timestamp without time zone,character varying,bigint,boolean) line 60 at SQL statement
2020-02-03 14:12:57.473 EST [11992] [0]LOCATION: quickdie, postgres.c:2717
2020-02-03 14:12:57.473 EST [12260] [0]WARNING: 57P02: terminating connection because of crash of another server process
2020-02-03 14:12:57.473 EST [12260] [0]DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-03 14:12:57.473 EST [12260] [0]HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-03 14:12:57.473 EST [12260] [0]LOCATION: quickdie, postgres.c:2717
2020-02-03 14:12:57.476 EST [24552] [0]WARNING: 57P02: terminating connection because of crash of another server process
2020-02-03 14:12:57.476 EST [24552] [0]DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-03 14:12:57.476 EST [24552] [0]HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-03 14:12:57.476 EST [24552] [0]LOCATION: quickdie, postgres.c:2717
2020-02-03 14:12:57.479 EST [23844] [0]WARNING: 57P02: terminating connection because of crash of another server process
2020-02-03 14:12:57.479 EST [23844] [0]DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-03 14:12:57.479 EST [23844] [0]HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-03 14:12:57.479 EST [23844] [0]LOCATION: quickdie, postgres.c:2717
2020-02-03 14:12:57.586 EST [25992] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:57.586 EST [25992] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:57.587 EST [19428] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:57.587 EST [19428] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:57.627 EST [24968] [0]LOG: 00000: all server processes terminated; reinitializing
2020-02-03 14:12:57.627 EST [24968] [0]LOCATION: PostmasterStateMachine, postmaster.c:3912
2020-02-03 14:12:57.697 EST [16620] [0]LOG: 00000: database system was interrupted; last known up at 2020-02-03 14:12:53 EST
2020-02-03 14:12:57.697 EST [16620] [0]LOCATION: StartupXLOG, xlog.c:6277
2020-02-03 14:12:57.707 EST [13736] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:57.707 EST [13736] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:57.826 EST [21712] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:57.826 EST [21712] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:57.903 EST [6596] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:57.903 EST [6596] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:57.936 EST [20988] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:57.936 EST [20988] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:57.997 EST [16772] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:57.997 EST [16772] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:58.054 EST [11116] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:58.054 EST [11116] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:58.112 EST [24912] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:58.112 EST [24912] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:58.174 EST [25152] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:58.174 EST [25152] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:58.237 EST [3184] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:58.237 EST [3184] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:58.305 EST [22284] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:58.305 EST [22284] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:58.349 EST [18136] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:58.349 EST [18136] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:58.368 EST [13096] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:58.368 EST [13096] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:58.435 EST [20696] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:58.435 EST [20696] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:58.498 EST [13808] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:58.498 EST [13808] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:58.685 EST [16620] [0]LOG: 00000: database system was not properly shut down; automatic recovery in progress
2020-02-03 14:12:58.685 EST [16620] [0]LOCATION: StartupXLOG, xlog.c:6774
2020-02-03 14:12:58.692 EST [16620] [0]LOG: 00000: redo starts at 10/A6064FE8
2020-02-03 14:12:58.692 EST [16620] [0]LOCATION: StartupXLOG, xlog.c:7045
2020-02-03 14:12:58.965 EST [19264] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:58.965 EST [19264] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:12:59.866 EST [23180] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:12:59.866 EST [23180] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:13:01.211 EST [23624] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:13:01.211 EST [23624] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:13:03.160 EST [22964] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:13:03.160 EST [22964] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:13:06.052 EST [17252] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:13:06.052 EST [17252] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:13:10.383 EST [24704] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:13:10.383 EST [24704] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:13:16.831 EST [25028] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:13:16.831 EST [25028] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:13:26.488 EST [14852] [0]FATAL: 57P03: the database system is in recovery mode
2020-02-03 14:13:26.488 EST [14852] [0]LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-03 14:13:32.772 EST [16620] [0]LOG: 00000: invalid record length at 10/C9B33770: wanted 24, got 0
2020-02-03 14:13:32.772 EST [16620] [0]LOCATION: ReadRecord, xlog.c:4284
2020-02-03 14:13:32.772 EST [16620] [0]LOG: 00000: redo done at 10/C9B33730
2020-02-03 14:13:32.772 EST [16620] [0]LOCATION: StartupXLOG, xlog.c:7307
2020-02-03 14:13:34.152 EST [24968] [0]LOG: 00000: database system is ready to accept connections

This happened when I was using a function to remove part of a comma delimited string while updating a row. The update could potentially touch every row in the table. The issue above occurred when a different update function was being executed on the same table.

If I use the following lock this issue seems to be resolved. However, I'm not sure why the above issue occurred.

LOCK TABLE containers IN SHARE ROW EXCLUSIVE MODE;

Does anyone have any ideas?

Thanks,

Doug

Re: Postgres Crashing

От

Adrian Klaver

Дата:

04 февраля 2020 г., 00:49:06

On 2/3/20 1:43 PM, Doug Roberts wrote:
> Hello,
> 
> I'm having an issue where a process in Postgres is crashing and cause 
> the server to go into recovery mode.
> 
> I'm getting the following errors in the log.
> 
> 2020-02-03 14:12:57.473 EST [11992] [0]WARNING:  57P02: terminating 
> connection because of crash of another server process
> 2020-02-03 14:12:57.473 EST [11992] [0]DETAIL:  The postmaster has 
> commanded this server process to roll back the current transaction and 
> exit, because another server process exited abnormally and possibly 
> corrupted shared memory.
> 2020-02-03 14:12:57.473 EST [11992] [0]HINT:  In a moment you should be 
> able to reconnect to the database and repeat your command.
> 2020-02-03 14:12:57.473 EST [11992] [0]CONTEXT:  while locking tuple 
> (4101,2) in relation "containers"
> SQL statement "UPDATE containers
>             SET type_uid = COALESCE(declared_type_uid, type_uid),
>                 carton_type_uid = COALESCE(declared_carton_type_uid, 
> carton_type_uid),
>                 status_uid = COALESCE(declared_status_uid, status_uid),
>                 order_uid = COALESCE(in_order_uid, order_uid),
>                 wave_uid = COALESCE(in_wave_uid, wave_uid),
>                 length = COALESCE(in_length, carton_length, length),
>                 width = COALESCE(in_width, carton_width, width),
>                 height = COALESCE(in_height, carton_height, height),
>                 weight = COALESCE(in_weight, weight),
>                 weight_minimum = COALESCE(in_weight_minimum, 
> weight_minimum),
>                 weight_maximum = COALESCE(in_weight_maximum, 
> weight_maximum),
>                 weight_expected = COALESCE(in_weight_expected, 
> weight_expected),
>                 first_seen_decision_point_id = 
> COALESCE(first_seen_decision_point_id, in_last_seen_decision_point_id),
>                 first_seen_datetime = COALESCE(first_seen_datetime, 
> last_seen_date_time),
>                 last_seen_decision_point_id = 
> COALESCE(in_last_seen_decision_point_id, last_seen_decision_point_id),
>                 last_seen_datetime = COALESCE(last_seen_date_time, 
> last_seen_datetime),
>                 recirculation_count = COALESCE(in_recirculation_count, 
> recirculation_count),
>                 project_flags = COALESCE(in_project_flags, project_flags),
>                 passed_weight_check = COALESCE(in_passed_weight_check, 
> passed_weight_check)
>             WHERE uid = in_uid"
> PL/pgSQL function 
> containers_add_update(integer,integer,integer,integer,integer,integer,double 
> precision,double precision,double precision,double precision,double 
> precision,double precision,double precision,integer,timestamp without 
> time zone,character varying,bigint,boolean) line 60 at SQL statement

> 
> This happened when I was using a function to remove part of a comma 
> delimited string while updating a row. The update could potentially 
> touch every row in the table. The issue above occurred when a different 
> update function was being executed on the same table.

The full content of containers_add_update() would be helpful as well as 
the content of the other function. If that is not possible some idea of 
the order in which they where run as well as where the LOCK TABLE below 
was inserted?

> 
> If I use the following lock this issue seems to be resolved. However, 
> I'm not sure why the above issue occurred.
> 
> LOCK TABLE containers IN SHARE ROW EXCLUSIVE MODE;
> 
> Does anyone have any ideas?
> 
> Thanks,
> 
> Doug


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Postgres Crashing

От

Tom Lane

Дата:

04 февраля 2020 г., 01:28:13

Doug Roberts <h205881@gmail.com> writes:
> I'm having an issue where a process in Postgres is crashing and cause the
> server to go into recovery mode.

Can you reduce this to a self-contained test case for others to try?

If not, you'll have to do the initial investigation yourself.
A stack trace from the crash would be pretty helpful:

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

We could also use all the standard details suggested in

https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

most notably, exactly which PG version is this?

> I'm getting the following errors in the log.

Unfortunately, this is pretty useless, since you only quoted the part
of the log after the problem was detected.

            regards, tom lane

Re: Postgres Crashing

От

Adrian Klaver

Дата:

04 февраля 2020 г., 01:34:46

On 2/3/20 2:18 PM, Doug Roberts wrote:
Please reply to list also.
Ccing list.
> Adrian,
> 
> Here is what the reset recirc function is doing.
> 
> CREATE OR REPLACE FUNCTION containers_reset_recirc
> (
>      in_uid INTEGER
> )
> RETURNS INTEGER
> AS $BODY$
>      DECLARE regex VARCHAR(50);
> BEGIN
>      SELECT concat(',*', in_uid, '=\d+,*') INTO regex;
> 
>      LOCK TABLE containers IN SHARE ROW EXCLUSIVE MODE;
> 
>      UPDATE containers
>          SET recirculation_count =
>              case
>                  when substring(recirculation_count, regex) like ',%,' then
>                      regexp_replace(recirculation_count, regex, ',')
>                  else
>                      regexp_replace(recirculation_count, regex, '')
>                  end;
> 
>      RETURN in_uid;
> END;
> 
> Containers add/update is basically updating a specific container using 
> the values that were passed to the function.

So how did containers_reset_recirc() come to clash with 
containers_add_update()?

> 
> UPDATE containers
>      SET type_uid = COALESCE(declared_type_uid, type_uid),
>          carton_type_uid = COALESCE(declared_carton_type_uid, 
> carton_type_uid),
>          status_uid = COALESCE(declared_status_uid, status_uid),
>          order_uid = COALESCE(in_order_uid, order_uid),
>          wave_uid = COALESCE(in_wave_uid, wave_uid),
>          length = COALESCE(in_length, carton_length, length),
>          width = COALESCE(in_width, carton_width, width),
>          height = COALESCE(in_height, carton_height, height),
>          weight = COALESCE(in_weight, weight),
>          weight_minimum = COALESCE(in_weight_minimum, weight_minimum),
>          weight_maximum = COALESCE(in_weight_maximum, weight_maximum),
>          weight_expected = COALESCE(in_weight_expected, weight_expected),
>          first_seen_decision_point_id = 
> COALESCE(first_seen_decision_point_id, in_last_seen_decision_point_id),
>          first_seen_datetime = COALESCE(first_seen_datetime, 
> last_seen_date_time),
>          last_seen_decision_point_id = 
> COALESCE(in_last_seen_decision_point_id, last_seen_decision_point_id),
>          last_seen_datetime = COALESCE(last_seen_date_time, 
> last_seen_datetime),
>          recirculation_count = COALESCE(in_recirculation_count, 
> recirculation_count),
>          project_flags = COALESCE(in_project_flags, project_flags),
>          passed_weight_check = COALESCE(in_passed_weight_check, 
> passed_weight_check)
>      WHERE uid = in_uid
> 
> Thanks,
> 
> Doug
> 
> On Mon, Feb 3, 2020 at 4:49 PM Adrian Klaver <adrian.klaver@aklaver.com 
> <mailto:adrian.klaver@aklaver.com>> wrote:
> 
>     On 2/3/20 1:43 PM, Doug Roberts wrote:
>      > Hello,
>      >
>      > I'm having an issue where a process in Postgres is crashing and
>     cause
>      > the server to go into recovery mode.
>      >
>      > I'm getting the following errors in the log.
>      >
>      > 2020-02-03 14:12:57.473 EST [11992] [0]WARNING:  57P02: terminating
>      > connection because of crash of another server process
>      > 2020-02-03 14:12:57.473 EST [11992] [0]DETAIL:  The postmaster has
>      > commanded this server process to roll back the current
>     transaction and
>      > exit, because another server process exited abnormally and possibly
>      > corrupted shared memory.
>      > 2020-02-03 14:12:57.473 EST [11992] [0]HINT:  In a moment you
>     should be
>      > able to reconnect to the database and repeat your command.
>      > 2020-02-03 14:12:57.473 EST [11992] [0]CONTEXT:  while locking tuple
>      > (4101,2) in relation "containers"
>      > SQL statement "UPDATE containers
>      >             SET type_uid = COALESCE(declared_type_uid, type_uid),
>      >                 carton_type_uid = COALESCE(declared_carton_type_uid,
>      > carton_type_uid),
>      >                 status_uid = COALESCE(declared_status_uid,
>     status_uid),
>      >                 order_uid = COALESCE(in_order_uid, order_uid),
>      >                 wave_uid = COALESCE(in_wave_uid, wave_uid),
>      >                 length = COALESCE(in_length, carton_length, length),
>      >                 width = COALESCE(in_width, carton_width, width),
>      >                 height = COALESCE(in_height, carton_height, height),
>      >                 weight = COALESCE(in_weight, weight),
>      >                 weight_minimum = COALESCE(in_weight_minimum,
>      > weight_minimum),
>      >                 weight_maximum = COALESCE(in_weight_maximum,
>      > weight_maximum),
>      >                 weight_expected = COALESCE(in_weight_expected,
>      > weight_expected),
>      >                 first_seen_decision_point_id =
>      > COALESCE(first_seen_decision_point_id,
>     in_last_seen_decision_point_id),
>      >                 first_seen_datetime = COALESCE(first_seen_datetime,
>      > last_seen_date_time),
>      >                 last_seen_decision_point_id =
>      > COALESCE(in_last_seen_decision_point_id,
>     last_seen_decision_point_id),
>      >                 last_seen_datetime = COALESCE(last_seen_date_time,
>      > last_seen_datetime),
>      >                 recirculation_count =
>     COALESCE(in_recirculation_count,
>      > recirculation_count),
>      >                 project_flags = COALESCE(in_project_flags,
>     project_flags),
>      >                 passed_weight_check =
>     COALESCE(in_passed_weight_check,
>      > passed_weight_check)
>      >             WHERE uid = in_uid"
>      > PL/pgSQL function
>      >
>     containers_add_update(integer,integer,integer,integer,integer,integer,double
> 
>      > precision,double precision,double precision,double precision,double
>      > precision,double precision,double precision,integer,timestamp
>     without
>      > time zone,character varying,bigint,boolean) line 60 at SQL statement
> 
>      >
>      > This happened when I was using a function to remove part of a comma
>      > delimited string while updating a row. The update could potentially
>      > touch every row in the table. The issue above occurred when a
>     different
>      > update function was being executed on the same table.
> 
>     The full content of containers_add_update() would be helpful as well as
>     the content of the other function. If that is not possible some idea of
>     the order in which they where run as well as where the LOCK TABLE below
>     was inserted?
> 
>      >
>      > If I use the following lock this issue seems to be resolved.
>     However,
>      > I'm not sure why the above issue occurred.
>      >
>      > LOCK TABLE containers IN SHARE ROW EXCLUSIVE MODE;
>      >
>      > Does anyone have any ideas?
>      >
>      > Thanks,
>      >
>      > Doug
> 
> 
>     -- 
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Postgres Crashing

От

Tom Lane

Дата:

04 февраля 2020 г., 02:21:46

Adrian Klaver <adrian.klaver@aklaver.com> writes:
> Please reply to list also.

> On 2/3/20 2:18 PM, Doug Roberts wrote:
>> Here is what the reset recirc function is doing.
>> ...
>>     UPDATE containers
>> ...

> So how did containers_reset_recirc() come to clash with 
> containers_add_update()?

If this is PG 12.0 or 12.1, a likely theory is that this is an
EvalPlanQual bug (which'd be triggered during concurrent updates
of the same row in the table, so that squares with the observation
that locking the table prevents it).  The known bugs in that area
require either before-row-update triggers on the table, or
child tables (either partitioning or traditional inheritance).
So I wonder what the schema of table "containers" looks like.

Or you could have hit some new bug ... but there's not enough
info here to diagnose.

            regards, tom lane

Re: Postgres Crashing

От

Doug Roberts

Дата:

04 февраля 2020 г., 17:20:04

> So how did containers_reset_recirc() come to clash with
> containers_add_update()?

They are clashing because another portion of our system is running and updating containers. The reset recirc function was run at the same time to see how our system and the database would handle it.

The recirc string is formatted like 2000=3,1000=6,5000=0. So the reset recirc function with take a UID (1000 for example) and use that to remove 1000=x from all of the recirc counts for all of the containers that have 1000=x.

We are currently using PG 12.0.

Thanks,

Doug

On Mon, Feb 3, 2020 at 6:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Adrian Klaver <adrian.klaver@aklaver.com> writes:
> Please reply to list also.

> On 2/3/20 2:18 PM, Doug Roberts wrote:
>> Here is what the reset recirc function is doing.
>> ...
>> UPDATE containers
>> ...

> So how did containers_reset_recirc() come to clash with
> containers_add_update()?

If this is PG 12.0 or 12.1, a likely theory is that this is an
EvalPlanQual bug (which'd be triggered during concurrent updates
of the same row in the table, so that squares with the observation
that locking the table prevents it). The known bugs in that area
require either before-row-update triggers on the table, or
child tables (either partitioning or traditional inheritance).
So I wonder what the schema of table "containers" looks like.

Or you could have hit some new bug ... but there's not enough
info here to diagnose.

regards, tom lane

Re: Postgres Crashing

От

Doug Roberts

Дата:

04 февраля 2020 г., 19:06:50

Hello,

Here is a stacktrace of what happened before and after the crash.

Thanks,

Doug

2020-02-04 10:26:16.841 EST [20788] [0] LOG: 00000: server process (PID 12168) was terminated by exception 0xC0000005
2020-02-04 10:26:16.841 EST [20788] [0] DETAIL: Failed process was running: select CONTAINERS_RESET_RECIRC_BY_DP(3000)
2020-02-04 10:26:16.841 EST [20788] [0] HINT: See C include file "ntstatus.h" for a description of the hexadecimal value.
2020-02-04 10:26:16.841 EST [20788] [0] LOCATION: LogChildExit, postmaster.c:3670
2020-02-04 10:26:16.841 EST [20788] [0] LOG: 00000: terminating any other active server processes
2020-02-04 10:26:16.841 EST [20788] [0] LOCATION: HandleChildCrash, postmaster.c:3400
2020-02-04 10:26:16.873 EST [1212] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.873 EST [1212] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.873 EST [1212] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.873 EST [1212] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.873 EST [19436] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.873 EST [19436] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.873 EST [19436] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.873 EST [19436] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.874 EST [13428] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.874 EST [13428] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.874 EST [13428] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.874 EST [13428] [0] CONTEXT: while locking tuple (0,115) in relation "containers"
SQL statement "UPDATE containers
SET type_uid = COALESCE(declared_type_uid, type_uid),
carton_type_uid = COALESCE(declared_carton_type_uid, carton_type_uid),
status_uid = COALESCE(declared_status_uid, status_uid),
order_uid = COALESCE(in_order_uid, order_uid),
wave_uid = COALESCE(in_wave_uid, wave_uid),
length = COALESCE(in_length, carton_length, length),
width = COALESCE(in_width, carton_width, width),
height = COALESCE(in_height, carton_height, height),
weight = COALESCE(in_weight, weight),
weight_minimum = COALESCE(in_weight_minimum, weight_minimum),
weight_maximum = COALESCE(in_weight_maximum, weight_maximum),
weight_expected = COALESCE(in_weight_expected, weight_expected),
first_seen_DP_id = COALESCE(first_seen_DP_id, in_last_seen_DP_id),
first_seen_datetime = COALESCE(first_seen_datetime, last_seen_date_time),
last_seen_DP_id = COALESCE(in_last_seen_DP_id, last_seen_DP_id),
last_seen_datetime = COALESCE(last_seen_date_time, last_seen_datetime),
recirculation_count = COALESCE(in_recirculation_count, recirculation_count),
project_flags = COALESCE(in_project_flags, project_flags),
passed_weight_check = COALESCE(in_passed_weight_check, passed_weight_check)
WHERE uid = in_uid"
PL/pgSQL function containers_add_update(integer,integer,integer,integer,integer,integer,double precision,double precision,double precision,double precision,double precision,double precision,double precision,integer,timestamp without time zone,character varying,bigint,boolean) line 60 at SQL statement
2020-02-04 10:26:16.874 EST [13428] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.874 EST [25916] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.874 EST [25916] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.874 EST [25916] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.874 EST [25916] [0] CONTEXT: while locking tuple (1,91) in relation "containers"
SQL statement "UPDATE containers
SET type_uid = COALESCE(declared_type_uid, type_uid),
carton_type_uid = COALESCE(declared_carton_type_uid, carton_type_uid),
status_uid = COALESCE(declared_status_uid, status_uid),
order_uid = COALESCE(in_order_uid, order_uid),
wave_uid = COALESCE(in_wave_uid, wave_uid),
length = COALESCE(in_length, carton_length, length),
width = COALESCE(in_width, carton_width, width),
height = COALESCE(in_height, carton_height, height),
weight = COALESCE(in_weight, weight),
weight_minimum = COALESCE(in_weight_minimum, weight_minimum),
weight_maximum = COALESCE(in_weight_maximum, weight_maximum),
weight_expected = COALESCE(in_weight_expected, weight_expected),
first_seen_DP_id = COALESCE(first_seen_DP_id, in_last_seen_DP_id),
first_seen_datetime = COALESCE(first_seen_datetime, last_seen_date_time),
last_seen_DP_id = COALESCE(in_last_seen_DP_id, last_seen_DP_id),
last_seen_datetime = COALESCE(last_seen_date_time, last_seen_datetime),
recirculation_count = COALESCE(in_recirculation_count, recirculation_count),
project_flags = COALESCE(in_project_flags, project_flags),
passed_weight_check = COALESCE(in_passed_weight_check, passed_weight_check)
WHERE uid = in_uid"
PL/pgSQL function containers_add_update(integer,integer,integer,integer,integer,integer,double precision,double precision,double precision,double precision,double precision,double precision,double precision,integer,timestamp without time zone,character varying,bigint,boolean) line 60 at SQL statement
2020-02-04 10:26:16.874 EST [25916] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.875 EST [2512] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.875 EST [2512] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.875 EST [2512] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.875 EST [2512] [0] CONTEXT: while locking tuple (0,111) in relation "containers"
SQL statement "UPDATE containers
SET type_uid = COALESCE(declared_type_uid, type_uid),
carton_type_uid = COALESCE(declared_carton_type_uid, carton_type_uid),
status_uid = COALESCE(declared_status_uid, status_uid),
order_uid = COALESCE(in_order_uid, order_uid),
wave_uid = COALESCE(in_wave_uid, wave_uid),
length = COALESCE(in_length, carton_length, length),
width = COALESCE(in_width, carton_width, width),
height = COALESCE(in_height, carton_height, height),
weight = COALESCE(in_weight, weight),
weight_minimum = COALESCE(in_weight_minimum, weight_minimum),
weight_maximum = COALESCE(in_weight_maximum, weight_maximum),
weight_expected = COALESCE(in_weight_expected, weight_expected),
first_seen_DP_id = COALESCE(first_seen_DP_id, in_last_seen_DP_id),
first_seen_datetime = COALESCE(first_seen_datetime, last_seen_date_time),
last_seen_DP_id = COALESCE(in_last_seen_DP_id, last_seen_DP_id),
last_seen_datetime = COALESCE(last_seen_date_time, last_seen_datetime),
recirculation_count = COALESCE(in_recirculation_count, recirculation_count),
project_flags = COALESCE(in_project_flags, project_flags),
passed_weight_check = COALESCE(in_passed_weight_check, passed_weight_check)
WHERE uid = in_uid"
PL/pgSQL function containers_add_update(integer,integer,integer,integer,integer,integer,double precision,double precision,double precision,double precision,double precision,double precision,double precision,integer,timestamp without time zone,character varying,bigint,boolean) line 60 at SQL statement
2020-02-04 10:26:16.875 EST [2512] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.879 EST [14908] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.879 EST [14908] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.879 EST [14908] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.879 EST [14908] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.880 EST [7092] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.880 EST [7092] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.880 EST [7092] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.880 EST [7092] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.975 EST [14360] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:16.975 EST [14360] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.033 EST [20788] [0] LOG: 00000: all server processes terminated; reinitializing
2020-02-04 10:26:17.033 EST [20788] [0] LOCATION: PostmasterStateMachine, postmaster.c:3912
2020-02-04 10:26:17.105 EST [20964] [0] LOG: 00000: database system was interrupted; last known up at 2020-02-04 10:26:09 EST
2020-02-04 10:26:17.105 EST [20964] [0] LOCATION: StartupXLOG, xlog.c:6277
2020-02-04 10:26:17.115 EST [1668] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.115 EST [1668] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.179 EST [25800] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.179 EST [25800] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.301 EST [14700] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.301 EST [14700] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.309 EST [19060] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.309 EST [19060] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.378 EST [24772] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.378 EST [24772] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.434 EST [12972] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.434 EST [12972] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.492 EST [11208] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.492 EST [11208] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.548 EST [13236] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.548 EST [13236] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.607 EST [25756] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.607 EST [25756] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.677 EST [12944] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.677 EST [12944] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.737 EST [14712] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.737 EST [14712] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:18.104 EST [20964] [0] LOG: 00000: database system was not properly shut down; automatic recovery in progress
2020-02-04 10:26:18.104 EST [20964] [0] LOCATION: StartupXLOG, xlog.c:6774
2020-02-04 10:26:18.109 EST [20964] [0] LOG: 00000: redo starts at 14/52009F08
2020-02-04 10:26:18.109 EST [20964] [0] LOCATION: StartupXLOG, xlog.c:7045
2020-02-04 10:26:18.349 EST [23064] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:18.349 EST [23064] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:19.248 EST [8816] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:19.248 EST [8816] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:20.560 EST [18200] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:20.560 EST [18200] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:22.508 EST [23204] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:22.508 EST [23204] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:25.402 EST [5888] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:25.402 EST [5888] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:29.714 EST [16820] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:29.714 EST [16820] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:36.161 EST [24072] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:36.161 EST [24072] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:45.806 EST [22000] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:45.806 EST [22000] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:55.687 EST [20964] [0] LOG: 00000: redo done at 14/79A030E0
2020-02-04 10:26:55.687 EST [20964] [0] LOCATION: StartupXLOG, xlog.c:7307
2020-02-04 10:26:55.861 EST [16700] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:55.861 EST [16700] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:57.016 EST [20788] [0] LOG: 00000: database system is ready to accept connections

On Tue, Feb 4, 2020 at 10:50 AM Doug Roberts <h205881@gmail.com> wrote:

Here is a stacktrace with what happened before and after the crash.

2020-02-04 10:26:16.841 EST [20788] [0] LOG: 00000: server process (PID 12168) was terminated by exception 0xC0000005
2020-02-04 10:26:16.841 EST [20788] [0] DETAIL: Failed process was running: select CONTAINERS_RESET_RECIRC_BY_DP(3000)
2020-02-04 10:26:16.841 EST [20788] [0] HINT: See C include file "ntstatus.h" for a description of the hexadecimal value.
2020-02-04 10:26:16.841 EST [20788] [0] LOCATION: LogChildExit, postmaster.c:3670
2020-02-04 10:26:16.841 EST [20788] [0] LOG: 00000: terminating any other active server processes
2020-02-04 10:26:16.841 EST [20788] [0] LOCATION: HandleChildCrash, postmaster.c:3400
2020-02-04 10:26:16.873 EST [1212] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.873 EST [1212] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.873 EST [1212] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.873 EST [1212] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.873 EST [19436] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.873 EST [19436] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.873 EST [19436] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.873 EST [19436] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.874 EST [13428] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.874 EST [13428] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.874 EST [13428] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.874 EST [13428] [0] CONTEXT: while locking tuple (0,115) in relation "containers"
SQL statement "UPDATE containers
SET type_uid = COALESCE(declared_type_uid, type_uid),
carton_type_uid = COALESCE(declared_carton_type_uid, carton_type_uid),
status_uid = COALESCE(declared_status_uid, status_uid),
order_uid = COALESCE(in_order_uid, order_uid),
wave_uid = COALESCE(in_wave_uid, wave_uid),
length = COALESCE(in_length, carton_length, length),
width = COALESCE(in_width, carton_width, width),
height = COALESCE(in_height, carton_height, height),
weight = COALESCE(in_weight, weight),
weight_minimum = COALESCE(in_weight_minimum, weight_minimum),
weight_maximum = COALESCE(in_weight_maximum, weight_maximum),
weight_expected = COALESCE(in_weight_expected, weight_expected),
first_seen_DP_id = COALESCE(first_seen_DP_id, in_last_seen_DP_id),
first_seen_datetime = COALESCE(first_seen_datetime, last_seen_date_time),
last_seen_DP_id = COALESCE(in_last_seen_DP_id, last_seen_DP_id),
last_seen_datetime = COALESCE(last_seen_date_time, last_seen_datetime),
recirculation_count = COALESCE(in_recirculation_count, recirculation_count),
project_flags = COALESCE(in_project_flags, project_flags),
passed_weight_check = COALESCE(in_passed_weight_check, passed_weight_check)
WHERE uid = in_uid"
PL/pgSQL function containers_add_update(integer,integer,integer,integer,integer,integer,double precision,double precision,double precision,double precision,double precision,double precision,double precision,integer,timestamp without time zone,character varying,bigint,boolean) line 60 at SQL statement
2020-02-04 10:26:16.874 EST [13428] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.874 EST [25916] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.874 EST [25916] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.874 EST [25916] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.874 EST [25916] [0] CONTEXT: while locking tuple (1,91) in relation "containers"
SQL statement "UPDATE containers
SET type_uid = COALESCE(declared_type_uid, type_uid),
carton_type_uid = COALESCE(declared_carton_type_uid, carton_type_uid),
status_uid = COALESCE(declared_status_uid, status_uid),
order_uid = COALESCE(in_order_uid, order_uid),
wave_uid = COALESCE(in_wave_uid, wave_uid),
length = COALESCE(in_length, carton_length, length),
width = COALESCE(in_width, carton_width, width),
height = COALESCE(in_height, carton_height, height),
weight = COALESCE(in_weight, weight),
weight_minimum = COALESCE(in_weight_minimum, weight_minimum),
weight_maximum = COALESCE(in_weight_maximum, weight_maximum),
weight_expected = COALESCE(in_weight_expected, weight_expected),
first_seen_DP_id = COALESCE(first_seen_DP_id, in_last_seen_DP_id),
first_seen_datetime = COALESCE(first_seen_datetime, last_seen_date_time),
last_seen_DP_id = COALESCE(in_last_seen_DP_id, last_seen_DP_id),
last_seen_datetime = COALESCE(last_seen_date_time, last_seen_datetime),
recirculation_count = COALESCE(in_recirculation_count, recirculation_count),
project_flags = COALESCE(in_project_flags, project_flags),
passed_weight_check = COALESCE(in_passed_weight_check, passed_weight_check)
WHERE uid = in_uid"
PL/pgSQL function containers_add_update(integer,integer,integer,integer,integer,integer,double precision,double precision,double precision,double precision,double precision,double precision,double precision,integer,timestamp without time zone,character varying,bigint,boolean) line 60 at SQL statement
2020-02-04 10:26:16.874 EST [25916] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.875 EST [2512] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.875 EST [2512] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.875 EST [2512] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.875 EST [2512] [0] CONTEXT: while locking tuple (0,111) in relation "containers"
SQL statement "UPDATE containers
SET type_uid = COALESCE(declared_type_uid, type_uid),
carton_type_uid = COALESCE(declared_carton_type_uid, carton_type_uid),
status_uid = COALESCE(declared_status_uid, status_uid),
order_uid = COALESCE(in_order_uid, order_uid),
wave_uid = COALESCE(in_wave_uid, wave_uid),
length = COALESCE(in_length, carton_length, length),
width = COALESCE(in_width, carton_width, width),
height = COALESCE(in_height, carton_height, height),
weight = COALESCE(in_weight, weight),
weight_minimum = COALESCE(in_weight_minimum, weight_minimum),
weight_maximum = COALESCE(in_weight_maximum, weight_maximum),
weight_expected = COALESCE(in_weight_expected, weight_expected),
first_seen_DP_id = COALESCE(first_seen_DP_id, in_last_seen_DP_id),
first_seen_datetime = COALESCE(first_seen_datetime, last_seen_date_time),
last_seen_DP_id = COALESCE(in_last_seen_DP_id, last_seen_DP_id),
last_seen_datetime = COALESCE(last_seen_date_time, last_seen_datetime),
recirculation_count = COALESCE(in_recirculation_count, recirculation_count),
project_flags = COALESCE(in_project_flags, project_flags),
passed_weight_check = COALESCE(in_passed_weight_check, passed_weight_check)
WHERE uid = in_uid"
PL/pgSQL function containers_add_update(integer,integer,integer,integer,integer,integer,double precision,double precision,double precision,double precision,double precision,double precision,double precision,integer,timestamp without time zone,character varying,bigint,boolean) line 60 at SQL statement
2020-02-04 10:26:16.875 EST [2512] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.879 EST [14908] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.879 EST [14908] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.879 EST [14908] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.879 EST [14908] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.880 EST [7092] [0] WARNING: 57P02: terminating connection because of crash of another server process
2020-02-04 10:26:16.880 EST [7092] [0] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-02-04 10:26:16.880 EST [7092] [0] HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-02-04 10:26:16.880 EST [7092] [0] LOCATION: quickdie, postgres.c:2717
2020-02-04 10:26:16.975 EST [14360] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:16.975 EST [14360] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.033 EST [20788] [0] LOG: 00000: all server processes terminated; reinitializing
2020-02-04 10:26:17.033 EST [20788] [0] LOCATION: PostmasterStateMachine, postmaster.c:3912
2020-02-04 10:26:17.105 EST [20964] [0] LOG: 00000: database system was interrupted; last known up at 2020-02-04 10:26:09 EST
2020-02-04 10:26:17.105 EST [20964] [0] LOCATION: StartupXLOG, xlog.c:6277
2020-02-04 10:26:17.115 EST [1668] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.115 EST [1668] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.179 EST [25800] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.179 EST [25800] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.301 EST [14700] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.301 EST [14700] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.309 EST [19060] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.309 EST [19060] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.378 EST [24772] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.378 EST [24772] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.434 EST [12972] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.434 EST [12972] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.492 EST [11208] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.492 EST [11208] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.548 EST [13236] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.548 EST [13236] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.607 EST [25756] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.607 EST [25756] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.677 EST [12944] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.677 EST [12944] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:17.737 EST [14712] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:17.737 EST [14712] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:18.104 EST [20964] [0] LOG: 00000: database system was not properly shut down; automatic recovery in progress
2020-02-04 10:26:18.104 EST [20964] [0] LOCATION: StartupXLOG, xlog.c:6774
2020-02-04 10:26:18.109 EST [20964] [0] LOG: 00000: redo starts at 14/52009F08
2020-02-04 10:26:18.109 EST [20964] [0] LOCATION: StartupXLOG, xlog.c:7045
2020-02-04 10:26:18.349 EST [23064] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:18.349 EST [23064] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:19.248 EST [8816] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:19.248 EST [8816] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:20.560 EST [18200] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:20.560 EST [18200] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:22.508 EST [23204] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:22.508 EST [23204] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:25.402 EST [5888] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:25.402 EST [5888] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:29.714 EST [16820] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:29.714 EST [16820] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:36.161 EST [24072] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:36.161 EST [24072] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:45.806 EST [22000] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:45.806 EST [22000] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:55.687 EST [20964] [0] LOG: 00000: redo done at 14/79A030E0
2020-02-04 10:26:55.687 EST [20964] [0] LOCATION: StartupXLOG, xlog.c:7307
2020-02-04 10:26:55.861 EST [16700] [0] FATAL: 57P03: the database system is in recovery mode
2020-02-04 10:26:55.861 EST [16700] [0] LOCATION: ProcessStartupPacket, postmaster.c:2275
2020-02-04 10:26:57.016 EST [20788] [0] LOG: 00000: database system is ready to accept connections

On Tue, Feb 4, 2020 at 9:20 AM Doug Roberts <h205881@gmail.com> wrote:
> So how did containers_reset_recirc() come to clash with
> containers_add_update()?

They are clashing because another portion of our system is running and updating containers. The reset recirc function was run at the same time to see how our system and the database would handle it.

The recirc string is formatted like 2000=3,1000=6,5000=0. So the reset recirc function with take a UID (1000 for example) and use that to remove 1000=x from all of the recirc counts for all of the containers that have 1000=x.

We are currently using PG 12.0.

Thanks,

Doug

On Mon, Feb 3, 2020 at 6:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Adrian Klaver <adrian.klaver@aklaver.com> writes:
> Please reply to list also.

> On 2/3/20 2:18 PM, Doug Roberts wrote:
>> Here is what the reset recirc function is doing.
>> ...
>> UPDATE containers
>> ...

> So how did containers_reset_recirc() come to clash with
> containers_add_update()?

If this is PG 12.0 or 12.1, a likely theory is that this is an
EvalPlanQual bug (which'd be triggered during concurrent updates
of the same row in the table, so that squares with the observation
that locking the table prevents it). The known bugs in that area
require either before-row-update triggers on the table, or
child tables (either partitioning or traditional inheritance).
So I wonder what the schema of table "containers" looks like.

Or you could have hit some new bug ... but there's not enough
info here to diagnose.

regards, tom lane

Re: Postgres Crashing

От

Adrian Klaver

Дата:

04 февраля 2020 г., 19:18:13

On 2/4/20 8:06 AM, Doug Roberts wrote:
> Hello,
> 
> Here is a stacktrace of what happened before and after the crash.

Actually the below is the Postgres log. Per Tom's previous post the 
procedure to get a stack trace can be found here:

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

> 
> Thanks,
> 
> Doug
> 
> 2020-02-04 10:26:16.841 EST [20788] [0] LOG:  00000: server process (PID 
> 12168) was terminated by exception 0xC0000005
> 2020-02-04 10:26:16.841 EST [20788] [0] DETAIL:  Failed process was 
> running: select CONTAINERS_RESET_RECIRC_BY_DP(3000)
> 2020-02-04 10:26:16.841 EST [20788] [0] HINT:  See C include file 
> "ntstatus.h" for a description of the hexadecimal value.
> 2020-02-04 10:26:16.841 EST [20788] [0] LOCATION:  LogChildExit, 
> postmaster.c:3670
> 2020-02-04 10:26:16.841 EST [20788] [0] LOG:  00000: terminating any 
> other active server processes
> 2020-02-04 10:26:16.841 EST [20788] [0] LOCATION:  HandleChildCrash, 
> postmaster.c:3400
> 2020-02-04 10:26:16.873 EST [1212] [0] WARNING:  57P02: terminating 
> connection because of crash of another server process
> 2020-02-04 10:26:16.873 EST [1212] [0] DETAIL:  The postmaster has 
> commanded this server process to roll back the current transaction and 
> exit, because another server process exited abnormally and possibly 
> corrupted shared memory.
> 2020-02-04 10:26:16.873 EST [1212] [0] HINT:  In a moment you should be 
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.873 EST [1212] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.873 EST [19436] [0] WARNING:  57P02: terminating 
> connection because of crash of another server process
> 2020-02-04 10:26:16.873 EST [19436] [0] DETAIL:  The postmaster has 
> commanded this server process to roll back the current transaction and 
> exit, because another server process exited abnormally and possibly 
> corrupted shared memory.
> 2020-02-04 10:26:16.873 EST [19436] [0] HINT:  In a moment you should be 
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.873 EST [19436] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.874 EST [13428] [0] WARNING:  57P02: terminating 
> connection because of crash of another server process
> 2020-02-04 10:26:16.874 EST [13428] [0] DETAIL:  The postmaster has 
> commanded this server process to roll back the current transaction and 
> exit, because another server process exited abnormally and possibly 
> corrupted shared memory.
> 2020-02-04 10:26:16.874 EST [13428] [0] HINT:  In a moment you should be 
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.874 EST [13428] [0] CONTEXT:  while locking tuple 
> (0,115) in relation "containers"
> SQL statement "UPDATE containers
>             SET type_uid = COALESCE(declared_type_uid, type_uid),
>                 carton_type_uid = COALESCE(declared_carton_type_uid, 
> carton_type_uid),
>                 status_uid = COALESCE(declared_status_uid, status_uid),
>                 order_uid = COALESCE(in_order_uid, order_uid),
>                 wave_uid = COALESCE(in_wave_uid, wave_uid),
>                 length = COALESCE(in_length, carton_length, length),
>                 width = COALESCE(in_width, carton_width, width),
>                 height = COALESCE(in_height, carton_height, height),
>                 weight = COALESCE(in_weight, weight),
>                 weight_minimum = COALESCE(in_weight_minimum, 
> weight_minimum),
>                 weight_maximum = COALESCE(in_weight_maximum, 
> weight_maximum),
>                 weight_expected = COALESCE(in_weight_expected, 
> weight_expected),
>                 first_seen_DP_id = COALESCE(first_seen_DP_id, 
> in_last_seen_DP_id),
>                 first_seen_datetime = COALESCE(first_seen_datetime, 
> last_seen_date_time),
>                 last_seen_DP_id = COALESCE(in_last_seen_DP_id, 
> last_seen_DP_id),
>                 last_seen_datetime = COALESCE(last_seen_date_time, 
> last_seen_datetime),
>                 recirculation_count = COALESCE(in_recirculation_count, 
> recirculation_count),
>                 project_flags = COALESCE(in_project_flags, project_flags),
>                 passed_weight_check = COALESCE(in_passed_weight_check, 
> passed_weight_check)
>             WHERE uid = in_uid"
> PL/pgSQL function 
> containers_add_update(integer,integer,integer,integer,integer,integer,double 
> precision,double precision,double precision,double precision,double 
> precision,double precision,double precision,integer,timestamp without 
> time zone,character varying,bigint,boolean) line 60 at SQL statement
> 2020-02-04 10:26:16.874 EST [13428] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.874 EST [25916] [0] WARNING:  57P02: terminating 
> connection because of crash of another server process
> 2020-02-04 10:26:16.874 EST [25916] [0] DETAIL:  The postmaster has 
> commanded this server process to roll back the current transaction and 
> exit, because another server process exited abnormally and possibly 
> corrupted shared memory.
> 2020-02-04 10:26:16.874 EST [25916] [0] HINT:  In a moment you should be 
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.874 EST [25916] [0] CONTEXT:  while locking tuple 
> (1,91) in relation "containers"
> SQL statement "UPDATE containers
>             SET type_uid = COALESCE(declared_type_uid, type_uid),
>                 carton_type_uid = COALESCE(declared_carton_type_uid, 
> carton_type_uid),
>                 status_uid = COALESCE(declared_status_uid, status_uid),
>                 order_uid = COALESCE(in_order_uid, order_uid),
>                 wave_uid = COALESCE(in_wave_uid, wave_uid),
>                 length = COALESCE(in_length, carton_length, length),
>                 width = COALESCE(in_width, carton_width, width),
>                 height = COALESCE(in_height, carton_height, height),
>                 weight = COALESCE(in_weight, weight),
>                 weight_minimum = COALESCE(in_weight_minimum, 
> weight_minimum),
>                 weight_maximum = COALESCE(in_weight_maximum, 
> weight_maximum),
>                 weight_expected = COALESCE(in_weight_expected, 
> weight_expected),
>                 first_seen_DP_id = COALESCE(first_seen_DP_id, 
> in_last_seen_DP_id),
>                 first_seen_datetime = COALESCE(first_seen_datetime, 
> last_seen_date_time),
>                 last_seen_DP_id = COALESCE(in_last_seen_DP_id, 
> last_seen_DP_id),
>                 last_seen_datetime = COALESCE(last_seen_date_time, 
> last_seen_datetime),
>                 recirculation_count = COALESCE(in_recirculation_count, 
> recirculation_count),
>                 project_flags = COALESCE(in_project_flags, project_flags),
>                 passed_weight_check = COALESCE(in_passed_weight_check, 
> passed_weight_check)
>             WHERE uid = in_uid"
> PL/pgSQL function 
> containers_add_update(integer,integer,integer,integer,integer,integer,double 
> precision,double precision,double precision,double precision,double 
> precision,double precision,double precision,integer,timestamp without 
> time zone,character varying,bigint,boolean) line 60 at SQL statement
> 2020-02-04 10:26:16.874 EST [25916] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.875 EST [2512] [0] WARNING:  57P02: terminating 
> connection because of crash of another server process
> 2020-02-04 10:26:16.875 EST [2512] [0] DETAIL:  The postmaster has 
> commanded this server process to roll back the current transaction and 
> exit, because another server process exited abnormally and possibly 
> corrupted shared memory.
> 2020-02-04 10:26:16.875 EST [2512] [0] HINT:  In a moment you should be 
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.875 EST [2512] [0] CONTEXT:  while locking tuple 
> (0,111) in relation "containers"
> SQL statement "UPDATE containers
>             SET type_uid = COALESCE(declared_type_uid, type_uid),
>                 carton_type_uid = COALESCE(declared_carton_type_uid, 
> carton_type_uid),
>                 status_uid = COALESCE(declared_status_uid, status_uid),
>                 order_uid = COALESCE(in_order_uid, order_uid),
>                 wave_uid = COALESCE(in_wave_uid, wave_uid),
>                 length = COALESCE(in_length, carton_length, length),
>                 width = COALESCE(in_width, carton_width, width),
>                 height = COALESCE(in_height, carton_height, height),
>                 weight = COALESCE(in_weight, weight),
>                 weight_minimum = COALESCE(in_weight_minimum, 
> weight_minimum),
>                 weight_maximum = COALESCE(in_weight_maximum, 
> weight_maximum),
>                 weight_expected = COALESCE(in_weight_expected, 
> weight_expected),
>                 first_seen_DP_id = COALESCE(first_seen_DP_id, 
> in_last_seen_DP_id),
>                 first_seen_datetime = COALESCE(first_seen_datetime, 
> last_seen_date_time),
>                 last_seen_DP_id = COALESCE(in_last_seen_DP_id, 
> last_seen_DP_id),
>                 last_seen_datetime = COALESCE(last_seen_date_time, 
> last_seen_datetime),
>                 recirculation_count = COALESCE(in_recirculation_count, 
> recirculation_count),
>                 project_flags = COALESCE(in_project_flags, project_flags),
>                 passed_weight_check = COALESCE(in_passed_weight_check, 
> passed_weight_check)
>             WHERE uid = in_uid"
> PL/pgSQL function 
> containers_add_update(integer,integer,integer,integer,integer,integer,double 
> precision,double precision,double precision,double precision,double 
> precision,double precision,double precision,integer,timestamp without 
> time zone,character varying,bigint,boolean) line 60 at SQL statement
> 2020-02-04 10:26:16.875 EST [2512] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.879 EST [14908] [0] WARNING:  57P02: terminating 
> connection because of crash of another server process
> 2020-02-04 10:26:16.879 EST [14908] [0] DETAIL:  The postmaster has 
> commanded this server process to roll back the current transaction and 
> exit, because another server process exited abnormally and possibly 
> corrupted shared memory.
> 2020-02-04 10:26:16.879 EST [14908] [0] HINT:  In a moment you should be 
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.879 EST [14908] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.880 EST [7092] [0] WARNING:  57P02: terminating 
> connection because of crash of another server process
> 2020-02-04 10:26:16.880 EST [7092] [0] DETAIL:  The postmaster has 
> commanded this server process to roll back the current transaction and 
> exit, because another server process exited abnormally and possibly 
> corrupted shared memory.
> 2020-02-04 10:26:16.880 EST [7092] [0] HINT:  In a moment you should be 
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.880 EST [7092] [0] LOCATION:  quickdie, postgres.c:2717
> 2020-02-04 10:26:16.975 EST [14360] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:16.975 EST [14360] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:17.033 EST [20788] [0] LOG:  00000: all server 
> processes terminated; reinitializing
> 2020-02-04 10:26:17.033 EST [20788] [0] LOCATION: 
>   PostmasterStateMachine, postmaster.c:3912
> 2020-02-04 10:26:17.105 EST [20964] [0] LOG:  00000: database system was 
> interrupted; last known up at 2020-02-04 10:26:09 EST
> 2020-02-04 10:26:17.105 EST [20964] [0] LOCATION:  StartupXLOG, xlog.c:6277
> 2020-02-04 10:26:17.115 EST [1668] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:17.115 EST [1668] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:17.179 EST [25800] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:17.179 EST [25800] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:17.301 EST [14700] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:17.301 EST [14700] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:17.309 EST [19060] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:17.309 EST [19060] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:17.378 EST [24772] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:17.378 EST [24772] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:17.434 EST [12972] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:17.434 EST [12972] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:17.492 EST [11208] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:17.492 EST [11208] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:17.548 EST [13236] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:17.548 EST [13236] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:17.607 EST [25756] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:17.607 EST [25756] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:17.677 EST [12944] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:17.677 EST [12944] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:17.737 EST [14712] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:17.737 EST [14712] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:18.104 EST [20964] [0] LOG:  00000: database system was 
> not properly shut down; automatic recovery in progress
> 2020-02-04 10:26:18.104 EST [20964] [0] LOCATION:  StartupXLOG, xlog.c:6774
> 2020-02-04 10:26:18.109 EST [20964] [0] LOG:  00000: redo starts at 
> 14/52009F08
> 2020-02-04 10:26:18.109 EST [20964] [0] LOCATION:  StartupXLOG, xlog.c:7045
> 2020-02-04 10:26:18.349 EST [23064] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:18.349 EST [23064] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:19.248 EST [8816] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:19.248 EST [8816] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:20.560 EST [18200] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:20.560 EST [18200] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:22.508 EST [23204] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:22.508 EST [23204] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:25.402 EST [5888] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:25.402 EST [5888] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:29.714 EST [16820] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:29.714 EST [16820] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:36.161 EST [24072] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:36.161 EST [24072] [0] LOCATION:  ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:45.806 EST [22000] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:45.806 EST [22000] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:55.687 EST [20964] [0] LOG:  00000: redo done at 
> 14/79A030E0
> 2020-02-04 10:26:55.687 EST [20964] [0] LOCATION:  StartupXLOG, xlog.c:7307
> 2020-02-04 10:26:55.861 EST [16700] [0] FATAL:  57P03: the database 
> system is in recovery mode
> 2020-02-04 10:26:55.861 EST [16700] [0] LOCATION:  ProcessStartupPacket, 
> postmaster.c:2275
> 2020-02-04 10:26:57.016 EST [20788] [0] LOG:  00000: database system is 
> ready to accept connections
> 
> On Tue, Feb 4, 2020 at 10:50 AM Doug Roberts <h205881@gmail.com 
> <mailto:h205881@gmail.com>> wrote:
> 
>     Here is a stacktrace with what happened before and after the crash.
> 
>     2020-02-04 10:26:16.841 EST [20788] [0] LOG:  00000: server process
>     (PID 12168) was terminated by exception 0xC0000005
>     2020-02-04 10:26:16.841 EST [20788] [0] DETAIL:  Failed process was
>     running: select CONTAINERS_RESET_RECIRC_BY_DP(3000)
>     2020-02-04 10:26:16.841 EST [20788] [0] HINT:  See C include file
>     "ntstatus.h" for a description of the hexadecimal value.
>     2020-02-04 10:26:16.841 EST [20788] [0] LOCATION:  LogChildExit,
>     postmaster.c:3670
>     2020-02-04 10:26:16.841 EST [20788] [0] LOG:  00000: terminating any
>     other active server processes
>     2020-02-04 10:26:16.841 EST [20788] [0] LOCATION:  HandleChildCrash,
>     postmaster.c:3400
>     2020-02-04 10:26:16.873 EST [1212] [0] WARNING:  57P02: terminating
>     connection because of crash of another server process
>     2020-02-04 10:26:16.873 EST [1212] [0] DETAIL:  The postmaster has
>     commanded this server process to roll back the current transaction
>     and exit, because another server process exited abnormally and
>     possibly corrupted shared memory.
>     2020-02-04 10:26:16.873 EST [1212] [0] HINT:  In a moment you should
>     be able to reconnect to the database and repeat your command.
>     2020-02-04 10:26:16.873 EST [1212] [0] LOCATION:  quickdie,
>     postgres.c:2717
>     2020-02-04 10:26:16.873 EST [19436] [0] WARNING:  57P02: terminating
>     connection because of crash of another server process
>     2020-02-04 10:26:16.873 EST [19436] [0] DETAIL:  The postmaster has
>     commanded this server process to roll back the current transaction
>     and exit, because another server process exited abnormally and
>     possibly corrupted shared memory.
>     2020-02-04 10:26:16.873 EST [19436] [0] HINT:  In a moment you
>     should be able to reconnect to the database and repeat your command.
>     2020-02-04 10:26:16.873 EST [19436] [0] LOCATION:  quickdie,
>     postgres.c:2717
>     2020-02-04 10:26:16.874 EST [13428] [0] WARNING:  57P02: terminating
>     connection because of crash of another server process
>     2020-02-04 10:26:16.874 EST [13428] [0] DETAIL:  The postmaster has
>     commanded this server process to roll back the current transaction
>     and exit, because another server process exited abnormally and
>     possibly corrupted shared memory.
>     2020-02-04 10:26:16.874 EST [13428] [0] HINT:  In a moment you
>     should be able to reconnect to the database and repeat your command.
>     2020-02-04 10:26:16.874 EST [13428] [0] CONTEXT:  while locking
>     tuple (0,115) in relation "containers"
>     SQL statement "UPDATE containers
>                 SET type_uid = COALESCE(declared_type_uid, type_uid),
>                     carton_type_uid = COALESCE(declared_carton_type_uid,
>     carton_type_uid),
>                     status_uid = COALESCE(declared_status_uid, status_uid),
>                     order_uid = COALESCE(in_order_uid, order_uid),
>                     wave_uid = COALESCE(in_wave_uid, wave_uid),
>                     length = COALESCE(in_length, carton_length, length),
>                     width = COALESCE(in_width, carton_width, width),
>                     height = COALESCE(in_height, carton_height, height),
>                     weight = COALESCE(in_weight, weight),
>                     weight_minimum = COALESCE(in_weight_minimum,
>     weight_minimum),
>                     weight_maximum = COALESCE(in_weight_maximum,
>     weight_maximum),
>                     weight_expected = COALESCE(in_weight_expected,
>     weight_expected),
>                     first_seen_DP_id = COALESCE(first_seen_DP_id,
>     in_last_seen_DP_id),
>                     first_seen_datetime = COALESCE(first_seen_datetime,
>     last_seen_date_time),
>                     last_seen_DP_id = COALESCE(in_last_seen_DP_id,
>     last_seen_DP_id),
>                     last_seen_datetime = COALESCE(last_seen_date_time,
>     last_seen_datetime),
>                     recirculation_count =
>     COALESCE(in_recirculation_count, recirculation_count),
>                     project_flags = COALESCE(in_project_flags,
>     project_flags),
>                     passed_weight_check =
>     COALESCE(in_passed_weight_check, passed_weight_check)
>                 WHERE uid = in_uid"
>     PL/pgSQL function
>     containers_add_update(integer,integer,integer,integer,integer,integer,double
>     precision,double precision,double precision,double precision,double
>     precision,double precision,double precision,integer,timestamp
>     without time zone,character varying,bigint,boolean) line 60 at SQL
>     statement
>     2020-02-04 10:26:16.874 EST [13428] [0] LOCATION:  quickdie,
>     postgres.c:2717
>     2020-02-04 10:26:16.874 EST [25916] [0] WARNING:  57P02: terminating
>     connection because of crash of another server process
>     2020-02-04 10:26:16.874 EST [25916] [0] DETAIL:  The postmaster has
>     commanded this server process to roll back the current transaction
>     and exit, because another server process exited abnormally and
>     possibly corrupted shared memory.
>     2020-02-04 10:26:16.874 EST [25916] [0] HINT:  In a moment you
>     should be able to reconnect to the database and repeat your command.
>     2020-02-04 10:26:16.874 EST [25916] [0] CONTEXT:  while locking
>     tuple (1,91) in relation "containers"
>     SQL statement "UPDATE containers
>                 SET type_uid = COALESCE(declared_type_uid, type_uid),
>                     carton_type_uid = COALESCE(declared_carton_type_uid,
>     carton_type_uid),
>                     status_uid = COALESCE(declared_status_uid, status_uid),
>                     order_uid = COALESCE(in_order_uid, order_uid),
>                     wave_uid = COALESCE(in_wave_uid, wave_uid),
>                     length = COALESCE(in_length, carton_length, length),
>                     width = COALESCE(in_width, carton_width, width),
>                     height = COALESCE(in_height, carton_height, height),
>                     weight = COALESCE(in_weight, weight),
>                     weight_minimum = COALESCE(in_weight_minimum,
>     weight_minimum),
>                     weight_maximum = COALESCE(in_weight_maximum,
>     weight_maximum),
>                     weight_expected = COALESCE(in_weight_expected,
>     weight_expected),
>                     first_seen_DP_id = COALESCE(first_seen_DP_id,
>     in_last_seen_DP_id),
>                     first_seen_datetime = COALESCE(first_seen_datetime,
>     last_seen_date_time),
>                     last_seen_DP_id = COALESCE(in_last_seen_DP_id,
>     last_seen_DP_id),
>                     last_seen_datetime = COALESCE(last_seen_date_time,
>     last_seen_datetime),
>                     recirculation_count =
>     COALESCE(in_recirculation_count, recirculation_count),
>                     project_flags = COALESCE(in_project_flags,
>     project_flags),
>                     passed_weight_check =
>     COALESCE(in_passed_weight_check, passed_weight_check)
>                 WHERE uid = in_uid"
>     PL/pgSQL function
>     containers_add_update(integer,integer,integer,integer,integer,integer,double
>     precision,double precision,double precision,double precision,double
>     precision,double precision,double precision,integer,timestamp
>     without time zone,character varying,bigint,boolean) line 60 at SQL
>     statement
>     2020-02-04 10:26:16.874 EST [25916] [0] LOCATION:  quickdie,
>     postgres.c:2717
>     2020-02-04 10:26:16.875 EST [2512] [0] WARNING:  57P02: terminating
>     connection because of crash of another server process
>     2020-02-04 10:26:16.875 EST [2512] [0] DETAIL:  The postmaster has
>     commanded this server process to roll back the current transaction
>     and exit, because another server process exited abnormally and
>     possibly corrupted shared memory.
>     2020-02-04 10:26:16.875 EST [2512] [0] HINT:  In a moment you should
>     be able to reconnect to the database and repeat your command.
>     2020-02-04 10:26:16.875 EST [2512] [0] CONTEXT:  while locking tuple
>     (0,111) in relation "containers"
>     SQL statement "UPDATE containers
>                 SET type_uid = COALESCE(declared_type_uid, type_uid),
>                     carton_type_uid = COALESCE(declared_carton_type_uid,
>     carton_type_uid),
>                     status_uid = COALESCE(declared_status_uid, status_uid),
>                     order_uid = COALESCE(in_order_uid, order_uid),
>                     wave_uid = COALESCE(in_wave_uid, wave_uid),
>                     length = COALESCE(in_length, carton_length, length),
>                     width = COALESCE(in_width, carton_width, width),
>                     height = COALESCE(in_height, carton_height, height),
>                     weight = COALESCE(in_weight, weight),
>                     weight_minimum = COALESCE(in_weight_minimum,
>     weight_minimum),
>                     weight_maximum = COALESCE(in_weight_maximum,
>     weight_maximum),
>                     weight_expected = COALESCE(in_weight_expected,
>     weight_expected),
>                     first_seen_DP_id = COALESCE(first_seen_DP_id,
>     in_last_seen_DP_id),
>                     first_seen_datetime = COALESCE(first_seen_datetime,
>     last_seen_date_time),
>                     last_seen_DP_id = COALESCE(in_last_seen_DP_id,
>     last_seen_DP_id),
>                     last_seen_datetime = COALESCE(last_seen_date_time,
>     last_seen_datetime),
>                     recirculation_count =
>     COALESCE(in_recirculation_count, recirculation_count),
>                     project_flags = COALESCE(in_project_flags,
>     project_flags),
>                     passed_weight_check =
>     COALESCE(in_passed_weight_check, passed_weight_check)
>                 WHERE uid = in_uid"
>     PL/pgSQL function
>     containers_add_update(integer,integer,integer,integer,integer,integer,double
>     precision,double precision,double precision,double precision,double
>     precision,double precision,double precision,integer,timestamp
>     without time zone,character varying,bigint,boolean) line 60 at SQL
>     statement
>     2020-02-04 10:26:16.875 EST [2512] [0] LOCATION:  quickdie,
>     postgres.c:2717
>     2020-02-04 10:26:16.879 EST [14908] [0] WARNING:  57P02: terminating
>     connection because of crash of another server process
>     2020-02-04 10:26:16.879 EST [14908] [0] DETAIL:  The postmaster has
>     commanded this server process to roll back the current transaction
>     and exit, because another server process exited abnormally and
>     possibly corrupted shared memory.
>     2020-02-04 10:26:16.879 EST [14908] [0] HINT:  In a moment you
>     should be able to reconnect to the database and repeat your command.
>     2020-02-04 10:26:16.879 EST [14908] [0] LOCATION:  quickdie,
>     postgres.c:2717
>     2020-02-04 10:26:16.880 EST [7092] [0] WARNING:  57P02: terminating
>     connection because of crash of another server process
>     2020-02-04 10:26:16.880 EST [7092] [0] DETAIL:  The postmaster has
>     commanded this server process to roll back the current transaction
>     and exit, because another server process exited abnormally and
>     possibly corrupted shared memory.
>     2020-02-04 10:26:16.880 EST [7092] [0] HINT:  In a moment you should
>     be able to reconnect to the database and repeat your command.
>     2020-02-04 10:26:16.880 EST [7092] [0] LOCATION:  quickdie,
>     postgres.c:2717
>     2020-02-04 10:26:16.975 EST [14360] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:16.975 EST [14360] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:17.033 EST [20788] [0] LOG:  00000: all server
>     processes terminated; reinitializing
>     2020-02-04 10:26:17.033 EST [20788] [0] LOCATION:
>       PostmasterStateMachine, postmaster.c:3912
>     2020-02-04 10:26:17.105 EST [20964] [0] LOG:  00000: database system
>     was interrupted; last known up at 2020-02-04 10:26:09 EST
>     2020-02-04 10:26:17.105 EST [20964] [0] LOCATION:  StartupXLOG,
>     xlog.c:6277
>     2020-02-04 10:26:17.115 EST [1668] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:17.115 EST [1668] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:17.179 EST [25800] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:17.179 EST [25800] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:17.301 EST [14700] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:17.301 EST [14700] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:17.309 EST [19060] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:17.309 EST [19060] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:17.378 EST [24772] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:17.378 EST [24772] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:17.434 EST [12972] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:17.434 EST [12972] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:17.492 EST [11208] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:17.492 EST [11208] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:17.548 EST [13236] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:17.548 EST [13236] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:17.607 EST [25756] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:17.607 EST [25756] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:17.677 EST [12944] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:17.677 EST [12944] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:17.737 EST [14712] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:17.737 EST [14712] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:18.104 EST [20964] [0] LOG:  00000: database system
>     was not properly shut down; automatic recovery in progress
>     2020-02-04 10:26:18.104 EST [20964] [0] LOCATION:  StartupXLOG,
>     xlog.c:6774
>     2020-02-04 10:26:18.109 EST [20964] [0] LOG:  00000: redo starts at
>     14/52009F08
>     2020-02-04 10:26:18.109 EST [20964] [0] LOCATION:  StartupXLOG,
>     xlog.c:7045
>     2020-02-04 10:26:18.349 EST [23064] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:18.349 EST [23064] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:19.248 EST [8816] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:19.248 EST [8816] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:20.560 EST [18200] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:20.560 EST [18200] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:22.508 EST [23204] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:22.508 EST [23204] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:25.402 EST [5888] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:25.402 EST [5888] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:29.714 EST [16820] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:29.714 EST [16820] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:36.161 EST [24072] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:36.161 EST [24072] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:45.806 EST [22000] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:45.806 EST [22000] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:55.687 EST [20964] [0] LOG:  00000: redo done at
>     14/79A030E0
>     2020-02-04 10:26:55.687 EST [20964] [0] LOCATION:  StartupXLOG,
>     xlog.c:7307
>     2020-02-04 10:26:55.861 EST [16700] [0] FATAL:  57P03: the database
>     system is in recovery mode
>     2020-02-04 10:26:55.861 EST [16700] [0] LOCATION:
>       ProcessStartupPacket, postmaster.c:2275
>     2020-02-04 10:26:57.016 EST [20788] [0] LOG:  00000: database system
>     is ready to accept connections
> 
>     On Tue, Feb 4, 2020 at 9:20 AM Doug Roberts <h205881@gmail.com
>     <mailto:h205881@gmail.com>> wrote:
> 
>         > So how did containers_reset_recirc() come to clash with
>         > containers_add_update()?
> 
>         They are clashing because another portion of our system is
>         running and updating containers. The reset recirc function was
>         run at the same time to see how our system and the database
>         would handle it.
> 
>         The recirc string is formatted like 2000=3,1000=6,5000=0. So the
>         reset recirc function with take a UID (1000 for example) and use
>         that to remove 1000=x from all of the recirc counts for all of
>         the containers that have 1000=x.
> 
>         We are currently using PG 12.0.
> 
>         Thanks,
> 
>         Doug
> 
>         On Mon, Feb 3, 2020 at 6:21 PM Tom Lane <tgl@sss.pgh.pa.us
>         <mailto:tgl@sss.pgh.pa.us>> wrote:
> 
>             Adrian Klaver <adrian.klaver@aklaver.com
>             <mailto:adrian.klaver@aklaver.com>> writes:
>              > Please reply to list also.
> 
>              > On 2/3/20 2:18 PM, Doug Roberts wrote:
>              >> Here is what the reset recirc function is doing.
>              >> ...
>              >>     UPDATE containers
>              >> ...
> 
>              > So how did containers_reset_recirc() come to clash with
>              > containers_add_update()?
> 
>             If this is PG 12.0 or 12.1, a likely theory is that this is an
>             EvalPlanQual bug (which'd be triggered during concurrent updates
>             of the same row in the table, so that squares with the
>             observation
>             that locking the table prevents it).  The known bugs in that
>             area
>             require either before-row-update triggers on the table, or
>             child tables (either partitioning or traditional inheritance).
>             So I wonder what the schema of table "containers" looks like.
> 
>             Or you could have hit some new bug ... but there's not enough
>             info here to diagnose.
> 
>                                      regards, tom lane
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Postgres Crashing

От

Doug Roberts

Дата:

04 февраля 2020 г., 19:27:14

Sure. Ok then.

On Tue, Feb 4, 2020 at 11:18 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 2/4/20 8:06 AM, Doug Roberts wrote:
> Hello,
>
> Here is a stacktrace of what happened before and after the crash.

Actually the below is the Postgres log. Per Tom's previous post the
procedure to get a stack trace can be found here:

https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

>
> Thanks,
>
> Doug
>
> 2020-02-04 10:26:16.841 EST [20788] [0] LOG: 00000: server process (PID
> 12168) was terminated by exception 0xC0000005
> 2020-02-04 10:26:16.841 EST [20788] [0] DETAIL: Failed process was
> running: select CONTAINERS_RESET_RECIRC_BY_DP(3000)
> 2020-02-04 10:26:16.841 EST [20788] [0] HINT: See C include file
> "ntstatus.h" for a description of the hexadecimal value.
> 2020-02-04 10:26:16.841 EST [20788] [0] LOCATION: LogChildExit,
> postmaster.c:3670
> 2020-02-04 10:26:16.841 EST [20788] [0] LOG: 00000: terminating any
> other active server processes
> 2020-02-04 10:26:16.841 EST [20788] [0] LOCATION: HandleChildCrash,
> postmaster.c:3400
> 2020-02-04 10:26:16.873 EST [1212] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.873 EST [1212] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.873 EST [1212] [0] HINT: In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.873 EST [1212] [0] LOCATION: quickdie, postgres.c:2717
> 2020-02-04 10:26:16.873 EST [19436] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.873 EST [19436] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.873 EST [19436] [0] HINT: In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.873 EST [19436] [0] LOCATION: quickdie, postgres.c:2717
> 2020-02-04 10:26:16.874 EST [13428] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.874 EST [13428] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.874 EST [13428] [0] HINT: In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.874 EST [13428] [0] CONTEXT: while locking tuple
> (0,115) in relation "containers"
> SQL statement "UPDATE containers
> SET type_uid = COALESCE(declared_type_uid, type_uid),
> carton_type_uid = COALESCE(declared_carton_type_uid,
> carton_type_uid),
> status_uid = COALESCE(declared_status_uid, status_uid),
> order_uid = COALESCE(in_order_uid, order_uid),
> wave_uid = COALESCE(in_wave_uid, wave_uid),
> length = COALESCE(in_length, carton_length, length),
> width = COALESCE(in_width, carton_width, width),
> height = COALESCE(in_height, carton_height, height),
> weight = COALESCE(in_weight, weight),
> weight_minimum = COALESCE(in_weight_minimum,
> weight_minimum),
> weight_maximum = COALESCE(in_weight_maximum,
> weight_maximum),
> weight_expected = COALESCE(in_weight_expected,
> weight_expected),
> first_seen_DP_id = COALESCE(first_seen_DP_id,
> in_last_seen_DP_id),
> first_seen_datetime = COALESCE(first_seen_datetime,
> last_seen_date_time),
> last_seen_DP_id = COALESCE(in_last_seen_DP_id,
> last_seen_DP_id),
> last_seen_datetime = COALESCE(last_seen_date_time,
> last_seen_datetime),
> recirculation_count = COALESCE(in_recirculation_count,
> recirculation_count),
> project_flags = COALESCE(in_project_flags, project_flags),
> passed_weight_check = COALESCE(in_passed_weight_check,
> passed_weight_check)
> WHERE uid = in_uid"
> PL/pgSQL function
> containers_add_update(integer,integer,integer,integer,integer,integer,double
> precision,double precision,double precision,double precision,double
> precision,double precision,double precision,integer,timestamp without
> time zone,character varying,bigint,boolean) line 60 at SQL statement
> 2020-02-04 10:26:16.874 EST [13428] [0] LOCATION: quickdie, postgres.c:2717
> 2020-02-04 10:26:16.874 EST [25916] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.874 EST [25916] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.874 EST [25916] [0] HINT: In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.874 EST [25916] [0] CONTEXT: while locking tuple
> (1,91) in relation "containers"
> SQL statement "UPDATE containers
> SET type_uid = COALESCE(declared_type_uid, type_uid),
> carton_type_uid = COALESCE(declared_carton_type_uid,
> carton_type_uid),
> status_uid = COALESCE(declared_status_uid, status_uid),
> order_uid = COALESCE(in_order_uid, order_uid),
> wave_uid = COALESCE(in_wave_uid, wave_uid),
> length = COALESCE(in_length, carton_length, length),
> width = COALESCE(in_width, carton_width, width),
> height = COALESCE(in_height, carton_height, height),
> weight = COALESCE(in_weight, weight),
> weight_minimum = COALESCE(in_weight_minimum,
> weight_minimum),
> weight_maximum = COALESCE(in_weight_maximum,
> weight_maximum),
> weight_expected = COALESCE(in_weight_expected,
> weight_expected),
> first_seen_DP_id = COALESCE(first_seen_DP_id,
> in_last_seen_DP_id),
> first_seen_datetime = COALESCE(first_seen_datetime,
> last_seen_date_time),
> last_seen_DP_id = COALESCE(in_last_seen_DP_id,
> last_seen_DP_id),
> last_seen_datetime = COALESCE(last_seen_date_time,
> last_seen_datetime),
> recirculation_count = COALESCE(in_recirculation_count,
> recirculation_count),
> project_flags = COALESCE(in_project_flags, project_flags),
> passed_weight_check = COALESCE(in_passed_weight_check,
> passed_weight_check)
> WHERE uid = in_uid"
> PL/pgSQL function
> containers_add_update(integer,integer,integer,integer,integer,integer,double
> precision,double precision,double precision,double precision,double
> precision,double precision,double precision,integer,timestamp without
> time zone,character varying,bigint,boolean) line 60 at SQL statement
> 2020-02-04 10:26:16.874 EST [25916] [0] LOCATION: quickdie, postgres.c:2717
> 2020-02-04 10:26:16.875 EST [2512] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.875 EST [2512] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.875 EST [2512] [0] HINT: In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.875 EST [2512] [0] CONTEXT: while locking tuple
> (0,111) in relation "containers"
> SQL statement "UPDATE containers
> SET type_uid = COALESCE(declared_type_uid, type_uid),
> carton_type_uid = COALESCE(declared_carton_type_uid,
> carton_type_uid),
> status_uid = COALESCE(declared_status_uid, status_uid),
> order_uid = COALESCE(in_order_uid, order_uid),
> wave_uid = COALESCE(in_wave_uid, wave_uid),
> length = COALESCE(in_length, carton_length, length),
> width = COALESCE(in_width, carton_width, width),
> height = COALESCE(in_height, carton_height, height),
> weight = COALESCE(in_weight, weight),
> weight_minimum = COALESCE(in_weight_minimum,
> weight_minimum),
> weight_maximum = COALESCE(in_weight_maximum,
> weight_maximum),
> weight_expected = COALESCE(in_weight_expected,
> weight_expected),
> first_seen_DP_id = COALESCE(first_seen_DP_id,
> in_last_seen_DP_id),
> first_seen_datetime = COALESCE(first_seen_datetime,
> last_seen_date_time),
> last_seen_DP_id = COALESCE(in_last_seen_DP_id,
> last_seen_DP_id),
> last_seen_datetime = COALESCE(last_seen_date_time,
> last_seen_datetime),
> recirculation_count = COALESCE(in_recirculation_count,
> recirculation_count),
> project_flags = COALESCE(in_project_flags, project_flags),
> passed_weight_check = COALESCE(in_passed_weight_check,
> passed_weight_check)
> WHERE uid = in_uid"
> PL/pgSQL function
> containers_add_update(integer,integer,integer,integer,integer,integer,double
> precision,double precision,double precision,double precision,double
> precision,double precision,double precision,integer,timestamp without
> time zone,character varying,bigint,boolean) line 60 at SQL statement
> 2020-02-04 10:26:16.875 EST [2512] [0] LOCATION: quickdie, postgres.c:2717
> 2020-02-04 10:26:16.879 EST [14908] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.879 EST [14908] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.879 EST [14908] [0] HINT: In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.879 EST [14908] [0] LOCATION: quickdie, postgres.c:2717
> 2020-02-04 10:26:16.880 EST [7092] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.880 EST [7092] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction and
> exit, because another server process exited abnormally and possibly
> corrupted shared memory.
> 2020-02-04 10:26:16.880 EST [7092] [0] HINT: In a moment you should be
> able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.880 EST [7092] [0] LOCATION: quickdie, postgres.c:2717
> 2020-02-04 10:26:16.975 EST [14360] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:16.975 EST [14360] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.033 EST [20788] [0] LOG: 00000: all server
> processes terminated; reinitializing
> 2020-02-04 10:26:17.033 EST [20788] [0] LOCATION:
> PostmasterStateMachine, postmaster.c:3912
> 2020-02-04 10:26:17.105 EST [20964] [0] LOG: 00000: database system was
> interrupted; last known up at 2020-02-04 10:26:09 EST
> 2020-02-04 10:26:17.105 EST [20964] [0] LOCATION: StartupXLOG, xlog.c:6277
> 2020-02-04 10:26:17.115 EST [1668] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.115 EST [1668] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.179 EST [25800] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.179 EST [25800] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.301 EST [14700] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.301 EST [14700] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.309 EST [19060] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.309 EST [19060] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.378 EST [24772] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.378 EST [24772] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.434 EST [12972] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.434 EST [12972] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.492 EST [11208] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.492 EST [11208] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.548 EST [13236] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.548 EST [13236] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.607 EST [25756] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.607 EST [25756] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.677 EST [12944] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.677 EST [12944] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:17.737 EST [14712] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.737 EST [14712] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:18.104 EST [20964] [0] LOG: 00000: database system was
> not properly shut down; automatic recovery in progress
> 2020-02-04 10:26:18.104 EST [20964] [0] LOCATION: StartupXLOG, xlog.c:6774
> 2020-02-04 10:26:18.109 EST [20964] [0] LOG: 00000: redo starts at
> 14/52009F08
> 2020-02-04 10:26:18.109 EST [20964] [0] LOCATION: StartupXLOG, xlog.c:7045
> 2020-02-04 10:26:18.349 EST [23064] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:18.349 EST [23064] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:19.248 EST [8816] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:19.248 EST [8816] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:20.560 EST [18200] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:20.560 EST [18200] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:22.508 EST [23204] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:22.508 EST [23204] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:25.402 EST [5888] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:25.402 EST [5888] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:29.714 EST [16820] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:29.714 EST [16820] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:36.161 EST [24072] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:36.161 EST [24072] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:45.806 EST [22000] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:45.806 EST [22000] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:55.687 EST [20964] [0] LOG: 00000: redo done at
> 14/79A030E0
> 2020-02-04 10:26:55.687 EST [20964] [0] LOCATION: StartupXLOG, xlog.c:7307
> 2020-02-04 10:26:55.861 EST [16700] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:55.861 EST [16700] [0] LOCATION: ProcessStartupPacket,
> postmaster.c:2275
> 2020-02-04 10:26:57.016 EST [20788] [0] LOG: 00000: database system is
> ready to accept connections
>
> On Tue, Feb 4, 2020 at 10:50 AM Doug Roberts <h205881@gmail.com
> <mailto:h205881@gmail.com>> wrote:
>
> Here is a stacktrace with what happened before and after the crash.
>
> 2020-02-04 10:26:16.841 EST [20788] [0] LOG: 00000: server process
> (PID 12168) was terminated by exception 0xC0000005
> 2020-02-04 10:26:16.841 EST [20788] [0] DETAIL: Failed process was
> running: select CONTAINERS_RESET_RECIRC_BY_DP(3000)
> 2020-02-04 10:26:16.841 EST [20788] [0] HINT: See C include file
> "ntstatus.h" for a description of the hexadecimal value.
> 2020-02-04 10:26:16.841 EST [20788] [0] LOCATION: LogChildExit,
> postmaster.c:3670
> 2020-02-04 10:26:16.841 EST [20788] [0] LOG: 00000: terminating any
> other active server processes
> 2020-02-04 10:26:16.841 EST [20788] [0] LOCATION: HandleChildCrash,
> postmaster.c:3400
> 2020-02-04 10:26:16.873 EST [1212] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.873 EST [1212] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.873 EST [1212] [0] HINT: In a moment you should
> be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.873 EST [1212] [0] LOCATION: quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.873 EST [19436] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.873 EST [19436] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.873 EST [19436] [0] HINT: In a moment you
> should be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.873 EST [19436] [0] LOCATION: quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.874 EST [13428] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.874 EST [13428] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.874 EST [13428] [0] HINT: In a moment you
> should be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.874 EST [13428] [0] CONTEXT: while locking
> tuple (0,115) in relation "containers"
> SQL statement "UPDATE containers
> SET type_uid = COALESCE(declared_type_uid, type_uid),
> carton_type_uid = COALESCE(declared_carton_type_uid,
> carton_type_uid),
> status_uid = COALESCE(declared_status_uid, status_uid),
> order_uid = COALESCE(in_order_uid, order_uid),
> wave_uid = COALESCE(in_wave_uid, wave_uid),
> length = COALESCE(in_length, carton_length, length),
> width = COALESCE(in_width, carton_width, width),
> height = COALESCE(in_height, carton_height, height),
> weight = COALESCE(in_weight, weight),
> weight_minimum = COALESCE(in_weight_minimum,
> weight_minimum),
> weight_maximum = COALESCE(in_weight_maximum,
> weight_maximum),
> weight_expected = COALESCE(in_weight_expected,
> weight_expected),
> first_seen_DP_id = COALESCE(first_seen_DP_id,
> in_last_seen_DP_id),
> first_seen_datetime = COALESCE(first_seen_datetime,
> last_seen_date_time),
> last_seen_DP_id = COALESCE(in_last_seen_DP_id,
> last_seen_DP_id),
> last_seen_datetime = COALESCE(last_seen_date_time,
> last_seen_datetime),
> recirculation_count =
> COALESCE(in_recirculation_count, recirculation_count),
> project_flags = COALESCE(in_project_flags,
> project_flags),
> passed_weight_check =
> COALESCE(in_passed_weight_check, passed_weight_check)
> WHERE uid = in_uid"
> PL/pgSQL function
> containers_add_update(integer,integer,integer,integer,integer,integer,double
> precision,double precision,double precision,double precision,double
> precision,double precision,double precision,integer,timestamp
> without time zone,character varying,bigint,boolean) line 60 at SQL
> statement
> 2020-02-04 10:26:16.874 EST [13428] [0] LOCATION: quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.874 EST [25916] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.874 EST [25916] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.874 EST [25916] [0] HINT: In a moment you
> should be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.874 EST [25916] [0] CONTEXT: while locking
> tuple (1,91) in relation "containers"
> SQL statement "UPDATE containers
> SET type_uid = COALESCE(declared_type_uid, type_uid),
> carton_type_uid = COALESCE(declared_carton_type_uid,
> carton_type_uid),
> status_uid = COALESCE(declared_status_uid, status_uid),
> order_uid = COALESCE(in_order_uid, order_uid),
> wave_uid = COALESCE(in_wave_uid, wave_uid),
> length = COALESCE(in_length, carton_length, length),
> width = COALESCE(in_width, carton_width, width),
> height = COALESCE(in_height, carton_height, height),
> weight = COALESCE(in_weight, weight),
> weight_minimum = COALESCE(in_weight_minimum,
> weight_minimum),
> weight_maximum = COALESCE(in_weight_maximum,
> weight_maximum),
> weight_expected = COALESCE(in_weight_expected,
> weight_expected),
> first_seen_DP_id = COALESCE(first_seen_DP_id,
> in_last_seen_DP_id),
> first_seen_datetime = COALESCE(first_seen_datetime,
> last_seen_date_time),
> last_seen_DP_id = COALESCE(in_last_seen_DP_id,
> last_seen_DP_id),
> last_seen_datetime = COALESCE(last_seen_date_time,
> last_seen_datetime),
> recirculation_count =
> COALESCE(in_recirculation_count, recirculation_count),
> project_flags = COALESCE(in_project_flags,
> project_flags),
> passed_weight_check =
> COALESCE(in_passed_weight_check, passed_weight_check)
> WHERE uid = in_uid"
> PL/pgSQL function
> containers_add_update(integer,integer,integer,integer,integer,integer,double
> precision,double precision,double precision,double precision,double
> precision,double precision,double precision,integer,timestamp
> without time zone,character varying,bigint,boolean) line 60 at SQL
> statement
> 2020-02-04 10:26:16.874 EST [25916] [0] LOCATION: quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.875 EST [2512] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.875 EST [2512] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.875 EST [2512] [0] HINT: In a moment you should
> be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.875 EST [2512] [0] CONTEXT: while locking tuple
> (0,111) in relation "containers"
> SQL statement "UPDATE containers
> SET type_uid = COALESCE(declared_type_uid, type_uid),
> carton_type_uid = COALESCE(declared_carton_type_uid,
> carton_type_uid),
> status_uid = COALESCE(declared_status_uid, status_uid),
> order_uid = COALESCE(in_order_uid, order_uid),
> wave_uid = COALESCE(in_wave_uid, wave_uid),
> length = COALESCE(in_length, carton_length, length),
> width = COALESCE(in_width, carton_width, width),
> height = COALESCE(in_height, carton_height, height),
> weight = COALESCE(in_weight, weight),
> weight_minimum = COALESCE(in_weight_minimum,
> weight_minimum),
> weight_maximum = COALESCE(in_weight_maximum,
> weight_maximum),
> weight_expected = COALESCE(in_weight_expected,
> weight_expected),
> first_seen_DP_id = COALESCE(first_seen_DP_id,
> in_last_seen_DP_id),
> first_seen_datetime = COALESCE(first_seen_datetime,
> last_seen_date_time),
> last_seen_DP_id = COALESCE(in_last_seen_DP_id,
> last_seen_DP_id),
> last_seen_datetime = COALESCE(last_seen_date_time,
> last_seen_datetime),
> recirculation_count =
> COALESCE(in_recirculation_count, recirculation_count),
> project_flags = COALESCE(in_project_flags,
> project_flags),
> passed_weight_check =
> COALESCE(in_passed_weight_check, passed_weight_check)
> WHERE uid = in_uid"
> PL/pgSQL function
> containers_add_update(integer,integer,integer,integer,integer,integer,double
> precision,double precision,double precision,double precision,double
> precision,double precision,double precision,integer,timestamp
> without time zone,character varying,bigint,boolean) line 60 at SQL
> statement
> 2020-02-04 10:26:16.875 EST [2512] [0] LOCATION: quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.879 EST [14908] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.879 EST [14908] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.879 EST [14908] [0] HINT: In a moment you
> should be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.879 EST [14908] [0] LOCATION: quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.880 EST [7092] [0] WARNING: 57P02: terminating
> connection because of crash of another server process
> 2020-02-04 10:26:16.880 EST [7092] [0] DETAIL: The postmaster has
> commanded this server process to roll back the current transaction
> and exit, because another server process exited abnormally and
> possibly corrupted shared memory.
> 2020-02-04 10:26:16.880 EST [7092] [0] HINT: In a moment you should
> be able to reconnect to the database and repeat your command.
> 2020-02-04 10:26:16.880 EST [7092] [0] LOCATION: quickdie,
> postgres.c:2717
> 2020-02-04 10:26:16.975 EST [14360] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:16.975 EST [14360] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.033 EST [20788] [0] LOG: 00000: all server
> processes terminated; reinitializing
> 2020-02-04 10:26:17.033 EST [20788] [0] LOCATION:
> PostmasterStateMachine, postmaster.c:3912
> 2020-02-04 10:26:17.105 EST [20964] [0] LOG: 00000: database system
> was interrupted; last known up at 2020-02-04 10:26:09 EST
> 2020-02-04 10:26:17.105 EST [20964] [0] LOCATION: StartupXLOG,
> xlog.c:6277
> 2020-02-04 10:26:17.115 EST [1668] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.115 EST [1668] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.179 EST [25800] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.179 EST [25800] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.301 EST [14700] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.301 EST [14700] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.309 EST [19060] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.309 EST [19060] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.378 EST [24772] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.378 EST [24772] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.434 EST [12972] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.434 EST [12972] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.492 EST [11208] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.492 EST [11208] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.548 EST [13236] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.548 EST [13236] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.607 EST [25756] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.607 EST [25756] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.677 EST [12944] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.677 EST [12944] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:17.737 EST [14712] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:17.737 EST [14712] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:18.104 EST [20964] [0] LOG: 00000: database system
> was not properly shut down; automatic recovery in progress
> 2020-02-04 10:26:18.104 EST [20964] [0] LOCATION: StartupXLOG,
> xlog.c:6774
> 2020-02-04 10:26:18.109 EST [20964] [0] LOG: 00000: redo starts at
> 14/52009F08
> 2020-02-04 10:26:18.109 EST [20964] [0] LOCATION: StartupXLOG,
> xlog.c:7045
> 2020-02-04 10:26:18.349 EST [23064] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:18.349 EST [23064] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:19.248 EST [8816] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:19.248 EST [8816] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:20.560 EST [18200] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:20.560 EST [18200] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:22.508 EST [23204] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:22.508 EST [23204] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:25.402 EST [5888] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:25.402 EST [5888] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:29.714 EST [16820] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:29.714 EST [16820] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:36.161 EST [24072] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:36.161 EST [24072] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:45.806 EST [22000] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:45.806 EST [22000] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:55.687 EST [20964] [0] LOG: 00000: redo done at
> 14/79A030E0
> 2020-02-04 10:26:55.687 EST [20964] [0] LOCATION: StartupXLOG,
> xlog.c:7307
> 2020-02-04 10:26:55.861 EST [16700] [0] FATAL: 57P03: the database
> system is in recovery mode
> 2020-02-04 10:26:55.861 EST [16700] [0] LOCATION:
> ProcessStartupPacket, postmaster.c:2275
> 2020-02-04 10:26:57.016 EST [20788] [0] LOG: 00000: database system
> is ready to accept connections
>
> On Tue, Feb 4, 2020 at 9:20 AM Doug Roberts <h205881@gmail.com
> <mailto:h205881@gmail.com>> wrote:
>
> > So how did containers_reset_recirc() come to clash with
> > containers_add_update()?
>
> They are clashing because another portion of our system is
> running and updating containers. The reset recirc function was
> run at the same time to see how our system and the database
> would handle it.
>
> The recirc string is formatted like 2000=3,1000=6,5000=0. So the
> reset recirc function with take a UID (1000 for example) and use
> that to remove 1000=x from all of the recirc counts for all of
> the containers that have 1000=x.
>
> We are currently using PG 12.0.
>
> Thanks,
>
> Doug
>
> On Mon, Feb 3, 2020 at 6:21 PM Tom Lane <tgl@sss.pgh.pa.us
> <mailto:tgl@sss.pgh.pa.us>> wrote:
>
> Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> writes:
> > Please reply to list also.
>
> > On 2/3/20 2:18 PM, Doug Roberts wrote:
> >> Here is what the reset recirc function is doing.
> >> ...
> >> UPDATE containers
> >> ...
>
> > So how did containers_reset_recirc() come to clash with
> > containers_add_update()?
>
> If this is PG 12.0 or 12.1, a likely theory is that this is an
> EvalPlanQual bug (which'd be triggered during concurrent updates
> of the same row in the table, so that squares with the
> observation
> that locking the table prevents it). The known bugs in that
> area
> require either before-row-update triggers on the table, or
> child tables (either partitioning or traditional inheritance).
> So I wonder what the schema of table "containers" looks like.
>
> Or you could have hit some new bug ... but there's not enough
> info here to diagnose.
>
> regards, tom lane
>

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Postgres Crashing

От

Adrian Klaver

Дата:

04 февраля 2020 г., 19:40:27

On 2/4/20 6:20 AM, Doug Roberts wrote:
>> So how did containers_reset_recirc() come to clash with
>> containers_add_update()?
> 
> They are clashing because another portion of our system is running and 
> updating containers. The reset recirc function was run at the same time 
> to see how our system and the database would handle it.

So does your system have the things Tom mentioned below?:

"The known bugs in that area
require either before-row-update triggers on the table, or
child tables (either partitioning or traditional inheritance).
So I wonder what the schema of table "containers" looks like."

> 
> The recirc string is formatted like 2000=3,1000=6,5000=0. So the reset 
> recirc function with take a UID (1000 for example) and use that to 
> remove 1000=x from all of the recirc counts for all of the containers 
> that have 1000=x.
> 
> We are currently using PG 12.0.
> 
> Thanks,
> 
> Doug
> 
> On Mon, Feb 3, 2020 at 6:21 PM Tom Lane <tgl@sss.pgh.pa.us 
> <mailto:tgl@sss.pgh.pa.us>> wrote:
> 
>     Adrian Klaver <adrian.klaver@aklaver.com
>     <mailto:adrian.klaver@aklaver.com>> writes:
>      > Please reply to list also.
> 
>      > On 2/3/20 2:18 PM, Doug Roberts wrote:
>      >> Here is what the reset recirc function is doing.
>      >> ...
>      >>     UPDATE containers
>      >> ...
> 
>      > So how did containers_reset_recirc() come to clash with
>      > containers_add_update()?
> 
>     If this is PG 12.0 or 12.1, a likely theory is that this is an
>     EvalPlanQual bug (which'd be triggered during concurrent updates
>     of the same row in the table, so that squares with the observation
>     that locking the table prevents it).  The known bugs in that area
>     require either before-row-update triggers on the table, or
>     child tables (either partitioning or traditional inheritance).
>     So I wonder what the schema of table "containers" looks like.
> 
>     Or you could have hit some new bug ... but there's not enough
>     info here to diagnose.
> 
>                              regards, tom lane
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Postgres Crashing

От

Doug Roberts

Дата:

04 февраля 2020 г., 22:19:19

Hello,

Hopefully the following stack trace is more helpful.

Exception thrown at 0x0000000140446403 in postgres.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFF8. occurred

> postgres.exe!pfree(void * pointer) Line 1033 C
postgres.exe!tts_buffer_heap_clear(TupleTableSlot * slot) Line 653 C
[Inline Frame] postgres.exe!ExecClearTuple(TupleTableSlot *) Line 428 C
postgres.exe!ExecForceStoreHeapTuple(HeapTupleData * tuple, TupleTableSlot * slot, bool shouldFree) Line 1448 C
postgres.exe!ExecBRUpdateTriggers(EState * estate, EPQState * epqstate, ResultRelInfo * relinfo, ItemPointerData * tupleid, HeapTupleData * fdw_trigtuple, TupleTableSlot * newslot) Line 3117 C
postgres.exe!ExecUpdate(ModifyTableState * mtstate, ItemPointerData * tupleid, HeapTupleData * oldtuple, TupleTableSlot * slot, TupleTableSlot * planSlot, EPQState * epqstate, EState * estate, bool canSetTag) Line 1072 C
postgres.exe!ExecModifyTable(PlanState * pstate) Line 2223 C
[Inline Frame] postgres.exe!ExecProcNode(PlanState *) Line 239 C
postgres.exe!ExecutePlan(EState * estate, PlanState * planstate, bool use_parallel_mode, CmdType operation, bool sendTuples, unsigned __int64 numberTuples, ScanDirection direction, _DestReceiver * dest, bool execute_once) Line 1652 C
postgres.exe!standard_ExecutorRun(QueryDesc * queryDesc, ScanDirection direction, unsigned __int64 count, bool execute_once) Line 378 C
postgres.exe!_SPI_pquery(QueryDesc * queryDesc, bool fire_triggers, unsigned __int64 tcount) Line 2523 C
postgres.exe!_SPI_execute_plan(_SPI_plan * plan, ParamListInfoData * paramLI, SnapshotData * snapshot, SnapshotData * crosscheck_snapshot, bool read_only, bool fire_triggers, unsigned __int64 tcount) Line 2298 C
postgres.exe!SPI_execute_plan_with_paramlist(_SPI_plan * plan, ParamListInfoData * params, bool read_only, long tcount) Line 581 C
plpgsql.dll!exec_stmt_execsql(PLpgSQL_execstate * estate, PLpgSQL_stmt_execsql * stmt) Line 4162 C
plpgsql.dll!exec_stmt(PLpgSQL_execstate * estate, PLpgSQL_stmt * stmt) Line 2033 C
[Inline Frame] plpgsql.dll!exec_stmts(PLpgSQL_execstate * stmts, List *) Line 1924 C
plpgsql.dll!exec_stmt_block(PLpgSQL_execstate * estate, PLpgSQL_stmt_block * block) Line 1865 C
plpgsql.dll!exec_stmt(PLpgSQL_execstate * estate, PLpgSQL_stmt * stmt) Line 1957 C
plpgsql.dll!plpgsql_exec_function(PLpgSQL_function * func, FunctionCallInfoBaseData * fcinfo, EState * simple_eval_estate, bool atomic) Line 590 C
plpgsql.dll!plpgsql_call_handler(FunctionCallInfoBaseData * fcinfo) Line 267 C
postgres.exe!ExecInterpExpr(ExprState * state, ExprContext * econtext, bool * isnull) Line 626 C
[Inline Frame] postgres.exe!ExecEvalExprSwitchContext(ExprState *) Line 307 C
postgres.exe!ExecProject(ProjectionInfo * projInfo) Line 351 C
[Inline Frame] postgres.exe!ExecProcNode(PlanState *) Line 239 C
postgres.exe!ExecutePlan(EState * estate, PlanState * planstate, bool use_parallel_mode, CmdType operation, bool sendTuples, unsigned __int64 numberTuples, ScanDirection direction, _DestReceiver * dest, bool execute_once) Line 1652 C
postgres.exe!standard_ExecutorRun(QueryDesc * queryDesc, ScanDirection direction, unsigned __int64 count, bool execute_once) Line 378 C
postgres.exe!PortalRunSelect(PortalData * portal, bool forward, long count, _DestReceiver * dest) Line 931 C
postgres.exe!PortalRun(PortalData * portal, long count, bool isTopLevel, bool run_once, _DestReceiver * dest, _DestReceiver * altdest, char * completionTag) Line 777 C
postgres.exe!exec_execute_message(const char * portal_name, long max_rows) Line 2098 C
postgres.exe!PostgresMain(int argc, char * * argv, const char * dbname, const char * username) Line 4299 C
postgres.exe!BackendRun(Port * port) Line 4432 C
postgres.exe!SubPostmasterMain(int argc, char * * argv) Line 4955 C
postgres.exe!main(int argc, char * * argv) Line 216 C
[External Code]

On Tue, Feb 4, 2020 at 11:40 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 2/4/20 6:20 AM, Doug Roberts wrote:
>> So how did containers_reset_recirc() come to clash with
>> containers_add_update()?
>
> They are clashing because another portion of our system is running and
> updating containers. The reset recirc function was run at the same time
> to see how our system and the database would handle it.

So does your system have the things Tom mentioned below?:

"The known bugs in that area
require either before-row-update triggers on the table, or
child tables (either partitioning or traditional inheritance).
So I wonder what the schema of table "containers" looks like."

>
> The recirc string is formatted like 2000=3,1000=6,5000=0. So the reset
> recirc function with take a UID (1000 for example) and use that to
> remove 1000=x from all of the recirc counts for all of the containers
> that have 1000=x.
>
> We are currently using PG 12.0.
>
> Thanks,
>
> Doug
>
> On Mon, Feb 3, 2020 at 6:21 PM Tom Lane <tgl@sss.pgh.pa.us
> <mailto:tgl@sss.pgh.pa.us>> wrote:
>
> Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> writes:
> > Please reply to list also.
>
> > On 2/3/20 2:18 PM, Doug Roberts wrote:
> >> Here is what the reset recirc function is doing.
> >> ...
> >> UPDATE containers
> >> ...
>
> > So how did containers_reset_recirc() come to clash with
> > containers_add_update()?
>
> If this is PG 12.0 or 12.1, a likely theory is that this is an
> EvalPlanQual bug (which'd be triggered during concurrent updates
> of the same row in the table, so that squares with the observation
> that locking the table prevents it). The known bugs in that area
> require either before-row-update triggers on the table, or
> child tables (either partitioning or traditional inheritance).
> So I wonder what the schema of table "containers" looks like.
>
> Or you could have hit some new bug ... but there's not enough
> info here to diagnose.
>
> regards, tom lane
>

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Postgres Crashing

От

Tom Lane

Дата:

04 февраля 2020 г., 22:34:41

Doug Roberts <h205881@gmail.com> writes:
> Hopefully the following stack trace is more helpful.

> Exception thrown at 0x0000000140446403 in postgres.exe: 0xC0000005: Access
> violation reading location 0xFFFFFFFFFFFFFFF8. occurred

>> postgres.exe!pfree(void * pointer) Line 1033 C
>   postgres.exe!tts_buffer_heap_clear(TupleTableSlot * slot) Line 653 C
>   [Inline Frame] postgres.exe!ExecClearTuple(TupleTableSlot *) Line 428 C
>   postgres.exe!ExecForceStoreHeapTuple(HeapTupleData * tuple,
> TupleTableSlot * slot, bool shouldFree) Line 1448 C
>   postgres.exe!ExecBRUpdateTriggers(EState * estate, EPQState * epqstate,
> ResultRelInfo * relinfo, ItemPointerData * tupleid, HeapTupleData *
> fdw_trigtuple, TupleTableSlot * newslot) Line 3117 C

Ah, so you *are* using before-row update triggers.  Almost certainly,
this is the same bug fixed by commit 60e97d63e, so you should be okay
if you update to 12.1.  (There are some related issues that will be
fixed in 12.2, due out next week.)

            regards, tom lane

Re: Postgres Crashing

От

Doug Roberts

Дата:

05 февраля 2020 г., 00:21:53

Seems to be working fine now that I've upgraded to 12.1. I'll keep an eye out for 12.2. However, we are not using a before row update trigger. We are using an after insert trigger on the containers table though.

Thanks,

Doug

On Tue, Feb 4, 2020 at 2:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Doug Roberts <h205881@gmail.com> writes:
> Hopefully the following stack trace is more helpful.

> Exception thrown at 0x0000000140446403 in postgres.exe: 0xC0000005: Access
> violation reading location 0xFFFFFFFFFFFFFFF8. occurred

>> postgres.exe!pfree(void * pointer) Line 1033 C
> postgres.exe!tts_buffer_heap_clear(TupleTableSlot * slot) Line 653 C
> [Inline Frame] postgres.exe!ExecClearTuple(TupleTableSlot *) Line 428 C
> postgres.exe!ExecForceStoreHeapTuple(HeapTupleData * tuple,
> TupleTableSlot * slot, bool shouldFree) Line 1448 C
> postgres.exe!ExecBRUpdateTriggers(EState * estate, EPQState * epqstate,
> ResultRelInfo * relinfo, ItemPointerData * tupleid, HeapTupleData *
> fdw_trigtuple, TupleTableSlot * newslot) Line 3117 C

Ah, so you *are* using before-row update triggers. Almost certainly,
this is the same bug fixed by commit 60e97d63e, so you should be okay
if you update to 12.1. (There are some related issues that will be
fixed in 12.2, due out next week.)

regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Postgres Crashing