Обсуждение: BUG #15021: Postgres crashes unexpectedly

Поиск
Список
Период
Сортировка

BUG #15021: Postgres crashes unexpectedly

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15021
Logged by:          Wallace Baggaley
Email address:      wally.baggaley@gmail.com
PostgreSQL version: 9.6.5
Operating system:   Debian GNU Linux 8 container on CentOS 7.2.1511
Description:

Postgres crashes unexpectedly and indicates that it received a signal to end
based on the following logs. But checking user history and journald show
nothing has terminated the application. Is it possible the log "LOG:
received smart shutdown request" could be caused by something other than a
signal. What signals could be associated with this log or is this possibly
related to a crash?

Thanks!
Wally

        January 19th 2018, 11:26:14.710 LOG:  autovacuum launcher started
    January 19th 2018, 11:26:14.706    LOG:  database system is ready to accept
connections
    January 19th 2018, 11:26:14.704    LOG:  MultiXact member wraparound
protections are now enabled
    January 19th 2018, 11:26:14.690    LOG:  invalid record length at 0/88D9C00:
wanted 24, got 0
    January 19th 2018, 11:26:14.690    LOG:  redo is not required
    January 19th 2018, 11:26:14.689    LOG:  database system was not properly shut
down; automatic recovery in progress
    January 19th 2018, 11:26:14.677    LOG:  database system was interrupted; last
known up at 2018-01-19 19:26:04 UTC
    January 19th 2018, 11:26:04.710    LOG:  autovacuum launcher started
    January 19th 2018, 11:26:04.709    LOG:  database system is ready to accept
connections
    January 19th 2018, 11:26:04.707    LOG:  MultiXact member wraparound
protections are now enabled
    January 19th 2018, 11:26:04.698    LOG:  redo done at 0/88D9B68
    January 19th 2018, 11:26:04.698    LOG:  last completed transaction was at log
time 2018-01-19 18:33:58.443636+00
    January 19th 2018, 11:26:04.698    LOG:  invalid record length at 0/88D9B90:
wanted 24, got 0
    January 19th 2018, 11:26:04.694    LOG:  redo starts at 0/88D2B70
    January 19th 2018, 11:26:04.692    LOG:  database system was not properly shut
down; automatic recovery in progress
    January 19th 2018, 11:26:04.678    LOG:  database system was interrupted; last
known up at 2018-01-19 16:41:01 UTC
    January 19th 2018, 10:34:07.436    DETAIL:  Key ("code")=(ABC123) already
exists.
    January 19th 2018, 10:34:07.436    STATEMENT:  insert into "TableName"
("code","subcode")  values ($1,$2)
    January 19th 2018, 10:34:07.436    ERROR:  duplicate key value violates unique
constraint "INDEX_NAME"
    January 19th 2018, 10:34:03.807    STATEMENT:  insert into "TableName"
("code","subcode")  values ($1,$2)
    January 19th 2018, 10:34:03.806    DETAIL:  Key ("code")=(ABC123) already
exists.
    January 19th 2018, 10:34:03.806    ERROR:  duplicate key value violates unique
constraint "INDEX_NAME"
    January 19th 2018, 10:33:58.533    LOG:  autovacuum launcher shutting down
    January 19th 2018, 10:33:58.526    LOG:  received smart shutdown request
    January 19th 2018, 10:33:58.452    DETAIL:  Key ("code")=(ABC123) already
exists.
    January 19th 2018, 10:33:58.452    STATEMENT:  insert into "TableName"
("code","subcode")  values ($1,$2)
    January 19th 2018, 10:33:58.451    ERROR:  duplicate key value violates unique
constraint "INDEX_NAME"



Re: BUG #15021: Postgres crashes unexpectedly

От
Merlin Moncure
Дата:
On Fri, Jan 19, 2018 at 3:47 PM, PG Bug reporting form
<noreply@postgresql.org> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      15021
> Logged by:          Wallace Baggaley
> Email address:      wally.baggaley@gmail.com
> PostgreSQL version: 9.6.5
> Operating system:   Debian GNU Linux 8 container on CentOS 7.2.1511
> Description:
>
> Postgres crashes unexpectedly and indicates that it received a signal to end
> based on the following logs. But checking user history and journald show
> nothing has terminated the application. Is it possible the log "LOG:
> received smart shutdown request" could be caused by something other than a
> signal. What signals could be associated with this log or is this possibly
> related to a crash?
>
> Thanks!
> Wally
>
>         January 19th 2018, 11:26:14.710 LOG:  autovacuum launcher started
>         January 19th 2018, 11:26:14.706 LOG:  database system is ready to accept
> connections
>         January 19th 2018, 11:26:14.704 LOG:  MultiXact member wraparound
> protections are now enabled
>         January 19th 2018, 11:26:14.690 LOG:  invalid record length at 0/88D9C00:
> wanted 24, got 0
>         January 19th 2018, 11:26:14.690 LOG:  redo is not required
>         January 19th 2018, 11:26:14.689 LOG:  database system was not properly shut
> down; automatic recovery in progress
>         January 19th 2018, 11:26:14.677 LOG:  database system was interrupted; last
> known up at 2018-01-19 19:26:04 UTC
>         January 19th 2018, 11:26:04.710 LOG:  autovacuum launcher started
>         January 19th 2018, 11:26:04.709 LOG:  database system is ready to accept
> connections
>         January 19th 2018, 11:26:04.707 LOG:  MultiXact member wraparound
> protections are now enabled
>         January 19th 2018, 11:26:04.698 LOG:  redo done at 0/88D9B68
>         January 19th 2018, 11:26:04.698 LOG:  last completed transaction was at log
> time 2018-01-19 18:33:58.443636+00
>         January 19th 2018, 11:26:04.698 LOG:  invalid record length at 0/88D9B90:
> wanted 24, got 0
>         January 19th 2018, 11:26:04.694 LOG:  redo starts at 0/88D2B70
>         January 19th 2018, 11:26:04.692 LOG:  database system was not properly shut
> down; automatic recovery in progress
>         January 19th 2018, 11:26:04.678 LOG:  database system was interrupted; last
> known up at 2018-01-19 16:41:01 UTC
>         January 19th 2018, 10:34:07.436 DETAIL:  Key ("code")=(ABC123) already
> exists.
>         January 19th 2018, 10:34:07.436 STATEMENT:  insert into "TableName"
> ("code","subcode")  values ($1,$2)
>         January 19th 2018, 10:34:07.436 ERROR:  duplicate key value violates unique
> constraint "INDEX_NAME"
>         January 19th 2018, 10:34:03.807 STATEMENT:  insert into "TableName"
> ("code","subcode")  values ($1,$2)
>         January 19th 2018, 10:34:03.806 DETAIL:  Key ("code")=(ABC123) already
> exists.
>         January 19th 2018, 10:34:03.806 ERROR:  duplicate key value violates unique
> constraint "INDEX_NAME"
>         January 19th 2018, 10:33:58.533 LOG:  autovacuum launcher shutting down
>         January 19th 2018, 10:33:58.526 LOG:  received smart shutdown request
>         January 19th 2018, 10:33:58.452 DETAIL:  Key ("code")=(ABC123) already
> exists.
>         January 19th 2018, 10:33:58.452 STATEMENT:  insert into "TableName"
> ("code","subcode")  values ($1,$2)
>         January 19th 2018, 10:33:58.451 ERROR:  duplicate key value violates unique
> constraint "INDEX_NAME"

These log messages appear not to be in order; is this how the were in the file?

merlin


Re: BUG #15021: Postgres crashes unexpectedly

От
Wallace Baggaley
Дата:
This was pulled from kibana/elasticsearch, so logs with the same millisecond log may be out-of-order. Otherwise should be in order.

On Fri, Jan 19, 2018 at 1:53 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Fri, Jan 19, 2018 at 3:47 PM, PG Bug reporting form
<noreply@postgresql.org> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      15021
> Logged by:          Wallace Baggaley
> Email address:      wally.baggaley@gmail.com
> PostgreSQL version: 9.6.5
> Operating system:   Debian GNU Linux 8 container on CentOS 7.2.1511
> Description:
>
> Postgres crashes unexpectedly and indicates that it received a signal to end
> based on the following logs. But checking user history and journald show
> nothing has terminated the application. Is it possible the log "LOG:
> received smart shutdown request" could be caused by something other than a
> signal. What signals could be associated with this log or is this possibly
> related to a crash?
>
> Thanks!
> Wally
>
>         January 19th 2018, 11:26:14.710 LOG:  autovacuum launcher started
>         January 19th 2018, 11:26:14.706 LOG:  database system is ready to accept
> connections
>         January 19th 2018, 11:26:14.704 LOG:  MultiXact member wraparound
> protections are now enabled
>         January 19th 2018, 11:26:14.690 LOG:  invalid record length at 0/88D9C00:
> wanted 24, got 0
>         January 19th 2018, 11:26:14.690 LOG:  redo is not required
>         January 19th 2018, 11:26:14.689 LOG:  database system was not properly shut
> down; automatic recovery in progress
>         January 19th 2018, 11:26:14.677 LOG:  database system was interrupted; last
> known up at 2018-01-19 19:26:04 UTC
>         January 19th 2018, 11:26:04.710 LOG:  autovacuum launcher started
>         January 19th 2018, 11:26:04.709 LOG:  database system is ready to accept
> connections
>         January 19th 2018, 11:26:04.707 LOG:  MultiXact member wraparound
> protections are now enabled
>         January 19th 2018, 11:26:04.698 LOG:  redo done at 0/88D9B68
>         January 19th 2018, 11:26:04.698 LOG:  last completed transaction was at log
> time 2018-01-19 18:33:58.443636+00
>         January 19th 2018, 11:26:04.698 LOG:  invalid record length at 0/88D9B90:
> wanted 24, got 0
>         January 19th 2018, 11:26:04.694 LOG:  redo starts at 0/88D2B70
>         January 19th 2018, 11:26:04.692 LOG:  database system was not properly shut
> down; automatic recovery in progress
>         January 19th 2018, 11:26:04.678 LOG:  database system was interrupted; last
> known up at 2018-01-19 16:41:01 UTC
>         January 19th 2018, 10:34:07.436 DETAIL:  Key ("code")=(ABC123) already
> exists.
>         January 19th 2018, 10:34:07.436 STATEMENT:  insert into "TableName"
> ("code","subcode")  values ($1,$2)
>         January 19th 2018, 10:34:07.436 ERROR:  duplicate key value violates unique
> constraint "INDEX_NAME"
>         January 19th 2018, 10:34:03.807 STATEMENT:  insert into "TableName"
> ("code","subcode")  values ($1,$2)
>         January 19th 2018, 10:34:03.806 DETAIL:  Key ("code")=(ABC123) already
> exists.
>         January 19th 2018, 10:34:03.806 ERROR:  duplicate key value violates unique
> constraint "INDEX_NAME"
>         January 19th 2018, 10:33:58.533 LOG:  autovacuum launcher shutting down
>         January 19th 2018, 10:33:58.526 LOG:  received smart shutdown request
>         January 19th 2018, 10:33:58.452 DETAIL:  Key ("code")=(ABC123) already
> exists.
>         January 19th 2018, 10:33:58.452 STATEMENT:  insert into "TableName"
> ("code","subcode")  values ($1,$2)
>         January 19th 2018, 10:33:58.451 ERROR:  duplicate key value violates unique
> constraint "INDEX_NAME"

These log messages appear not to be in order; is this how the were in the file?

merlin

Re: BUG #15021: Postgres crashes unexpectedly

От
Francisco Olarte
Дата:
On Sat, Jan 20, 2018 at 1:32 AM, Wallace Baggaley
<wally.baggaley@gmail.com> wrote:
> This was pulled from kibana/elasticsearch, so logs with the same millisecond
> log may be out-of-order. Otherwise should be in order.

They seem to be in DESCENDING time order. This is used by a lot of
things ( like the gmail web MUA I'm using now ), as it can be prettier
to show /easier to look at the last event in a web form / ... but due
to the way people are usually taught to read it makes reading it for
diagnostic purpose extremely difficult / annoying.

Francisco Olarte


Re: BUG #15021: Postgres crashes unexpectedly

От
Tomas Vondra
Дата:

On 01/19/2018 10:47 PM, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      15021
> Logged by:          Wallace Baggaley
> Email address:      wally.baggaley@gmail.com
> PostgreSQL version: 9.6.5
> Operating system:   Debian GNU Linux 8 container on CentOS 7.2.1511
> Description:        
> 
> Postgres crashes unexpectedly and indicates that it received a signal to end
> based on the following logs. But checking user history and journald show
> nothing has terminated the application. Is it possible the log "LOG: 
> received smart shutdown request" could be caused by something other than a
> signal. What signals could be associated with this log or is this possibly
> related to a crash?
> 

Apparently something happened between 10:34:07.436 and 11:26:04.678, as
indicated by these log messages:

    January 19th 2018, 11:26:04.692 LOG:  database system was not
        properly shut down; automatic recovery in progress
    January 19th 2018, 11:26:04.678 LOG:  database system was
        interrupted; last known up at 2018-01-19 16:41:01 UTC
    January 19th 2018, 10:34:07.436 DETAIL:  Key ("code")=(ABC123)
        already exists.

but it's damn impossible to say why that happened. You have to inspect
some additional logs from that machine - it might have been rebooted,
the process might have been killed by OOM killer (although that usually
writes additional stuff to the log) etc.

My recommendation - look at dmesg and logs for other services on the
same host.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services