Re: Postgres abort found in 9.3.11

Поиск
Список
Период
Сортировка
От K S, Sandhya (Nokia - IN/Bangalore)
Тема Re: Postgres abort found in 9.3.11
Дата
Msg-id DB5PR07MB154156B5B062C8769E8A569ED6E20@DB5PR07MB1541.eurprd07.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Postgres abort found in 9.3.11  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hello Tom,

Apologies for delayed reply.

Our setup is a hot-standby architecture. This crash is occurring only on stand-by node. Postgres continues to run
withoutany issues on active node. 
Postmaster is waiting for a start and is throwing this message.

Aug 22 11:44:21.462555 info node-0 postgres[8222]: [1-2] HINT:  Is another postmaster already running on port 5433? If
not,wait a few seconds and retry.   
Aug 22 11:44:52.065760 crit node-1 postgres[8629]: [18-1] err-3:  btree_xlog_delete_get_latestRemovedXid: cannot
operatewith inconsistent dataAug 22 11:44:52.065971 crit CFPU-1 postgres[8629]: [18-2] CONTEXT:  xlog redo delete:
index1663/16386/17378; iblk 1, heap 1663/16386/16518; 
Aug 22 11:44:52.085486 info node-1 coredumper: Generating core file

The standby postgres recovers automatically on next restart. This is because we always copy db freshly from active node
onrestart. 

We implemented one patch to force kill walsender on active side. This is done to avoid prolonged wait if standby node
isnot reachable (for eg. Force power off or LAN cable removal). This implementation exists from long time. However the
issueonly recently observed after upgrading to 9.3.11. Do you think this force kill of walsender might lead to such
issuesin latest postgres? 


Regards,
Sandhya

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, August 30, 2016 5:09 PM
To: K S, Sandhya (Nokia - IN/Bangalore) <sandhya.k_s@nokia.com>
Cc: pgsql-hackers@postgresql.org; Itnal, Prakash (Nokia - IN/Bangalore) <prakash.itnal@nokia.com>
Subject: Re: [HACKERS] Postgres abort found in 9.3.11

"K S, Sandhya (Nokia - IN/Bangalore)" <sandhya.k_s@nokia.com> writes:
> During the server restart, we are getting postgres crash with sigabrt. No other operation being performed.
> Attached the backtrace.

What shows up in the postmaster log?

> The occurrence is occasional. The issue is seen once in 30~50 times.

Does it successfully restart if you try again?  If not, what are you
doing to recover?
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: amcheck (B-Tree integrity checking tool)
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [PATCH] COPY vs \copy HINT