Re: PostgreSQL with BDR - PANIC: could not create replication identifier checkpoint

Поиск
Список
Период
Сортировка
От Cameron Smith
Тема Re: PostgreSQL with BDR - PANIC: could not create replication identifier checkpoint
Дата
Msg-id CO2PR0801MB22144A06118F31215B6023AEA04A0@CO2PR0801MB2214.namprd08.prod.outlook.com
обсуждение исходный текст
Ответ на Re: PostgreSQL with BDR - PANIC: could not create replication identifier checkpoint  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы Re: PostgreSQL with BDR - PANIC: could not create replication identifier checkpoint  (Martín Marqués <martin@2ndquadrant.com>)
Список pgsql-general
I'd agree:  most likely a file system problem.  Is there any hope that this file could be re-built?

My current plan is to use bdr_part_by_node_names to remove the failing node and then rebuild it from a fresh backup
(andprobably on a new server). 

Thank you for your help!

Cameron Smith


________________________________________
From: Alvaro Herrera <alvherre@2ndquadrant.com>
Sent: May 19, 2016 2:56 PM
To: Cameron Smith
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] PostgreSQL with BDR - PANIC:  could not create replication identifier checkpoint

CAUTION EXTERNAL EMAIL






Cameron Smith wrote:

> t:2016-05-19 01:14:51.668 UTC d= p=144 a=PANIC:  could not create replication identifier checkpoint
"pg_logical/checkpoints/8-F3923F98.ckpt.tmp":Invalid argument 

This line corresponds to the following code in BDR's 9.4.4
src/backend/replication/logical/replication_identifier.c:

    /*
     * no other backend can perform this at the same time, we're protected by
     * CheckpointLock.
     */
    tmpfd = OpenTransientFile(tmppath,
                              O_CREAT | O_EXCL | O_WRONLY | PG_BINARY,
                              S_IRUSR | S_IWUSR);
    if (tmpfd < 0)
        ereport(PANIC,
                (errcode_for_file_access(),
                 errmsg("could not create replication identifier checkpoint \"%s\": %m",
                        tmppath)));

This file does not exist in 9.5, but instead we have
src/backend/replication/logical/origin.c which has identical code.

OpenTransientFile calls BasicOpenFile, which in turn calls open() and
propagates the errno.  My manpage doesn't list any possible reasons for
open() to return EINVAL, so I'm at a loss about what is happening here.
Maybe this is a filesystem problem?

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
DO NOT open attachments or click on links from unknown senders or unexpected emails





This e-mail and any attachments are intended only for use by the addressee(s) named herein and may contain confidential
information.If you are not the intended recipient of this e-mail, you are hereby notified any dissemination,
distributionor copying of this email and any attachments is strictly prohibited. If you receive this email in error,
pleaseimmediately notify the sender by return email and permanently delete the original, any copy and any printout
thereof.The integrity and security of e-mail cannot be guaranteed. 


В списке pgsql-general по дате отправления:

Предыдущее
От: "ktm@rice.edu"
Дата:
Сообщение: Re: Debugging a backend stuck consuming CPU
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Debugging a backend stuck consuming CPU