Re: Cluster seems broken after pg_basebackup

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: Cluster seems broken after pg_basebackup
Дата
Msg-id 54D91170.1000809@aklaver.com
обсуждение исходный текст
Ответ на Cluster seems broken after pg_basebackup  (Guillaume Drolet <droletguillaume@gmail.com>)
Список pgsql-general
On 02/09/2015 08:34 AM, Guillaume Drolet wrote:

CCing list so the information stays in the thread.
>
>
> 2015-02-06 18:44 GMT-05:00 Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>>:
>
>     On 02/06/2015 09:17 AM, Guillaume Drolet wrote:
>
>         Dear Adrian,
>
>         Thanks for helping me. Sorry for the lack of details, I had said to
>         myself I had to not forget to give these details but I hit the send
>         button too fast. You know how it is...
>
>         I added more info in your reply below.
>
>
>              First some questions:
>
>              1) What Postgres version?
>
>
>         9.3
>
>
>         Windows 7
>
>
>              3) Where were you backing up from and to?
>
>
>         Backing up from my only cluster (PGDATA) on disk E, to a backup
>         directory on an other disk (F:) using this command:
>
>         pg_basebackup -D "F:\\db_base_backup" -Fp -Xs  -R -P
>         --label="basebackup20150205" --username=postgres
>
>         What's weird is that I did some successful tests last week on
>         the same
>         system (backing up, archiving, recovering) using the same procedure.
>         Only difference was the cluster, which was much smaller for testing
>         purposes, but located at the same place (i.e. E:\data) and
>         PostgresSQL
>         installed in C:\Programs\...
>
>
>              4) Which cluster does not start, the master or the child
>         you created
>              with pg_basebackup?
>
>
>
>         The master. I haven't tried the child yet. But I saw that the
>         message
>         about role "208375PT$" is in logs from before the backup too.
>
>
>         This is the local domain of my machine. I log onto my machine with a
>         local admin account and using domain name 208375PT (I didn't set
>         this
>         part of my machine, the IT guys here at work did). The thing is:
>         I don't
>         understand why it's there in the log file??
>
>
>     Not sure.
>
>     What are you using for an authentication method for database login?
>

  At this moment, for my tests I use md5 for user 'postgres' and trust for
  user 'all'.

>
>
>
>
>                  And after that, I went back to the log file and there's new
>                  information
>                  added:
>
>                  2015-02-06 07:51:05 EST LOG:  processus serveur (PID
>         184) a été
>                  arrêté
>                  par l'exception 0x80000004
>                  2015-02-06 07:51:05 EST DÉTAIL:  Le processus qui a échoué
>                  exécutait :
>                  SELECT version();
>                  2015-02-06 07:51:05 EST ASTUCE :  Voir le fichier
>         d'en-tête C «
>                  ntstatus.h » pour une description de la valeur
>                        hexadécimale.
>
>
>              Well according to here:
>
>         https://msdn.microsoft.com/en-____us/library/cc704588.aspx
>         <https://msdn.microsoft.com/en-__us/library/cc704588.aspx>
>              <https://msdn.microsoft.com/__en-us/library/cc704588.aspx
>         <https://msdn.microsoft.com/en-us/library/cc704588.aspx>>
>
>              0x80000004
>              STATUS_SINGLE_STEP
>
>
>              {EXCEPTION} Single Step A single step or trace operation
>         has just
>              been completed.
>
>              A developer is going to have explain what that means.
>
>
>
>
>              My suspicion is you copied at least partly over a running
>         server.
>
>
>         How would that be possible? Using the pg_basebackup command I wrote
>         above, it is clear that I wrote the backup on disk F and not E.
>
>
>     I was just speculating, I would not put too much stock in it.
>
>
>
>         While writing this post, I started my backup using:
>
>         pg_ctl start -D "F:\db_basebackup"
>
>         Similar stuff happened with pgAdmin and the log (message about
>         symbolic
>         link is related to my post from yesterday. I don't know if this
>         could be
>         involved in the current problem):
>
>         2015-02-06 12:13:58 EST LOG:  le système de bases de données a été
>         interrompu ; dernier lancement connu à 2015-02-05 14:30:34 EST
>         2015-02-06 12:13:58 EST LOG:  création du répertoire manquant «
>         pg_xlog/archive_status » pour les journaux de transactions
>         2015-02-06 12:13:58 EST LOG:  la ré-exécution commence à
>         24B/28000090
>         2015-02-06 12:13:58 EST LOG:  n'a pas pu supprimer le lien
>         symbolique «
>         pg_tblspc/940585 » : No such file or directory
>         2015-02-06 12:13:58 EST CONTEXTE :  xlog redo drop tablespace:
>         940585
>         2015-02-06 12:13:58 EST LOG:  état de restauration cohérent
>         atteint à
>         24B/290000B8
>         2015-02-06 12:13:58 EST LOG:  ré-exécution faite à 24B/290000B8
>         2015-02-06 12:13:58 EST LOG:  la dernière transaction a eu lieu à
>         2015-02-05 09:06:04.892-05 (moment de la journalisation)
>         2015-02-06 12:13:59 EST LOG:  le système de bases de données est
>         prêt
>         pour accepter les connexions
>         2015-02-06 12:13:59 EST LOG:  lancement du processus autovacuum
>         2015-02-06 12:14:42 EST LOG:  processus serveur (PID 1784) a été
>         arrêté
>         par l'exception 0x80000004
>         2015-02-06 12:14:42 EST DÉTAIL:  Le processus qui a échoué
>         exécutait :
>         SELECT version();
>         2015-02-06 12:14:42 EST ASTUCE :  Voir le fichier d'en-tête C «
>         ntstatus.h » pour une description de la valeur
>               hexadécimale.
>         2015-02-06 12:14:42 EST LOG:  arrêt des autres processus serveur
>         actifs
>         2015-02-06 12:14:42 EST ATTENTION:  arrêt de la connexion à cause de
>         l'arrêt brutal d'un autre processus serveur
>         2015-02-06 12:14:42 EST DÉTAIL:  Le postmaster a commandé à ce
>         processus
>         serveur d'annuler la transaction
>               courante et de quitter car un autre processus serveur a quitté
>         anormalement
>               et qu'il existe probablement de la mémoire partagée corrompue.
>         2015-02-06 12:14:42 EST ASTUCE :  Dans un moment, vous devriez être
>         capable de vous reconnecter à la base de
>               données et de relancer votre commande.
>         2015-02-06 12:14:42 EST LOG:  tous les processus serveur se sont
>         arrêtés, réinitialisation
>
>
>         Any ideas where to go from here?
>
>
>     In both cases the database got to the point below, which would seem
>     to indicate everything was alright.
>
>     2015-02-06 7:11:38 ET LOG: the re-execution is not required
>     2015-02-06 7:11:38 ET LOG: the database system is ready for
>     accept connections
>
>     Also from what I can see the server crashed at this point:
>
>     2015-02-06 12:13:59 LOG IS: launch autovacuum processes
>     2015-02-06 12:14:42 EST LOG: server process (PID 1784) was arrested
>     by the exception 0x80000004
>
>
>     Now 0x80000004 is supposed to mean:
>
>     STATUS_SINGLE_STEP
>
>
>     {EXCEPTION} Single Step A single step or trace operation has just
>     been completed.
>
>     Some digging indicates this is the result of debugger command. Have
>     no idea how that would invoked in Postgres running production code.
>     This leads to my default question when I see unexplained behavior on
>     a Windows machine; do you have anti-virus machine running against
>     the drives?
>
>

  Yes I do and I'm not allowed to turn it off (I don't have such
  privileges). But the anti-virus software is running on my other machine
  (same setup) and I've never had such problems. Even on this machine
  that's giving me problems, I spent the two last weeks making tests with
  point-in-time-recovery and everything went fine.

>
>
>
>         Thanks a lot again.
>
>
>                  Thanks a lot for helping! Guillaume
>
>
>
>              --
>              Adrian Klaver
>         adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>         <mailto:adrian.klaver@aklaver.__com
>         <mailto:adrian.klaver@aklaver.com>>
>
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


В списке pgsql-general по дате отправления:

Предыдущее
От: Jeroen Ooms
Дата:
Сообщение: Building proper static library libpq.a
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: Question on session_replication_role