Re: pg_basebackup: return value 1: reason?

Поиск

Список

Период

Сортировка

От	Adrian Klaver
Тема	Re: pg_basebackup: return value 1: reason?
Дата	15 апреля 2016 г. 23:17:34
Msg-id	57117683.1010507@aklaver.com обсуждение исходный текст
Ответ на	pg_basebackup: return value 1: reason? (Andrej Vanek <andrej.vanek.sk@gmail.com>)
Ответы	Re: pg_basebackup: return value 1: reason?
Список	pgsql-general

Дерево обсуждения

On 04/15/2016 03:28 PM, Andrej Vanek wrote:
> Hello,
>
> I tried to run pg_basebackup. Return value is 1.
>
> How to find out its reason?
> (I suspect that some wal after backup is missing- but how to find
> out the real reason? How to fix it?)

First it is not clear to me where you are taking the backup from, the
master or the standby?

Second there is a lot of redirection going on. What happens if you run
the pg_basebackup directly (without doing  su - postgres ...) and use
hardcoded values instead of shell variables?

>
> thanks, Andrej
> --------------details:
> environment: CentOS 6.7, postgres 9.5.1
> ( PostgreSQL 9.5.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
> 20120313 (Red Hat 4.4.7-16), 64-bit)
>
> I tried 2 forms of pg_basebackup (-X fetch and -X stream). Both were
> issued from a script:
> # su - postgres -c "/usr/pgsql-9.5/bin/pg_basebackup -h ${DB_MASTER_IP}
> -D ${GEO_STDBY_DATA} -U pgreplic -P -v -X fetch" 2>${LOG_FILE}.stderr
>   >> ${LOG_FILE}
> # echo $?
> 1             <--------------pg_basebackup failed!
> # cat log.stderr
> # cat /var/log/cluster/geo_repair.log.err
> transaction log start point: 0/E3000028 on timeline 1
> WARNING:  skipping special file "./pg_hba.conf"
> WARNING:  skipping special file "./pg_hba.conf.save"
> transaction log end point: 0/E30000F8
> pg_basebackup: base backup completed            <------------------no
> reason for pg_basebackup failure!
> # cp /tmp/pg_hba.conf /tmp/postgresql.conf /pg_data/
> # su - postgres -c "/usr/pgsql-9.5/bin/pg_ctl -D /pg_data/ start"
> # tail /pg_data/pg_log/postgresql-Fri.log
> `pg_xlog/0000000100000000000000E2' ->
> `../backups/arc/0000000100000000000000E2'
> 2016-04-15 23:15:10 CEST:pgreplic@[unknown]:[10667] WARNING:  skipping
> special file "./pg_hba.conf"
> 2016-04-15 23:15:10 CEST:pgreplic@[unknown]:[10667] WARNING:  skipping
> special file "./pg_hba.conf.save"         <---------------recorded in
> pg_log on master node and copied by pg_basebackup (note time difference
> between two servers)
> 2016-04-15 23:15:02 CEST:@:[23321] LOG:  database system was
> interrupted; last known up at 2016-04-15 23:15:10 CEST
> 2016-04-15 23:15:02 CEST:postgres@postgres:[23329] FATAL:  the database
> system is starting up
> 2016-04-15 23:15:03 CEST:@:[23321] LOG:  entering standby mode
> 2016-04-15 23:15:03 CEST:@:[23321] LOG:  database system was not
> properly shut down; automatic recovery in progress <---------something
> missing from pg_basebackup
> 2016-04-15 23:15:03 CEST:@:[23321] LOG:  redo starts at 0/E3000028
> 2016-04-15 23:15:03 CEST:@:[23321] LOG:  consistent recovery state
> reached at 0/E4000000
> 2016-04-15 23:15:03 CEST:@:[23295] LOG:  database system is ready to
> accept read only connections
> 2016-04-15 23:15:03 CEST:@:[23356] LOG:  started streaming WAL from
> primary at 0/E4000000 on timeline 1
> -------second trial
> # su - postgres -c "/usr/pgsql-9.5/bin/pg_basebackup -h ${DB_MASTER_IP}
> -D ${GEO_STDBY_DATA} -U pgreplic -P -v -X stream"
> # echo $?
> 1
> #  cat /var/log/cluster/geo_repair.log.err
> transaction log start point: 0/E5000028 on timeline 1
> pg_basebackup: starting background WAL receiver
> WARNING:  skipping special file "./pg_hba.conf"
> WARNING:  skipping special file "./pg_hba.conf.save"
> transaction log end point: 0/E50000F8
> pg_basebackup: waiting for background process to finish streaming ...
> pg_basebackup: could not wait for child process: No child processes
>     <----what does this mean? I think it failed to start process to
> fetching wal logs created during backup: but neither on master node
> neither on pg_basebackup output here is any information about reason..
> (max_wal_senders on master is 10: I see no reason to fail).
>
> postgres logs:
> `pg_xlog/0000000100000000000000E4' ->
> `../backups/arc/0000000100000000000000E4'
> 2016-04-15 23:35:09 CEST:pgreplic@[unknown]:[29035] WARNING:  skipping
> special file "./pg_hba.conf"
> 2016-04-15 23:35:09 CEST:pgreplic@[unknown]:[29035] WARNING:  skipping
> special file "./pg_hba.conf.save"
> 2016-04-15 23:35:01 CEST:@:[28926] LOG:  database system was
> interrupted; last known up at 2016-04-15 23:35:09 CEST
> 2016-04-15 23:35:01 CEST:postgres@postgres:[28938] FATAL:  the database
> system is starting up
> 2016-04-15 23:35:02 CEST:@:[28926] LOG:  entering standby mode
> 2016-04-15 23:35:02 CEST:@:[28926] LOG:  database system was not
> properly shut down; automatic recovery in progress  <------------this
> means something missing from pg_basebackup
> 2016-04-15 23:35:02 CEST:@:[28926] LOG:  redo starts at 0/E5000028
> 2016-04-15 23:35:02 CEST:@:[28926] LOG:  consistent recovery state
> reached at 0/E6000000
> 2016-04-15 23:35:02 CEST:@:[28904] LOG:  database system is ready to
> accept read only connections
> 2016-04-15 23:35:02 CEST:@:[28989] LOG:  started streaming WAL from
> primary at 0/E6000000 on timeline 1
>
> postgres params on master node:
> log_line_prefix = '%t:%u@%d:[%p] '
> logging_collector = on
> wal_buffers = 16MB
> max_wal_size = 200MB
> log_temp_files = 1MB
> max_connections = 170
> shared_buffers = 512MB
> effective_cache_size = 1500MB
> work_mem = 48MB
> log_lock_waits = on
> log_min_duration_statement = 10000
> shared_preload_libraries = 'pg_stat_statements'
> include '/var/lib/pgsql/tmp/rep_mode.conf' # added by pgsql RA
> wal_level = hot_standby
> archive_mode = on
> max_wal_senders = 10
> hot_standby = on
> wal_keep_segments = 128
> archive_command = '/opt/postgres/dbconf/archive_command.sh %p %f'
> wal_receiver_status_interval = 2
> max_standby_streaming_delay = -1
> max_standby_archive_delay = -1
> restart_after_crash = off
> hot_standby_feedback = on
>


--
Adrian Klaver
adrian.klaver@aklaver.com

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: pg_basebackup: return value 1: reason?