Re: pg_basebackup: return value 1: reason?
От | Adrian Klaver |
---|---|
Тема | Re: pg_basebackup: return value 1: reason? |
Дата | |
Msg-id | 57117683.1010507@aklaver.com обсуждение исходный текст |
Ответ на | pg_basebackup: return value 1: reason? (Andrej Vanek <andrej.vanek.sk@gmail.com>) |
Ответы |
Re: pg_basebackup: return value 1: reason?
(Andrej Vanek <andrej.vanek.sk@gmail.com>)
|
Список | pgsql-general |
On 04/15/2016 03:28 PM, Andrej Vanek wrote: > Hello, > > I tried to run pg_basebackup. Return value is 1. > > How to find out its reason? > (I suspect that some wal after backup is missing- but how to find > out the real reason? How to fix it?) First it is not clear to me where you are taking the backup from, the master or the standby? Second there is a lot of redirection going on. What happens if you run the pg_basebackup directly (without doing su - postgres ...) and use hardcoded values instead of shell variables? > > thanks, Andrej > --------------details: > environment: CentOS 6.7, postgres 9.5.1 > ( PostgreSQL 9.5.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 > 20120313 (Red Hat 4.4.7-16), 64-bit) > > I tried 2 forms of pg_basebackup (-X fetch and -X stream). Both were > issued from a script: > # su - postgres -c "/usr/pgsql-9.5/bin/pg_basebackup -h ${DB_MASTER_IP} > -D ${GEO_STDBY_DATA} -U pgreplic -P -v -X fetch" 2>${LOG_FILE}.stderr > >> ${LOG_FILE} > # echo $? > 1 <--------------pg_basebackup failed! > # cat log.stderr > # cat /var/log/cluster/geo_repair.log.err > transaction log start point: 0/E3000028 on timeline 1 > WARNING: skipping special file "./pg_hba.conf" > WARNING: skipping special file "./pg_hba.conf.save" > transaction log end point: 0/E30000F8 > pg_basebackup: base backup completed <------------------no > reason for pg_basebackup failure! > # cp /tmp/pg_hba.conf /tmp/postgresql.conf /pg_data/ > # su - postgres -c "/usr/pgsql-9.5/bin/pg_ctl -D /pg_data/ start" > # tail /pg_data/pg_log/postgresql-Fri.log > `pg_xlog/0000000100000000000000E2' -> > `../backups/arc/0000000100000000000000E2' > 2016-04-15 23:15:10 CEST:pgreplic@[unknown]:[10667] WARNING: skipping > special file "./pg_hba.conf" > 2016-04-15 23:15:10 CEST:pgreplic@[unknown]:[10667] WARNING: skipping > special file "./pg_hba.conf.save" <---------------recorded in > pg_log on master node and copied by pg_basebackup (note time difference > between two servers) > 2016-04-15 23:15:02 CEST:@:[23321] LOG: database system was > interrupted; last known up at 2016-04-15 23:15:10 CEST > 2016-04-15 23:15:02 CEST:postgres@postgres:[23329] FATAL: the database > system is starting up > 2016-04-15 23:15:03 CEST:@:[23321] LOG: entering standby mode > 2016-04-15 23:15:03 CEST:@:[23321] LOG: database system was not > properly shut down; automatic recovery in progress <---------something > missing from pg_basebackup > 2016-04-15 23:15:03 CEST:@:[23321] LOG: redo starts at 0/E3000028 > 2016-04-15 23:15:03 CEST:@:[23321] LOG: consistent recovery state > reached at 0/E4000000 > 2016-04-15 23:15:03 CEST:@:[23295] LOG: database system is ready to > accept read only connections > 2016-04-15 23:15:03 CEST:@:[23356] LOG: started streaming WAL from > primary at 0/E4000000 on timeline 1 > -------second trial > # su - postgres -c "/usr/pgsql-9.5/bin/pg_basebackup -h ${DB_MASTER_IP} > -D ${GEO_STDBY_DATA} -U pgreplic -P -v -X stream" > # echo $? > 1 > # cat /var/log/cluster/geo_repair.log.err > transaction log start point: 0/E5000028 on timeline 1 > pg_basebackup: starting background WAL receiver > WARNING: skipping special file "./pg_hba.conf" > WARNING: skipping special file "./pg_hba.conf.save" > transaction log end point: 0/E50000F8 > pg_basebackup: waiting for background process to finish streaming ... > pg_basebackup: could not wait for child process: No child processes > <----what does this mean? I think it failed to start process to > fetching wal logs created during backup: but neither on master node > neither on pg_basebackup output here is any information about reason.. > (max_wal_senders on master is 10: I see no reason to fail). > > postgres logs: > `pg_xlog/0000000100000000000000E4' -> > `../backups/arc/0000000100000000000000E4' > 2016-04-15 23:35:09 CEST:pgreplic@[unknown]:[29035] WARNING: skipping > special file "./pg_hba.conf" > 2016-04-15 23:35:09 CEST:pgreplic@[unknown]:[29035] WARNING: skipping > special file "./pg_hba.conf.save" > 2016-04-15 23:35:01 CEST:@:[28926] LOG: database system was > interrupted; last known up at 2016-04-15 23:35:09 CEST > 2016-04-15 23:35:01 CEST:postgres@postgres:[28938] FATAL: the database > system is starting up > 2016-04-15 23:35:02 CEST:@:[28926] LOG: entering standby mode > 2016-04-15 23:35:02 CEST:@:[28926] LOG: database system was not > properly shut down; automatic recovery in progress <------------this > means something missing from pg_basebackup > 2016-04-15 23:35:02 CEST:@:[28926] LOG: redo starts at 0/E5000028 > 2016-04-15 23:35:02 CEST:@:[28926] LOG: consistent recovery state > reached at 0/E6000000 > 2016-04-15 23:35:02 CEST:@:[28904] LOG: database system is ready to > accept read only connections > 2016-04-15 23:35:02 CEST:@:[28989] LOG: started streaming WAL from > primary at 0/E6000000 on timeline 1 > > postgres params on master node: > log_line_prefix = '%t:%u@%d:[%p] ' > logging_collector = on > wal_buffers = 16MB > max_wal_size = 200MB > log_temp_files = 1MB > max_connections = 170 > shared_buffers = 512MB > effective_cache_size = 1500MB > work_mem = 48MB > log_lock_waits = on > log_min_duration_statement = 10000 > shared_preload_libraries = 'pg_stat_statements' > include '/var/lib/pgsql/tmp/rep_mode.conf' # added by pgsql RA > wal_level = hot_standby > archive_mode = on > max_wal_senders = 10 > hot_standby = on > wal_keep_segments = 128 > archive_command = '/opt/postgres/dbconf/archive_command.sh %p %f' > wal_receiver_status_interval = 2 > max_standby_streaming_delay = -1 > max_standby_archive_delay = -1 > restart_after_crash = off > hot_standby_feedback = on > -- Adrian Klaver adrian.klaver@aklaver.com
В списке pgsql-general по дате отправления: