Re: Point in Time recovery on PostgreSQL (10.3.1)

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Point in Time recovery on PostgreSQL (10.3.1)
Дата
Msg-id 20180810103838.GH3326@tamriel.snowman.net
обсуждение исходный текст
Ответ на Point in Time recovery on PostgreSQL (10.3.1)  (amit tripathi <amittripathi1445@gmail.com>)
Список pgsql-admin
Greetings,

* amit tripathi (amittripathi1445@gmail.com) wrote:
> We need some help with the point in recovery test . We have followed the
> below steps but after creating recovery.conf file and restarting the
> machine our DB is getting corrupted and not getting started.
> Please look at the below steps and help us analyze what is going wrong
> there. We have used offline postgresql installer to set up database.
> Required configuration like *wal_level=hot_stanby, archive_mode=on, and
> archive_command='cp %p /mnt/server/archive/%f*' in *postgresql.conf* are
> also set properly.

I'd strongly recommend that you consider using a real PostgreSQL backup
tool for your backups and not try to write your own.  As a simple point,
using 'cp' is not sufficient for a proper archive command.

> 1. After DB set up, We have created a tablcespace, db and mapped db with
> the tablespace. Also created some tables to generate transaction files in
> pg_wal and archive directory (which is *mnt/server/archive*)
>
> 2. SELECT pg_start_backup('TestPITR');
>
> 3. Taken backup(tar) of Postgresql data directory (opt/PostgreSQL/10/
> data)>>tar
> zcf backup20180810.tar data/
>
> 4. SELECT pg_stop_backup();

Did you read the documentation where we explicitly point out that the
non-exclusive backup method is recommended over the exclusive method?
There's good reasons for that:

https://www.postgresql.org/docs/current/static/continuous-archiving.html

> 5. Created some more tables and noted down their timestamp (for PITR, which
> will be added in recovery.conf file as *recovery_target_time* )
>
> 6. Stopped the DB>>service postgresql-10 stop
>
> 7. Created a new directory >>mkdir pgbackup
>
> 8. Moved tar file into the pgbackup directory >>mv backup20180810.tar
> /opt/PostgreSQL/10/pgbackup
>
> 9. Copied archive files into the pgbackup directory>>  cp -r /
> *mnt/server/archive*/ /opt/PostgreSQL/10/pgbackup/
>
> 10. Renamed the old data directory( opt/PostgreSQL/10/ data )to bad.data
> >>> mv data bad.data (now opt/PostgreSQL/10/ bad.data)
>
> 11. untar the  back taken in step 2>>tar -xvf backup20180810.tar
>
> 12. Moved the data directory to old path >>>mv
> /opt/PostgreSQL/10/pgbackup/data/ /opt/PostgreSQL/10/ (Now i have
> opt/PostgreSQL/10/ data-derived from tar file and with old transaction logs
> in pg_wal)

Didn't you mention a tablespace..?  Did you handle backing up the
tablespace as part of this process and then restoring it?  It doesn't
look like it from what you've said above, unless perhaps I missed it
(this is pretty difficult to follow...).

> 13  Copied the updated pg_wal logs from bad.data to data >>>cp -r
> /opt/PostgreSQL/10/bad.data/pg_wal/0* /opt/PostgreSQL/10/data/pg_wal

Why would you perform this step..?  The WAL should have been archived
through archive_command and should be restored using restore_command
specified through recovery.conf.

> 14. Started the database  >> service postgresql-10 start (Our db works fine)

What do you mean you started the database and it "works fine"?

> 15. Created a recovery.conf file inside data folder >>opt/PostgreSQL/10/ data
> --given the permission and user as postgres

Wait, you started the database before creating a recovery.conf file?
That's definitely not going to do what you want- PG will end up trying
to do crash recovery instead of restoring from a backup.

Is there a backup_label file?

> Our recovery.conf file has two following parameters. *recovery_target_time
> was taken from step-5*
>
> *restore_command = 'cp /opt/PostgreSQL/10/pgbackup/archive/%f
> %p'recovery_target_time = '2018-08-10 02:56:31'*
>
> 16. Restarted the db server>> service postgresql-10 restart
>
> It doesn't restart instead we are getting below error(yesterday's log) on
> linux host. Also there is no log entry in log
> folder(opt/PostgreSQL/10/data/log) looks like it has stopped creating logs
> after restart.
>
>
> * journalctl -xeAug 09 06:47:53 scspr0512196001 systemd[1]: Starting
> PostgreSQL 10 database s-- Subject: Unit postgresql-10.service has begun
> start-up-- Defined-By: systemd--
> Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
<https://www.google.com/url?q=http://lists.freedesktop.org/mailman/listinfo/systemd-devel&sa=D&source=hangouts&ust=1533964684221000&usg=AFQjCNFJDTSITdO_SpA_w3Y8ZNn7zwrDqQ>----
> Unit postgresql-10.service has begun starting up.Aug 09 06:47:53
> scspr0512196001 pg_ctl[17697]: waiting for server to start...Aug 09
> 06:47:53 scspr0512196001 pg_ctl[17697]: pg_ctl: could not start serverAug
> 09 06:47:53 scspr0512196001 pg_ctl[17697]: Examine the log output.Aug 09
> 06:47:53 scspr0512196001 systemd[1]: postgresql-10.service: control prAug
> 09 06:47:53 scspr0512196001 systemd[1]: Failed to start PostgreSQL 10 dat--
> Subject: Unit postgresql-10.service has failed-- Defined-By: systemd--
> Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
<https://www.google.com/url?q=http://lists.freedesktop.org/mailman/listinfo/systemd-devel&sa=D&source=hangouts&ust=1533964684221000&usg=AFQjCNFJDTSITdO_SpA_w3Y8ZNn7zwrDqQ>----
> Unit postgresql-10.service has failed.---- The result is failed.Aug 09
> 06:47:53 scspr0512196001 systemd[1]: Unit postgresql-10.service entereAug
> 09 06:47:53 scspr0512196001 systemd[1]: postgresql-10.service failed.Aug 09
> 06:47:53 scspr0512196001 polkitd[740]: Unregistered Authentication Age *

As mentioned in the above log message, you need to look at the
PostgreSQL logs to see what's actually happened.  These logs aren't
telling us anything useful.

However, what you're seeing here is that it's hard to write a backup
system using the low-level PostgreSQL base backup API, which is why
there are good tools written that handle all of that for you.  I'd
strongly recommend that you look at using one of those tools for your
backups instead of trying to roll your own.  My favorite is pgBackRest
(though I'm a bit biased by that), but there's a few different options
out there (barman, WAL-E are two others).

Thanks!

Stephen

Вложения

В списке pgsql-admin по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: pg_upgrade doesn't work?
Следующее
От: Achilleas Mantzios
Дата:
Сообщение: Re: Logical replication monitoring