Re: PITR

Поиск
Список
Период
Сортировка
От Murthy Nunna
Тема Re: PITR
Дата
Msg-id BD7563973EF8894D905C80593EE9A4ED3312DD@MAIL01.fnal.gov
обсуждение исходный текст
Ответ на Re: PITR  (desmodemone <desmodemone@gmail.com>)
Ответы Re: PITR  (bricklen <bricklen@gmail.com>)
Re: PITR  (Raghavendra <raghavendra.rao@enterprisedb.com>)
Список pgsql-admin

Raghavendra,

 

Thanks for testing and confirming the behavior of “pause” setting.

 

While I understand your explanation, I feel I am still missing something. IMHO, when I say pause using “pause” setting, no matter what, I expect the recovery to wait for manual intervention. I myself can come up with number of reasons for doing so… e.g I may be purposely “hiding” some WALs somewhere else, or maybe I have several thousands of WALs that I want to parallelize the process of applying some logs while I recall some from tapes.

 

Let me know what you think.

 

Mat, This is a test database so I purposely lowered checkpoint/archive timeouts. Thank-you, I’ll follow your advice for production systems to conserve space.

 

Thanks,

Murthy

 

 

 

From: desmodemone [mailto:desmodemone@gmail.com]
Sent: Sunday, February 23, 2014 4:13 AM
To: Raghavendra
Cc: Murthy Nunna; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] PITR

 

 

 

2014-02-23 7:08 GMT+01:00 Raghavendra <raghavendra.rao@enterprisedb.com>:

On Sun, Feb 23, 2014 at 9:57 AM, Murthy Nunna <mnunna@fnal.gov> wrote:

Hi Mat,

 

Thank you for the pointers on pause_at_recovery_target and recovery_target_time. It worked but I encountered an unexpected situation.

 

I wanted to test three recovery times by checking data at each point and then proceed to the next. It worked as expected first 2 recovery times but the last one did not give me an opportunity to check data. It simply completed recovery and switched timeline. This means I cannot rollforward anymore unless I restore the database again. Do you think I did something wrong?

 

You did right, for the first two time targets you have some wal files pending hence pause is allowed. Whereas for the last time target there are no archive files to pause so it has just opened the database as completion of PITR.

 

I am able to reproduce this on my local as well.

 

This is from my logs:

 

2014-02-06 18:54:54 PST-11634---[] LOG:  restored log file "0000000100000001000000B8" from archive

2014-02-06 18:54:55 PST-11634---[] LOG:  restored log file "0000000100000001000000B9" from archive

2014-02-06 18:54:55 PST-11634---[] LOG:  consistent recovery state reached at 1/B90029C8

2014-02-06 18:54:55 PST-11632---[] LOG:  database system is ready to accept read only connections

cp: cannot stat `/opt/PostgreSQL/9.3/archives93/0000000100000001000000BA': No such file or directory

 

B8,B9 applied and looking for next file BA which is not there and no point of pausing without any files, hence it has opened the database. But my guess is, what it has done is right, there no more files to pause and allow you to query, its like completion of PITR.

 

Same in your case, there may not be any files in your archives directory. You can check for '00000004000000080000000D' file in your archive directory.

 

,2014-02-22 22:04:16 CSTLOG:  restored log file "00000004000000080000000C" from archive

cp: cannot stat `/pgdata/backups/xlogs/minerva_ecl_test/00000004000000080000000D': No such file or directory

 

---

Regards,

Raghavendra

EnterpriseDB Corporation



Raghavendra is right,  the recovery applied successful your archived wal segment to the database and ended the loop in the xlog.c when it not found anymore records in the wal segments. Moreover the backend is  telling you that the recovery phase ended at "
2014-02-22 19:48:29.205898-06" while your target timeline was "2014-02-22 19:50:00" and it's interesting.

Did you create transactions and you committed them at that time or after ?

Normally, when you have to do a restore,  probably you had a crash and your filesystem is not ok, so  you use a new filesystem on another storage. So your last transactions are "lost", because those transactions was not still archived, infact those transactions were in the last wal segment of the crashed filesystem. If that file is not OK, you could not use it for complete the recovery.

If you have a RPO policy , you have to look at parameter archive_timeout , so you are sure your wal segments will be archived, and if you do that and your database does not create so much transactions , think about to compress them or use data deduplication at storage level or filesystem level , or you will have a lot of wasted space ( the wal segment even if is not full will write 16Mb of <data>+<00000..000>)

 

Have a nice day


Mat Dba

В списке pgsql-admin по дате отправления:

Предыдущее
От: desmodemone
Дата:
Сообщение: Re: PITR
Следующее
От: bricklen
Дата:
Сообщение: Re: PITR