Обсуждение: Recovery.conf PITR by recovery_target_time

Поиск
Список
Период
Сортировка

Recovery.conf PITR by recovery_target_time

От
"ascot.moss@gmail.com"
Дата:
Hi,

I have tried following test cases about PITR:

a) Time and events on Master :
    16:32:03 HKT     Begin; CREATE TABLE test22 (id INTEGER PRIMARY =
KEY); INSERT INTO test22 VALUES (generate_series(1,220000)); End; =
Commit;=09
    16:35:02 HKT    Begin; CREATE TABLE test23 (id INTEGER PRIMARY =
KEY); INSERT INTO test23 VALUES (generate_series(1,230000)); End; =
Commit;
    16:57:01 HKT    Begin; CREATE TABLE test24 (id INTEGER PRIMARY =
KEY); INSERT INTO test24 VALUES (generate_series(1,240000)); End; =
Commit;
    15:57:45 HKT       Begin; CREATE TABLE test25 (id INTEGER PRIMARY =
KEY); INSERT INTO test25 VALUES (generate_series(1,250000)); End; =
Commit;
    16:57:56 HKT    drop table test24;

     (    16:58:30 HKT       <=3D=3D recovery_target_time  )

    17:01:53 HKT    Begin; CREATE TABLE test26 (id INTEGER PRIMARY =
KEY); INSERT INTO test26 VALUES (generate_series(1,260000)); End; =
Commit;

b) Expected Result
    - Do PITR on another test machine,=20
    - Set recovery_target_time =3D '2013-08-12 16:58:30 HKT', i.e =
wish to get table test22, test23, test25 restored, but not test24 =
(dropped) / test26 (after recovery_target_time)
    - PITR should stop the recovery according to =
recovery_target_time (i.e. should not replay all WAL)

c) Recovery.conf (only 4 lines)
    restore_command =3D '/usr/local/pgsql/bin/pg_standby -d -s 2  -t =
 /tmp/pgsql.trigger.5442  /var/pgsql/data/archive/ %f %p %r >> =
/tmp/pitr.log'
    recovery_end_command =3D 'rm -f /tmp/pgsql.trigger.5442'
    recovery_target_time =3D '2013-08-12 16:58:30 HKT'
    recovery_target_inclusive =3D 'false'


d) Result (Not Good )=20

     PG ignored the recovery_target_time, actually it just replayed all =
WAL which is not good.

    - Log :=20
        LOG:  database system was interrupted; last known up at =
2013-08-12 17:07:28 HKT
        LOG:  starting point-in-time recovery to 2013-08-12 =
16:58:30+08                        <=3D=3D showing =
the point-in-time of the recovery=20
        Trigger file:         <not set>
        Waiting for WAL file: 00000001000000AD0000000B
        WAL file path:        =
/var/pgsql/data/archive//00000001000000AD0000000B
        Restoring to:         pg_xlog/RECOVERYXLOG
        Sleep interval:       2 seconds
        Max wait interval:    0 forever
        Command for restore:  cp =
"/var/pgsql/data/archive//00000001000000AD0000000B" =
"pg_xlog/RECOVERYXLOG"
        Keep archive history: 000000000000000000000000 and later
        running restore:      OK

    - check tables from psql:

        postgres=3D# \d
                List of relations
         Schema |  Name  | Type  |  Owner  =20
        --------+--------+-------+----------
         public | test22 | table | postgres
        public | test23 | table | postgres
        public | test25 | table | postgres
         public | test26 | table | postgres                =
                                        =
<=3D=3D=3D    this table was created at 17:01:53 and should not be =
restored
        (4 rows)
        postgres=3D# select count(1) from test26;
          count =20
        ---------
         2600000
        (1 row)


I am new to PITR, can you please advise if this is a bug and any work =
around?

regards

Re: Recovery.conf PITR by recovery_target_time

От
Michael Paquier
Дата:
It looks that you are missing something. Similarly to what you did, here is
an example of PITR using a base backup:
1) Here is my master node doing some archiving:
$ psql -c 'show archive_command' -p 5432
                    archive_command
--------------------------------------------------------
 cp -i %p /home/mpaquier/bin/pgsql/archive/node_5432/%f
(1 row)
2) Creating data folder of new node using a base backup:
pg_basebackup -D ~/bin/pgsql/slave -p 5432
echo "port = 5433" >> ~/bin/pgsql/slave/postgresql.conf
This node will run with port 5433.
3) Creating some data:
$ psql -c 'CREATE TABLE aa AS SELECT generate_series(1,1000000) AS a' -p
5432
SELECT 1000000
$ date
2013-08-12 19:47:33 GMT
$ psql -c 'CREATE TABLE bb AS SELECT generate_series(1,1000000) AS a' -p
5432
SELECT 1000000
Similarly to what you did, after doing the recovery table bb will not exist
on the node recovered with PITR.
4) Preparing recovery for slave:
echo "restore_command = 'cp -i
/home/mpaquier/bin/pgsql/archive/node_5432/%f %p'" >
~/bin/pgsql/slave/recovery.conf
echo "recovery_target_time = '2013-08-12 19:47:33 GMT'" >>
~/bin/pgsql/slave/recovery.conf
5) Time to perform the PITR:
$ tail -n3 ~/bin/pgsql/slave/pg_log/postgresql-2013-08-12_195441.log
LOG:  recovery stopping before commit of transaction 1305, time 2013-08-12
19:48:22.436774+00
LOG:  recovery has paused
HINT:  Execute pg_xlog_replay_resume() to continue.
Note that in this case the recovery has stopped such as you can check the
status of the node before resuming its activity (you can as well enforce
the resume if you wish)
6) Now let's check that the node is in a correct state:
$ psql -p 5433
psql (9.3beta2)
Type "help" for help.

mpaquier=# \d
        List of relations
 Schema | Name | Type  |  Owner
--------+------+-------+----------
 public | aa   | table | mpaquier
(1 row)
And only table aa exists.

Here is more input after resume xlog replay.
mpaquier=# create table bb (a int); -- Node is still in read-only mode
ERROR:  25006: cannot execute CREATE TABLE in a read-only transaction
LOCATION:  PreventCommandIfReadOnly, utility.c:270
mpaquier=# select pg_xlog_replay_resume();
 pg_xlog_replay_resume
-----------------------

(1 row)
mpaquier=# create table bb (a int);
CREATE TABLE

Et voila!

On Mon, Aug 12, 2013 at 7:34 PM, ascot.moss@gmail.com <ascot.moss@gmail.com>
wrote:
>         - check tables from psql:
>                 postgres=# select count(1) from test26;
>                   count
>                 ---------
>                 2600000
>                 (1 row)
Perhaps you are connecting to the master node and not the node that has
been recovered when querying that?

Regards,
--
Michael

Re: Recovery.conf PITR by recovery_target_time

От
"ascot.moss@gmail.com"
Дата:
Thanks very much!  I guess you should use the same machine to carry out =
the PITR while I am using two physical machines. I will try it again.


On 13 Aug 2013, at 12:11 PM, Michael Paquier wrote:

> It looks that you are missing something. Similarly to what you did, =
here is an example of PITR using a base backup:
> 1) Here is my master node doing some archiving:
> $ psql -c 'show archive_command' -p 5432
>                     archive_command                   =20
> --------------------------------------------------------
>  cp -i %p /home/mpaquier/bin/pgsql/archive/node_5432/%f
> (1 row)
> 2) Creating data folder of new node using a base backup:
> pg_basebackup -D ~/bin/pgsql/slave -p 5432
> echo "port =3D 5433" >> ~/bin/pgsql/slave/postgresql.conf
> This node will run with port 5433.
> 3) Creating some data:
> $ psql -c 'CREATE TABLE aa AS SELECT generate_series(1,1000000) AS a' =
-p 5432
> SELECT 1000000
> $ date
> 2013-08-12 19:47:33 GMT
> $ psql -c 'CREATE TABLE bb AS SELECT generate_series(1,1000000) AS a' =
-p 5432
> SELECT 1000000
> Similarly to what you did, after doing the recovery table bb will not =
exist on the node recovered with PITR.
> 4) Preparing recovery for slave:
> echo "restore_command =3D 'cp -i =
/home/mpaquier/bin/pgsql/archive/node_5432/%f %p'" > =
~/bin/pgsql/slave/recovery.conf
> echo "recovery_target_time =3D '2013-08-12 19:47:33 GMT'" >> =
~/bin/pgsql/slave/recovery.conf
> 5) Time to perform the PITR:
> $ tail -n3 ~/bin/pgsql/slave/pg_log/postgresql-2013-08-12_195441.log
> LOG:  recovery stopping before commit of transaction 1305, time =
2013-08-12 19:48:22.436774+00
> LOG:  recovery has paused
> HINT:  Execute pg_xlog_replay_resume() to continue.
> Note that in this case the recovery has stopped such as you can check =
the status of the node before resuming its activity (you can as well =
enforce the resume if you wish)
> 6) Now let's check that the node is in a correct state:
> $ psql -p 5433
> psql (9.3beta2)
> Type "help" for help.
>=20
> mpaquier=3D# \d
>         List of relations
>  Schema | Name | Type  |  Owner  =20
> --------+------+-------+----------
>  public | aa   | table | mpaquier
> (1 row)
> And only table aa exists.
>=20
> Here is more input after resume xlog replay.
> mpaquier=3D# create table bb (a int); -- Node is still in read-only =
mode
> ERROR:  25006: cannot execute CREATE TABLE in a read-only transaction
> LOCATION:  PreventCommandIfReadOnly, utility.c:270
> mpaquier=3D# select pg_xlog_replay_resume();
>  pg_xlog_replay_resume=20
> -----------------------
> =20
> (1 row)
> mpaquier=3D# create table bb (a int);
> CREATE TABLE
>=20
> Et voila!
>=20
> On Mon, Aug 12, 2013 at 7:34 PM, ascot.moss@gmail.com =
<ascot.moss@gmail.com> wrote:
> >         - check tables from psql:
> >                 postgres=3D# select count(1) from test26;
> >                   count
> >                 ---------
> >                 2600000
> >                 (1 row)
> Perhaps you are connecting to the master node and not the node that =
has been recovered when querying that?
>=20
> Regards,
> --=20
> Michael