BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby

Поиск
Список
Период
Сортировка
От amit.kapila@huawei.com
Тема BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby
Дата
Msg-id E1TBlPY-0003Do-75@wrigleys.postgresql.org
обсуждение исходный текст
Ответы Re: BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby  (Fujii Masao <masao.fujii@gmail.com>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      7533
Logged by:          Amit Kapila
Email address:      amit.kapila@huawei.com
PostgreSQL version: 9.2.0
Operating system:   Suse
Description:        =


M host is primary, S host is standby and CS host is cascaded standby. =


1.Set up postgresql-9.2beta2/RC1 on  all hosts. =

2.Execute command initdb on host M to create fresh database. =

3.Modify the configure file postgresql.conf on host M like this=EF=BC=9A =

     listen_addresses =3D 'M' =

   port =3D 15210 =

   wal_level =3D hot_standby   =

   max_wal_senders =3D 4 =

   hot_standby =3D on =

4.modify the configure file pg_hba.conf on host M like this=EF=BC=9A =

host     replication     repl             M/24            md5 =

5.Start the server on host M as primary. =

6.Connect one client to primary server and create a user =E2=80=98repl=E2=
=80=99 =

  Create user repl superuser password '123'; =

7.Use the command pg_basebackup on the host S to retrieve database of
primary host =

pg_basebackup  -D /opt/t38917/data -F p -x fetch -c fast -l repl_backup -P
-v -h M -p 15210 -U repl =E2=80=93W =

8. Copy one recovery.conf.sample from share folder of package to database
folder of the host S. Then rename this file to recovery.conf =

9.Modify the file recovery.conf on host S as below: =

             standby_mode =3D on =

             primary_conninfo =3D 'host=3DM port=3D15210 user=3Drepl passwo=
rd=3D123' =

10. Modify the file postgresql.conf on host S as follow: =

       listen_addresses =3D 'S' =

11.Start the server on host S as standby server. =

12.Use the command pg_basebackup on the host CS to retrieve database of
standby host =

pg_basebackup  -D /opt/t38917/data -F p -x fetch -c fast -l repl_backup -P
-v -h M -p 15210 -U repl =E2=80=93W =

13.Modify the file recovery.conf on host CS as below: =

   standby_mode =3D on =

   primary_conninfo =3D 'host=3DS port=3D15210 user=3Drepl password=3D123' =

14. Modify the file postgresql.conf on host S as follow: =

     listen_addresses =3D 'CS' =

15.Start the server on host CS as Cascaded standby server node. =

16. Try to connect a client to host CS but it gives error as: =

    FATAL:  the database system is starting up =



Observations related to bug
------------------------------
In the above scenario it is observed that Start-up process has read all data
(in our defect scenario minRecoveryPoint is 5016220) till the position
5016220 and then it goes and check for recovery consistency by following
condition in function CheckRecoveryConsistency: =

        if (!reachedConsistency && =

                XLByteLE(minRecoveryPoint, EndRecPtr) && =

                XLogRecPtrIsInvalid(ControlFile->backupStartPoint)) =


At this point first two conditions are true but last condition is not true
because still redo has not been applied and hence backupStartPoint has not
been reset. So it does not signal postmaster regarding consistent stage.
After this it goes and applies the redo and then reset backupStartPoint and
then it goes to read next set of record. Since all records have been already
read, so it starts waiting for the new record from the Standby node. But
since there is no new record from Standby node coming so it keeps waiting
for that and it does not get chance to recheck the recovery consistent
level. And hence client connection does not get allowed.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Marko Tiikkaja
Дата:
Сообщение: Re: BUG #7516: PL/Perl crash
Следующее
От: amit.kapila@huawei.com
Дата:
Сообщение: BUG #7534: walreceiver takes long time to detect n/w breakdown