Обсуждение: hot standby PSQL 9.1 Windows 2008 Servers

Поиск
Список
Период
Сортировка

hot standby PSQL 9.1 Windows 2008 Servers

От
chinnaobi
Дата:
Hi All,

I have implemented hot standby for PostgreSQL with a group of two Primary
and Standby in windows 2008 servers.

Currently below are the settings:

1. Archiving is enabled on primary, stored on network storage.
2. Asynchronous Streaming replication from primary to standby.
wal-senders=5, wal_keep_segments=32.
3. I am checking heartbeat of primary from standby, when it is down i am
restarting service by changing postgresql.conf and recovery.conf settings.

I do base backup only first time on standby when it is going to be
replicated. when ever primary goes down,  standby becomes primary and
primary becomes standby when primary comes up. When primary becomes standby
I am restoring data from WAL archive and start postgres service streaming
replication to connect to primary. 

This setup is working.

I have tested for few days in my network with huge data input to the primary
and restarting the servers multiple times. I have observed these below
specific errors in standby server and standby server is not replicating in
anymore.

*1. Invalid primary and secondary checkpoints. Very rare But happend
2. contrecord is requested by 0/DF000020 -- Major one
3.  record with zero length at 0/DF000078, invalid record length at
0/DF000078*

These errors are shown up when primary switching to standby by recovering
data from archive and standby stops replicating. *I am not able to start the
service for that I should kill all the process of postgres and then do base
backup and then start service on stnadby.*

I don't have any prior experience working with postgreSQL. Please tell me 

what are the most reliable settings for Hot standby with streaming
replication to work ? two 2008 servers and 1 network storage available.
How to avoid above errors ??



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/hot-standby-PSQL-9-1-Windows-2008-Servers-tp5708637.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


Re: hot standby PSQL 9.1 Windows 2008 Servers

От
Robert Haas
Дата:
On Mon, May 14, 2012 at 8:18 AM, chinnaobi <chinnaobi@gmail.com> wrote:
> I do base backup only first time on standby when it is going to be
> replicated. when ever primary goes down,  standby becomes primary and
> primary becomes standby when primary comes up. When primary becomes standby
> I am restoring data from WAL archive and start postgres service streaming
> replication to connect to primary.
>
> This setup is working.

I don't think this is safe.  The primary might have WAL that never
made it to the standby, in which case the two machines will be out of
sync with each other and all sorts of bad stuff could happen.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: hot standby PSQL 9.1 Windows 2008 Servers

От
chinnaobi
Дата:
Dear Robert,

Thank you very much for the reply. 

You mean when the primary which is going to switch its role to standby might
not have sent all the WAL records to the standby and If it is switched to
standby it has more WAL records than the standby which is now serves as
primary. Is it ??

It is actually the standby server which has to be restored from archive when
it is switching to primary right .. Not the primary which is switching to
standby ??

Regards,
Reddy.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/hot-standby-PSQL-9-1-Windows-2008-Servers-tp5708637p5709495.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


Re: hot standby PSQL 9.1 Windows 2008 Servers

От
"Kevin Grittner"
Дата:
chinnaobi  wrote:
> You mean when the primary which is going to switch its role to
> standby might not have sent all the WAL records to the standby and
> If it is switched to standby it has more WAL records than the
> standby which is now serves as primary. Is it ??
What happens when there is a network fault between the primary and
the standby, but not between the primary and some of the clients
updating it?  Similarly, if this is asynchronous replication, what if
there have been commits on the primary which were still in the
network buffer when the primary crashed?
Clean automated failover is not a trivial task.  If you are writing
your own, it would be best to follow the steps recommended in the
documentation rather closely.
-Kevin


Re: hot standby PSQL 9.1 Windows 2008 Servers

От
chinnaobi
Дата:
Dear Kevin,

Thank you for your reply. Yeah I am writing an application using powershell,
it's true it is not trivial and especially a guy like me who has no idea on
database. 

You raised all the cases which I am muddling with, But currently I am
testing this setup:

Always standby server is configured from base backup and restore from
storage server then start streaming replication(asynchronous). Base backups
are taken frequently. 

I am sure there is some data loss during switching. Still researching how to
do it clean. suggest me if you have any good papers on this ..

Reddy.



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Re-hot-standby-PSQL-9-1-Windows-2008-Servers-tp5710824p5710830.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


Re: hot standby PSQL 9.1 Windows 2008 Servers

От
chinnaobi
Дата:
Sorry to mention, In my setup the primary and standby servers receive same
traffic, so no issue with the 
"network fault between the primary and the standby, but not between the
primary and some of the clients updating it"

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Re-hot-standby-PSQL-9-1-Windows-2008-Servers-tp5710824p5710832.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


Re: hot standby PSQL 9.1 Windows 2008 Servers

От
Robert Haas
Дата:
On Tue, May 22, 2012 at 12:15 PM, chinnaobi <chinnaobi@gmail.com> wrote:
> You mean when the primary which is going to switch its role to standby might
> not have sent all the WAL records to the standby and If it is switched to
> standby it has more WAL records than the standby which is now serves as
> primary. Is it ??

Yes, that is possible.  Or the standby might have received all the WAL
records but not be caught up in terms of replaying them.

> It is actually the standby server which has to be restored from archive when
> it is switching to primary right .. Not the primary which is switching to
> standby ??

If you want to promote a standby, you can just do it (pg_ctl promote).If you have a master that you want to demote to a
standby,you've got
 
to resync it to whatever the current master is.  I understand repmgr
has some tooling to help automate that, although I have not played
with it myself.  In any event rsync can be a big help in reducing the
resync time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company