Обсуждение: Replication recovery?

Поиск
Список
Период
Сортировка

Replication recovery?

От
John Mudd
Дата:
Sorry if this is a dumb question. Feel free to just point me to a doc.

I've read a little about Postgres replication and the concept of a
master and one or more slaves. If one db is down then you just switch
to one that's still running. There's even additional software like
pgpool to make the switch easy. But I want to know more about how to
resume normal operating mode.

For example, I take it that if the master is unavailable then you
switch to a slave. The former slave becomes the current master. When
the original "master" is ready to run and network accessible then do
you bring it online in slave mode and it syncs automatically with the
current master? At which time you're almost back to normal. Once they
are back in sync do people typically switch the roles back to the
original designation of who's a slave and who's a master? It's not
clear to me if the last step is necessary.

Well, that's assuming that the master comes back online with all the
data it had when it went offline. If it comes back but all data was
lost (a worst case scenario) then I assume I have to take the current
master offline and use it to repopulate the recovering master from
scratch, correct? But... if I have additional slaves then I could just
take one of the current slaves offline, use it to rebuild the original
master, and then bring both the slave and the reconstructed master
(now also a slave) back online and both will sync with the current
master.

John

Re: Replication recovery?

От
"Albe Laurenz"
Дата:
John Mudd wrote:
> Sorry if this is a dumb question. Feel free to just point me to a doc.

Sure, here:
http://www.postgresql.org/docs/current/static/warm-standby-failover.html

> I've read a little about Postgres replication and the concept of a
> master and one or more slaves. If one db is down then you just switch
> to one that's still running. There's even additional software like
> pgpool to make the switch easy. But I want to know more about how to
> resume normal operating mode.
>
> For example, I take it that if the master is unavailable then you
> switch to a slave. The former slave becomes the current master. When
> the original "master" is ready to run and network accessible then do
> you bring it online in slave mode and it syncs automatically with the
> current master? At which time you're almost back to normal.

No, to quote:
"To return to normal operation, a standby server must be recreated,
 either on the former primary system when it comes up, or on a third,
 possibly new, system."

That means that you have to take a new base backup from the new primary
to setup the new standby.  Using something like "rsync" for the backup
can speed up the process if the difference is not yet too great.

That is necessary because after failover the database has changed
(it is running on a new time line).
Also, how would you handle the case that a transaction has already been
committed on the old master, but not yet replicated?

>                                                             Once they
> are back in sync do people typically switch the roles back to the
> original designation of who's a slave and who's a master? It's not
> clear to me if the last step is necessary.

It's not necessary to switch back again, as long as both machines are
equal.  While it is a good idea to use comparable machines anyway if
you want the standby to be able to take over, it may for example be that
the standby is at a remote site and you want your primary server to
run locally - then you would want to switch back.

> Well, that's assuming that the master comes back online with all the
> data it had when it went offline. If it comes back but all data was
> lost (a worst case scenario) then I assume I have to take the current
> master offline and use it to repopulate the recovering master from
> scratch, correct? But... if I have additional slaves then I could just
> take one of the current slaves offline, use it to rebuild the original
> master, and then bring both the slave and the reconstructed master
> (now also a slave) back online and both will sync with the current
> master.

It should work to create a slave by copying another slave, if that
is what you have in mind.

Yours,
Laurenz Albe

Re: Replication recovery?

От
Sergey Konoplev
Дата:
On Thu, May 17, 2012 at 11:10 PM, John Mudd <johnbmudd@gmail.com> wrote:
> For example, I take it that if the master is unavailable then you
> switch to a slave. The former slave becomes the current master. When
> the original "master" is ready to run and network accessible then do
> you bring it online in slave mode and it syncs automatically with the
> current master?

No, you need to rebuild replica on the original master's machine from scratch.

> At which time you're almost back to normal. Once they
> are back in sync do people typically switch the roles back to the
> original designation of who's a slave and who's a master? It's not
> clear to me if the last step is necessary.

It is up to you. I usually switch back only when the original master's
hardware is more powerful.

> Well, that's assuming that the master comes back online with all the
> data it had when it went offline. If it comes back but all data was
> lost (a worst case scenario) then I assume I have to take the current
> master offline and use it to repopulate the recovering master from
> scratch, correct? But... if I have additional slaves then I could just
> take one of the current slaves offline, use it to rebuild the original
> master, and then bring both the slave and the reconstructed master
> (now also a slave) back online and both will sync with the current
> master.
>
> John
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



--
Sergey Konoplev

a database and software architect
http://www.linkedin.com/in/grayhemp

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +79160686204