Обсуждение: Trigger File Behaviour

Поиск
Список
Период
Сортировка

Trigger File Behaviour

От
Selva manickaraja
Дата:
Hi,

We tried setting the trigger file for fail-over purpose. But we just can't understand how it works. Each time the secondary is started the trigger file is removed. How can we introduce auto fail-over is this happens?

Thank you.

Regards,

Selvam

Fwd: Trigger File Behaviour

От
Selva manickaraja
Дата:
Any assistance available on this topic?

---------- Forwarded message ----------
From: Selva manickaraja <mavles78@gmail.com>
Date: Thu, Feb 17, 2011 at 10:10 AM
Subject: Trigger File Behaviour
To: pgsql-admin@postgresql.org


Hi,

We tried setting the trigger file for fail-over purpose. But we just can't understand how it works. Each time the secondary is started the trigger file is removed. How can we introduce auto fail-over is this happens?

Thank you.

Regards,

Selvam

Re: Trigger File Behaviour

От
Selva manickaraja
Дата:
Does this mean the fail-over will not be auto the moment the primary DB or Server goes down? Does it require for us to manually intervene to introduce the trigger_file?

On Fri, Feb 18, 2011 at 1:17 AM, Dean Gibson (DB Administrator) <postgresql@ultimeth.com> wrote:
On 2011-02-16 18:10, Selva manickaraja wrote:
We tried setting the trigger file for fail-over purpose. But we just can't understand how it works. Each time the secondary is started the trigger file is removed. How can we introduce auto fail-over is this happens?

Thank you.  Regards, Selvam

That's exactly what is supposed to happen.  You will also find that the
recover.conf file gets renamed when the trigger file is created by you
(and then is promptly deleted by PostgreSQL).

Don't create the trigger file until you want the hot-standby server to
switch roles and become the primary server (and thusly accept DB change
requests).

--
Mail to my list address MUST be sent via the mailing list.
All other mail to my list address will bounce.



Re: Trigger File Behaviour

От
"Dean Gibson (DB Administrator)"
Дата:
On 2011-02-17 17:43, Selva manickaraja wrote:
Does this mean the fail-over will not be auto the moment the primary DB or Server goes down? Does it require for us to manually intervene to introduce the trigger_file?

Yes, and yes.

I depends upon the installation, as to how to detect that that the primary has gone down (eg, lose network connection for how long?).  Therefore, it is left to the sysadmin (that's you and me) to write external procedures to detect when it is time to fail-over, and write an appropriate automated script to create the trigger file (or do it manually).  I do it manually, because I have multiple slaves and the procedures are slightly more complex than for just one slave.

Setting up the configuration files is pretty trivial:

1. On each slave, create a recovery.conf that points to the primary.
2. Optionally, on the primary, create a recovery.done (note the different extension) that points to whichever slave you plan on later switching to a primary in case of a fail-over.

In the case of the failure of the primary:

1. Make sure the primary will not come back up (yet).
2. Create the trigger file (which triggers the switch) on whatever slave you wish to become the new primary.
3. If you have only one slave, you are done.  If you have more than one slave, you will need to (on each other slave):
3.a. Stop the slave.
3.b. Edit the recovery.conf file to point to the new primary.
3.c. Restart the slave.
3.d. If the slave does not properly sync to the new primary, you may have to (ugh) run the script below to resynchronize the slave to the new primary.

When you are ready to have the old primary become a new slave, resync the data on the old primary to the new primary (I use the following script on the old primary):

        service         postgresql      stop
        {
            cat         <<-EOF
                                SELECT  pg_start_backup( '$(date --iso-8601)', true );
                                \!      rsync   -vaz --delete  new.primary.hostname:$PGDATA/  $PGDATA/
                                SELECT  pg_stop_backup();
                        EOF
        } | psql -e       template1 postgres
        mv              $PGDATA/recovery.{done,conf}
        service         postgresql      start

Presto!  Your old primary is now back up as a slave.


On Fri, Feb 18, 2011 at 1:17 AM, Dean Gibson (DB Administrator) <postgresql@ultimeth.com> wrote:
On 2011-02-16 18:10, Selva manickaraja wrote:
We tried setting the trigger file for fail-over purpose. But we just can't understand how it works. Each time the secondary is started the trigger file is removed. How can we introduce auto fail-over is this happens?

Thank you.  Regards, Selvam

That's exactly what is supposed to happen.  You will also find that the recovery.conf file gets renamed when the trigger file is created by you (and then is promptly deleted by PostgreSQL).

Don't create the trigger file until you want the hot-standby server to switch roles and become the primary server (and thusly accept DB change requests).


-- 
Mail to my list address MUST be sent via the mailing list.
All other mail to my list address will bounce.

Re: Trigger File Behaviour

От
Selva manickaraja
Дата:
Thank you so much!! It has been very very helpful. I just wonder if PostgreSQL has plans to put this in as part of it HA features in later versions.

On Fri, Feb 18, 2011 at 11:55 AM, Dean Gibson (DB Administrator) <postgresql@ultimeth.com> wrote:
On 2011-02-17 17:43, Selva manickaraja wrote:
Does this mean the fail-over will not be auto the moment the primary DB or Server goes down? Does it require for us to manually intervene to introduce the trigger_file?

Yes, and yes.

I depends upon the installation, as to how to detect that that the primary has gone down (eg, lose network connection for how long?).  Therefore, it is left to the sysadmin (that's you and me) to write external procedures to detect when it is time to fail-over, and write an appropriate automated script to create the trigger file (or do it manually).  I do it manually, because I have multiple slaves and the procedures are slightly more complex than for just one slave.

Setting up the configuration files is pretty trivial:

1. On each slave, create a recovery.conf that points to the primary.
2. Optionally, on the primary, create a recovery.done (note the different extension) that points to whichever slave you plan on later switching to a primary in case of a fail-over.

In the case of the failure of the primary:

1. Make sure the primary will not come back up (yet).
2. Create the trigger file (which triggers the switch) on whatever slave you wish to become the new primary.
3. If you have only one slave, you are done.  If you have more than one slave, you will need to (on each other slave):
3.a. Stop the slave.
3.b. Edit the recovery.conf file to point to the new primary.
3.c. Restart the slave.
3.d. If the slave does not properly sync to the new primary, you may have to (ugh) run the script below to resynchronize the slave to the new primary.

When you are ready to have the old primary become a new slave, resync the data on the old primary to the new primary (I use the following script on the old primary):

        service         postgresql      stop
        {
            cat         <<-EOF
                                SELECT  pg_start_backup( '$(date --iso-8601)', true );
                                \!      rsync   -vaz --delete  new.primary.hostname:$PGDATA/  $PGDATA/
                                SELECT  pg_stop_backup();
                        EOF
        } | psql -e       template1 postgres
        mv              $PGDATA/recovery.{done,conf}
        service         postgresql      start

Presto!  Your old primary is now back up as a slave.


On Fri, Feb 18, 2011 at 1:17 AM, Dean Gibson (DB Administrator) <postgresql@ultimeth.com> wrote:
On 2011-02-16 18:10, Selva manickaraja wrote:
We tried setting the trigger file for fail-over purpose. But we just can't understand how it works. Each time the secondary is started the trigger file is removed. How can we introduce auto fail-over is this happens?

Thank you.  Regards, Selvam

That's exactly what is supposed to happen.  You will also find that the recovery.conf file gets renamed when the trigger file is created by you (and then is promptly deleted by PostgreSQL).


Don't create the trigger file until you want the hot-standby server to switch roles and become the primary server (and thusly accept DB change requests).


-- 
Mail to my list address MUST be sent via the mailing list.
All other mail to my list address will bounce.

Re: Trigger File Behaviour

От
Selva manickaraja
Дата:
I encountered a small issue when I run the script in the old-primary. The "SELECT  pg_start_backup...." part of the script will fail since the postgres service has been stopped. Is this error meant to happen?

On Fri, Feb 18, 2011 at 11:55 AM, Dean Gibson (DB Administrator) <postgresql@ultimeth.com> wrote:
On 2011-02-17 17:43, Selva manickaraja wrote:
Does this mean the fail-over will not be auto the moment the primary DB or Server goes down? Does it require for us to manually intervene to introduce the trigger_file?

Yes, and yes.

I depends upon the installation, as to how to detect that that the primary has gone down (eg, lose network connection for how long?).  Therefore, it is left to the sysadmin (that's you and me) to write external procedures to detect when it is time to fail-over, and write an appropriate automated script to create the trigger file (or do it manually).  I do it manually, because I have multiple slaves and the procedures are slightly more complex than for just one slave.

Setting up the configuration files is pretty trivial:

1. On each slave, create a recovery.conf that points to the primary.
2. Optionally, on the primary, create a recovery.done (note the different extension) that points to whichever slave you plan on later switching to a primary in case of a fail-over.

In the case of the failure of the primary:

1. Make sure the primary will not come back up (yet).
2. Create the trigger file (which triggers the switch) on whatever slave you wish to become the new primary.
3. If you have only one slave, you are done.  If you have more than one slave, you will need to (on each other slave):
3.a. Stop the slave.
3.b. Edit the recovery.conf file to point to the new primary.
3.c. Restart the slave.
3.d. If the slave does not properly sync to the new primary, you may have to (ugh) run the script below to resynchronize the slave to the new primary.

When you are ready to have the old primary become a new slave, resync the data on the old primary to the new primary (I use the following script on the old primary):

        service         postgresql      stop
        {
            cat         <<-EOF
                                SELECT  pg_start_backup( '$(date --iso-8601)', true );
                                \!      rsync   -vaz --delete  new.primary.hostname:$PGDATA/  $PGDATA/
                                SELECT  pg_stop_backup();
                        EOF
        } | psql -e       template1 postgres
        mv              $PGDATA/recovery.{done,conf}
        service         postgresql      start

Presto!  Your old primary is now back up as a slave.


On Fri, Feb 18, 2011 at 1:17 AM, Dean Gibson (DB Administrator) <postgresql@ultimeth.com> wrote:
On 2011-02-16 18:10, Selva manickaraja wrote:
We tried setting the trigger file for fail-over purpose. But we just can't understand how it works. Each time the secondary is started the trigger file is removed. How can we introduce auto fail-over is this happens?

Thank you.  Regards, Selvam

That's exactly what is supposed to happen.  You will also find that the recovery.conf file gets renamed when the trigger file is created by you (and then is promptly deleted by PostgreSQL).


Don't create the trigger file until you want the hot-standby server to switch roles and become the primary server (and thusly accept DB change requests).


-- 
Mail to my list address MUST be sent via the mailing list.
All other mail to my list address will bounce.

Re: Trigger File Behaviour

От
Guillaume Lelarge
Дата:
Le 21/02/2011 02:04, Selva manickaraja a écrit :
> I encountered a small issue when I run the script in the old-primary. The
> "SELECT  pg_start_backup...." part of the script will fail since the
> postgres service has been stopped. Is this error meant to happen?
>

The "SELECT pg_start_backup... rsync... SELECT pg_stop_backup" must
happen on the new primary. The psql comand probably needs a "-h
new.primary.hostname" switch.


--
Guillaume
 http://www.postgresql.fr
 http://dalibo.com