Обсуждение: Archive Command Configuration

Поиск
Список
Период
Сортировка

Archive Command Configuration

От
Pallav Kalva
Дата:
Hi,


    I am in the process of implementing Failover to our production
database. Inorder to able to restore to the last archive log, I have
cron job from the production database that runs every 5-10min to check
if there are any new archive logs and copy new archive logs to the
remote stand by failover machine.

 The problem with this scenario is that there might be a possibility
that I might scp a partially filled archive log over to the remote
machine on a heavily updated databases with batch jobs runnings most of
the time  like ours.

   So, inorder to over come this we want to change the archive_command to

archive_command = 'cp %p /archives-temp/%f && mv /archives-temp/%f
/archives/%f'
( I could do remote copy also with the command but I dont want to go
that route because of network problems. )


  With this command the archive log files are first copied to a
temporary location and then moved over to the actual location to which I
can point my cron job and do the scp to remote location every 5-10 min.
   I tried with this command on a test machine and made sure that the
logs are moved over from archive-temp to archives directory, it works.


  My question is there any problem in using this approach ? will I run
into problems in the future ? are there any other better ways of solving
this problem ?

  postgresql version is 8.0.2.

Thanks!
Pallav.

Re: Archive Command Configuration

От
Steve Crawford
Дата:
>...I have a
> cron job from the production database that runs every 5-10min to check
> if there are any new archive logs and copy new archive logs to the
> remote stand by failover machine.
>
> The problem with this scenario is that there might be a possibility that
> I might scp a partially filled archive log over to the remote machine on
> a heavily updated databases with batch jobs runnings most of the time
> like ours.
>
>   So, inorder to over come this we want to change the archive_command to
>
> archive_command = 'cp %p /archives-temp/%f && mv /archives-temp/%f
> /archives/%f'
> ( I could do remote copy also with the command but I dont want to go
> that route because of network problems. )
>
> ....
>
>  My question is there any problem in using this approach ? will I run
> into problems in the future ? are there any other better ways of solving
> this problem ?

Looks like it should work but have you considered using rsync instead?
If the file grows from one pass to the next, rsync will send the
differences.

Given the small number of files (at least far fewer thatn the
half-million file trees my rsyncs process every 10 minutes) in the
archive directory, rsync will probably only take a fraction of a second
to determine whether or not any transfers are needed. I'll bet that you
could run rsync every 15 seconds with virtually no increase in system
load other than whatever is required to transfer the data and you have
that load already.

Rsync can also handle deleting files from the receiving end if you wish.

I am starting to work on a wal-shipping backup and I plan to try rsync
first.

Some advice. I make extensive use of rsync. The core skeleton of my
scripts goes something like this:

cd /the/appropriate/place
if [ -f rsync.lockfile ]
   # Exit - new pass started before previously
   # started rsync completed
   exit
fi
date > rsync.lockfile
rsync <your parameters as appropriate>
rm rsync.lockfile

I run another script on the target machine that periodically checks the
age of rsync.lockfile and sends alerts if it is excessively old (where
excessive will be determined by the specifics of your setup - I use one
hour).

Cheers,
Steve

Re: Archive Command Configuration

От
Pallav Kalva
Дата:
Steve Crawford wrote:
>> ...I have a
>> cron job from the production database that runs every 5-10min to
>> check if there are any new archive logs and copy new archive logs to
>> the remote stand by failover machine.
>>
>> The problem with this scenario is that there might be a possibility
>> that I might scp a partially filled archive log over to the remote
>> machine on a heavily updated databases with batch jobs runnings most
>> of the time  like ours.
>>
>>   So, inorder to over come this we want to change the archive_command to
>>
>> archive_command = 'cp %p /archives-temp/%f && mv /archives-temp/%f
>> /archives/%f'
>> ( I could do remote copy also with the command but I dont want to go
>> that route because of network problems. )
>>
>> ....
>>
>>  My question is there any problem in using this approach ? will I run
>> into problems in the future ? are there any other better ways of
>> solving this problem ?
>
> Looks like it should work but have you considered using rsync instead?
> If the file grows from one pass to the next, rsync will send the
> differences.
>
> Given the small number of files (at least far fewer thatn the
> half-million file trees my rsyncs process every 10 minutes) in the
> archive directory, rsync will probably only take a fraction of a
> second to determine whether or not any transfers are needed. I'll bet
> that you could run rsync every 15 seconds with virtually no increase
> in system load other than whatever is required to transfer the data
> and you have that load already.
>
> Rsync can also handle deleting files from the receiving end if you wish.
>
> I am starting to work on a wal-shipping backup and I plan to try rsync
> first.
>
> Some advice. I make extensive use of rsync. The core skeleton of my
> scripts goes something like this:
>
> cd /the/appropriate/place
> if [ -f rsync.lockfile ]
>   # Exit - new pass started before previously
>   # started rsync completed
>   exit
> fi
> date > rsync.lockfile
> rsync <your parameters as appropriate>
> rm rsync.lockfile
>
> I run another script on the target machine that periodically checks
> the age of rsync.lockfile and sends alerts if it is excessively old
> (where excessive will be determined by the specifics of your setup - I
> use one hour).
>
> Cheers,
> Steve
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>               http://archives.postgresql.org
>
Hi Steve,

   Thanks! for the quick reply,  I thought about rsync too, but wasnt
sure about completely how it handles partial files. I use rsync for all
the backups, it works fine for all the application except for our mail
application it copies the files but at the end of the job it gives me
message like

------------------------------------------------------------------------------------
send_files failed to open //this/file/ : No such file or directory

rsync error: some files could not be transferred (code 23) at main.c(1158)
------------------------------------------------------------------------------------

  Considering it is copying all our mail messages and it takes a lot of
time to rsync the whole thing, there might be changes to the file or
file gets deleted completely from time it gets the files to rsync and do
the actual rsync.

 Pallav.

Re: Archive Command Configuration

От
Steve Crawford
Дата:
> Hi Steve,
>
>   Thanks! for the quick reply,  I thought about rsync too, but wasnt
> sure about completely how it handles partial files. I use rsync for all
> the backups, it works fine for all the application except for our mail
> application it copies the files but at the end of the job it gives me
> message like
>
> ------------------------------------------------------------------------------------
>
> send_files failed to open //this/file/ : No such file or directory
>
> rsync error: some files could not be transferred (code 23) at main.c(1158)
> ------------------------------------------------------------------------------------
>
>
>  Considering it is copying all our mail messages and it takes a lot of
> time to rsync the whole thing, there might be changes to the file or
> file gets deleted completely from time it gets the files to rsync and do
> the actual rsync.

Not to worry. I see this from time to time as well. Rsync builds a list
of files to sync and then syncs them. If a file is deleted between the
time rsync builds the list and when it tries to process that file, it
will return that error.

BTW, if you haven't done so already, check out version 2.6.7 released a
week or so ago. There are some nice enhancements to the include/exclude
code, a --prune-empty-dirs option so when your exclude options eliminate
all the files in a directory you don't end up creating an empty
directory on the target machine, and an --append option to improve
efficiency when transferring files that are only appended to, not
changed (great for logs and such).

Cheers,
Steve

Re: Archive Command Configuration

От
"Jim C. Nasby"
Дата:
On Thu, Mar 23, 2006 at 12:28:41PM -0800, Steve Crawford wrote:
> >...I have a
> >cron job from the production database that runs every 5-10min to check
> >if there are any new archive logs and copy new archive logs to the
> >remote stand by failover machine.
> >
> >The problem with this scenario is that there might be a possibility that
> >I might scp a partially filled archive log over to the remote machine on
> >a heavily updated databases with batch jobs runnings most of the time
> >like ours.
> >
> >  So, inorder to over come this we want to change the archive_command to
> >
> >archive_command = 'cp %p /archives-temp/%f && mv /archives-temp/%f
> >/archives/%f'
> >( I could do remote copy also with the command but I dont want to go
> >that route because of network problems. )
> >
> >....
> >
> > My question is there any problem in using this approach ? will I run
> >into problems in the future ? are there any other better ways of solving
> >this problem ?
>
> Looks like it should work but have you considered using rsync instead?
> If the file grows from one pass to the next, rsync will send the
> differences.

rsync does nothing to address the race condition issue, though. If you
fire up an rsync and the cp takes too long, you'll end up syncing a
partially written WAL file, which could potentially be bad.

BTW, there's also now a project on pgFoundry for using PITR to keep a
warm-backup: http://pgfoundry.org/projects/pgpitrha/
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Archive Command Configuration

От
Pallav Kalva
Дата:
Jim C. Nasby wrote:
>On Thu, Mar 23, 2006 at 12:28:41PM -0800, Steve Crawford wrote:
>
>>>...I have a
>>>cron job from the production database that runs every 5-10min to check
>>>if there are any new archive logs and copy new archive logs to the
>>>remote stand by failover machine.
>>>
>>>The problem with this scenario is that there might be a possibility that
>>>I might scp a partially filled archive log over to the remote machine on
>>>a heavily updated databases with batch jobs runnings most of the time
>>>like ours.
>>>
>>> So, inorder to over come this we want to change the archive_command to
>>>
>>>archive_command = 'cp %p /archives-temp/%f && mv /archives-temp/%f
>>>/archives/%f'
>>>( I could do remote copy also with the command but I dont want to go
>>>that route because of network problems. )
>>>
>>>....
>>>
>>>My question is there any problem in using this approach ? will I run
>>>into problems in the future ? are there any other better ways of solving
>>>this problem ?
>>>
>>Looks like it should work but have you considered using rsync instead?
>>If the file grows from one pass to the next, rsync will send the
>>differences.
>>
>
>rsync does nothing to address the race condition issue, though. If you
>fire up an rsync and the cp takes too long, you'll end up syncing a
>partially written WAL file, which could potentially be bad.
>
>BTW, there's also now a project on pgFoundry for using PITR to keep a
>warm-backup: http://pgfoundry.org/projects/pgpitrha/
>
Hi Jim,

   What do you think about this option ? If I just do scp from the
/archives folder, I will be ok and wont copy any partial files at any
time right ?
archive_command = 'cp %p /archives-temp/%f && mv /archives-temp/%f /archives/%f'


Pallav.




Re: Archive Command Configuration

От
"Jim C. Nasby"
Дата:
On Fri, Mar 24, 2006 at 10:39:41AM -0500, Pallav Kalva wrote:
>   What do you think about this option ? If I just do scp from the
> /archives folder, I will be ok and wont copy any partial files at any
> time right ?
> archive_command = 'cp %p /archives-temp/%f && mv /archives-temp/%f
> /archives/%f'

A simple scp leaves you vulnerable to a WAL file only being partially
copied if a backhoe hits at just the right moment. I believe that
wouldn't be an issue with rsync, since it won't put a file in place
until it's completely transfered. So if you want to use scp, I think
you'd need to scp to a temporary directory, and then do a final move.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Archive Command Configuration

От
Pallav Kalva
Дата:
Jim C. Nasby wrote:
>On Fri, Mar 24, 2006 at 10:39:41AM -0500, Pallav Kalva wrote:
>
>>  What do you think about this option ? If I just do scp from the
>>/archives folder, I will be ok and wont copy any partial files at any
>>time right ?
>>archive_command = 'cp %p /archives-temp/%f && mv /archives-temp/%f
>>/archives/%f'
>>
>
>A simple scp leaves you vulnerable to a WAL file only being partially
>copied if a backhoe hits at just the right moment. I believe that
>wouldn't be an issue with rsync, since it won't put a file in place
>until it's completely transfered. So if you want to use scp, I think
>you'd need to scp to a temporary directory, and then do a final move.
>

Are you talking about scp to a temporary directory and move on a remote
machine ?

The idea with this archive command is to make postgres copy first to
"archive-temp" directory  and once the copy of the file is done
completely then move it to "archives" directory on the source machine.
My cronjob then does scp from the archives folder on source machine
regularly to a remote location, this way it never does a scp on a
partial file since files are movied  only when it is completely copied.

This way I will never scp a partial file and be sure that copy on the
remote machine is a complete clean copy.

(archive_command = 'cp %p /archives-temp/%f && mv /archives-temp/%f
/archives/%f')

Re: Archive Command Configuration

От
"Jim C. Nasby"
Дата:
On Fri, Mar 24, 2006 at 12:12:54PM -0500, Pallav Kalva wrote:
> Jim C. Nasby wrote:
> >On Fri, Mar 24, 2006 at 10:39:41AM -0500, Pallav Kalva wrote:
> >
> >> What do you think about this option ? If I just do scp from the
> >>/archives folder, I will be ok and wont copy any partial files at any
> >>time right ?
> >>archive_command = 'cp %p /archives-temp/%f && mv /archives-temp/%f
> >>/archives/%f'
> >>
> >
> >A simple scp leaves you vulnerable to a WAL file only being partially
> >copied if a backhoe hits at just the right moment. I believe that
> >wouldn't be an issue with rsync, since it won't put a file in place
> >until it's completely transfered. So if you want to use scp, I think
> >you'd need to scp to a temporary directory, and then do a final move.
> >
>
> Are you talking about scp to a temporary directory and move on a remote
> machine ?
>
> The idea with this archive command is to make postgres copy first to
> "archive-temp" directory  and once the copy of the file is done
> completely then move it to "archives" directory on the source machine.
> My cronjob then does scp from the archives folder on source machine
> regularly to a remote location, this way it never does a scp on a
> partial file since files are movied  only when it is completely copied.
>
> This way I will never scp a partial file and be sure that copy on the
> remote machine is a complete clean copy.
>
> (archive_command = 'cp %p /archives-temp/%f && mv /archives-temp/%f
> /archives/%f')

scp archives/%f remote-machine:temp
ssh remote-machine mv temp/%f archive/%f

That's the only way I can think of to ensure you don't get a partial
logfile on the remote machine.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461