Обсуждение: Cron DB Bounce causes index problems && Help rotating logs

Поиск
Список
Период
Сортировка

Cron DB Bounce causes index problems && Help rotating logs

От
"Chris Hoover"
Дата:
Hi all,

We are having an issue here trying to automate our db log rotations.  We were using the Apache rotatelogs program and
wethought it was working fine.  However, when we had a server crash and had to recover and roll forward using the
postgresqllogs (we are logging all sql), we discovered that some of the sql records in the log were being over written.
This has caused us to go back to just having postgresql log directly to the drives.  (The problem with rotatelogs
lookedlike this "[2005-01-14 08:00:01] [123]  LOG: query: select * fr[2005-01-14 08:00:01] ....") 

Anyway, we are now manually rotating the logs.  We tried to rotate the logs via a simple script and cron, but we ended
upcausing strange index issues.  The data would be in the tables and would show if a table scan was forced.  But many
tableswould not show the data if an index scan (esp. if the primary key was used) was used.  We also had issues with
systemallowing records to be inserted that violated the primary key. 

Manually logging into the servers and running the same script from the shell prompt works fine and does not cause any
indexissues. 

Any ideas on what is going on here?  We do not have any postgresql related environmental variables set in the shell.
Dowe need to execute a db bounce via cron with a "special" call from cron? 

Also, has anyone else seen the rotatelogs issue?

Is there a better way to rotate the logs w/o bouncing the db every night?  Due to recovery requirements, we must have
thesql logged to the postgresql log.  These logs grow to between 2 and 5 GB every day.  Running for more than 2 days
withoutrotating the logs tends to cause a noticable slowdown to our users. 

Any help would be appreciated.

Chris

P.S.

Here is the cron line:
01 20 * * * /pgsql/tmp/bin/rotate_logs.sh 50001

Here is the rotate_log.sh script:

source ~postgres/etc/dba.profile

PORT=$1

LOGPATH=${LOG_PATH}/${PORT}/pg_log

echo "`date` - `hostname` - log rotation started" | mail -s "`hostname` rotation" $DBAPAGER

## shutdown the database
${POSTGRES_BIN}/pg_ctl -D ${DB_PATH}/${PORT} -m fast stop > ~postgres/db_${PORT}_stop.log 2>&1

STOPPED=`cat ~postgres/db_${PORT}_stop.log | grep "postmaster successfully shut down" | wc -l`
if [ $STOPPED != 1 ]
then
  echo "POSTGRES DID NOT STOP" | mail -s "`hostname` stop fail" $DBAPAGER
  exit
fi

sleep 150

mv $LOGPATH/postgres.log $LOGPATH/postgres.log.`date +%m%d%y%H%M`

sleep 150

${POSTGRES_BIN}/pg_ctl -D ${DB_PATH}/${PORT} -l ${LOGPATH}/postgres.log start > ~postgres/db_${PORT}_start.log 2>&1
STARTED=`grep "successfully start" ~postgres/db_${PORT}_start.log | wc -l `
if [ $STARTED != 1 ]
then
  echo "POSTGRES DID NOT START" | mail -s "`hostname` start fail" $DBAPAGER
fi

echo "`date` - `hostname` - log rotation finished" | mail -s "`hostname` rotation" $DBAPAGER


Here is the dba.profile file:
DBAPAGER=<my pager number>

POSTGRES=/pgsql/

POSTGRES_BIN=${POSTGRES}/bin

DB_PATH=/data

LOG_PATH=/data



Re: Cron DB Bounce causes index problems && Help rotating logs

От
Richard_D_Levine@raytheon.com
Дата:
Hi Chris,

In the rotate_logs.sh I would replace the first sleep 150 with a call to
fuser (in a sleeping loop) to make sure there is no file access to the log
file.  I don't understand the second sleep 150.

Rick



                   
                      "Chris Hoover"
                   
                      <revoohc@sermonaudio.        To:       <pgsql-admin@postgresql.org>
                   
                      com>                         cc:
                   
                      Sent by:                     Subject:  [ADMIN] Cron DB Bounce causes index problems && Help
rotatinglogs             
                      pgsql-admin-owner@pos
                   
                      tgresql.org
                   

                   

                   
                      01/14/2005 10:16 AM
                   

                   

                   




Hi all,

We are having an issue here trying to automate our db log rotations.  We
were using the Apache rotatelogs program and we thought it was working
fine.  However, when we had a server crash and had to recover and roll
forward using the postgresql logs (we are logging all sql), we discovered
that some of the sql records in the log were being over written.  This has
caused us to go back to just having postgresql log directly to the drives.
(The problem with rotatelogs looked like this "[2005-01-14 08:00:01] [123]
LOG: query: select * fr[2005-01-14 08:00:01] ....")

Anyway, we are now manually rotating the logs.  We tried to rotate the logs
via a simple script and cron, but we ended up causing strange index issues.
The data would be in the tables and would show if a table scan was forced.
But many tables would not show the data if an index scan (esp. if the
primary key was used) was used.  We also had issues with system allowing
records to be inserted that violated the primary key.

Manually logging into the servers and running the same script from the
shell prompt works fine and does not cause any index issues.

Any ideas on what is going on here?  We do not have any postgresql related
environmental variables set in the shell.  Do we need to execute a db
bounce via cron with a "special" call from cron?

Also, has anyone else seen the rotatelogs issue?

Is there a better way to rotate the logs w/o bouncing the db every night?
Due to recovery requirements, we must have the sql logged to the postgresql
log.  These logs grow to between 2 and 5 GB every day.  Running for more
than 2 days without rotating the logs tends to cause a noticable slowdown
to our users.

Any help would be appreciated.

Chris

P.S.

Here is the cron line:
01 20 * * * /pgsql/tmp/bin/rotate_logs.sh 50001

Here is the rotate_log.sh script:

source ~postgres/etc/dba.profile

PORT=$1

LOGPATH=${LOG_PATH}/${PORT}/pg_log

echo "`date` - `hostname` - log rotation started" | mail -s "`hostname`
rotation" $DBAPAGER

## shutdown the database
${POSTGRES_BIN}/pg_ctl -D ${DB_PATH}/${PORT} -m fast stop >
~postgres/db_${PORT}_stop.log 2>&1

STOPPED=`cat ~postgres/db_${PORT}_stop.log | grep "postmaster successfully
shut down" | wc -l`
if [ $STOPPED != 1 ]
then
  echo "POSTGRES DID NOT STOP" | mail -s "`hostname` stop fail" $DBAPAGER
  exit
fi

sleep 150

mv $LOGPATH/postgres.log $LOGPATH/postgres.log.`date +%m%d%y%H%M`

sleep 150

${POSTGRES_BIN}/pg_ctl -D ${DB_PATH}/${PORT} -l ${LOGPATH}/postgres.log
start > ~postgres/db_${PORT}_start.log 2>&1
STARTED=`grep "successfully start" ~postgres/db_${PORT}_start.log | wc -l `
if [ $STARTED != 1 ]
then
  echo "POSTGRES DID NOT START" | mail -s "`hostname` start fail" $DBAPAGER
fi

echo "`date` - `hostname` - log rotation finished" | mail -s "`hostname`
rotation" $DBAPAGER


Here is the dba.profile file:
DBAPAGER=<my pager number>

POSTGRES=/pgsql/

POSTGRES_BIN=${POSTGRES}/bin

DB_PATH=/data

LOG_PATH=/data



---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster




Re: Cron DB Bounce causes index problems && Help rotating logs

От
Tom Lane
Дата:
"Chris Hoover" <revoohc@sermonaudio.com> writes:
> We are having an issue here trying to automate our db log rotations.

What PG version is this?  On what platform?

> We were using the Apache rotatelogs program and we thought it was
> working fine.  However, when we had a server crash and had to recover
> and roll forward using the postgresql logs (we are logging all sql),
> we discovered that some of the sql records in the log were being over
> written.  This has caused us to go back to just having postgresql log
> directly to the drives.  (The problem with rotatelogs looked like this
> "[2005-01-14 08:00:01] [123] LOG: query: select * fr[2005-01-14
> 08:00:01] ....")

Are you sure it was overwritten, and not just that pieces of different
messages were interspersed?  Even that seems a bit unlikely unless
the messages exceeded 4K apiece, which your example certainly doesn't.

> Anyway, we are now manually rotating the logs.  We tried to rotate the
> logs via a simple script and cron, but we ended up causing strange
> index issues.  The data would be in the tables and would show if a
> table scan was forced.  But many tables would not show the data if an
> index scan (esp. if the primary key was used) was used.  We also had
> issues with system allowing records to be inserted that violated the
> primary key.

That sounds like a corrupted index --- time to do a REINDEX.  I strongly
doubt that this has anything to do with the stderr log though.

> Is there a better way to rotate the logs w/o bouncing the db every night?

Piping into rotatelogs is really the best bet.  (In 8.0 there is a built
in equivalent of this method.)  I would recommend trying to find out why
that's not behaving right for you, rather than abandoning it at the
first sign of trouble.

In general, though, I haven't got a lot of confidence in the idea of
using log_statement as a backup technique.  There's no guarantee that
the log entries will appear in exactly the order in which they
effectively executed, and even without that issue, pulling apart
interspersed separate transactions seems a bit painful.

            regards, tom lane

Re: Cron DB Bounce causes index problems && Help rotating logs

От
"Chris Hoover"
Дата:
We are doing "real" backups using pg_dump every night.  The problem is
that we need to provide for an absolute minimum data loss.  We are using
the logging of the sql for this.  Right now, we can not change the
version of the db we are on (7.3.4), so we are trying to make the most
of what we have.

All the pg_dump allows you to do is to give a starting point.  You would
still lose any work done since the last backup.

A.M. wrote:

> What you are trying to do is emulate point-in-time recovery (WAL
> backup) which is available in 8.0. The problem you face is that sql
> logging just does that- logging all input. That is, if I connect to
> your DB and send the query "skdfjhsdkf"- that will end up in  the logs
> too. In addition, there is a limit on the number of characters logged
> from the statement.
>
> The solution is to use pg_dump for real backups or upgrade to 8.0 for
> PITR. Good luck.
>
> On Friday, January 14, 2005, at 03:16 PM, Chris Hoover wrote:
>
>> Hi all,
>>
>> We are having an issue here trying to automate our db log rotations.
>> We were using the Apache rotatelogs program and we thought it was
>> working fine.  However, when we had a server crash and had to recover
>> and roll forward using the postgresql logs (we are logging all sql),
>> we discovered that some of the sql records in the log were being over
>> written.  This has caused us to go back to just having postgresql log
>> directly to the drives.  (The problem with rotatelogs looked like
>> this "[2005-01-14 08:00:01] [123]  LOG: query: select * fr[2005-01-14
>> 08:00:01] ....")
>>
>> Anyway, we are now manually rotating the logs.  We tried to rotate
>> the logs via a simple script and cron, but we ended up causing
>> strange index issues.  The data would be in the tables and would show
>> if a table scan was forced.  But many tables would not show the data
>> if an index scan (esp. if the primary key was used) was used.  We
>> also had issues with system allowing records to be inserted that
>> violated the primary key.
>>
>> Manually logging into the servers and running the same script from
>> the shell prompt works fine and does not cause any index issues.
>>
>> Any ideas on what is going on here?  We do not have any postgresql
>> related environmental variables set in the shell.  Do we need to
>> execute a db bounce via cron with a "special" call from cron?
>>
>> Also, has anyone else seen the rotatelogs issue?
>>
>> Is there a better way to rotate the logs w/o bouncing the db every
>> night?  Due to recovery requirements, we must have the sql logged to
>> the postgresql log.  These logs grow to between 2 and 5 GB every
>> day.  Running for more than 2 days without rotating the logs tends to
>> cause a noticable slowdown to our users.
>>
>> Any help would be appreciated.
>>
>> Chris
>>
>> P.S.
>>
>> Here is the cron line:
>> 01 20 * * * /pgsql/tmp/bin/rotate_logs.sh 50001
>>
>> Here is the rotate_log.sh script:
>>
>> source ~postgres/etc/dba.profile
>>
>> PORT=$1
>>
>> LOGPATH=${LOG_PATH}/${PORT}/pg_log
>>
>> echo "`date` - `hostname` - log rotation started" | mail -s
>> "`hostname` rotation" $DBAPAGER
>>
>> ## shutdown the database
>> ${POSTGRES_BIN}/pg_ctl -D ${DB_PATH}/${PORT} -m fast stop >
>> ~postgres/db_${PORT}_stop.log 2>&1
>>
>> STOPPED=`cat ~postgres/db_${PORT}_stop.log | grep "postmaster
>> successfully shut down" | wc -l`
>> if [ $STOPPED != 1 ]
>> then
>>   echo "POSTGRES DID NOT STOP" | mail -s "`hostname` stop fail"
>> $DBAPAGER
>>   exit
>> fi
>>
>> sleep 150
>>
>> mv $LOGPATH/postgres.log $LOGPATH/postgres.log.`date +%m%d%y%H%M`
>>
>> sleep 150
>>
>> ${POSTGRES_BIN}/pg_ctl -D ${DB_PATH}/${PORT} -l
>> ${LOGPATH}/postgres.log start > ~postgres/db_${PORT}_start.log 2>&1
>> STARTED=`grep "successfully start" ~postgres/db_${PORT}_start.log |
>> wc -l `
>> if [ $STARTED != 1 ]
>> then
>>   echo "POSTGRES DID NOT START" | mail -s "`hostname` start fail"
>> $DBAPAGER
>> fi
>>
>> echo "`date` - `hostname` - log rotation finished" | mail -s
>> "`hostname` rotation" $DBAPAGER
>>
>>
>> Here is the dba.profile file:
>> DBAPAGER=<my pager number>
>>
>> POSTGRES=/pgsql/
>>
>> POSTGRES_BIN=${POSTGRES}/bin
>>
>> DB_PATH=/data
>>
>> LOG_PATH=/data
>>
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 4: Don't 'kill -9' the postmaster
>
>
>
>
>