Re: archive falling behind

Поиск
Список
Период
Сортировка
От German Becker
Тема Re: archive falling behind
Дата
Msg-id CALyjCLu-siDm5FH12XhkhmVr746p6HX5+gt3t7OokDL3OG56nQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: archive falling behind  (German Becker <german.becker@gmail.com>)
Список pgsql-admin
Actually this seems like a very strange filesystem /hw problem. The wal segments keep "changing" even after I stoped the database and noone is supposly accesing it:

root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049
6fd36722641dc2857bb950437c052fa3  000000010000001000000049
root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049
26e9c82d123513528824bdf9815dbd2b  000000010000001000000049
root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049
649111a77ac7ec26f4ddeed18e039faa  000000010000001000000049
root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# lsof 000000010000001000000049
root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049
ac9ba79e672bc5df2c126044e9054ff7  000000010000001000000049
root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049
8956e59a4542599e8ded7450b7cab5a6  000000010000001000000049
root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049
514dccfe7f5df4c55747e14e6c13268f  000000010000001000000049
root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049
f2c53795afcbc7c150443a3cdd3550bb  000000010000001000000049
root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049
79687effd43c0e51a127a677e14a815c  000000010000001000000049
root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049
51b66cd72ed3fb11aa57fab244696e0f  000000010000001000000049
root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# md5sum 000000010000001000000049
bf1a2ec5847c40a0b9200769cff601e4  000000010000001000000049

root@lemur:/var/lib/postgresql/9.1/main/pg_xlog# lsof 000000010000001000000049
root@lemur:/var/lib/postgresql/9.1/main/pg_xlog#


Maybe this is off-topic but has anyone seen something like this? I'm on Ubuntu 12.04. This is the hard drive mount line (the hard drive is used exclusivly for the pg_xlog directory):

/dev/sdb1 on /storage/sdb1 type ext4 (rw,noatime,errors=remount-ro)

Thanks!


On Fri, Apr 26, 2013 at 4:25 PM, German Becker <german.becker@gmail.com> wrote:
Hi I have reverted to cp as archive command, but know under heavy load (> 150 WAL segments  in a minute) it happens that some wal segments gets corrupted:

postgres@lemur:~/9.1/main/pg_xlog$ md5sum 000000010000001000000049
f1906d2745224430f811496df466203f  000000010000001000000049
postgres@lemur:~/9.1/main/pg_xlog$ md5sum ~/backups/wal/000000010000001000000049
7e73fe759e41e427497360a815f9d3e1  /var/lib/postgresql/backups/wal/000000010000001000000049





On Fri, Apr 26, 2013 at 10:55 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
German Becker wrote:
> Here is the archive part of the config:
>
> archive_mode = on               # allows archiving to be done
>                                 # (change requires restart)
> archive_command = '/var/lib/postgresql/scripts/archive_copy.sh %p %f'           # command to use to
> archive a logfile segment
> #archive_timeout = 0            # force a logfile segment switch after this
>                                 # number of seconds; 0 disables

So the problem might be in that script.

> The archive coommand makes a local copy and then it copies to the backup server via ssh. Both copies
> are md5-checked and retried up to 3 times in case of failure.

archive_command should not retry the operation, but rather
return a non-zero return code.

> I have seen under heavy load that some WALs are skipped, some have less size, some are corrupted (i,e,
> the loop fails 3 times).
> I'm not sure about the return value (checking it). What is the expected behaviour of the archiver?
> Will it retry de archive if archive command returns differnt than 0? Will it retain the WAL segment
> until it is succesfuly archived?

See http://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-ARCHIVING-WAL

archive_command should exit with zero only if the
WAL segment was archived successfully.
PostgreSQL will retry and retain the WAL segment until
archival succeeds.

Yours,
Laurenz Albe


В списке pgsql-admin по дате отправления:

Предыдущее
От: German Becker
Дата:
Сообщение: Re: archive falling behind
Следующее
От: ALEXANDER JOSE
Дата:
Сообщение: Postgresql Courses