Re: pg v. 8.4.5 misses objects and data after restoring from backup using wal

Поиск
Список
Период
Сортировка
От Imre Oolberg
Тема Re: pg v. 8.4.5 misses objects and data after restoring from backup using wal
Дата
Msg-id 4D21AC15.1010102@auul.pri.ee
обсуждение исходный текст
Ответ на pg v. 8.4.5 misses objects and data after restoring from backup using wal  (Imre Oolberg <imre@auul.pri.ee>)
Ответы Re: pg v. 8.4.5 misses objects and data after restoring from backup using wal  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Список pgsql-admin
Hi and Happy New Year!

Unfortunately i still have the same problem, in the mean time i tried
out with different versions and still get same results. Experimenting on
Debian Lenny v 5.0 with v. compiled v. 9.0.2 where default conf is
changed only in this

wal_level = archive
archive_mode = on
archive_command = 'test ! -f /data/backup/postgresql/archive-logs/%f &&
cp %p /data/backup/postgresql/archive-logs'

and having as a starting point of the recovery

1. data cluster file sytem copied with rsync between pg_start_backup and
pg_stop backup
2. wal logs archived starting earier that pg_start_backup and continuing
after pg_stop_backup

I guess that my problem is probably that although my .backup file says

000000010000000000000009.00000020.backup
START WAL LOCATION: 0/9000020 (file 000000010000000000000009)
STOP WAL LOCATION: 1/6325B2E0 (file 000000010000000100000063)
CHECKPOINT LOCATION: 0/A2C4908
START TIME: 2011-01-03 11:32:17 EET
LABEL: test
STOP TIME: 2011-01-03 12:18:27 EET

looking at the pg_ctl.log it starts reading wal logs beginning with
000000010000000100000063 and not with 000000010000000000000009

LOG:  database system was interrupted; last known up at 2011-01-03
11:58:56 EET
LOG:  creating missing WAL directory "pg_xlog/archive_status"
LOG:  starting archive recovery
LOG:  restored log file "000000010000000100000063" from archive
LOG:  consistent recovery state reached at 1/6325B2B8
LOG:  redo starts at 1/6325B2B8
LOG:  restored log file "000000010000000100000064" from archive
LOG:  restored log file "000000010000000100000065" from archive
LOG:  restored log file "000000010000000100000066" from archive
...
LOG:  restored log file "000000010000000100000081" from archive
cp: cannot stat
`/data/backup/postgresql/archive-logs/000000010000000100000082': No such
file or directory
LOG:  could not open file "pg_xlog/000000010000000100000082" (log file
1, segment 130): No such file or directory
LOG:  redo done at 1/81530E18
LOG:  last completed transaction was at log time 2011-01-03
12:20:31.917695+02
LOG:  restored log file "000000010000000100000081" from archive
cp: cannot stat `/data/backup/postgresql/archive-logs/00000002.history':
No such file or directory
LOG:  selected new timeline ID: 2
cp: cannot stat `/data/backup/postgresql/archive-logs/00000001.history':
No such file or directory
LOG:  archive recovery complete
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections

I tried to follow a pitr procedure as per
http://www.postgresql.org/docs/9.0/interactive/continuous-archiving.html
and i would be very greatful if somebody comments this and points me to
the right direction.


Best regards,

Imre


On 12/30/10 13:03, Imre Oolberg wrote:
> Hi!
>
> I did my usual test restoration from backup which leaded me to suspect
> that some data is missing although database starts up. I did some more
> testing to reproduce the problem as generally as i could and i wish to
> present it here and hope that somebody could comment on this. It may
> well be my bad as i am doing some systematic error or something else.
>
> I use debian lenny amd64 with postgresql from backports which is
>
> # dpkg -l postgre\* | grep ii
> ii  postgresql-8.4                      8.4.5-2~bpo50+1
> object-relational SQL database, version 8.4
> ii  postgresql-client-8.4               8.4.5-2~bpo50+1 front-end
> programs for PostgreSQL 8.4
> ii  postgresql-client-common            111~bpo50+1 manager for
> multiple PostgreSQL client versi
> ii  postgresql-common                   111~bpo50+1 PostgreSQL
> database-cluster manager
> ii  postgresql-contrib-8.4              8.4.5-2~bpo50+1 additional
> facilities for PostgreSQL
> ii  postgresql-plperl-8.4               8.4.5-2~bpo50+1 PL/Perl
> procedural language for PostgreSQL 8
> ii  postgresql-server-dev-8.4           8.4.5-2~bpo50+1 development
> files for PostgreSQL 8.4 server-
>
> In addition i have compiled and installed pljava, plproxy and skytools
> libraries but in this test i do not use them, i.e. i guess it doest
> really matter that the are in /usr/lib/postgresql/8.4/lib directory
> since no languages/functions are defined to use them.
>
> PostgreSQL config is default except wal logging is turned on.
>
> My procedure consists of the following steps
>
> 1. start up database
> 2. created schema test1 and into it table test like
>
> CREATE TABLE test1.test (
>   test_id serial NOT NULL,
>   pilt bytea,
>   CONSTRAINT test_id_pkey PRIMARY KEY (test_id)
> )
>
> and started inserting data with this script
>
> import psycopg2, cPickle
> conn = psycopg2.connect("dbname=test host=127.0.0.1 user=postgres
> password=xxx port=5432")
> cur = conn.cursor()
>
> f = open('/boot/initrd.img-2.6.26-2-amd64', 'rb')
> binary = f.read()
> sql = "insert into test1.test (pilt) values (%s)";
> cur.execute(sql, (psycopg2.Binary(binary),))
> cur.close()
> conn.commit()
> conn.close()
>
> like
>
> $ for i in `seq 1 200`; do echo $i; python pi1.py; sleep 10; done
>
> 3. issued pg_start_backup('test') while script is still running
> 4. created schemas and tables test2.test and test3.test and started
> insterting data similarly
> 5. while three scripts are running issued rsync -avH --bwlimit=1024
> root@localhost:/var/lib/postgresql/ /mnt/backup/
> 6. while rsync is running i created schemas and tables test4.test and
> test5.test and started insterting data similarly
> 7. when scripts and rsync have been completed i said pg_stop_backup
> 8. created schema and table test6.test and started insterting data
> similarly
>
> When everything was finished i had approx 5g used as data cluster
> filesystem and about 400 wal logs by 16 MB each.
>
> Restore went like this
>
> 1. mounted /mnt/backup/ under /var/lib/postgresql
> 2. removed backup_label
> 3. created recovery.conf with a line
>
> restore_command = 'cp /data/backup/postgresql/archive-logs/%f %p'
>
> 4. emptied pg_xlog directory while having directory itself
> 5. started postgresql and it finishes accepting connections
>
> And now observations
>
> 1. while i have 400 wal log files during recovery it used only about
> last 20 of them
> 2. while original working postgresql data cluster filesystem's used
> size was about 5 gb after restore used space is 2.6 gb
> 3. looking thru tables, most of them have quarter of half less entries
> than was actually insterted and is present in original database
> 4. some tables are missing (test4.test) and some give errors like
> (test5.test) when i say select count(*) from test5.test;
>
> 2010-12-30 12:41:23 EET ERROR:  catalog is missing 2 attribute(s) for
> relid 16586 at character 22
> 2010-12-30 12:41:23 EET STATEMENT:  select count(*) from test5.test;
>
> tried it also with compiled version of 8.4.5 but i chose to use lenny
> + backports because it may be this way more reproducable that my
> compiled instance. Actually my data size is about 160 gb, and i tested
> also with sizes about 1g and 4 tables but successful reproduction
> seems to need database size in filesystem around 5 g and 6 tables.
>
> I undestand that this report is not very precise, i.e. i have not
> fixed how much data was interted into database at the moment rsyncing
> started but i hope it has some useful points and someone could help me
> further, i am also ready to do some more testing if it is helpful.
>
>
> Imre
>
>


В списке pgsql-admin по дате отправления:

Предыдущее
От: sharilalipv
Дата:
Сообщение: hanging query
Следующее
От: "Kevin Grittner"
Дата:
Сообщение: Re: hanging query