Re: pg_upgrade bug found!
От | bricklen |
---|---|
Тема | Re: pg_upgrade bug found! |
Дата | |
Msg-id | BANLkTim6w+tX9mBRGtLqDfGxsgmuJJvbBQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: pg_upgrade bug found! (bricklen <bricklen@gmail.com>) |
Ответы |
Re: pg_upgrade bug found!
|
Список | pgsql-hackers |
On Fri, Apr 8, 2011 at 7:20 PM, bricklen <bricklen@gmail.com> wrote: > On Fri, Apr 8, 2011 at 7:11 PM, Stephen Frost <sfrost@snowman.net> wrote: >> bricklen, >> >> * bricklen (bricklen@gmail.com) wrote: >>> I looked deeper into our backup archives, and it appears that I do >>> have the clog file reference in the error message "DETAIL: Could not >>> open file "pg_clog/04BE": No such file or directory." >> >> Great! And there's no file in pg_clog which matches that name (or >> exist which are smaller in value), right? >> >>> It exists in an untouched backup directory that I originally made when >>> I set up the backup and ran pg_upgrade. I'm not sure if it is from >>> version 8.4 or 9.0.2 though. Is it safe to just copy it into my >>> production pg_clog dir and restart? >> >> It should be, provided you're not overwriting any files or putting a >> clog file in place which is greater than the other clog files in that >> directory. > > It appears that there are no files lower. > > Missing clog: 04BE > > production pg_clog dir: > ls -lhrt 9.0/data/pg_clog > total 38M > -rw------- 1 postgres postgres 256K Jan 25 21:04 04BF > -rw------- 1 postgres postgres 256K Jan 26 12:35 04C0 > -rw------- 1 postgres postgres 256K Jan 26 20:58 04C1 > -rw------- 1 postgres postgres 256K Jan 27 13:02 04C2 > -rw------- 1 postgres postgres 256K Jan 28 01:00 04C3 > ... > > old backup pg_clog dir (possibly v8.4) > ... > -rw------- 1 postgres postgres 256K Jan 23 21:11 04BB > -rw------- 1 postgres postgres 256K Jan 24 08:56 04BC > -rw------- 1 postgres postgres 256K Jan 25 06:32 04BD > -rw------- 1 postgres postgres 256K Jan 25 10:58 04BE > -rw------- 1 postgres postgres 256K Jan 25 20:44 04BF > -rw------- 1 postgres postgres 8.0K Jan 25 20:54 04C0 > > > So, if I have this right, my steps to take are: > - copy the backup 04BE to production pg_clog dir > - restart the database > - run Bruce's script > > Does that sound right? Has anyone else experienced this? I'm leery of > testing this on my production db, as our last pg_dump was from early > this morning, so I apologize for being so cautious. > > Thanks, > > Bricklen What I've tested and current status: When I saw the announcement a few hours ago, I started setting up a 9.0.3 hot standby. I brought it live a few minutes ago. - I copied over the 04BE clog from the original backup, - restarted the standby cluster - ran the script against the main database and turned up a bunch of other transactions that were missing: psql:pg_upgrade_tmp.sql:539: ERROR: could not access status of transaction 1248683931 DETAIL: Could not open file "pg_clog/04A6": No such file or directory. psql:pg_upgrade_tmp.sql:540: ERROR: could not access status of transaction 1249010987 DETAIL: Could not open file "pg_clog/04A7": No such file or directory. psql:pg_upgrade_tmp.sql:541: ERROR: could not access status of transaction 1250325059 DETAIL: Could not open file "pg_clog/04A8": No such file or directory. psql:pg_upgrade_tmp.sql:542: ERROR: could not access status of transaction 1252759918 DETAIL: Could not open file "pg_clog/04AA": No such file or directory. psql:pg_upgrade_tmp.sql:543: ERROR: could not access status of transaction 1254527871 DETAIL: Could not open file "pg_clog/04AC": No such file or directory. psql:pg_upgrade_tmp.sql:544: ERROR: could not access status of transaction 1256193334 DETAIL: Could not open file "pg_clog/04AD": No such file or directory. psql:pg_upgrade_tmp.sql:556: ERROR: could not access status of transaction 1268739471 DETAIL: Could not open file "pg_clog/04B9": No such file or directory. I checked, and found that each one of those files exists in the original backup location. - scp'd those files to the hot standby clog directory, - pg_ctl stop -m fast - pg_ctl start - ran the script Hit a bunch of missing clog file errors like above, repeated the scp + bounce + script process 4 or 5 more times until no more missing clog file messages surfaced. Now, is this safe to run against my production database? **Those steps again, to run against prod: cp the clog files from the original backup to dir to my production pg_clog dir bounce the database run the script against all database in the cluster Anyone have any suggestions or changes before I commit myself to this course of action? Thanks, Bricklen
В списке pgsql-hackers по дате отправления: