page corruption after moving tablespace
От | Jeff Davis |
---|---|
Тема | page corruption after moving tablespace |
Дата | |
Msg-id | 1279867843.23350.28.camel@jdavis обсуждение исходный текст |
Ответы |
Re: page corruption after moving tablespace
Re: page corruption after moving tablespace |
Список | pgsql-bugs |
I was investigating some strange page corruption today in which the page was completely zeroed except for the LSN and TLI. I found a sequence that can cause that problem even in 9.0: (wal_level must be set to "archive" or greater) 1. Create a tablespace "t1" 2. Create a table "foo" 3. Attach to the backend with gdb, and set a breakpoint at the START_CRITICAL_SECTION() line in heap_insert(). Continue in gdb. 4. Insert a tuple into foo. 5. gdb should break. At that time, send a SIGKILL. 6. restart the server (if it doesn't restart itself) 7. ALTER TABLE foo SET TABLESPACE t1; 8. SELECT * FROM foo; ERROR: invalid page header in block 0 of relation pg_tblspc/16384/PG_9.1_201007151/11876/24576 The SIGKILL is just a way to get an all-zero page to end up in a heap file. Any time any relation gets an all-zero page (which is generally treated as a valid situation in postgres), changing the tablespace is a problem. The code does a copy_relation_data, and that does a log_newpage, and that sets the LSN and TLI on the page and then writes it. But on an all-zero page, that leaves the page corrupt. I think the simple fix would be to have copy_relation_data call PageInit() if it's a new page. Are there other areas where a similar problem might exist? Regards, Jeff Davis
В списке pgsql-bugs по дате отправления: