Re: [GENERAL] PANIC: heap_update_redo: no block
От | Tom Lane |
---|---|
Тема | Re: [GENERAL] PANIC: heap_update_redo: no block |
Дата | |
Msg-id | 26340.1143515039@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [GENERAL] PANIC: heap_update_redo: no block (Greg Stark <gsstark@mit.edu>) |
Ответы |
Re: [GENERAL] PANIC: heap_update_redo: no block
(Simon Riggs <simon@2ndquadrant.com>)
Re: [GENERAL] PANIC: heap_update_redo: no block (Tom Lane <tgl@sss.pgh.pa.us>) Re: [GENERAL] PANIC: heap_update_redo: no block (Bruce Momjian <pgman@candle.pha.pa.us>) |
Список | pgsql-hackers |
Greg Stark <gsstark@mit.edu> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> I think what's happened here is that VACUUM FULL moved the only tuple >> off page 1 of the relation, then truncated off page 1, and now >> heap_update_redo is panicking because it can't find page 1 to replay the >> move. Curious that we've not seen a case like this before, because it >> seems like a generic hazard for WAL replay. > This sounds familiar > http://archives.postgresql.org/pgsql-hackers/2005-05/msg01369.php After further review I've concluded that there is not a systemic bug here, but there are several nearby local bugs. The reason it's not a systemic bug is that this scenario is supposed to be handled by the same mechanism that prevents torn-page writes: the first XLOG record that touches a given page after a checkpoint is supposed to rewrite the entire page, rather than update it incrementally. Since XLOG replay always begins at a checkpoint, this means we should always be able to write a fresh copy of the page, even after relation deletion or truncation. Furthermore, during XLOG replay we are willing to create a table (or even a whole tablespace or database directory) if it's not there when touched. The subsequent replay of the deletion or truncation will get rid of any unwanted data again. Therefore, there is no systemic bug --- unless you are running with full_page_writes=off. I assert that that GUC variable is broken and must be removed. There are, however, a bunch of local bugs, including these: * On a symlink-less platform (ie, Windows), TablespaceCreateDbspace is #ifdef'd to be a no-op. This is wrong because it performs the essential function of re-creating a tablespace or database directory if needed during replay. AFAICS the #if can just be removed and have the same code with or without symlinks. * log_heap_update decides that it can set XLOG_HEAP_INIT_PAGE instead of storing the full destination page, if the destination contains only the single tuple being moved. This is fine, except it also resets the buffer indicator for the *source* page, which is wrong --- that page may still need to be re-generated from the xlog record. This is the proximate cause of the bug report that started this thread. * btree_xlog_split passes extend=false to XLogReadBuffer for the left sibling, which is silly because it is going to rewrite that whole page from the xlog record anyway. It should pass true so that there's no complaint if the left sib page was later truncated away. This accounts for one of the bug reports mentioned in the message cited above. * btree_xlog_delete_page passes extend=false for the target page, which is likewise silly because it's going to init the page (not that there was any useful data on it anyway). This accounts for the other bug report mentioned in the message cited above. Clearly, we need to go through the xlog code with a fine tooth comb and convince ourselves that all pages touched by any xlog record will be properly reconstituted if they've later been truncated off. I have not yet examined any of the code except the above. Notice that these are each, individually, pretty low-probability scenarios, which is why we've not seen many bug reports. If we had had a systemic bug I'm sure we'd be seeing far more. regards, tom lane
В списке pgsql-hackers по дате отправления:
Предыдущее
От: "Andrew Dunstan"Дата:
Сообщение: Re: Why are default encoding conversions namespace-specific?