Re: Bug in MultiXact replay compat logic for older minor version after crash-recovery
| От | Heikki Linnakangas |
|---|---|
| Тема | Re: Bug in MultiXact replay compat logic for older minor version after crash-recovery |
| Дата | |
| Msg-id | 33319276-e4d0-4773-89e4-09084905fdb0@iki.fi обсуждение исходный текст |
| Ответ на | Re: Bug in MultiXact replay compat logic for older minor version after crash-recovery (Andrey Borodin <x4mmm@yandex-team.ru>) |
| Ответы |
回复:Bug in MultiXact replay compat logic for older minor version after crash-recovery
Re: Bug in MultiXact replay compat logic for older minor version after crash-recovery |
| Список | pgsql-hackers |
On 20/03/2026 19:05, Andrey Borodin wrote: >> On 20 Mar 2026, at 18:14, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> >> Zeroing the page again is dangerous because the CREATE_ID records can be out of order. The page might already containsome later multixids, and zeroing will overwrite them. > > I see only cases when it's not a problem: we zeroed page, did not flush it, thus did not extend the file, crashed, testedFS, zeroed page once more, overwrote again by replaying WAL, no big deal. > We should never zero a page with offsets, that will not be replayed by WAL. I think we're in agreement, but I want to verify because this is important to get right. I was replying to this: > If we are sure buffers have no this page we can detect it via FS. > Otherwise... nothing bad can happen, actually. We might get false positive and zero the page once more. My point is that if we rely on SimpleLruDoesPhysicalPageExist(), and it ever returns false even though we had already initialized the page, you can lose data. It's *not* ok to zero a page again that was zeroed earlier already, because we might have already written some real data on it. Let's consider this wal stream, generated with old minor version: ZERO_PAGE:2048 -> CREATE_ID:2048 -> CREATE_ID:2049 -> CREATE_ID:2047 2048 is the first multixid on the page. When WAL replay gets to the CREATE_ID:2047 record, it will enter the backwards-compatibility codepath and needs to determine if the page containing the next mxid (2048) already exists. In this WAL sequence, the page already exist because the ZERO_PAGE record was replayed earlier. But if we just call SimpleLruDoesPhysicalPageExist(), it will return 'false' because the page was not flushed to disk yet. If we believe that and zero the page again, we will lose data (the offset for mxid 2049). The opposite cannot happen: if SimpleLruDoesPhysicalPageExist() returns true, then it does really exist. So indeed we can only trust SimpleLruDoesPhysicalPageExist() if we are sure that the page is not sitting in the buffers. Attached is a new version. I updated the comment to explain that. I also added another safety measure: before calling SimpleLruDoesPhysicalPageExist(), flush all the SLRU buffers. That way, SimpleLruDoesPhysicalPageExist() should definitely return the correct answer. That shouldn't be necessary because the check with last_initialized_offsets_page should cover all the cases where a page that extended the file is sitting in the buffers, but better safe than sorry. - Heikki
Вложения
В списке pgsql-hackers по дате отправления: