Обсуждение: AW: AW: AW: WAL-based allocation of XIDs is insecure
> >> Hmm. Actually, what is written to the log is the *modified* page not > >> its original contents. > > Well, that sure is not what was discussed on the list for implementation !! > > I thus really doubt above statement. > > Read the code. Ok, sad. > > > Each page about to be modified should be written to the txlog once, > > and only once before the first modification after each checkpoint. > > Yes, there's only one page dump per page per checkpoint. But the > sequence is (1) make the modification in shmem buffers then (2) make > the XLOG entry. > > I believe this is OK since the XLOG entry is flushed before any of > the pages it affects are written out from shmem. Since we have not > changed the storage management policy, it's OK if heap pages contain > changes from uncommitted transactions Sure, but the other way would be a lot less complex. > --- all we must avoid is > inconsistencies (eg not all three pages of a btree split written out), > and redo of the XLOG entry will ensure that for us. Is it so hard to swap ? First write page to log then modify in shmem. Then those pages would have additional value, because then utilities could do all sorts of things with those pages. 1. Create a consistent state of the db by only applying "physical log" pagesafter checkpoint (in case a complete WAL rollforwardbails out) 2. Create a consistent online backup snapshot, by first doing something like an ordinary tar, and after that save all "physicallog" pages. Andreas
Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes: > Is it so hard to swap ? First write page to log then modify in shmem. > Then those pages would have additional value, because > then utilities could do all sorts of things with those pages. After thinking about this a little, I believe I see why Vadim did it the way he did. Suppose we tried to make the code sequence be obtain write lock on buffer;XLogOriginalPage(buffer); // copy page to xlog if first since ckptmodify buffer;XLogInsert(xlogentry for modification);mark buffer dirty and release write lock; so that the saving of the original page is a separate xlog entry from the modification data. Looks easy, and it'd sure simplify XLogInsert a lot. The only problem is it's wrong. What if a checkpoint occurs between the two XLOG records? The decision whether to log the whole buffer has to be atomic with the actual entry of the xlog record. Unless we want to hold the xlog insert lock for the entire time that we're (eg) splitting a btree page, that means we log the buffer after the modification work is done, not before. regards, tom lane
I wrote: > The decision whether to log the whole buffer has to be atomic with the > actual entry of the xlog record. Unless we want to hold the xlog insert > lock for the entire time that we're (eg) splitting a btree page, that > means we log the buffer after the modification work is done, not before. On third thought --- we could still log the original page contents and the modification log record atomically, if what were logged in the xlog record were (essentially) the parameters to the operation being logged, not its results. That is, make the log entry before you start doing the mod work, not after. This might also simplify redo, since redo would be no different from the normal case. I'm not sure why Vadim didn't choose to do it that way; maybe there's some other fine point I'm missing. In any case, it'd be a big code change and not something I'd want to undertake at this point in the release cycle ... maybe we can revisit this issue for 7.2. regards, tom lane