Re: UNDO and in-place update
От | Greg Stark |
---|---|
Тема | Re: UNDO and in-place update |
Дата | |
Msg-id | CAM-w4HMNCBrYePEAgB9F4o9YVBmb2YJ+SW8__rZwhzTSOp1Z2Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: UNDO and in-place update (Peter Geoghegan <pg@heroku.com>) |
Ответы |
Re: UNDO and in-place update
(Bruce Momjian <bruce@momjian.us>)
Re: UNDO and in-place update (Thomas Kellerer <spam_eater@gmx.net>) Re: UNDO and in-place update (Robert Haas <robertmhaas@gmail.com>) |
Список | pgsql-hackers |
On 23 November 2016 at 04:28, Peter Geoghegan <pg@heroku.com> wrote: > On Tue, Nov 22, 2016 at 7:01 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> This basic DO-UNDO-REDO protocol has been well-understood for >> decades. > > FWIW, while this is basically true, the idea of repurposing UNDO to be > usable for MVCC is definitely an Oracleism. Mohan's ARIES paper says > nothing about MVCC. Fwiw, Oracle does not use the undo log for snapshot fetches. It's used only for transaction rollback and recovery. For snapshot isolation Oracle has yet a *third* copy of the data in a space called the "rollback segment(s)". When you update a row in a block you save the whole block in the rollback segment. When you try to access a block you check if the CSN -- which is basically equivalent to our LSN -- is newer than your snapshot and if it is you fetch the old version of the block from the rollback. Essentially their MVCC is done on a per-block level rather than a per-row level and they keep only the newest version of the block in the table, the rest are in the rollback segment. For what it's worth I think our approach is cleaner and more flexible. They had a lot of trouble with their approach over the years and it works well only because they invested an enormous amount of development in it and also because people throw a lot of hardware at it too. I think the main use case we have trouble with is actually the "update every row in the table" type of update which requires we write to every block, plus a second copy of every block, plus write full pages of both copies, then later set hint bits dirtying pages again and generating more full pages writes, then later come along and vacuum which requires two more writes of every block, etc. If we had a solution for the special case of an update that replaces every row in a page that I think would complement HOT nicely and go a long way towards fixing our issues. Incidentally the "Interested transaction list" is for locking rows for updates and it's basically similar to what we've discussed before of having a "most frequent xmin" in the header and then a bit indicating the xmin is missing from the row header. Except in their case they don't need it for the actual xmin/xmax because their visibility is done per-block, only the transient lock state -- greg
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Dilip KumarДата:
Сообщение: Re: Creating a DSA area to provide work space for parallel execution