Re: XLog changes for 9.3

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: XLog changes for 9.3
Дата
Msg-id 201206071618.55703.andres@2ndquadrant.com
обсуждение исходный текст
Ответ на XLog changes for 9.3  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Ответы Re: XLog changes for 9.3  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Список pgsql-hackers
On Thursday, June 07, 2012 03:50:35 PM Heikki Linnakangas wrote:
> When I worked on the XLogInsert scaling patch, it became apparent that
> some changes to the WAL format would make it a lot easier. So for 9.3,
> I'd like to do some refactoring:

> 1. Use a 64-bit integer instead of the two-variable log/seg
> representation, for identifying a WAL segment. This has no user-visible
> effect, but makes the code a bit simpler.
+1

We can define a sensible InvalidXLogRecPtr instead of doing that locally in 
loads of places! Yipee.

> 2. Don't waste the last WAL segment in each logical 4GB file. Currently,
> we skip the WAL segment ending with "FF". The comments claim that
> wasting the last segment "ensures that we don't have problems
> representing last-byte-position-plus-1", but in my experience, it just
> makes things more complicated. You have two ways to represent the
> segment boundary, and some functions are picky on which one is used. For
> example, XLogWrite() assumes that when you want to flush to the end of a
> logical log file, you use the "5/FF000000" representation, not
> "6/00000000". Other functions, like XLogPageRead(), expect the latter.
> 
> This is a backwards-incompatible change for external utilities that know
> how the WAL segment numbering works. Hopefully there aren't too many of
> those around.
+1

> 3. Move the only field, xl_rem_len, from the continuation record header
> straight to the xlog page header, eliminating XLogContRecord altogether.
> This makes it easier to calculate in advance how much space a WAL record
> requires, as it no longer depends on how many pages it has to be split
> across. This wastes 4-8 bytes on every xlog page, but that's not much.
+1. I don't think this will waste a measureable amount in real-world 
scenarios. A very big percentag of pages have continuation records.

> 4. Allow WAL record header to be split across page boundaries.
> Currently, if there are less than SizeOfXLogRecord bytes left on the
> current WAL page, it is wasted, and the next record is inserted at the
> beginning of the next page. The problem with that is again that it makes
> it impossible to know in advance exactly how much space a WAL record
> requires, because it depends on how many bytes need to be wasted at the
> end of current page.
+0.5. Its somewhat convenient to be able to look at a record before you have 
reassembled it over multiple pages. But its probably not worth the 
implementation complexity.
If we do that we can remove all the aligment padding as well. Which would be a 
problem for you anyway, wouldn't it?

> These changes will help the XLogInsert scaling patch, by making the
> space calculations simpler. In essence, to reserve space for a WAL
> record of size X, you just need to do "bytepos += X".  There's a lot
> more details with that, like mapping from the contiguous byte position
> to an XLogRecPtr that takes page headers into account, and noticing
> RedoRecPtr changes safely, but it's a start.
Hm. Wouldn't you need to remove short/long page headers for that as well? 


Andres

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Could we replace SysV semaphores with latches?
Следующее
От: Robert Haas
Дата:
Сообщение: Re: "page is not marked all-visible" warning in regression tests