On 09.01.2013 22:36, Simon Riggs wrote:
> Overall, the WAL record is MAXALIGN'd, so with 8 byte alignment we
> waste 4 bytes per record. Or put another way, if we could reduce
> record header by 4 bytes, we would actually reduce it by 8 bytes per
> record. So looking for ways to do that seems like a good idea.
Agreed.
> The WAL record header starts with xl_tot_len, a 4 byte field. There is
> also another field, xl_len. The difference is that xl_tot_len includes
> the header, xl_len and any backup blocks. Since the header is fixed,
> the only time xl_tot_len != SizeOfXLogRecord + xl_len is when we have
> backup blocks.
>
> We can re-arrange the record layout so that we remove xl_tot_len and
> add another (maxaligned) 4 byte field (--> 8 bytes) after the record
> header, xl_bkpblock_len that only exists if we have backup blocks.
> This will then save 8 bytes from every record that doesn't have backup
> blocks, and be the same as now with backup blocks.
Here's a better idea:
Let's keep xl_tot_len as it is, but move xl_len at the very end of the
WAL record, after all the backup blocks. If there are no backup blocks,
xl_len is omitted. Seems more robust to keep xl_tot_len, so that you
require less math to figure out where one record ends and where the next
one begins.
> Forcing the XLogRecord header to be all on one page makes the format
> more robust and simplifies the code that copes with header wrapping.
-1 on that. That would essentially revert the changes I made earlier.
The purpose of allowing the header to be wrapped was that you could
easily calculate ahead of time exactly how much space a WAL record
takes. My motivation for that was the XLogInsert scaling patch. Now, I
admit I haven't had a chance to work further on that patch, so we're not
gaining much from the format change at the moment. Nevertheless, I don't
want us to get back to the situation that you sometimes need to add
padding to the end of a WAL page.
My suggestion above to keep xl_tot_len and remove xl_len from XLogRecord
doesn't have a problem with crossing page boundaries.
- Heikki