Placing hints in line pointers
От | Simon Riggs |
---|---|
Тема | Placing hints in line pointers |
Дата | |
Msg-id | CA+U5nMLzCXuK-hix4OJXFMeu--0W8=vWtLW-U8boOncZ=LMzdw@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Placing hints in line pointers
(Jeff Davis <pgsql@j-davis.com>)
|
Список | pgsql-hackers |
Notes on a longer term idea... An item pointer (also called line pointer) is used to allow an external pointer to an item, while allowing us to place the tuple that anywhere on the page. An ItemId is 4 bytes long and currently consists of (see src/include/storage/itemid.h)... typedef struct ItemIdData{ unsigned lp_off:15, /* offset to tuple (from start of page) */ lp_flags:2, /* state of item pointer, see below */ lp_len:15; /* byte length of tuple */} ItemIdData; The offset to the tuple is 15 bits, which is sufficient to point to 32768 separate byte positions, and hence why we limit ourselves to 32kB blocks. If we use 4 byte alignment for tuples, then that would mean we wouldn't ever use the lower 2 bits of lp_off, nor would we use the lower 2 bits of lp_len. They are always set at zero. (Obviously, with 8 byte alignment we would have 3 bits spare in each, but I'm looking for something that works the same on various architectures for simplicity). So my suggestion is to make lp_off and lp_len store the values in terms of 4 byte chunks, which would allow us to rework the data structure like this... typedef struct ItemIdData { unsigned lp_off:13, /* offset to tuple (from start of page), number of 4 byte chunks */ lp_xmin_hint:2, /* committed and invalid hints for xmin */ lp_flags:2, /* state of item pointer, see below */ lp_len:13; /* byte length of tuple, numberof 4 byte chunks */ lp_xmax_hint:2, /*committed and invalid hints for xmax */ } ItemIdData; i.e. we have room for 4 additional bits and we use those to put the tuple hints for xmin and xmax Doing this would have two purposes: * We wouldn't need to follow the pointer if the row is marked aborted. This would save a random memory access for that tuple * It would isolate the tuple hint values into a smaller area of the block, so we would be able to avoid the annoyance of recalculating the checksums for the whole block when a single bit changes. We wouldn't need to do a FPW when a hint changes, we would only need to take a copy of the ItemId array, which is much smaller. And it could be protected by its own checksum. (In addition, if we wanted, this could be used to extend block size to 64kB if we used 8-byte alignment for tuples) --Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: