Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap

Поиск
Список
Период
Сортировка
От Jameison Martin
Тема Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
Дата
Msg-id 1344527774.12166.YahooMailNeo@web39404.mail.mud.yahoo.com
обсуждение исходный текст
Ответ на Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
Список pgsql-hackers
<div style="color:; background-color:; font-family:tahoma, new york, times, serif;font-size:10pt"><div
style="font-family:tahoma, 'new york', times, serif; font-size: 10pt; "><span>Simon, Tom is correct, the patch doesn't
changethe existing row format contract or the format of the null bitmap. The change only affects how new rows are
writtenout. And it uses the same supported format that has always been there (which is why alter table add col null
worksthe way it does). And it keeps to the same MAXALIGN boundaries that are there today. </span></div><div
style="font-family:tahoma, 'new york', times, serif; font-size: 13px; color: rgb(0, 0, 0); background-color:
transparent;font-style: normal; "><span><br /></span></div><div style="background-color: transparent; "><span><font
size="2">Onecould argue that different row formats could make sense in different circumstances, and I'm certainly open
tothat kind of discussion, but this change is far more modest and perhaps can be made on its own since it
doesn't perturb thecode base much, improves performance (marginally) and improves the size of rows with lots of
trailingnulls.</font></span></div><div style="background-color: transparent; color: rgb(0, 0, 0); font-size: 13px;
font-family:tahoma, 'new york', times, serif; font-style: normal; "><span><font size="2"><br /></font></span></div><div
style="background-color:transparent; color: rgb(0, 0, 0); font-size: 13px; font-family: tahoma, 'new york', times,
serif;font-style: normal; "><span><font size="2">[separate topic: pluggable heap manager]</font></span></div><div
style="background-color:transparent; color: rgb(0, 0, 0); font-size: 13px; font-family: tahoma, 'new york', times,
serif;font-style: normal; "><span><font size="2">I'm quite interested in pursuing more aggressive compression
strategies,and I'd like to do so in the context of the heap manager. I'm exploring having a pluggable heap manager
implementationand would be interested in feedback on that as a general approach. My thinking is that I'd like to be
ableto have PostgreSQL support multiple heap implementations along the lines of how multiple index types are supported,
thoughprobably only the existing heap manager implementation would be part of the actual codeline. I've done a little
exploratorywork of looking at the heap interface. I was planning on doing a little prototyping before suggesting
anythingconcrete, but, assuming the concept of a layered heap manager is not inherently objectionable, I was thinking
ofcleaning up the heap interface a little (e.g. some HOT stuff has bled across a little), then taking a whack at
formalizingthe interface along the lines of the index layering. So ideally I'd make a few separate submissions and if
allgoes according to plan I'd be able to have a pluggable heap manager implementation that I could work on
independentlyand which could in theory use the same hooks as the existing heap implementation. And if it turns out that
myimplementation is deemed to be general enough it could be released to the community.</font></span></div><div
style="background-color:transparent; color: rgb(0, 0, 0); font-size: 13px; font-family: tahoma, 'new york', times,
serif;font-style: normal; "><span><font size="2"><br /></font></span></div><div style="background-color: transparent;
color:rgb(0, 0, 0); font-family: tahoma, 'new york', times, serif; font-style: normal; "><font size="2">If I do decide
topursue this, can anyone suggest the best way solicit feedback? I see that some proposals get shared on the postgres
wiki.I could put something up there to frame the issue and encourage some back and forth dialog. Or is email the way
thatthis kind of exchange tends to happen? Ultimately I'd like to get into a bit of detail about what the actual heap
managercontract is and so forth.</font></div><div style="background-color: transparent; color: rgb(0, 0, 0);
font-family:tahoma, 'new york', times, serif; font-style: normal; font-size: 13px; "><font size="2"><br
/></font></div><divstyle="background-color: transparent; color: rgb(0, 0, 0); font-family: tahoma, 'new york', times,
serif;font-style: normal; font-size: 13px; "><font size="2">Note that I'm a ways from really knowing if this is
feasibleon my end, so this is quite speculative at this point. But I'd like to introduce the topic and get some
feedbackon the right way to communicate as early as possible.</font></div><div style="background-color: transparent;
color:rgb(0, 0, 0); font-family: tahoma, 'new york', times, serif; font-style: normal; font-size: 13px; "><font
size="2"><br/></font></div><div style="background-color: transparent; color: rgb(0, 0, 0); font-family: tahoma, 'new
york',times, serif; font-style: normal; font-size: 13px; "><font size="2">Thanks.</font></div><div
style="background-color:transparent; color: rgb(0, 0, 0); font-family: tahoma, 'new york', times, serif; font-style:
normal;font-size: 13px; "><font size="2"><br /></font></div><div style="background-color: transparent; color: rgb(0, 0,
0);font-family: tahoma, 'new york', times, serif; font-style: normal; font-size: 13px; "><font
size="2">-Jamie</font></div><divstyle="background-color: transparent; color: rgb(0, 0, 0); font-family: tahoma, 'new
york',times, serif; font-style: normal; font-size: 13px; "><font size="2"><br /></font></div><div style="font-family:
tahoma,'new york', times, serif; font-size: 10pt; "><div style="font-family: 'times new roman', 'new york', times,
serif;font-size: 12pt; "><div dir="ltr"><font face="Arial" size="2"><hr size="1" /><b><span
style="font-weight:bold;">From:</span></b>Tom Lane <tgl@sss.pgh.pa.us><br /><b><span style="font-weight:
bold;">To:</span></b>Simon Riggs <simon@2ndQuadrant.com> <br /><b><span style="font-weight: bold;">Cc:</span></b>
JameisonMartin <jameisonb@yahoo.com>; "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org> <br
/><b><spanstyle="font-weight: bold;">Sent:</span></b> Thursday, August 9, 2012 7:27 AM<br /><b><span
style="font-weight:bold;">Subject:</span></b> Re: [HACKERS] patch submission: truncate trailing nulls from heap rows to
reducethe size of the null bitmap<br /></font></div><br /> Simon Riggs <<a href="mailto:simon@2ndQuadrant.com"
ymailto="mailto:simon@2ndQuadrant.com">simon@2ndQuadrant.com</a>>writes:<br />> On 17 April 2012 17:22, Jameison
Martin<<a href="mailto:jameisonb@yahoo.com" ymailto="mailto:jameisonb@yahoo.com">jameisonb@yahoo.com</a>>
wrote:<br/>>> The following patch truncates trailing null attributes from heap rows to<br />>> reduce the
sizeof the row bitmap.<br /><br />> This is an interesting patch, but its has had various comments made about it.<br
/><br/>> When I look at this I see that it would change the NULL bitmap for all<br />> existing rows, which means
itforces a complete unload/reload of data.<br /><br />Huh?  I thought it would only change how *new* tuples were
stored.<br/>Old tuples ought to continue to work fine.<br /><br />I'm not really convinced that it's a good idea in the
largerscheme<br />of things --- your point in a nearby thread that micro-optimizing<br />storage space at the expense
ofall else is not good engineering<br />applies here.  But I don't see that it forces data reload.  Or if<br />it does,
thatshould be easily fixable.<br /><br />> ...  Have another flag which indicates<br />> when a partial trailing
coltrimmed NULL bitmap is in use.<br /><br />That might be useful for forensic purposes, but on the whole I suspect<br
/>it'sjust added complexity (and eating up a valuable infomask bit)<br />for relatively little gain.<br /><br />>
...decide whether a table will benefit from full or partial bitmap and<br />> set that in the tupledesc. That way
thetupledesc will show<br />> heap_form_tuple which kind of null bitmap is preferred for new tuples.<br />> That
preferencemight be settable by user on or off, but the default<br />> would be for postgres to decide that for us
basedupon null stats etc,<br />> which we would decide at ANALYZE time.<br /><br />And that seems like huge
overcomplication. I think we could probably<br />do fine with some very simple fixed policy, like "don't bother with<br
/>thisfor tables of less than N columns", where N is maybe 64 or so<br />and chosen to match the MAXALIGN boundary
wherethere actually could<br />be some savings from trimming the null bitmap.<br /><br />(Note: I've not read the
patch,so maybe Jameison already did something<br />of the sort.)<br /><br />            regards, tom lane<br /><br
/><br/></div></div></div> 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: SP-GiST for ranges based on 2d-mapping and quad-tree
Следующее
От: Tom Lane
Дата:
Сообщение: WIP patch for consolidating misplaced-aggregate checks