Re: jsonb format is pessimal for toast compression
От | Arthur Silva |
---|---|
Тема | Re: jsonb format is pessimal for toast compression |
Дата | |
Msg-id | CAO_YK0VZRG6yjBgyrfQ7Rz=K7j-xq6HYcfVQZcTO3WNhXHCwSA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: jsonb format is pessimal for toast compression (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: jsonb format is pessimal for toast compression
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-hackers |
<p dir="ltr">I'm still getting up to speed on postgres development but I'd like to leave an opinion. <p dir="ltr">We shouldadd some sort of versionning to the jsonb format. This can be explored in the future in many ways.<p dir="ltr">As forthe current problem, we should explore the directory at the end option. It should improve compression and keep good accessperformance. <p dir="ltr">A 4 byte header is sufficient to store the directory offset and some versionning bits.<br/><div class="gmail_quote">Em 15/08/2014 17:39, "Tom Lane" <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>>escreveu:<br type="attribution" /><blockquote class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Josh Berkus <<a href="mailto:josh@agliodbs.com">josh@agliodbs.com</a>>writes:<br /> > On 08/14/2014 07:24 PM, Tom Lane wrote:<br />>> We can certainly reduce that. The question was whether it would be<br /> >> worth the effort to try. Atthis point, with three different test<br /> >> data sets having shown clear space savings, I think it is worth<br/> >> the effort. I'll poke into it tomorrow or over the weekend, unless<br /> >> somebody beats meto it.<br /><br /> > Note that I specifically created that data set to be a worst case: many<br /> > top-level keys,no nesting, and small values. However, I don't think<br /> > it's an unrealistic worst case.<br /><br /> > Interestingly,even on the unpatched, 1GB table case, the *index* on the<br /> > JSONB is only 60MB. Which shows justhow terrific the improvement in<br /> > GIN index size/performance is.<br /><br /> I've been poking at this, and Ithink the main explanation for your result<br /> is that with more JSONB documents being subject to compression, we're<br/> spending more time in pglz_decompress. There's no free lunch in that<br /> department: if you want compressedstorage it's gonna cost ya to<br /> decompress. The only way I can get decompression and TOAST access to not<br/> dominate the profile on cases of this size is to ALTER COLUMN SET STORAGE<br /> PLAIN. However, when I do that,I do see my test patch running about 25%<br /> slower overall than HEAD on an "explain analyze select jfield -> 'key'<br/> from table" type of query with 200-key documents with narrow fields (see<br /> attached perl script that generatesthe test data).<br /><br /> It seems difficult to improve much on that for this test case. I put some<br /> logicinto findJsonbValueFromContainer to calculate the offset sums just<br /> once not once per binary-search iteration,but that only improved matters<br /> 5% at best. I still think it'd be worth modifying the JsonbIterator code<br/> to avoid repetitive offset calculations, but that's not too relevant to<br /> this test case.<br /><br /> Havingsaid all that, I think this test is something of a contrived worst<br /> case. More realistic cases are likely tohave many fewer keys (so that<br /> speed of the binary search loop is less of an issue) or else to have total<br /> documentsizes large enough that inline PLAIN storage isn't an option,<br /> meaning that detoast+decompression costs willdominate.<br /><br /> regards, tom lane<br /><br /><br /><br /> --<br /> Sent via pgsql-hackersmailing list (<a href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br /> To makechanges to your subscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-hackers" target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/><br /></blockquote></div>
В списке pgsql-hackers по дате отправления: