Обсуждение: Index-only scans for GiST.
Вложения
Finally there is a new version of patch (in attachments).It provides multicolumn index-only scan for GiST indexes.- Memory leak is fixed.- little code cleanup- example of performance test in attachmens- function OIDs have debugging values (1111*) just to avoid merge conflicts while testing patchWiki page of the project ishttps://wiki.postgresql.org/wiki/Support_for_Index-only_scans_for_GIST_GSoC_2014Waiting for feedback.
$ patch -p1 < ~/Downloads/indexonlyscan_gist_2.0.patch
(Stripping trailing CRs from patch; use --binary to disable.)
patching file src/backend/access/gist/gist.c
Hunk #1 succeeded at 1404 (offset 9 lines).
Hunk #2 succeeded at 1434 (offset 9 lines).
(Stripping trailing CRs from patch; use --binary to disable.)
patching file src/backend/access/gist/gistget.c
Hunk #1 succeeded at 227 (offset 1 line).
Hunk #2 succeeded at 243 (offset 1 line).
Hunk #3 succeeded at 293 (offset -4 lines).
Hunk #4 succeeded at 330 (offset -4 lines).
Hunk #5 succeeded at 365 (offset -5 lines).
Hunk #6 succeeded at 444 (offset -27 lines).
Hunk #7 succeeded at 474 (offset -27 lines).
Hunk #8 FAILED at 518.
Hunk #9 succeeded at 507 (offset -28 lines).
Hunk #10 succeeded at 549 with fuzz 1 (offset -28 lines).
Hunk #11 FAILED at 601.
2 out of 11 hunks FAILED -- saving rejects to file src/backend/access/gist/gistget.c.rej
...
--
On 11 February 2015 at 22:50, Anastasia Lubennikova <lubennikovaav@gmail.com> wrote:Finally there is a new version of patch (in attachments).It provides multicolumn index-only scan for GiST indexes.- Memory leak is fixed.- little code cleanup- example of performance test in attachmens- function OIDs have debugging values (1111*) just to avoid merge conflicts while testing patchWiki page of the project ishttps://wiki.postgresql.org/wiki/Support_for_Index-only_scans_for_GIST_GSoC_2014Waiting for feedback.Hi Anastasia. Thanks for the updated patch. I've just tried applying it to head and it doesn't appear to apply cleanly.
$ patch -p1 < ~/Downloads/indexonlyscan_gist_2.0.patch
(Stripping trailing CRs from patch; use --binary to disable.)
patching file src/backend/access/gist/gist.c
Hunk #1 succeeded at 1404 (offset 9 lines).
Hunk #2 succeeded at 1434 (offset 9 lines).
(Stripping trailing CRs from patch; use --binary to disable.)
patching file src/backend/access/gist/gistget.c
Hunk #1 succeeded at 227 (offset 1 line).
Hunk #2 succeeded at 243 (offset 1 line).
Hunk #3 succeeded at 293 (offset -4 lines).
Hunk #4 succeeded at 330 (offset -4 lines).
Hunk #5 succeeded at 365 (offset -5 lines).
Hunk #6 succeeded at 444 (offset -27 lines).
Hunk #7 succeeded at 474 (offset -27 lines).
Hunk #8 FAILED at 518.
Hunk #9 succeeded at 507 (offset -28 lines).
Hunk #10 succeeded at 549 with fuzz 1 (offset -28 lines).
Hunk #11 FAILED at 601.
2 out of 11 hunks FAILED -- saving rejects to file src/backend/access/gist/gistget.c.rej
...
--Thom
--
Вложения
Thanks for answer.Now it seems to be applied correctly.
On 02/12/2015 12:40 PM, Anastasia Lubennikova wrote: > Thanks for answer. > Now it seems to be applied correctly. Thanks, it would be great to get this completed. This still leaks memory with the same test query as earlier. The leak seems to be into a different memory context now; it used to be to the "GiST scan context", but now it's to the ExecutorState context. See attached patch which makes the leak evident. Looking at my previous comments from August: * What's the reason for turning GISTScanOpaqueData.pageData from an array to a List? * Documentation is missing. - Heikki
Вложения
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On 12 February 2015 at 16:40, Heikki Linnakangas <span dir="ltr"><<ahref="mailto:hlinnakangas@vmware.com" target="_blank">hlinnakangas@vmware.com</a>></span> wrote:<br /><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On02/12/2015 12:40 PM, Anastasia Lubennikova wrote:<br /><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px#ccc solid;padding-left:1ex"> Thanks for answer.<br /> Now it seems to be applied correctly.<br /></blockquote></span><br/> * Documentation is missing.<span class="HOEnZb"></span><br clear="all" /></blockquote></div><br/></div><div class="gmail_extra">Anastasia provided a documentation patch in the first email.<br/></div><div class="gmail_extra"><br /><div class="gmail_signature">Thom</div></div></div>
<div dir="ltr"><div class="gmail_extra">On Thu, Feb 12, 2015 at 3:07 PM, Thom Brown <<a href="mailto:thom@linux.com">thom@linux.com</a>>wrote:<br />><br />> On 12 February 2015 at 16:40, Heikki Linnakangas<<a href="mailto:hlinnakangas@vmware.com">hlinnakangas@vmware.com</a>> wrote:<br />>><br />>>On 02/12/2015 12:40 PM, Anastasia Lubennikova wrote:<br />>>><br />>>> Thanks for answer.<br/>>>> Now it seems to be applied correctly.<br />>><br />>><br />>> * Documentationis missing.<br />><br />><br />> Anastasia provided a documentation patch in the first email.<br />><br/><br /></div><div class="gmail_extra">Yeah, but it's a good practice send all the patches together. Will make thelife of the reviewers better ;-)<br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra">Regards,<br/></div><div class="gmail_extra"><br />--<br />Fabrízio de Royes Mello<br />Consultoria/CoachingPostgreSQL<br />>> Timbira: <a href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog: <a href="http://fabriziomello.github.io">http://fabriziomello.github.io</a><br/>>> Linkedin: <a href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a><br/>>> Github: <a href="http://github.com/fabriziomello">http://github.com/fabriziomello</a></div></div>
I add MemoryContext listCxt to avoid memory leak. listCxt is created once in gistrescan (only for index-only scan plan ) and reseted when scan of the leaf page is finished.
I do not sure if the problem was completely solved, so I wait for feedback.
* What's the reason for turning GISTScanOpaqueData.pageData from an array to a List?
This array is field of structure GISTScanOpaqueData. Memory for that structure allocated in function gistbeginscan(). The array is static so it's declared only one time in structure:
GISTSearchHeapItem pageData [BLCKSZ/sizeof(IndexTupleData)]
But how could we know size of array if we don't know what data would be returned? I mean type and amount.
I asked Alexander about that and he offered me to use List instead of Array.
Вложения
On 02/27/2015 04:19 PM, Anastasia Lubennikova wrote: > I add MemoryContext listCxt to avoid memory leak. listCxt is created once > in gistrescan (only for index-only scan plan ) and reseted when scan of the > leaf page is finished. > > I do not sure if the problem was completely solved, so I wait for feedback. Yeah, I think that solves it. > * What's the reason for turning GISTScanOpaqueData.pageData from an array > to a List? > > This array is field of structure GISTScanOpaqueData. Memory for that > structure allocated in function gistbeginscan(). The array is static so > it's declared only one time in structure: > GISTSearchHeapItem pageData [BLCKSZ/sizeof(IndexTupleData)] > > But how could we know size of array if we don't know what data would be > returned? I mean type and amount. You're only adding a pointer to the IndexTuple to GISTSearchHeapItem. The GISTSearchHeapItem struct itself is still of constant size. I spent a little time cleaning this up. I reverted that pageData change so that it's an array again, put back the gist_indexonly.sql and expected output files that were missing from your latest version, removed a couple of unused local variables that gcc complained about. I refactored gistFetchTuple a bit, because it was doing IMHO quite bogus things with NULLs. It was passing NULLs to the opclass' fetch function, but it didn't pass the isNull flag correctly. I changed it so that the fetch function is not called at all for NULLs. I think this is pretty close to being committable. I'll make a round of editorializing over the docs, and the code comments as well. The opr_sanity regression test is failing, there's apparently something wrong with the pg_proc entries of the *canreturn functions. I haven't looked into that yet; could you fix that? - Heikki
Вложения
On 03/02/2015 12:43 AM, Heikki Linnakangas wrote: > On 02/27/2015 04:19 PM, Anastasia Lubennikova wrote: >> I add MemoryContext listCxt to avoid memory leak. listCxt is created once >> in gistrescan (only for index-only scan plan ) and reseted when scan of the >> leaf page is finished. >> >> I do not sure if the problem was completely solved, so I wait for feedback. > > Yeah, I think that solves it. On further testing, there was actually still a leak with kNN-searches. Fixed. > I spent a little time cleaning this up. I reverted that pageData change > so that it's an array again, put back the gist_indexonly.sql and > expected output files that were missing from your latest version, > removed a couple of unused local variables that gcc complained about. I > refactored gistFetchTuple a bit, because it was doing IMHO quite bogus > things with NULLs. It was passing NULLs to the opclass' fetch function, > but it didn't pass the isNull flag correctly. I changed it so that the > fetch function is not called at all for NULLs. > > I think this is pretty close to being committable. I'll make a round of > editorializing over the docs, and the code comments as well. > > The opr_sanity regression test is failing, there's apparently something > wrong with the pg_proc entries of the *canreturn functions. I haven't > looked into that yet; could you fix that? I have pushed this, after fixing the opr_sanity failure, some bug fixes, and documentation and comment cleanup. Thanks for the patch! - Heikki