Обсуждение: Index-only scans for GiST.

Поиск
Список
Период
Сортировка

Index-only scans for GiST.

От
Anastasia Lubennikova
Дата:
Finally there is a new version of patch (in attachments).
It provides multicolumn index-only scan for GiST indexes.

- Memory leak is fixed.
- little code cleanup
- example of performance test in attachmens
- function OIDs have debugging values (1111*) just to avoid merge conflicts while testing patch

Wiki page of the project is
https://wiki.postgresql.org/wiki/Support_for_Index-only_scans_for_GIST_GSoC_2014

Waiting for feedback.
--
Best regards,
Lubennikova Anastasia
Вложения

Re: Index-only scans for GiST.

От
Thom Brown
Дата:
On 11 February 2015 at 22:50, Anastasia Lubennikova <lubennikovaav@gmail.com> wrote:
Finally there is a new version of patch (in attachments).
It provides multicolumn index-only scan for GiST indexes.

- Memory leak is fixed.
- little code cleanup
- example of performance test in attachmens
- function OIDs have debugging values (1111*) just to avoid merge conflicts while testing patch


Hi Anastasia.  Thanks for the updated patch.  I've just tried applying it to head and it doesn't appear to apply cleanly.

$ patch -p1 < ~/Downloads/indexonlyscan_gist_2.0.patch
(Stripping trailing CRs from patch; use --binary to disable.)
patching file src/backend/access/gist/gist.c
Hunk #1 succeeded at 1404 (offset 9 lines).
Hunk #2 succeeded at 1434 (offset 9 lines).
(Stripping trailing CRs from patch; use --binary to disable.)
patching file src/backend/access/gist/gistget.c
Hunk #1 succeeded at 227 (offset 1 line).
Hunk #2 succeeded at 243 (offset 1 line).
Hunk #3 succeeded at 293 (offset -4 lines).
Hunk #4 succeeded at 330 (offset -4 lines).
Hunk #5 succeeded at 365 (offset -5 lines).
Hunk #6 succeeded at 444 (offset -27 lines).
Hunk #7 succeeded at 474 (offset -27 lines).
Hunk #8 FAILED at 518.
Hunk #9 succeeded at 507 (offset -28 lines).
Hunk #10 succeeded at 549 with fuzz 1 (offset -28 lines).
Hunk #11 FAILED at 601.
2 out of 11 hunks FAILED -- saving rejects to file src/backend/access/gist/gistget.c.rej
...

--
Thom

Re: Index-only scans for GiST.

От
Anastasia Lubennikova
Дата:
Thanks for answer.
Now it seems to be applied correctly.

2015-02-12 3:12 GMT+04:00 Thom Brown <thom@linux.com>:
On 11 February 2015 at 22:50, Anastasia Lubennikova <lubennikovaav@gmail.com> wrote:
Finally there is a new version of patch (in attachments).
It provides multicolumn index-only scan for GiST indexes.

- Memory leak is fixed.
- little code cleanup
- example of performance test in attachmens
- function OIDs have debugging values (1111*) just to avoid merge conflicts while testing patch


Hi Anastasia.  Thanks for the updated patch.  I've just tried applying it to head and it doesn't appear to apply cleanly.

$ patch -p1 < ~/Downloads/indexonlyscan_gist_2.0.patch
(Stripping trailing CRs from patch; use --binary to disable.)
patching file src/backend/access/gist/gist.c
Hunk #1 succeeded at 1404 (offset 9 lines).
Hunk #2 succeeded at 1434 (offset 9 lines).
(Stripping trailing CRs from patch; use --binary to disable.)
patching file src/backend/access/gist/gistget.c
Hunk #1 succeeded at 227 (offset 1 line).
Hunk #2 succeeded at 243 (offset 1 line).
Hunk #3 succeeded at 293 (offset -4 lines).
Hunk #4 succeeded at 330 (offset -4 lines).
Hunk #5 succeeded at 365 (offset -5 lines).
Hunk #6 succeeded at 444 (offset -27 lines).
Hunk #7 succeeded at 474 (offset -27 lines).
Hunk #8 FAILED at 518.
Hunk #9 succeeded at 507 (offset -28 lines).
Hunk #10 succeeded at 549 with fuzz 1 (offset -28 lines).
Hunk #11 FAILED at 601.
2 out of 11 hunks FAILED -- saving rejects to file src/backend/access/gist/gistget.c.rej
...

--
Thom



--
Best regards,
Lubennikova Anastasia
Вложения

Re: Index-only scans for GiST.

От
Thom Brown
Дата:
On 12 February 2015 at 10:40, Anastasia Lubennikova <lubennikovaav@gmail.com> wrote:
Thanks for answer.
Now it seems to be applied correctly.

(please avoid top-posting)

Thanks for the updated patch.  I can confirm that it now cleanly applies and compiles fine.  I've run the tested in the SQL file you provided, and it's behaving as expected.

Thom

Re: Index-only scans for GiST.

От
Heikki Linnakangas
Дата:
On 02/12/2015 12:40 PM, Anastasia Lubennikova wrote:
> Thanks for answer.
> Now it seems to be applied correctly.

Thanks, it would be great to get this completed.

This still leaks memory with the same test query as earlier. The leak
seems to be into a different memory context now; it used to be to the
"GiST scan context", but now it's to the ExecutorState context. See
attached patch which makes the leak evident.

Looking at my previous comments from August:

* What's the reason for turning GISTScanOpaqueData.pageData from an
array to a List?

* Documentation is missing.

- Heikki

Вложения

Re: Index-only scans for GiST.

От
Thom Brown
Дата:
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On 12 February 2015 at 16:40, Heikki Linnakangas <span
dir="ltr"><<ahref="mailto:hlinnakangas@vmware.com" target="_blank">hlinnakangas@vmware.com</a>></span> wrote:<br
/><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span
class="">On02/12/2015 12:40 PM, Anastasia Lubennikova wrote:<br /><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px#ccc solid;padding-left:1ex"> Thanks for answer.<br /> Now it seems to be applied correctly.<br
/></blockquote></span><br/> * Documentation is missing.<span class="HOEnZb"></span><br clear="all"
/></blockquote></div><br/></div><div class="gmail_extra">Anastasia provided a documentation patch in the first
email.<br/></div><div class="gmail_extra"><br /><div class="gmail_signature">Thom</div></div></div> 

Re: Index-only scans for GiST.

От
Fabrízio de Royes Mello
Дата:
<div dir="ltr"><div class="gmail_extra">On Thu, Feb 12, 2015 at 3:07 PM, Thom Brown <<a
href="mailto:thom@linux.com">thom@linux.com</a>>wrote:<br />><br />> On 12 February 2015 at 16:40, Heikki
Linnakangas<<a href="mailto:hlinnakangas@vmware.com">hlinnakangas@vmware.com</a>> wrote:<br />>><br
/>>>On 02/12/2015 12:40 PM, Anastasia Lubennikova wrote:<br />>>><br />>>> Thanks for
answer.<br/>>>> Now it seems to be applied correctly.<br />>><br />>><br />>> *
Documentationis missing.<br />><br />><br />> Anastasia provided a documentation patch in the first email.<br
/>><br/><br /></div><div class="gmail_extra">Yeah, but it's a good practice send all the patches together. Will make
thelife of the reviewers better ;-)<br /></div><div class="gmail_extra"><br /></div><div
class="gmail_extra">Regards,<br/></div><div class="gmail_extra"><br />--<br />Fabrízio de Royes Mello<br
/>Consultoria/CoachingPostgreSQL<br />>> Timbira: <a
href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog: <a
href="http://fabriziomello.github.io">http://fabriziomello.github.io</a><br/>>> Linkedin: <a
href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a
href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a><br/>>> Github: <a
href="http://github.com/fabriziomello">http://github.com/fabriziomello</a></div></div>

Re: Index-only scans for GiST.

От
Anastasia Lubennikova
Дата:

I add MemoryContext listCxt to avoid memory leak. listCxt is created once in gistrescan (only for index-only scan plan ) and reseted when scan of the leaf page is finished.

I do not sure if the problem was completely solved, so I wait for feedback.

* What's the reason for turning GISTScanOpaqueData.pageData from an array to a List?

This array is field of structure GISTScanOpaqueData. Memory for that structure allocated in function gistbeginscan(). The array is static so it's declared only one time in structure:
GISTSearchHeapItem  pageData [BLCKSZ/sizeof(IndexTupleData)]

But how could we know size of array if we don't know what data would be returned? I mean type and amount.

I asked Alexander about that and he offered me to use List instead of Array.

Вложения

Re: Index-only scans for GiST.

От
Heikki Linnakangas
Дата:
On 02/27/2015 04:19 PM, Anastasia Lubennikova wrote:
> I add MemoryContext listCxt to avoid memory leak. listCxt is created once
> in gistrescan (only for index-only scan plan ) and reseted when scan of the
> leaf page is finished.
>
> I do not sure if the problem was completely solved, so I wait for feedback.

Yeah, I think that solves it.

> * What's the reason for turning GISTScanOpaqueData.pageData from an array
> to a List?
>
> This array is field of structure GISTScanOpaqueData. Memory for that
> structure allocated in function gistbeginscan(). The array is static so
> it's declared only one time in structure:
> GISTSearchHeapItem  pageData [BLCKSZ/sizeof(IndexTupleData)]
>
> But how could we know size of array if we don't know what data would be
> returned? I mean type and amount.

You're only adding a pointer to the IndexTuple to GISTSearchHeapItem.
The GISTSearchHeapItem struct itself is still of constant size.

I spent a little time cleaning this up. I reverted that pageData change
so that it's an array again, put back the gist_indexonly.sql and
expected output files that were missing from your latest version,
removed a couple of unused local variables that gcc complained about. I
refactored gistFetchTuple a bit, because it was doing IMHO quite bogus
things with NULLs. It was passing NULLs to the opclass' fetch function,
but it didn't pass the isNull flag correctly. I changed it so that the
fetch function is not called at all for NULLs.

I think this is pretty close to being committable. I'll make a round of
editorializing over the docs, and the code comments as well.

The opr_sanity regression test is failing, there's apparently something
wrong with the pg_proc entries of the *canreturn functions. I haven't
looked into that yet; could you fix that?

- Heikki


Вложения

Re: Index-only scans for GiST.

От
Heikki Linnakangas
Дата:
On 03/02/2015 12:43 AM, Heikki Linnakangas wrote:
> On 02/27/2015 04:19 PM, Anastasia Lubennikova wrote:
>> I add MemoryContext listCxt to avoid memory leak. listCxt is created once
>> in gistrescan (only for index-only scan plan ) and reseted when scan of the
>> leaf page is finished.
>>
>> I do not sure if the problem was completely solved, so I wait for feedback.
>
> Yeah, I think that solves it.

On further testing, there was actually still a leak with kNN-searches. 
Fixed.

> I spent a little time cleaning this up. I reverted that pageData change
> so that it's an array again, put back the gist_indexonly.sql and
> expected output files that were missing from your latest version,
> removed a couple of unused local variables that gcc complained about. I
> refactored gistFetchTuple a bit, because it was doing IMHO quite bogus
> things with NULLs. It was passing NULLs to the opclass' fetch function,
> but it didn't pass the isNull flag correctly. I changed it so that the
> fetch function is not called at all for NULLs.
>
> I think this is pretty close to being committable. I'll make a round of
> editorializing over the docs, and the code comments as well.
>
> The opr_sanity regression test is failing, there's apparently something
> wrong with the pg_proc entries of the *canreturn functions. I haven't
> looked into that yet; could you fix that?

I have pushed this, after fixing the opr_sanity failure, some bug fixes, 
and documentation and comment cleanup.

Thanks for the patch!

- Heikki