Обсуждение: Trying to add more tests to gistbuild.c
I'm studying how the gist index works trying to improve the test coverage of gistbuild.c. Reading the source code I noticed that the gistInitBuffering function is not covered, so I decided to start with it. Reading the documentation and the source I understood that for this function to be executed it is necessary to force bufferring=on when creating the index or the index to be created is big enough to not fit in the cache, am I correct? Considering the above, I added two new index creation statements in the gist regression test (patch attached) to create an index using buffering=on and another to try to simulate an index that does not fit in the cache. With these new tests the coverage went from 45.3% to 85.5%, but I have some doubts: - Does this test make sense? - Would there be a way to validate that the buffering was done correctly? - Is this test necessary? I've been studying Postgresql implementations and I'm just trying to start contributing the source code. -- Matheus Alcantara
Вложения
The attached patch is failing on make check due to a typo, resubmitting the correct one. -- Matheus Alcantara
Вложения
Hi Matheus, Many thanks for hacking on increasing the code coverage! I noticed that this patch was stuck in "Needs Review" state for some time and decided to take a look. > With these new tests the coverage went from 45.3% to 85.5%, but I have some doubts: > - Does this test make sense? > - Would there be a way to validate that the buffering was done correctly? > - Is this test necessary? I can confirm that the coverage improved as stated. Personally I believe this change makes perfect sense. Although this is arguably not an ideal test for gistInitBuffering(), writing proper tests for `static` procedures is generally not an easy task. Executing the code at least once in order to make sure that it doesn't crash, doesn't throw errors and doesn't violate any Asserts() is better than doing nothing. Here is a slightly modified patch with added commit message. Please note that patches created with `git format-patch` are generally preferable than patches created with `git diff`. I'm going to change the status of this patch to "Ready for Committer" in a short time unless anyone has a second opinion. -- Best regards, Aleksander Alekseev
Вложения
Aleksander Alekseev <aleksander@timescale.com> writes: > Personally I believe this change makes perfect sense. Although this is > arguably not an ideal test for gistInitBuffering(), writing proper > tests for `static` procedures is generally not an easy task. Executing > the code at least once in order to make sure that it doesn't crash, > doesn't throw errors and doesn't violate any Asserts() is better than > doing nothing. Yeah, our poor test coverage for gist buffering builds has been complained of before [1]. It'd be good to do something about that; the trick is to not bloat the runtime of the core regression tests too much. I checked this patch and what I see is: * gistbuild.c coverage improves to 81.8% line coverage, more or less as stated (probably depends on if you use --enable-cassert) * gistbuildbuffers.c coverage improves from 0 to 14.0% * gist.sql runtime goes from ~215ms to ~280ms The results for gistbuildbuffers.c are kind of disappointing, especially given the nontrivial runtime cost. (YMMV on the runtime of course, but I see pretty stable numbers under non-parallel "make installcheck".) In the previous thread, Pavel Borisov offered a test patch [2] that still applies, and it brings the line count coverage to 95%+ in both files. Unfortunately it more than doubles the test runtime, to somewhere around 580ms, so I rejected it at the time hoping for a better compromise. The idea I see you using that Pavel missed is to reduce effective_cache_size to persuade the buffering build logic to kick in with less data. But it looks like multilevel buffering still doesn't get activated, which is how come gistbuildbuffers.c coverage still remains poor. (I tried reducing effective_cache_size further, but it didn't help.) I wonder if we can combine ideas from the two patches to get a better tradeoff of code coverage vs. runtime. Another thing we might consider is to move the testing responsibility somewhere else. The reason I'm allergic to adding a lot of runtime here is that the core regression tests are invoked at least four times in a standard buildfarm run, often more. But that concern could be alleviated if we put the test somewhere else. Maybe contrib/btree_gist would be suitable? regards, tom lane [1] https://www.postgresql.org/message-id/flat/16329-7a6aa9b6fa1118a1%40postgresql.org [2] https://www.postgresql.org/message-id/CALT9ZEECCV5m7wvxg46PC-7x-EybUmnpupBGhSFMoAAay%2Br6HQ%40mail.gmail.com
------- Original Message ------- On Friday, July 29th, 2022 at 19:53, Tom Lane tgl@sss.pgh.pa.us wrote: > I wonder if we can combine ideas from the two patches to get a > better tradeoff of code coverage vs. runtime. I was checking the Pavel patch and notice that he was using the fillfactor parameter when creating the gist index. I changed my previous patch to include this parameter and the code coverage of gistbuild.c and gistbuildbuffers.c was improved to 97.7% and 92.8% respectively. I'm attaching this new patch, could you please check if this change make sense and also don't impact the test runtime? > Another thing we might consider is to move the testing responsibility > somewhere else. The reason I'm allergic to adding a lot of runtime > here is that the core regression tests are invoked at least four times > in a standard buildfarm run, often more. But that concern could be > alleviated if we put the test somewhere else. Maybe contrib/btree_gist > would be suitable? I can't say much about it. If there's anything I can do here, please let me know. -- Matheus Alcantara
Вложения
Matheus Alcantara <mths.dev@pm.me> writes: > On Friday, July 29th, 2022 at 19:53, Tom Lane tgl@sss.pgh.pa.us wrote: >> I wonder if we can combine ideas from the two patches to get a >> better tradeoff of code coverage vs. runtime. > I was checking the Pavel patch and notice that he was using the fillfactor > parameter when creating the gist index. I changed my previous patch to include > this parameter and the code coverage of gistbuild.c and gistbuildbuffers.c was > improved to 97.7% and 92.8% respectively. Nice! I poked at this some more, wondering if we could combine the two new index builds into one test, and eventually realized something I should probably have figured out before: if you make effective_cache_size too small, it refuses to use buffering build at all, and you get here: if (levelStep <= 0) { elog(DEBUG1, "failed to switch to buffered GiST build"); buildstate->buildMode = GIST_BUFFERING_DISABLED; return; } In fact, at least on my machine, the first test case hits this and thus effectively adds no coverage at all :-(. If I remove that and just add the second index build, the above-quoted bit is the *only* thing in gistbuild.c or gistbuildbuffers.c that is missed compared to using both test cases. Moreover, the runtime of the test comes down to ~240 ms, which is an increment of ~25ms instead of ~65ms. (Which shows that the non-buffering build is slower, not surprising I guess.) I judge that covering those three lines is not worth the extra 40ms, so pushed just the second test case. Thanks for poking at this! I'm much happier now about the state of code coverage in that area. regards, tom lane
On Sun, 31 Jul 2022 at 00:33, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Matheus Alcantara <mths.dev@pm.me> writes:
> On Friday, July 29th, 2022 at 19:53, Tom Lane tgl@sss.pgh.pa.us wrote:
>> I wonder if we can combine ideas from the two patches to get a
>> better tradeoff of code coverage vs. runtime.
> I was checking the Pavel patch and notice that he was using the fillfactor
> parameter when creating the gist index. I changed my previous patch to include
> this parameter and the code coverage of gistbuild.c and gistbuildbuffers.c was
> improved to 97.7% and 92.8% respectively.
Nice!
I poked at this some more, wondering if we could combine the two new
index builds into one test, and eventually realized something I should
probably have figured out before: if you make effective_cache_size
too small, it refuses to use buffering build at all, and you get here:
if (levelStep <= 0)
{
elog(DEBUG1, "failed to switch to buffered GiST build");
buildstate->buildMode = GIST_BUFFERING_DISABLED;
return;
}
In fact, at least on my machine, the first test case hits this and
thus effectively adds no coverage at all :-(. If I remove that and
just add the second index build, the above-quoted bit is the *only*
thing in gistbuild.c or gistbuildbuffers.c that is missed compared
to using both test cases. Moreover, the runtime of the test comes
down to ~240 ms, which is an increment of ~25ms instead of ~65ms.
(Which shows that the non-buffering build is slower, not surprising
I guess.)
I judge that covering those three lines is not worth the extra 40ms,
so pushed just the second test case.
Thanks for poking at this! I'm much happier now about the state of
code coverage in that area.
I'm happy, that the improvement of the tests I've forgotten about so long ago is finally committed. Thank you!
Best regards,
Pavel Borisov
Pavel Borisov