Re: index prefetching
От | Tomas Vondra |
---|---|
Тема | Re: index prefetching |
Дата | |
Msg-id | 1ec17146-6d46-4997-af53-41c959f18496@vondra.me обсуждение исходный текст |
Ответ на | Re: index prefetching (Peter Geoghegan <pg@bowt.ie>) |
Список | pgsql-hackers |
On 7/16/25 17:29, Peter Geoghegan wrote: > On Wed, Jul 16, 2025 at 4:40 AM Tomas Vondra <tomas@vondra.me> wrote: >> For "uniform" data set, both prefetch patches do much better than master >> (for low selectivities it's clearer in the log-scale chart). The >> "complex" prefetch patch appears to have a bit of an edge for >1% >> selectivities. I find this a bit surprising, the leaf pages have ~360 >> index items, so I wouldn't expect such impact due to not being able to >> prefetch beyond the end of the current leaf page. But could be on >> storage with higher latencies (this is the cloud SSD on azure). > > How can you say that the "complex" patch has "a bit of an edge for >1% > selectivities"? > > It looks like a *massive* advantage on all "linear" test results. > Those are only about 1/3 of all tests -- but if I'm not mistaken > they're the *only* tests where prefetching could be expected to help a > lot. The "cyclic" tests are adversarial/designed to make the patch > look bad. The "uniform" tests have uniformly random heap accesses (I > think), which can only be helped so much by prefetching. > > For example, with "linear_10 / eic=16 / sync", it looks like "complex" > has about half the latency of "simple" in tests where selectivity is > 10. The advantage for "complex" is even greater at higher > "selectivity" values. All of the other "linear" test results look > about the same. > > Have I missed something? > That paragraph starts with "for uniform data set", and the statement about 1% selectivities was only about that particular data set. You're right there's a massive difference on all the "correlated" data sets. I believe (assume) that's caused by the same issue, discussed in this thread (where the simple patch seems to do fewer fadvise calls). I only picked the "cyclic" data set as an example, representing this. FWIW I suspect the difference on "uniform" data set might be caused by this too, because at ~5% selectivity the queries start to hit pages multiple times (there are ~20 rows/page, hence ~5% means ~1 row). But it's much weaker than on the correlated data sets, of course. regards -- Tomas Vondra
В списке pgsql-hackers по дате отправления: