Re: Use merge-based matching for MCVs in eqjoinsel
| От | David Geier |
|---|---|
| Тема | Re: Use merge-based matching for MCVs in eqjoinsel |
| Дата | |
| Msg-id | 33fd5d63-19cd-4fff-b741-0e7af45df52f@gmail.com обсуждение исходный текст |
| Ответ на | Re: Use merge-based matching for MCVs in eqjoinsel (Tom Lane <tgl@sss.pgh.pa.us>) |
| Ответы |
Re: Use merge-based matching for MCVs in eqjoinsel
|
| Список | pgsql-hackers |
Hi Tom! On 17.11.2025 19:44, Tom Lane wrote: > I wrote: >> Actually, after sleeping on it it seems like the obvious thing is >> to test "sslot1.nvalues * sslot2.nvalues", since the work we are >> thinking about saving scales as that product. But I'm not sure >> what threshold value to use if we do that. Maybe around 10000? > > Or maybe better, since we are considering an O(m*n) algorithm > versus an O(m+n) one, we could check whether > > sslot1.nvalues * sslot2.nvalues - (sslot1.nvalues + sslot2.nvalues) > > exceeds some threshold. But that doesn't offer any insight into > just what the threshold should be, either. Good idea. How about using that formula and then determining the threshold with a few experiments? Could be the JOB benchmark Ilia has already set up or some synthetic test-cases. Given that there's no one-size-fits-all constant anyways, that seems good enough to me. Looking at [1], determining to set MIN_ARRAY_SIZE_FOR_HASHED_SAOP to 9 was done the same way. We could also include the operator costs for hashing and equality comparison to make it more precise, in case they're easily accessible at this point. -- David Geier [1] https://www.postgresql.org/message-id/flat/CAAaqYe8x62%2B%3Dwn0zvNKCj55tPpg-JBHzhZFFc6ANovdqFw7-dA%40mail.gmail.com
В списке pgsql-hackers по дате отправления: