Re: BUG #19031: pg_trgm infinite loop on certain cases
От | Tom Lane |
---|---|
Тема | Re: BUG #19031: pg_trgm infinite loop on certain cases |
Дата | |
Msg-id | 1081531.1756221406@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #19031: pg_trgm infinite loop on certain cases (Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>) |
Список | pgsql-bugs |
Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> writes: > On Tue, Aug 26, 2025 at 3:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> However, I don't totally understand *why* it fixes the test case. >> Especially not after I noted that there's already a test case in >> pg_trgm that exercises exactly this situation: >> >> select count(*) from test_trgm where t like '%99%' and t like '%qwerty%'; >> >> If you put an Assert into ginNewScanKey that the first scan key >> isn't excludeOnly (instead of the re-sort), it fails on that query. >> So why do we not see an infinite loop for that test case? I don't >> really understand the GIN code well enough to figure out what is >> the difference. > I debug a little bit and it looks like the reason there's no infinite > loop in your example is because it returns MAYBE for the first > 'excludeOnly' key in: > keyGetItem() > ... > res = key->triConsistentFn(key); Ah-hah! The thing I'd overlooked is that that regression test query uses a different operator: LIKE not %>. So even though the GinScanKey looks pretty similar, the strategy is different, leading gin_trgm_triconsistent to return GIN_MAYBE not GIN_FALSE when nkeys=0. So now I can reproduce the failure with the regression tests' table: select count(*) from test_trgm where t %> '' and t %> '%qwerty%'; > I don't have an opinion on whether it's good or not to move > all the 'excludeOnly' keys to the end, but it seems that simply not > having an "excludeOnly" key as the first key is enough to fix the bug. > Maybe it's enough to just swap any normal key with the first one, if > it's "excludeOnly"? Yeah, that would be enough to get us out of this particular example. But I think the lesson here is that there are under-documented dependencies on the ordering of GinScanKeys, and I want the fix to make that ordering more predictable not less so. For example, after seeing this I have little confidence that GIN wouldn't have issues with an excludeOnly key that precedes the first normal key for its index attribute, even when there are other keys for other attributes appearing ahead of them in the scankey array. So I'd rather that the fix be based on a consistent pattern like "put excludeOnly keys after not-excludeOnly keys", not "let's swap the first key with some randomly-chosen other key". regards, tom lane
В списке pgsql-bugs по дате отправления: