Re: BUG #19031: pg_trgm infinite loop on certain cases

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #19031: pg_trgm infinite loop on certain cases
Дата
Msg-id 1081531.1756221406@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #19031: pg_trgm infinite loop on certain cases  (Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>)
Список pgsql-bugs
Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> writes:
> On Tue, Aug 26, 2025 at 3:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> However, I don't totally understand *why* it fixes the test case.
>> Especially not after I noted that there's already a test case in
>> pg_trgm that exercises exactly this situation:
>> 
>> select count(*) from test_trgm where t like '%99%' and t like '%qwerty%';
>> 
>> If you put an Assert into ginNewScanKey that the first scan key
>> isn't excludeOnly (instead of the re-sort), it fails on that query.
>> So why do we not see an infinite loop for that test case?  I don't
>> really understand the GIN code well enough to figure out what is
>> the difference.

> I debug a little bit and it looks like the reason there's no infinite
> loop in your example is because it returns MAYBE for the first
> 'excludeOnly' key in:
>     keyGetItem()
>         ...
>         res = key->triConsistentFn(key);

Ah-hah!  The thing I'd overlooked is that that regression test query
uses a different operator: LIKE not %>.  So even though the
GinScanKey looks pretty similar, the strategy is different, leading
gin_trgm_triconsistent to return GIN_MAYBE not GIN_FALSE when nkeys=0.
So now I can reproduce the failure with the regression tests' table:

select count(*) from test_trgm where t %> '' and t %> '%qwerty%';

> I don't have an opinion on whether it's good or not to move
> all the 'excludeOnly' keys to the end, but it seems that simply not
> having an "excludeOnly" key as the first key is enough to fix the bug.
> Maybe it's enough to just swap any normal key with the first one, if
> it's "excludeOnly"?

Yeah, that would be enough to get us out of this particular example.
But I think the lesson here is that there are under-documented
dependencies on the ordering of GinScanKeys, and I want the fix to
make that ordering more predictable not less so.  For example, after
seeing this I have little confidence that GIN wouldn't have issues
with an excludeOnly key that precedes the first normal key for its
index attribute, even when there are other keys for other attributes
appearing ahead of them in the scankey array.  So I'd rather that the
fix be based on a consistent pattern like "put excludeOnly keys after
not-excludeOnly keys", not "let's swap the first key with some
randomly-chosen other key".

            regards, tom lane



В списке pgsql-bugs по дате отправления: