Обсуждение: Useless removal of duplicate GIN index entries in pg_trgm

Поиск

Список

Период

Сортировка

Useless removal of duplicate GIN index entries in pg_trgm

От

Fujii Masao

Дата:

27 августа 2012 г., 16:46:26

Hi,

After pg_trgm extracts the trigrams as GIN index keys, generate_trgm()
removes duplicate index keys, to avoid generating redundant index entries.
Also ginExtractEntries() which is the caller of pg_trgm does the same thing.
Why do we need to remove GIN index entries twice? I think that we can
get rid of the removal-of-duplicate code block from generate_trgm()
because it's useless. Comments?

Regards,

-- 
Fujii Masao

Re: Useless removal of duplicate GIN index entries in pg_trgm

От

Tom Lane

Дата:

27 августа 2012 г., 19:38:15

Fujii Masao <masao.fujii@gmail.com> writes:
> After pg_trgm extracts the trigrams as GIN index keys, generate_trgm()
> removes duplicate index keys, to avoid generating redundant index entries.
> Also ginExtractEntries() which is the caller of pg_trgm does the same thing.
> Why do we need to remove GIN index entries twice? I think that we can
> get rid of the removal-of-duplicate code block from generate_trgm()
> because it's useless. Comments?

I see eight different callers of generate_trgm().  It might be that
gin_extract_value_trgm() doesn't really need this behavior, but that
doesn't mean the other seven don't want it.

Also, seeing that generate_trgm() is able to use relatively cheap
trigram-specific comparison operators for this, it's not impossible
that getting rid of duplicates internal to it is a net savings even
for the gin_extract_value case, because it'd reduce the number of
much-more-heavyweight comparisons done by ginExtractEntries...
        regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Useless removal of duplicate GIN index entries in pg_trgm

Useless removal of duplicate GIN index entries in pg_trgm

Re: Useless removal of duplicate GIN index entries in pg_trgm