Re: Patch: add conversion from pg_wchar to multibyte

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: Patch: add conversion from pg_wchar to multibyte
Дата	2 мая 2012 г. 13:48:49
Msg-id	CA+Tgmobi0hDCfq7cF91QOZBUF0KK09Q8dxydB1u6FsmisgtghQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Patch: add conversion from pg_wchar to multibyte (Alexander Korotkov <aekorotkov@gmail.com>)
Ответы	Re: Patch: add conversion from pg_wchar to multibyte (Alexander Korotkov <aekorotkov@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, May 2, 2012 at 9:35 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
>> I was thinking you could perhaps do it just based on the *number* of
>> trigrams, not necessarily their frequency.
>
> Imagine we've two queries:
> 1) SELECT * FROM tbl WHERE col LIKE '%abcd%';
> 2) SELECT * FROM tbl WHERE col LIKE '%abcdefghijk%';
>
> The first query require reading posting lists of trigrams "abc" and "bcd".
> The second query require reading posting lists of trigrams "abc", "bcd",
> "cde", "def", "efg", "fgh", "ghi", "hij" and "ijk".
> We could decide to use index scan for first query and sequential scan for
> second query because number of posting list to read is high. But it is
> unreasonable because actually second query is narrower than the first one.
> We can use same index scan for it, recheck will remove all false positives.
> When number of trigrams is high we can just exclude some of them from index
> scan. It would be better than just decide to do sequential scan. But the
> question is what trigrams to exclude? Ideally we would leave most rare
> trigrams to make index scan cheaper.

True.  I guess I was thinking more of the case where you've got
abc|def|ghi|jkl|mno|pqr|stu|vwx|yza|....  There's probably some point
at which it becomes silly to think about using the index.

>> > Probably you have some comments on idea of conversion from pg_wchar to
>> > multibyte? Is it acceptable at all?
>>
>> Well, I'm not an expert on encodings, but it seems like a logical
>> extension of what we're doing right now, so I don't really see why
>> not.  I'm confused by the diff hunks in pg_mule2wchar_with_len,
>> though.  Presumably either the old code is right (in which case, don't
>> change it) or the new code is right (in which case, there's a bug fix
>> needed here that ought to be discussed and committed separately from
>> the rest of the patch).  Maybe I am missing something.
>
> Unfortunately I didn't understand original logic of pg_mule2wchar_with_len.
> I just did proposal about how it could be. I hope somebody more familiar
> with this code would clarify this situation.

Well, do you think the current code is buggy, or not?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alexander Korotkov
Дата: 02 мая 2012 г., 13:35:39
Сообщение: Re: Patch: add conversion from pg_wchar to multibyte

Следующее

От: Alexander Korotkov
Дата: 02 мая 2012 г., 13:57:44
Сообщение: Re: Patch: add conversion from pg_wchar to multibyte

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Patch: add conversion from pg_wchar to multibyte

Предыдущее

Следующее