Re: Doing better at HINTing an appropriate column within errorMissingColumn()

Поиск
Список
Период
Сортировка
От Ian Barwick
Тема Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Дата
Msg-id 539FA371.4070902@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Doing better at HINTing an appropriate column within errorMissingColumn()  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Список pgsql-hackers
On 14/06/17 9:53, Tom Lane wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> On Tue, Jun 17, 2014 at 9:30 AM, Ian Barwick <ian@2ndquadrant.com> wrote:
>>> From what I've seen in the wild in Japan, Roman/ASCII characters are
>>> widely used for object/attribute names, as generally it's much less
>>> hassle than switching between input methods, dealing with different
>>> encodings etc. The only place where I've seen Japanese characters widely
>>> used is in tutorials, examples etc. However that's only my personal
>>> observation for one particular non-Roman language.
> 
>> And I agree to this remark, that's a PITA to manage database object
>> names with Japanese characters directly. I have ever seen some
>> applications using such ways to define objects though in the past, not
>> *that* many I concur..
> 
> What exactly is the rationale for thinking that Levenshtein distance is
> useless in non-Roman alphabets?  AFAIK it just counts insertions and
> deletions of characters, which seems like a concept rather independent
> of what those characters are.

With Japanese (which doesn't have an alphabet, but two syllabaries and
a bunch of logographic characters), Levenshtein distance is pretty useless
for examining similarities with words which can be written in either
syllabary (Michael's "ramen" example earlier in the thread); and when
catching "typos" caused by erroneous conversion from phonetic input to
characters - e.g. intending to input "成長" (seichou, growth) but
accidentally selecting "清聴" (seichou, courteous attention).

Howver in this particular use case, as long as it doesn't produce false
positives (I haven't looked at the patch) I don't think it would cause
any problems (of the kind which would require actively excluding certain
languages/character sets), it just wouldn't be quite as useful.


Regards

Ian Barwick

-- Ian Barwick                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: How to change the pgsql source code and build it??
Следующее
От: Noah Misch
Дата:
Сообщение: Re: Built-in support for a memory consumption ulimit?