Re: Hostnames, IDNs, Punycode and Unicode Case Folding

Поиск
Список
Период
Сортировка
От Andrew Sullivan
Тема Re: Hostnames, IDNs, Punycode and Unicode Case Folding
Дата
Msg-id 20141230015309.GJ54847@crankycanuck.ca
обсуждение исходный текст
Ответ на Re: Hostnames, IDNs, Punycode and Unicode Case Folding  (Mike Cardwell <pgsql@lists.grepular.com>)
Список pgsql-general
On Tue, Dec 30, 2014 at 12:53:42AM +0000, Mike Cardwell wrote:
> > Hmm.  How did you get the original, then?
>
> The "original" in my case, is the hostname which the end user supplied.
> Essentially, when I display it back to them, I want to make sure it is
> displayed the same way that it was when they originally submitted it.

Ah.  This gets even better™ for you, then, because whereas in IDNA2003
you can pass it an old fashioned LDH name (letter, digit, hypen),
IDNA2008 treats those as _outside_ the spec.  So basically, you first
have to get a label and determine whether it is LDH or not (you can do
this by checking for any octet outside the LDH range) and then you can
decide which way to process it.  In IDNA2003, the punycode output from
an LDH label turns out always to be the LDH label.  The reason for
this is that you're supposed to validate that a U-label is really a
U-label before registering in IDNA2008, and lots of perfectly good LDH
labels (like EXAMPLE) are not valid under IDNA2008 because of upper
case.

(If by now you think that maybe it's time for this DNS thing to get
replaced, you have company.)

> I was unaware of the different versions of IDNA. I basically started using
> the Perl module IDNA::Punycode in my project and assumed that this was the
> only type. Seems like I need to do some more reading.

Yeah, this is all made much harder by the fact that several IDN
libraries still do 2003.  Here is one that many people are using for
IDNA2008:
<https://gitorious.org/libidn2/libidn2/source/0d6b5c0a9f1e4a9742c5ce32b6241afb4910cae1:>
It's GPLv3, though, which brings its own issues.

A

--
Andrew Sullivan
ajs@crankycanuck.ca


В списке pgsql-general по дате отправления:

Предыдущее
От: Mike Cardwell
Дата:
Сообщение: Re: Hostnames, IDNs, Punycode and Unicode Case Folding
Следующее
От: Pawel Veselov
Дата:
Сообщение: Re: Improving performance of merging data between tables