Re: [SPAM] [MessageLimit][lowlimit] Re: pl/perl and utf-8 in sql_ascii databases

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: [SPAM] [MessageLimit][lowlimit] Re: pl/perl and utf-8 in sql_ascii databases
Дата
Msg-id 20120712.130919.67152096.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [SPAM] [MessageLimit][lowlimit] Re: pl/perl and utf-8 in sql_ascii databases  (Alvaro Herrera <alvherre@commandprompt.com>)
Ответы Re: pl/perl and utf-8 in sql_ascii databases  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Re: pl/perl and utf-8 in sql_ascii databases  (Alvaro Herrera <alvherre@commandprompt.com>)
Список pgsql-hackers
Hmm... Sorry for immature patch..

> ... and this story hasn't ended yet, because one of the new tests is
> failing.  See here:
> 
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=magpie&dt=2012-07-11%2010%3A00%3A04
> 
> The interesting part of the diff is:
...
>   SELECT encode(perl_utf_inout(E'ab\xe5\xb1\xb1cd')::bytea, 'escape')
> ! ERROR:  character with byte sequence 0xe5 0xb7 0x9d in encoding "UTF8" has no equivalent in encoding "LATIN1"
> ! CONTEXT:  PL/Perl function "perl_utf_inout"
> 
> 
> I am not sure what can we do here other than remove this function and
> query from the test.

I've run the regress only for the environment capable to handle
the character U+5ddd (Japanese character which means river)...

The byte sequences which can be decoded and the result byte
sequences of encoding from a unicode character vary among the
encodings.

The problem itself which is the aim of this thread could be
covered without the additional test. That confirms if
encoding/decoding is done as expected on calling the language
handler. I suppose that testing for the two cases and additional
one case which runs pg_do_encoding_conversion(), say latin1,
would be enough to confirm that encoding/decoding is properly
done, since the concrete conversion scheme is not significant
this case.

So I recommend that we should add the test for latin1 and omit
the test from other than sql_ascii, utf8 and latin1. This might
be archieved by create empty plperl_lc.sql and plperl_lc.out
files for those encodings.

What do you think about that?


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jose Ildefonso Camargo Tolosa
Дата:
Сообщение: Re: Synchronous Standalone Master Redoux
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: pl/perl and utf-8 in sql_ascii databases