Re: pl/perl and utf-8 in sql_ascii databases

Поиск

Список

Период

Сортировка

От	Alvaro Herrera
Тема	Re: pl/perl and utf-8 in sql_ascii databases
Дата	13 июля 2012 г. 17:52:55
Msg-id	1342201377-sup-3678@alvh.no-ip.org обсуждение исходный текст
Ответ на	Re: [SPAM] [MessageLimit][lowlimit] Re: pl/perl and utf-8 in sql_ascii databases (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Ответы	Re: pl/perl and utf-8 in sql_ascii databases (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Список	pgsql-hackers

Дерево обсуждения

Excerpts from Kyotaro HORIGUCHI's message of jue jul 12 00:09:19 -0400 2012:
>
> Hmm... Sorry for immature patch..

No need to apologize.

> > ... and this story hasn't ended yet, because one of the new tests is
> > failing.  See here:
> >
> > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=magpie&dt=2012-07-11%2010%3A00%3A04
> >
> > The interesting part of the diff is:
> ...
> >   SELECT encode(perl_utf_inout(E'ab\xe5\xb1\xb1cd')::bytea, 'escape')
> > ! ERROR:  character with byte sequence 0xe5 0xb7 0x9d in encoding "UTF8" has no equivalent in encoding "LATIN1"
> > ! CONTEXT:  PL/Perl function "perl_utf_inout"
> >
> >
> > I am not sure what can we do here other than remove this function and
> > query from the test.
>
> I've run the regress only for the environment capable to handle
> the character U+5ddd (Japanese character which means river)...
>
> The byte sequences which can be decoded and the result byte
> sequences of encoding from a unicode character vary among the
> encodings.

Right.  I only ran the test in C and UTF8, not Latin1, so I didn't see
it fail either.

> The problem itself which is the aim of this thread could be
> covered without the additional test. That confirms if
> encoding/decoding is done as expected on calling the language
> handler.

Right.

> I suppose that testing for the two cases and additional
> one case which runs pg_do_encoding_conversion(), say latin1,
> would be enough to confirm that encoding/decoding is properly
> done, since the concrete conversion scheme is not significant
> this case.
>
> So I recommend that we should add the test for latin1 and omit
> the test from other than sql_ascii, utf8 and latin1. This might
> be archieved by create empty plperl_lc.sql and plperl_lc.out
> files for those encodings.
>
> What do you think about that?

I think that's probably too much engineering for something that doesn't
really warrant it.  A real solution to this problem could be to create
yet another new test file containing just this function definition and
the query that calls it, and have one expected file for each encoding;
but that's too much work and too many files, I'm afraid.

I can see us supporting tests that require a small number of expected
files.  No Make tricks with file copying, though.  If we can't get
some easy way to test this without that, I submit we should just remove
the test.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tom Lane
Дата: 13 июля 2012 г., 17:51:03
Сообщение: Re: Type modifier parameter of input function

Следующее

От: Tom Lane
Дата: 13 июля 2012 г., 18:20:11
Сообщение: Re: initdb and fsync

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: pl/perl and utf-8 in sql_ascii databases

Предыдущее

Следующее