Re: bug in Google translate snippet

Поиск
Список
Период
Сортировка
От Jan Urbański
Тема Re: bug in Google translate snippet
Дата
Msg-id 4A4DC027.9060802@flumotion.com
обсуждение исходный текст
Ответ на Re: bug in Google translate snippet  (Alvaro Herrera <alvherre@commandprompt.com>)
Список pgsql-hackers
Alvaro Herrera wrote:
> Andrew Dunstan wrote:
>>
>> Alvaro Herrera wrote:
>>> Hi,
>>>
>>> I was having a look at this snippet:
>>> http://wiki.postgresql.org/wiki/Google_Translate
>>> and it turns out that it doesn't work if the result contains non-ASCII
>>> chars.  Does anybody know how to fix it?
>>>
>>> alvherre=# select gtranslate('en', 'es', 'he');
>>> ERROR:  plpython: function "gtranslate" could not create return value
>>> DETALLE:  <type 'exceptions.UnicodeEncodeError'>: 'ascii' codec can't encode character u'\xe9' in position 0:
ordinalnot in range(128)
 
>> This looks like a python issue rather than a Postgres issue. The problem  
>> is probably in python-simplejson.
> 
> I think the problem happens when the PL tries to create the output
> value.  Otherwise I wouldn't be able to see the value in plpy.log.

The problem is that the thing you are trying to return
(resp['responseData']['translatedText']) is a Unicode object, so you
can't just print it. The error comes from Python complaining that you
are trying to output an 8-bit character using the 'ascii' codec, that
cannot encode that.

One solution is to explicitly encode the Unicode string with some codec,
that is: ask Python to convert the Unicode object into a blob using some
serialization method, UTF-8 being a good method here. For instance return
resp['responseData']['translatedText'].encode('utf-8')
worked for me.

See also http://docs.python.org/tutorial/introduction.html#unicode-strings

Cheers,
Jan


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [pgsql-www] commitfest.postgresql.org
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [pgsql-www] commitfest.postgresql.org