invalidly encoded strings

Поиск
Список
Период
Сортировка
От Andrew Dunstan
Тема invalidly encoded strings
Дата
Msg-id 46E37054.1040501@dunslane.net
обсуждение исходный текст
Ответы Re: invalidly encoded strings  (Martijn van Oosterhout <kleptog@svana.org>)
Re: invalidly encoded strings  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I have been looking at fixing the issue of accepting strings that are 
not valid in the database encoding. It appears from previous discussion 
that we need to add a call to pg_verifymbstr() to the relevant input 
routines and ensure that the chr() function returns a valid string. That 
leaves several issues:

. which are the relevant input routines? I have identified the following 
as needing remediation: textin(), bpcharin(), varcharin(), anyenum_in(), 
namein().  Do we also need one for cstring_in()? Does the xml code 
handle this as part of xml validation?

. what do we need to do to make the verification code more efficient? I 
think we need to address the correctness issue first, but doing so 
should certainly make us want to improve the verification code. For 
example, I'm wondering if it might benefit from having a tiny cache.

. for chr() under UTF8, it seems to be generally agreed that the 
argument should represent the codepoint and the function should return 
the correspondingly encoded character. If so, possible the argument 
should be a bigint to accommodate the full range of possible code 
points. It is not clear what the argument should represent for other 
multi-byte encodings for any argument higher than 127. Similarly, it is 
not clear what ascii() should return in such cases. I would be inclined 
just to error out.

cheers

andrew


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Just-in-time Background Writer Patch+Test Results
Следующее
От: Oleg Bartunov
Дата:
Сообщение: Re: tsearch filenames unlikes special symbols and numbers