Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding.
От | Alex Hunsaker |
---|---|
Тема | Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding. |
Дата | |
Msg-id | CAFaPBrSrsKFL7tJ2HM1Z6UvsjMGv19Q2vkDQi=rXSnYfE=Mv5w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding. (Amit Khandekar <amit.khandekar@enterprisedb.com>) |
Ответы |
Re: Re: [COMMITTERS] pgsql: Force strings passed to and
from plperl to be in UTF8 encoding.
|
Список | pgsql-hackers |
On Tue, Oct 4, 2011 at 23:46, Amit Khandekar <amit.khandekar@enterprisedb.com> wrote: > On 4 October 2011 22:57, Alex Hunsaker <badalex@gmail.com> wrote: >> On Tue, Oct 4, 2011 at 03:09, Amit Khandekar >> <amit.khandekar@enterprisedb.com> wrote: >>> On 4 October 2011 14:04, Alex Hunsaker <badalex@gmail.com> wrote: >>>> On Mon, Oct 3, 2011 at 23:35, Amit Khandekar >>>> <amit.khandekar@enterprisedb.com> wrote: >>>> >>>>> WHen GetDatabaseEncoding() != PG_UTF8 case, ret will not be equal to >>>>> utf8_str, so pg_verify_mbstr_len() will not get called. [...] >>>> >>>> Consider a latin1 database where utf8_str was a string of ascii >>>> characters. [...] >> >>>> [Patch] Look ok to you? >>>> >>> >>> + if(GetDatabaseEncoding() == PG_UTF8) >>> + pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false); >>> >>> In your patch, the above will again skip mb-validation if the database >>> encoding is SQL_ASCII. Note that in pg_do_encoding_conversion returns >>> the un-converted string even if *one* of the src and dest encodings is >>> SQL_ASCII. >> >> *scratches head* I thought the point of SQL_ASCII was no encoding >> conversion was done and so there would be nothing to verify. >> >> Ahh I see looks like pg_verify_mbstr_len() will make sure there are no >> NULL bytes in the string when we are a single byte encoding. >> >>> I think : >>> if (ret == utf8_str) >>> + { >>> + pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false); >>> ret = pstrdup(ret); >>> + } >>> >>> This (ret == utf8_str) condition would be a reliable way for knowing >>> whether pg_do_encoding_conversion() has done the conversion at all. >> >> Yes. However (and maybe im nitpicking here), I dont see any reason to >> verify certain strings twice if we can avoid it. >> >> What do you think about: >> + /* >> + * when we are a PG_UTF8 or SQL_ASCII database pg_do_encoding_conversion() >> + * will not do any conversion or verification. we need to do it >> manually instead. >> + */ >> + if( GetDatabaseEncoding() == PG_UTF8 || >> GetDatabaseEncoding() == SQL_ASCII) >> + pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false); >> > > You mean the final changes in plperl_helpers.h would look like > something like this right? : > > static inline char * > utf_u2e(const char *utf8_str, size_t len) > { > char *ret = (char *) pg_do_encoding_conversion((unsigned > char *) utf8_str, len, PG_UTF8, GetDatabaseEncoding()); > > if (ret == utf8_str) > + { > + if (GetDatabaseEncoding() == PG_UTF8 || > + GetDatabaseEncoding() == PG_SQL_ASCII) > + { > + pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false); > + } > + > ret = pstrdup(ret); > + } > return ret; > } Yes. > Yeah I am ok with that. It's just an additional check besides (ret == > utf8_str) to know if we really require validation. >
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Heikki LinnakangasДата:
Сообщение: Re: Action requested - Application Softblock implemented | Issue report ID341057