Tim Bunce wrote:
> On Fri, Dec 25, 2009 at 12:54:13PM -0500, Andrew Dunstan wrote:
>
>> Tim Bunce wrote:
>>
>>> I've attached an update of my previous refactoring of plperl.c.
>>> It's been rebased over the current (git) HEAD and has a few
>>> very minor additions.
>>>
>>>
>> [snip]
>>
>>> + -- Test compilation of unicode regex
>>> + --
>>> + CREATE OR REPLACE FUNCTION perl_unicode_regex(text) RETURNS INTEGER AS $$
>>> + # see http://rt.perl.org/rt3/Ticket/Display.html?id=47576
>>> + return ($_[0] =~ /\x{263A}|happy/i) ? 1 : 0; # unicode smiley
>>> + $$ LANGUAGE plperl;
>>>
>> This test is failing on my setup at least when the target db is not UTF8
>> encoded.
>>
>> Maybe that's a bug we need to fix?
>>
>
> Yes. I believe the test is highlighting an existing problem: that plperl
> function in non-PG_UTF8 databases can't use regular expressions that
> require unicode character meta-data.
>
> Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init()
> should be removed, so the utf8fix function is always called, or the
> test should be removed (or hacked to only apply to PG_UTF8 databases).
>
I tried forcing the test, but it doesn't seem to work, possibly because
in the case that the db is not utf8 we aren't forcing argument strings
to UTF8 :-(
I think we might need to remove the test from the patch.
>
> p.s. There may be other problems using unicode in non-PG_UTF8 databases,
> but I believe this patch doesn't change the behaviour for better or worse.
>
>
Right.
cheers
andrew