Обсуждение: Initial refactoring of plperl.c - rebased [PATCH]
I've attached an update of my previous refactoring of plperl.c. It's been rebased over the current (git) HEAD and has a few very minor additions. Background: I've started work on the enhancements to plperl I outlined on pg-general (in the "Wishlist of PL/Perl Enhancements for 8.5" thread). I have a working implementation of those changes, plus some performance enhancements, that I'm now re-working into a clean set of tested and polished patches. This patch is a first step that doesn't add any extra functionality. It refactors the internals to make adding the extra functionality easier (and more clearly visible). Changes in this patch: - Changed MULTIPLICITY check from runtime to compiletime. No loads the large Config module. - Changed plperl_init_interp() to return new interp and not alter the global interp_state - Moved plperl_safe_init() call into check_interp(). - Removed plperl_safe_init_done state variable as interp_state now covers that role. - Changed plperl_create_sub() to take a plperl_proc_desc argument. - Simplified return value handling in plperl_create_sub. - Added a test for the effect of the utf8fix function. - Changed perl.com link in the docs to perl.org and tweaked wording to clarify that require, not use, is what's blocked. - Moved perl code in large multi-line C string literal macros out to plc_*.pl files. - Added a test2macro.pl utility to convert the plc_*.pl files to macros in a perlchunks.h file which is #included Additions since previous verion: - Replaced calls to SvPV(val, PL_na) with SvPV_nolen(val) - Simplifed plperl_safe_init() slightly - Removed trailing whitespace from new plc_*.pl files. I'd appreciate any feedback on the patch. Tim.
Вложения
Tim, Since there's a commitfest on right now, meaningful feedback on your patch could be delayed. Just so you know. --Josh Berkus
On Thu, Dec 03, 2009 at 03:47:21PM -0800, Josh Berkus wrote: > Tim, > > Since there's a commitfest on right now, meaningful feedback on your > patch could be delayed. Just so you know. Understood. Thanks Josh. Tim.
Tim Bunce wrote: > I've attached an update of my previous refactoring of plperl.c. > It's been rebased over the current (git) HEAD and has a few > very minor additions. > > [snip] > + -- Test compilation of unicode regex > + -- > + CREATE OR REPLACE FUNCTION perl_unicode_regex(text) RETURNS INTEGER AS $$ > + # see http://rt.perl.org/rt3/Ticket/Display.html?id=47576 > + return ($_[0] =~ /\x{263A}|happy/i) ? 1 : 0; # unicode smiley > + $$ LANGUAGE plperl; > This test is failing on my setup at least when the target db is not UTF8 encoded. Maybe that's a bug we need to fix? cheers andrew
On Fri, Dec 25, 2009 at 12:54:13PM -0500, Andrew Dunstan wrote: > Tim Bunce wrote: >> I've attached an update of my previous refactoring of plperl.c. >> It's been rebased over the current (git) HEAD and has a few >> very minor additions. >> > [snip] >> + -- Test compilation of unicode regex >> + -- >> + CREATE OR REPLACE FUNCTION perl_unicode_regex(text) RETURNS INTEGER AS $$ >> + # see http://rt.perl.org/rt3/Ticket/Display.html?id=47576 >> + return ($_[0] =~ /\x{263A}|happy/i) ? 1 : 0; # unicode smiley >> + $$ LANGUAGE plperl; > > This test is failing on my setup at least when the target db is not UTF8 > encoded. > > Maybe that's a bug we need to fix? Yes. I believe the test is highlighting an existing problem: that plperl function in non-PG_UTF8 databases can't use regular expressions that require unicode character meta-data. Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init() should be removed, so the utf8fix function is always called, or the test should be removed (or hacked to only apply to PG_UTF8 databases). Tim. p.s. There may be other problems using unicode in non-PG_UTF8 databases, but I believe this patch doesn't change the behaviour for better or worse.
Tim Bunce wrote: > On Fri, Dec 25, 2009 at 12:54:13PM -0500, Andrew Dunstan wrote: > >> Tim Bunce wrote: >> >>> I've attached an update of my previous refactoring of plperl.c. >>> It's been rebased over the current (git) HEAD and has a few >>> very minor additions. >>> >>> >> [snip] >> >>> + -- Test compilation of unicode regex >>> + -- >>> + CREATE OR REPLACE FUNCTION perl_unicode_regex(text) RETURNS INTEGER AS $$ >>> + # see http://rt.perl.org/rt3/Ticket/Display.html?id=47576 >>> + return ($_[0] =~ /\x{263A}|happy/i) ? 1 : 0; # unicode smiley >>> + $$ LANGUAGE plperl; >>> >> This test is failing on my setup at least when the target db is not UTF8 >> encoded. >> >> Maybe that's a bug we need to fix? >> > > Yes. I believe the test is highlighting an existing problem: that plperl > function in non-PG_UTF8 databases can't use regular expressions that > require unicode character meta-data. > > Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init() > should be removed, so the utf8fix function is always called, or the > test should be removed (or hacked to only apply to PG_UTF8 databases). > I tried forcing the test, but it doesn't seem to work, possibly because in the case that the db is not utf8 we aren't forcing argument strings to UTF8 :-( I think we might need to remove the test from the patch. > > p.s. There may be other problems using unicode in non-PG_UTF8 databases, > but I believe this patch doesn't change the behaviour for better or worse. > > Right. cheers andrew
Andrew Dunstan wrote: >> >> Yes. I believe the test is highlighting an existing problem: that plperl >> function in non-PG_UTF8 databases can't use regular expressions that >> require unicode character meta-data. >> >> Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init() >> should be removed, so the utf8fix function is always called, or the >> test should be removed (or hacked to only apply to PG_UTF8 databases). >> > > > I tried forcing the test, but it doesn't seem to work, possibly > because in the case that the db is not utf8 we aren't forcing argument > strings to UTF8 :-( > > I think we might need to remove the test from the patch. > > I have not been able to come up with a fix for this - the whole thing seems very fragile. I'm going to commit what remains of this patch, but not add the extra regression test. I'll add a TODO to allow plperl to do utf8 operations in non-utf8 databases. cheers andrew
On Mon, Jan 04, 2010 at 06:38:03PM -0500, Andrew Dunstan wrote: > Andrew Dunstan wrote: > >> > >>Yes. I believe the test is highlighting an existing problem: that plperl > >>function in non-PG_UTF8 databases can't use regular expressions that > >>require unicode character meta-data. > >> > >>Either the (GetDatabaseEncoding() == PG_UTF8) test in plperl_safe_init() > >>should be removed, so the utf8fix function is always called, or the > >>test should be removed (or hacked to only apply to PG_UTF8 databases). > > > >I tried forcing the test, but it doesn't seem to work, possibly > >because in the case that the db is not utf8 we aren't forcing > >argument strings to UTF8 :-( > > > >I think we might need to remove the test from the patch. > > I have not been able to come up with a fix for this - the whole > thing seems very fragile. I'm going to commit what remains of this > patch, but not add the extra regression test. I'll add a TODO to > allow plperl to do utf8 operations in non-utf8 databases. I see you've not commited it yet, so to help out I've attached a new diff, over the current CVS head, with two minor changes: - Removed the test, as noted above. - Optimized pg_verifymbstr calls to avoid unneeded strlen()s. This should apply cleanly to cvs, saving you the need to resolve the conflicts caused by the recent pg_verifymbstr patch. I'll add it to the commitfest once it reaches the archives. Tim.
Вложения
Tim Bunce wrote: > > I see you've not commited it yet, so to help out I've attached > a new diff, over the current CVS head, with two minor changes: > > - Removed the test, as noted above. > - Optimized pg_verifymbstr calls to avoid unneeded strlen()s. > > I have committed this with minor edits. That should give us a clean base for the features patch(es). cheers andrew
On fre, 2010-01-08 at 12:46 +0000, Tim Bunce wrote: > *** 45,50 **** > --- 45,55 ---- > > include $(top_srcdir)/src/Makefile.shlib > > + plperl.o: perlchunks.h > + > + perlchunks.h: plc_*.pl > + $(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl > > perlchunks.htmp > + mv perlchunks.htmp perlchunks.h > > all: all-lib > What's the reason for the temp file here?
On Fri, Jan 08, 2010 at 09:47:01PM -0500, Andrew Dunstan wrote: > Tim Bunce wrote: > > > >I see you've not commited it yet, so to help out I've attached > >a new diff, over the current CVS head, with two minor changes: > > > >- Removed the test, as noted above. > >- Optimized pg_verifymbstr calls to avoid unneeded strlen()s. > > I have committed this with minor edits. That should give us a clean > base for the features patch(es). Wonderful. Many thanks Andrew. Tim.
On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote: > On fre, 2010-01-08 at 12:46 +0000, Tim Bunce wrote: > > *** 45,50 **** > > --- 45,55 ---- > > > > include $(top_srcdir)/src/Makefile.shlib > > > > + plperl.o: perlchunks.h > > + > > + perlchunks.h: plc_*.pl > > + $(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl > > > perlchunks.htmp > > + mv perlchunks.htmp perlchunks.h > > > > all: all-lib > > What's the reason for the temp file here? Defensive. If the text2macro.pl program fails/dies then you'd be left with a broken output file with a newer timestamp, so the next make wouldn't rerun text2macro.pl. Tim. p.s. In the makefile for perl we use a little utility called mv_if_diff instead of a plain mv so that any downstream dependencies only get rebuilt if the contents have changed.
On Sat, Jan 09, 2010 at 11:49:22PM +0000, Tim Bunce wrote: > On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote: > > On fre, 2010-01-08 at 12:46 +0000, Tim Bunce wrote: > > > *** 45,50 **** > > > --- 45,55 ---- > > > > > > include $(top_srcdir)/src/Makefile.shlib > > > > > > + plperl.o: perlchunks.h > > > + > > > + perlchunks.h: plc_*.pl > > > + $(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl > > > > perlchunks.htmp > > > + mv perlchunks.htmp perlchunks.h > > > > > > all: all-lib > > > > What's the reason for the temp file here? > > Defensive. If the text2macro.pl program fails/dies then you'd be left > with a broken output file with a newer timestamp, so the next make > wouldn't rerun text2macro.pl. An alternative would be for text2macro.pl to write to a temp file and rename at the end. Tim.
On sön, 2010-01-10 at 00:03 +0000, Tim Bunce wrote: > On Sat, Jan 09, 2010 at 11:49:22PM +0000, Tim Bunce wrote: > > On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote: > > > On fre, 2010-01-08 at 12:46 +0000, Tim Bunce wrote: > > > > *** 45,50 **** > > > > --- 45,55 ---- > > > > > > > > include $(top_srcdir)/src/Makefile.shlib > > > > > > > > + plperl.o: perlchunks.h > > > > + > > > > + perlchunks.h: plc_*.pl > > > > + $(PERL) text2macro.pl --strip='^(\#.*|\s*)$$' plc_*.pl > > > > > perlchunks.htmp > > > > + mv perlchunks.htmp perlchunks.h > > > > > > > > all: all-lib > > > > > > What's the reason for the temp file here? > > > > Defensive. If the text2macro.pl program fails/dies then you'd be left > > with a broken output file with a newer timestamp, so the next make > > wouldn't rerun text2macro.pl. > > An alternative would be for text2macro.pl to write to a temp file and > rename at the end. Sounds better. I think any program should be written such that it doesn't produce an output file at all if it cannot produce a correct output file. So use a temp file or a trap or something like that. The makefile should not have to clean up after everyone.
Tim Bunce <Tim.Bunce@pobox.com> writes: > On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote: >> What's the reason for the temp file here? > Defensive. If the text2macro.pl program fails/dies then you'd be left > with a broken output file with a newer timestamp, so the next make > wouldn't rerun text2macro.pl. Doesn't make forcibly remove the target file if the command fails? [ rummages in manual... ] It does if you specify .DELETE_ON_ERROR, which we do (in Makefile.global); so this bit of complication is a waste. regards, tom lane
On Sun, Jan 10, 2010 at 01:16:13AM -0500, Tom Lane wrote: > Tim Bunce <Tim.Bunce@pobox.com> writes: > > On Sat, Jan 09, 2010 at 11:16:18PM +0200, Peter Eisentraut wrote: > >> What's the reason for the temp file here? > > > Defensive. If the text2macro.pl program fails/dies then you'd be left > > with a broken output file with a newer timestamp, so the next make > > wouldn't rerun text2macro.pl. > > Doesn't make forcibly remove the target file if the command fails? > > [ rummages in manual... ] It does if you specify .DELETE_ON_ERROR, > which we do (in Makefile.global); so this bit of complication is > a waste. Okay. Andrew, perhaps you could apply the attached to fix that. (Or I could bundle it into one of the split out plperl feature patches.) Tim.
Вложения
Tim Bunce <Tim.Bunce@pobox.com> writes: > On Sun, Jan 10, 2010 at 01:16:13AM -0500, Tom Lane wrote: >> Doesn't make forcibly remove the target file if the command fails? > Andrew, perhaps you could apply the attached to fix that. (Or I could > bundle it into one of the split out plperl feature patches.) Applied. regards, tom lane