Обсуждение: Re: automatic parser generation for ecpg
> (Mike, it lacks a copyright notice, I take it BSD is okay). Thats fine with me.. Also - for completeness (for the list) - I think the plan is to convert the awk to perl (via a2p + some tweaking) if awk is not already used as part of the build process (to avoid adding another prerequisite..) -- Mike Aubury http://www.aubit.com/ Aubit Computing Ltd is registered in England and Wales, Number: 3112827 Registered Address : Clayton House,59 Piccadilly,Manchester,M1 2AQ
Mike Aubury <mike.aubury@aubit.com> writes: > Also - for completeness (for the list) - I think the plan is to convert the > awk to perl (via a2p + some tweaking) if awk is not already used as part of > the build process (to avoid adding another prerequisite..) Hmm. I believe the current state of play on that is: * awk is required to build from source on non-Windows platforms (cf genbki.sh, Gen_fmgrtab.sh, and various random uses in the Makefiles) * perl is required to build from source on Windows (all the build scripts) * perl is required to build from a CVS pull on non-Windows too, but we avoid requiring this for builds from a distribution tarball (by including the relevant derived files in the tarball) * we get around the awk requirement on Windows by maintaining parallel code that does the same things in perl :-( So it's all pretty messy and neither choice is exactly desirable. I think maintaining parallel versions of an ecpg parser generator would be no fun at all, though, so the perl choice seems more or less forced. We could either preserve the current state of play by shipping the derived bison file in tarballs, or bite the bullet and say perl is required to build from source in all cases (in which case I'd be inclined to try to get rid of Gen_fmgrtab.sh etc). As against that ... does a2p produce code that is readable/maintainable? If the code wasn't perl to start with I'd be a little worried about ending up with ugly hard-to-read code. regards, tom lane
Perl code thats readable and maintainable ;-) In reality - it doesn't look too disimilar from the awk original. I didn't appreciate that we'd probably need to keep 2 versions (one for unix and one for windows). In that case - I'd argue that we only need to "maintain" one and regenerate the other when required. Provided they both work the same, I'd say it doesn't matter what the perl one looked like, because thats not the one that'd be maintained :-) Personally - I'd be tempted to keep this as a background process for the ecpg maintiner anyway rather than a normal end user. Probably using something like a 'syncparser' make target and keep the generation separate from the normal build process. That way - awk/perl (you could then pick just one) would only be a requirement if you want to regenerate the grammer via the 'syncparser' target. This does have the benefit that the ecpg maintainer can then control when the sync'ing is done and that its less likely to inadvertantly break the ecpg branch of source tree. At the end of the day - this is something Michael has just been doing manually already and we're trying to help automate the process.. (ducks for cover) > As against that ... does a2p produce code that is readable/maintainable? > If the code wasn't perl to start with I'd be a little worried about > ending up with ugly hard-to-read code. -- Mike Aubury http://www.aubit.com/ Aubit Computing Ltd is registered in England and Wales, Number: 3112827 Registered Address : Clayton House,59 Piccadilly,Manchester,M1 2AQ
Mike Aubury <mike.aubury@aubit.com> writes: > In reality - it doesn't look too disimilar from the awk original. I didn't > appreciate that we'd probably need to keep 2 versions (one for unix and one > for windows). In that case - I'd argue that we only need to "maintain" one > and regenerate the other when required. Provided they both work the same, I'd > say it doesn't matter what the perl one looked like, because thats not the > one that'd be maintained :-) That'd only be acceptable if the code conversion were fully automatic --- given your reference to "tweaks" I wasn't sure if that could be the case. > Personally - I'd be tempted to keep this as a background process for > the ecpg maintiner anyway rather than a normal end user. While we could approach it that way, it doesn't really meet all the goals I was hoping for. The current process is unsatisfactory for at least two reasons above and beyond "Michael has to do a lot of gruntwork": * People hacking on the core grammar might break ecpg without realizing it. They need short-term feedback from the standard build process, or at the worst from the standard buildfarm checks. * For the last little while, changing the core keyword set breaks ecpg completely, which means we have the worst of all possible worlds: core modifiers have to hack ecpg to get it to compile, and then Michael has to do more work to get it to actually work right. I've been willing to put up with the second problem because I expected the ecpg grammar build process to become fully automatic soon. If that doesn't happen then I'm going to be lobbying to revert the change that made ecpg depend directly on the core keyword set. regards, tom lane
I share Tom's thoughts completely. My personal goal is definitely to make ecpg parser generation a fully automated task. The only manual work I see in the future is adding some special ecpg handling. I fully expect this script to generate a working parser for every single change in gram.y. However, if some new rule needs a different aka non-default handling in ecpg that will remain manual, but the automatic process should nevertheless create a parser with default handling for this new rule, thus not breaking anything but the new feature in ecpg, which of course cannot get broken because it is new. Is this understandable? :-) Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org Go VfL Borussia! Go SF 49ers! Use Debian GNU/Linux! Use PostgreSQL!
Michael Meskes <meskes@postgresql.org> writes: > ... The only manual work I see in the > future is adding some special ecpg handling. I fully expect this script to > generate a working parser for every single change in gram.y. However, if some > new rule needs a different aka non-default handling in ecpg that will remain > manual, but the automatic process should nevertheless create a parser with > default handling for this new rule, thus not breaking anything but the new > feature in ecpg, which of course cannot get broken because it is new. Hmm --- I hadn't really thought much about the need for the generation script to make special transformations of some rules, but obviously that is going to be needed in the places where you have to sew the SQL and ecpg syntaxes together. Perhaps there is a good argument for going to perl just to be sure that we don't get backed into a corner on what can be done in such cases. awk is a great tool for certain kinds of tasks, but it's pretty limited. For instance, AFAIK you'd be out of luck if you needed to make two passes over the input. So my vote at this point would be to convert the script to perl. Also, never mind the idea about starting to require perl for all build scenarios. We'll still ship preproc.y in tarballs because we will still ship preproc.c in tarballs --- I don't think anyone was lobbying to start requiring bison to be present for builds from tarballs. So if the script is perl we'll have exactly the same build dependency scenarios as now. regards, tom lane
On Tue, Oct 21, 2008 at 08:31:54AM -0400, Tom Lane wrote: > So it's all pretty messy and neither choice is exactly desirable. I > think maintaining parallel versions of an ecpg parser generator > would be no fun at all, though, so the perl choice seems more or > less forced. We could either preserve the current state of play by > shipping the derived bison file in tarballs, or bite the bullet and > say perl is required to build from source in all cases (in which > case I'd be inclined to try to get rid of Gen_fmgrtab.sh etc). +1 for requiring it for source builds. We'll be able to simplify the code quite a bit :) > As against that ... does a2p produce code that is > readable/maintainable? Not that I've seen. There are modules on CPAN (I know, I know) for dealing with lexx and yacc, and those are probably better for the purpose. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Tue, Oct 21, 2008 at 08:45:11AM -0700, David Fetter wrote: > > As against that ... does a2p produce code that is > > readable/maintainable? > > Not that I've seen. There are modules on CPAN (I know, I know) for > dealing with lexx and yacc, and those are probably better for the > purpose. Well I think it's at least readable. Problem with your approach is that both Mike and I feel more comfortable with awk at the moment. If a2p produces a working perl script, that doesn't seem to be a problem IMO as we could maintain the awk script but deliver the perl version. Michael -- Michael Meskes Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org Go VfL Borussia! Go SF 49ers! Use Debian GNU/Linux! Use PostgreSQL!