Обсуждение: Have we out-grown Flex?
Quite apart from the practical difficulties that we have with Flex (the fact that the authors are non-responsive and possibly retired, that annoying compiler warning, and the fact that we are forced to maintain our own Windows binaries of 2.5.35), it has some notable disadvantages. I am aware that the MySQL people use their own hand-coded lexical analyzer named sql_lex.cc, which provides a yacc interface, while avoiding using any lexical analyzer generator whatsoever. They can't have done this just for fun, and no doubt this goes some way to explaining their continued performance advantage for very simple queries. I have heard people complain about Postgres parser overhead for "pgbench -S" style use-cases where it is unreasonably high, and I've heard them do so more than once. I suspect that we pay a high price for our table-based lexical analyser's use of a table data structure to maintain transition information, and the MySQL guys do not. It seems exceedingly likely that our lexer is way more sophisticated than theirs, which likely makes it infeasible to emulate their approach, and certainly makes it unattractive. However, I think there might be a better way. There is an LGPL-licensed project named Quex (hey, GNU bison is GPL), which can generate C (and C++) code, and I am in contact with its primary author, Frank-Rene Schäfer. http://quex.sourceforge.net/ While it fits the bill as a replacement for Flex if ever we were compelled to drop Flex, it also has a number of interesting advantages over it that warrant further investigation. As Wikipedia puts it: "Quex uses traditional steps of Thompson construction to create non-deterministic finite automatons from regular expressions, conversion to a deterministic finite automaton and then Hopcroft optimization to reduce the number of states to a minimum. Those mechanisms, though, have been adapted to deal with character sets rather than single characters. By means of this the calculation time can be significantly reduced. Since the Unicode character set consists of many more code points than plain ASCII, those optimizations are necessary in order to produce lexical analysers in a reasonable amount of time." Apparently, in practical terms, reductions of more than 1/2 and even as high as 2/3 in total lexing time have been observed when switching from Flex, and Quex uses the syntax of Flex for its description of regular expressions, which makes a conversion research project seem like a worthwhile undertaking. I am told that this will probably take less than a month, and perhaps as little as two weeks given the broad compatibility of Flex and Quex. This undertaking has the enthusiastic support of Frank-Rene, which is in refreshing contrast to the complete inaccessibility of the Flex developers and the dead silence on the flex users' mailing list. I'm certainly not suggesting that this isn't something that, in order to be adopted, will have to be carefully considered from many angles, of which the performance of the resulting lexer is only one, but at the same time I dare say that this approach is the "low-hanging fruit" of optimising for the use-case where parser overhead is unreasonably high. Am I barking up the wrong tree? At the very least, hopefully these simple observations will provoke thought - Flex/Lex is not the only tool for generating yacc-compatible lexical analysers, and it may not be the best one for our current needs. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
Peter Geoghegan <peter@2ndquadrant.com> writes: > ... I have heard people complain about Postgres > parser overhead for "pgbench -S" style use-cases where it is > unreasonably high, and I've heard them do so more than once. I was under the impression that those cycles were mostly in bison not flex... regards, tom lane
Peter Geoghegan <peter@2ndquadrant.com> writes: > Quite apart from the practical difficulties that we have with Flex > (the fact that the authors are non-responsive and possibly retired, btw, as far as that goes, a look into their sourceforge SCM shows evidence of continued activity. I dunno why it's been so long since their last formal release, but they're not dead. > that annoying compiler warning, http://flex.cvs.sourceforge.net/viewvc/flex/flex/flex.skl?r1=2.213&r2=2.214 regards, tom lane
On Tue, May 1, 2012 at 8:53 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote: > I'm certainly not suggesting that this isn't something that, in order > to be adopted, will have to be carefully considered from many angles, > of which the performance of the resulting lexer is only one, but at > the same time I dare say that this approach is the "low-hanging fruit" > of optimising for the use-case where parser overhead is unreasonably > high. Am I barking up the wrong tree? At the very least, hopefully > these simple observations will provoke thought - Flex/Lex is not the > only tool for generating yacc-compatible lexical analysers, and it may > not be the best one for our current needs. I think Tom's question of whether the parser or lexer is the problem is something that ought to be investigated. Personally, I suspect that our tendency to use lists everywhere, for everything, an artifact of our proud LISP heritage, may be costing us dearly in terms of parse time. However, there's a difference between suspicion and proof, and I certainly have none of the latter. One fairly major obstacle to using quex for anything is that it doesn't appear to be packaged on either Fedora or Ubuntu (I tried apt-cache search quex on my Ubuntu laptop, and yum install quex on an F16 machine). 'port install quex' on my MacOS X doesn't find it either. This means that if we were to switch to quex, the barrier to compiling PostgreSQL would go up very significantly. I realize there's a chicken-and-egg problem there, since if quex is awesome and nobody uses it because it isn't packaged, then it will never become popular enough for anyone to consider packaging it. Nonetheless, I'm not sure I want to be the guinea pig. Even if the actual patch-as-committed-to-git to make the switch to quex were not that large, the amount of follow-on work we'd be creating for every developer and packager of PostgreSQL would be quite formidable. We shouldn't do that unless it's pretty much awesome. One possible way forward would be to try to be compatible with both quex and flex. We'd need a buildfarm critter or two running with quex to detect breakage there, though, since a lot of people would probably continue to run flex. But I'm not entirely sure it's worth doing even that much unless we can show a worthwhile performance bump. The core scanner doesn't change much, so "more features" isn't too compelling as a reason for change, especially if we need to remain flex-compatible. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > One fairly major obstacle to using quex for anything is that it > doesn't appear to be packaged on either Fedora or Ubuntu (I tried > apt-cache search quex on my Ubuntu laptop, and yum install quex on an > F16 machine). 'port install quex' on my MacOS X doesn't find it > either. This means that if we were to switch to quex, the barrier to > compiling PostgreSQL would go up very significantly. I realize > there's a chicken-and-egg problem there, since if quex is awesome and > nobody uses it because it isn't packaged, then it will never become > popular enough for anyone to consider packaging it. Nonetheless, I'm > not sure I want to be the guinea pig. I was trying to think of a delicate way to put that, but since you beat me to it ... yeah, exactly. It's possible that quex is so spectacular that we'd be willing to absorb the conversion costs and the penalties of being a pioneer user of a new tool, but the bar seems pretty high. > Even if the actual > patch-as-committed-to-git to make the switch to quex were not that > large, the amount of follow-on work we'd be creating for every > developer and packager of PostgreSQL would be quite formidable. FWIW, I think only developers not packagers would really be taking such a hit. I assume we'd continue to ship prebuilt lexer output in tarballs, so there'd seldom be a reason for a packager to need to run the tool. Given the extremely slow rate of churn of the lexer, it might not be necessary for most developers to have the tool installed, either, if we were willing to put the derived file into git. Having said all that, if the amount of effort needed to make a trial port is within reason, by all means let's try it and see what the performance difference is, rather than just speculating. regards, tom lane
On 2 May 2012 04:24, Robert Haas <robertmhaas@gmail.com> wrote: > I think Tom's question of whether the parser or lexer is the problem > is something that ought to be investigated. Personally, I suspect > that our tendency to use lists everywhere, for everything, an artifact > of our proud LISP heritage, may be costing us dearly in terms of parse > time. However, there's a difference between suspicion and proof, and > I certainly have none of the latter. It's funny that you should say that, because I actually had a discussion with Greg Stark over drinks about this recently. John Bentley has described an experiment that undermines many traditional beliefs about the trade-offs represented by using a linked list rather than an array. The test is, using a modern computer, generate N integers at random and insert them in order into a sequence. Then, randomly remove integers from the sequence, one at a time. What is the performance at different sizes of N when the sequence is a doubly-linked list, and when it is an array? If you graph the two, the results are perhaps rather surprising. I think his graph went up to 100,000. The initial result shows a line representing an array down near the bottom of the graph. The list line looks exponential. Even if you use memory pooling so the list doesn't have to allocate memory as needed, the array still roundly beats the list, albeit not quite so convincingly and without the list hitting the roof near 100,000. I don't think that I need to point out that this is for inserting and deleting, and that's when you're supposed to use lists. The point is that on modern architectures, with many layers of cache, the cost of the linear search to get the insertion point completely dominates - this is without the array availing of a binary search, in the interest of fairness. CPU caches happen to do a very good job of moving over on-average half of the array for inserting elements at random points. The list is much larger than the array, with the two extra pointers per element (yeah, I know that we use singly linked lists, but we have other disadvantages compared to typical C lists), which matters. Predictable usage patterns - that is, temporal and spatial locality, resulting in good usage of the memory hierarchy matters a lot. We're not talking about a small difference, either. I think the difference in the published test was something like the array was 50 - 100 times faster. The list results in far more cache misses than the array. So, I'm right there with you - using lists everywhere is bad news. As for the question of Flex/Quex, I'm not in an immediate position to sink any more time into it, but it remains on my list of things to pursue for 9.3, though it's only about number 3 on that list right now. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
On 2 May 2012 04:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: > FWIW, I think only developers not packagers would really be taking such > a hit. I assume we'd continue to ship prebuilt lexer output in > tarballs, so there'd seldom be a reason for a packager to need to run > the tool. Given the extremely slow rate of churn of the lexer, it might > not be necessary for most developers to have the tool installed, either, > if we were willing to put the derived file into git. Incidentally, I had an unrelated conversation with someone (I think it might have been Heikki) a while back, where it was suggested that Flex and Bison could be run through web services. This might actually make hacking Postgres on windows far easier, because the last time I tried to do that the hard way, I gave up, suspecting that there must be some kind of Winflex bug that selectively manifests itself - certainly, the population of windows hackers is small enough that it could go unnoticed for quite a while. Such an approach could be part of the solution to this problem (although, incidentally, Quex maintains visual studio support quite well, and even has graphical instructions here: http://quex.sourceforge.net/doc/html/intro/visual_studio_trouble.html ). It might be the case that some kind of virtualisation and/or authentication (Postgres community account required) could make this approach practical. It just isn't the path of least resistance right now. Visual studio builds would be far easier if we did this, which might encourage more hackers to venture into Windows land. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
On Tue, May 1, 2012 at 5:53 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote: > Quite apart from the practical difficulties that we have with Flex > (the fact that the authors are non-responsive and possibly retired, > that annoying compiler warning, and the fact that we are forced to > maintain our own Windows binaries of 2.5.35), it has some notable > disadvantages. I am aware that the MySQL people use their own > hand-coded lexical analyzer named sql_lex.cc, which provides a yacc > interface, while avoiding using any lexical analyzer generator > whatsoever. They can't have done this just for fun, and no doubt this > goes some way to explaining their continued performance advantage for > very simple queries. I have heard people complain about Postgres > parser overhead for "pgbench -S" style use-cases where it is > unreasonably high, and I've heard them do so more than once. For -S -M simple, the time spent planning is 5 times more than the time spent parsing. It may be worthwhile to reduce the time spent parsing, but if the goal is parity with MySQL it probably isn't the place to start. (If you use a bottom-up profiler, the time spent in planning is scattered over so many different functions that none of them look very important individually) Cheers, Jeff
On 2 May 2012 17:20, Jeff Janes <jeff.janes@gmail.com> wrote: > For -S -M simple, the time spent planning is 5 times more than the > time spent parsing. It may be worthwhile to reduce the time spent > parsing, but if the goal is parity with MySQL it probably isn't the > place to start. Could you please share your figures and methodology? I've heard of far larger proportions than that. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
On Wed, May 2, 2012 at 9:31 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote: > On 2 May 2012 17:20, Jeff Janes <jeff.janes@gmail.com> wrote: >> For -S -M simple, the time spent planning is 5 times more than the >> time spent parsing. It may be worthwhile to reduce the time spent >> parsing, but if the goal is parity with MySQL it probably isn't the >> place to start. > > Could you please share your figures and methodology? I've heard of far > larger proportions than that. I used two methods. One was to hack exec_simple_query so that, under the control of a new GUC, it would do things such as pg_parse_query the query 101 times, throwing away the results of the first 100 before proceeding to use the 101 parse result as normal. Then I just run pgbench under both settings, take the difference in 1/TPS between them and divide by 100 to get the seconds per parse (and multiple by 1e6 to get usec/parse) I did the same thing for pg_analyze_and_rewrite, and for pg_analyze_and_rewrite+pg_plan_queries (pg_plan_queries scribbles on the structure produced by pg_analyze_and_rewrite, so you have to repeat both as a unit, and then subtract the the pg_analyze_and_rewrite timings off afterwards to isolate just the planning) . On my current laptop and rebased to git HEAD, I got 2usec/pg_parse_query, 2usec/pg_analyze_and_rewrite, and 12usec/pg_plan_queries. Since my laptop is dual core, I did this with -c2 -j2. Back when I originally implemented and tested this on much older hardware and about one year older pgsql code base, the absolute values of usec/action were several fold higher, but the ratios of 1:1:6 were about the same. This does risk that it will overlook caching effects by repeating the same thing in a tight loop. I.e. parses 2 through 101 might be much faster than parse 1 was. Also, it risks that I simply don't know what I'm doing and my attempts to throw away the results of a parse are misbegotten--I just overwrote the old pointer with the new one and assume the memory context would clean up the resulting orphans. I could try to clean up and post the patch that implements this if you want. The second method was just to do --enable-profiling on a stock build and look at the call graph section of gprof output. It attributed 50% to pg_plan_queries and children and about 10% to each of pg_parse_query and pg_analyze_and_rewrite (including their respective children). I don't put tremendous faith in gprof's call graph, but the fact that the results are in agreement with the other method gives me more confidence in both. Cheers, Jeff
On Wed, May 2, 2012 at 3:33 PM, Peter Geoghegan <peter@2ndquadrant.com> wrote: > On 2 May 2012 04:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> FWIW, I think only developers not packagers would really be taking such >> a hit. I assume we'd continue to ship prebuilt lexer output in >> tarballs, so there'd seldom be a reason for a packager to need to run >> the tool. Given the extremely slow rate of churn of the lexer, it might >> not be necessary for most developers to have the tool installed, either, >> if we were willing to put the derived file into git. > > Incidentally, I had an unrelated conversation with someone (I think it > might have been Heikki) a while back, where it was suggested that Flex > and Bison could be run through web services. This might actually make Might've been me - I've been doing that for a long time to work around winflex issues. But I never got around to doing anything like access control or so, I just ran it on a hidden ip on a random port, and simple curl call on the windows box.. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Wed, May 02, 2012 at 10:37:58AM -0700, Jeff Janes wrote: > I could try to clean up and post the patch that implements this if you want. > > The second method was just to do --enable-profiling on a stock build > and look at the call graph section of gprof output. It attributed 50% > to pg_plan_queries and children and about 10% to each of > pg_parse_query and pg_analyze_and_rewrite (including their respective > children). I don't put tremendous faith in gprof's call graph, but > the fact that the results are in agreement with the other method gives > me more confidence in both. Those are the ratio's I (and I think Tom) expected to see. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
I haven't tried quex, but I have tried lemon (which can be broken out of SQLite) and re2c and ragel. I like ragel and lemon, but the combination supports a push-parser style from memory, and many tools are inconvenient unless you are prepared to suck in a whole message before parsing, or let the parser drive a pull loop, or use a coroutine structure. Could go all trendy and use a PEG tool like, er,, peg (http://piumarta.com/software/peg/). (I haven't tried them tho') James
Doesn't that imply that a plan cache might be worthwhile? But no matter: didn't the OP really have issue with packaging and Windows support - and there are a lot of Windows users, and in general there are many Windows devs: making it easier for them to contribute has to be good doesn't it?
On Thu, May 3, 2012 at 12:51 PM, james <james@mansionfamily.plus.com> wrote: > I haven't tried quex, but I have tried lemon (which can be broken out of > SQLite) and re2c and ragel. > > I like ragel and lemon, but the combination supports a push-parser style > from memory, and many tools are inconvenient unless you are prepared to suck > in a whole message before parsing, or let the parser drive a pull loop, or > use a coroutine structure. > > Could go all trendy and use a PEG tool like, er,, peg > (http://piumarta.com/software/peg/). (I haven't tried them tho') I think the goal is not trendy nor easy to use (but easy to maintain, at least...), but faster, and even then there is some doubt if any amount of lexer optimization could possibly matter given everything else that needs to happen to execute a query. Better error messages (with position information) might be a functional enhancement that I'd like, but I don't think flex is limiting in that regard; rather, a lot more information already exposed by flex would have to be passed through the semantic analyzer. Provided it could matter, are these tools faster than flex? My limited understanding is "probably not". -- fdr
I believe there are tools that are significantly faster than flex. I believe re2c generates code that is faster. But the key thing is to test, probably, or perhaps ask around. I'm out of touch, but from memory flex wasn't the be-all and end-all. Lemon is definitely easy to maintain/port and the result is pretty nice, too (I know Bison/Yacc wasn't the focus here).
On Thu, May 3, 2012 at 12:53 PM, james <james@mansionfamily.plus.com> wrote: > Doesn't that imply that a plan cache might be worthwhile? PG has one if you use prepared statements. Some people are either unable or unwilling to use prepared statements, though. Cheers, Jeff