Обсуждение: how to gate experimental features (SQL/PGQ)
Some of you will have seen the thread on the SQL/PGQ feature patch.[0] I want to evaluate how it would be acceptable to commit such a feature patch with an understanding that it is experimental. Perhaps some of these ideas could then also be applied to other projects. [0]: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org At this point, the patch set is pretty settled, but it is large, and it's not going to be perfect at the first try. Especially, some of the parsing rules, query semantics, that kind of thing. Imagine if you implemented basic SQL for the first time, how sure would you be that you get the semantics of a.b.c fully correct everywhere at the first try. But much of the patch is almost-boilerplate: New DDL commands, new catalogs, associated tests and documentation. It looks a lot, but most of it is not very surprising. So it might make sense to commit this and let it get refined in-tree rather than carrying this large patch around until some indefinite point. Obviously, there would be some basic requirements. The overall code structure should be future-proof. It shouldn't crash. It has to satisfy security requirements. Also, it should not significantly affect uses that don't use that feature. All of this is being worked on. But I would like to communicate to users, the details of some query results might change, we might make some incompatible syntax changes if there was some mistake, or I don't know, maybe the planning of some query creates an infinite loop that we haven't caught. I'm not aware of anything like that, but it seems prudent to plan for it. Some options: 1) Just document it and hope people will read the documentation and/or understand that it's a new feature that needs time to mature. 2) A run-time setting (GUC) like experimental_pgq = on/off. This would be checked in the relevant DDL (CREATE/ALTER/DROP) commands as well as the GRAPH_TABLE function. So without that you couldn't do anything with it, but for example pg_dump and psql and ecpg preproc would still work and the system catalogs exist. Default to off for one release (subject to change). 3) A compile-time option. My preference would be 2). Option 3) has obvious problems, like you wouldn't get buildfarm coverage, and it would be a significant burden on all developers to keep the different code paths all working going forward. Option 1) would be the easy fallback, but I suppose the point of this message is to check whether a more technical approach would be preferable. Also, perhaps others have had similar thoughts about other development projects, in which case it would be good to get an overview and think about how these principles could be applied in a general way. Just to put forward another example that I'm familiar with, I have this currently-dormant column encryption patch set [1] that has vaguely similar properties in that it is a large patch, lots of boilerplate, lots of details that are best checked while actually using it, but possibly requiring incompatible changes if fixes are required. [1]: https://www.postgresql.org/message-id/flat/89157929-c2b6-817b-6025-8e4b2d89d88f%40enterprisedb.com
On Tue, Jan 13, 2026 at 10:16 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > Some of you will have seen the thread on the SQL/PGQ feature patch.[0] > I want to evaluate how it would be acceptable to commit such a feature > patch with an understanding that it is experimental. Perhaps some of > these ideas could then also be applied to other projects. > > [0]: > https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org > > At this point, the patch set is pretty settled, but it is large, and > it's not going to be perfect at the first try. Especially, some of the > parsing rules, query semantics, that kind of thing. Imagine if you > implemented basic SQL for the first time, how sure would you be that you > get the semantics of a.b.c fully correct everywhere at the first try. > But much of the patch is almost-boilerplate: New DDL commands, new > catalogs, associated tests and documentation. It looks a lot, but most > of it is not very surprising. So it might make sense to commit this and > let it get refined in-tree rather than carrying this large patch around > until some indefinite point. +1 > > Obviously, there would be some basic requirements. The overall code > structure should be future-proof. It shouldn't crash. It has to > satisfy security requirements. Also, it should not significantly affect > uses that don't use that feature. All of this is being worked on. But > I would like to communicate to users, the details of some query results > might change, we might make some incompatible syntax changes if there > was some mistake, or I don't know, maybe the planning of some query > creates an infinite loop that we haven't caught. I'm not aware of > anything like that, but it seems prudent to plan for it. Since the patch set doesn't support VLE(variable length edge) yet, I don't think it will cause an infinite loop. > > Some options: > > 1) Just document it and hope people will read the documentation and/or > understand that it's a new feature that needs time to mature. > > 2) A run-time setting (GUC) like experimental_pgq = on/off. This would > be checked in the relevant DDL (CREATE/ALTER/DROP) commands as well as > the GRAPH_TABLE function. So without that you couldn't do anything with > it, but for example pg_dump and psql and ecpg preproc would still work > and the system catalogs exist. Default to off for one release (subject > to change). Instead of a GUC, does it make sense to just raise a notice message for those DDL? > > 3) A compile-time option. > > My preference would be 2). Option 3) has obvious problems, like you > wouldn't get buildfarm coverage, and it would be a significant burden on > all developers to keep the different code paths all working going > forward. Option 1) would be the easy fallback, but I suppose the point > of this message is to check whether a more technical approach would be > preferable. > > Also, perhaps others have had similar thoughts about other development > projects, in which case it would be good to get an overview and think > about how these principles could be applied in a general way. > > Just to put forward another example that I'm familiar with, I have this > currently-dormant column encryption patch set [1] that has vaguely > similar properties in that it is a large patch, lots of boilerplate, > lots of details that are best checked while actually using it, but > possibly requiring incompatible changes if fixes are required. > > [1]: > https://www.postgresql.org/message-id/flat/89157929-c2b6-817b-6025-8e4b2d89d88f%40enterprisedb.com > > > -- Regards Junwang Zhao
On Tuesday, January 13, 2026, Peter Eisentraut <peter@eisentraut.org> wrote:
1) Just document it and hope people will read the documentation and/or understand that it's a new feature that needs time to mature.
Unless experimental means we are allowed to make breaking changes in minor releases I’d say we do the best we can and then just put it out there as normal.
David J.
Hi, On 2026-01-13 15:16:22 +0100, Peter Eisentraut wrote: > Some of you will have seen the thread on the SQL/PGQ feature patch.[0] I > want to evaluate how it would be acceptable to commit such a feature patch > with an understanding that it is experimental. Perhaps some of these ideas > could then also be applied to other projects. > > [0]: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org > > At this point, the patch set is pretty settled, but it is large, and it's > not going to be perfect at the first try. Yikes, large indeed: $ git diff --shortstat upstream/master 118 files changed, 14805 insertions(+), 220 deletions(-) $ git log -p upstream/master.. |wc -c 721659 I think the main question that needs answering is not how to get this merged, but whether a feature this large (in its infancy stage!) is worth the squeeze... It's possible that I am just weak minded, but at least I can't sensibly review a 700kB sized commit. This imo badly needs to broken down and, if at all possible, reduced in size & scope. > Also, it should not significantly affect uses that don't use that feature. Are there such affects currently? > 1) Just document it and hope people will read the documentation and/or > understand that it's a new feature that needs time to mature. > > 2) A run-time setting (GUC) like experimental_pgq = on/off. This would be > checked in the relevant DDL (CREATE/ALTER/DROP) commands as well as the > GRAPH_TABLE function. So without that you couldn't do anything with it, but > for example pg_dump and psql and ecpg preproc would still work and the > system catalogs exist. Default to off for one release (subject to change). > > 3) A compile-time option. I don't even know how you could implement 3) realistically. We have zero infrastructure for making e.g. parser, keyword list etc change due to defines compile time. > Just to put forward another example that I'm familiar with, I have this > currently-dormant column encryption patch set [1] that has vaguely similar > properties in that it is a large patch, lots of boilerplate, lots of details > that are best checked while actually using it, but possibly requiring > incompatible changes if fixes are required. I'm quite sceptical that that's a good choice for an experimental feature being merged, because it has tree wide impact, e.g. due to making pg_attribute wider, affecting the protocol, adding branches to pretty core places. I think I am on board with being more open to merge features earlier, but it shouldn't be a fig leaf to allow merging stuff with wide impact with the excuse that it's an experimental thing. Greetings, Andres Freund
On Tue, Jan 13, 2026 at 6:16 AM Peter Eisentraut <peter@eisentraut.org> wrote: > 1) Just document it and hope people will read the documentation and/or > understand that it's a new feature that needs time to mature. > > 2) A run-time setting (GUC) like experimental_pgq = on/off. This would > be checked in the relevant DDL (CREATE/ALTER/DROP) commands as well as > the GRAPH_TABLE function. So without that you couldn't do anything with > it, but for example pg_dump and psql and ecpg preproc would still work > and the system catalogs exist. Default to off for one release (subject > to change). > > 3) A compile-time option. I don't like (1) personally, and I think I'd prefer (2) of these options. (3) does have the advantage that you're testing that the experimental feature can be torn out independently. But that has its own downsides in terms of maintenance burden, as you say. > Also, perhaps others have had similar thoughts about other development > projects, in which case it would be good to get an overview and think > about how these principles could be applied in a general way. I know you and I briefly discussed something like this for OAuth, and I wonder if some of the upcoming patchsets there would be good candidates for "experimental" features as well, since they're self-contained, and they apply only to a subset of users who more clearly understand whether they need the features or not. Token caching is one such hard problem. Plus, just... protocol stuff in general. All of it. That is really hard to get right up front without testing. Many software projects nowadays will allow you to enable "draft mode" for a certain RFC that is almost there but not quite released. Not only might this help us with upcoming IETF specs, but if our own protocol had its own draft mode, we might be able to break deadlock on some of the intractable feature threads faster. On Tue, Jan 13, 2026 at 7:17 AM Andres Freund <andres@anarazel.de> wrote: > I don't even know how you could implement 3) realistically. We have zero > infrastructure for making e.g. parser, keyword list etc change due to defines > compile time. Is that an architecturally unsolvable thing, or is it a simple matter of programming? Would it be nice to have said infrastructure? > I think I am on board with being more open to merge features earlier, but it > shouldn't be a fig leaf to allow merging stuff with wide impact with the > excuse that it's an experimental thing. I agree that a good quality bar needs to remain in place -- and I guess I view this thread as a way to figure out how we keep that quality bar. But overall, I think I'd rather work with an environment where we occasionally say "hey, this clearly-labeled experimental feature was not baked enough, pull it back now" as opposed to occasionally saying "hey, we never got this one feature everyone wants because we'll never be practically sure that it's 'right' enough without user testing". On Tue, Jan 13, 2026 at 6:55 AM David G. Johnston <david.g.johnston@gmail.com> wrote: > Unless experimental means we are allowed to make breaking changes in minor releases For a protocol draft system, that would be *amazingly* helpful. "19.6 introduces protocol feature FooBar Draft-4, which fixes these known issues in the spec. Support for Draft-2 is dropped." If there's a community that can make use of it (which may be something that has to be grown, to be fair), that could help other community developers feel more plugged into our release cycle, and capable of making progress throughout the year on really thorny problems. --Jacob
Jacob Champion <jacob.champion@enterprisedb.com> writes:
> On Tue, Jan 13, 2026 at 7:17 AM Andres Freund <andres@anarazel.de> wrote:
>> I don't even know how you could implement 3) realistically. We have zero
>> infrastructure for making e.g. parser, keyword list etc change due to defines
>> compile time.
> Is that an architecturally unsolvable thing, or is it a simple matter
> of programming? Would it be nice to have said infrastructure?
You'd have to throw out flex and bison and build some sort of
extensible parser. That has some attraction to me personally
(I worked on such systems decades ago at HP), but it's fairly
hard to justify the amount of effort that would be needed to
get there. It might well be slower than a flex/bison parser,
and/or have poorer detection of grammar inconsistencies, either
of which would be bad for our usage.
> But overall, I think I'd rather work with an environment where we
> occasionally say "hey, this clearly-labeled experimental feature was
> not baked enough, pull it back now" as opposed to occasionally saying
> "hey, we never got this one feature everyone wants because we'll never
> be practically sure that it's 'right' enough without user testing".
This seems workable for some kinds of features and less so for others.
As an example, forcing initdb post-release seems like a nonstarter
even for "experimental" features, so anything that affects initial
catalog contents is still going to be a problem.
It strikes me that we do have a mechanism that could be used to cope
with catalog changes, which is to package the new catalog objects as
an extension. Then ALTER EXTENSION UPDATE could be used to migrate
from FooBar Draft-2 to Draft-4. But the tricky bit would be to
replace the extension with in-core objects once we decide it's
completely baked. Our previous attempts to do that sort of thing
have been painful. It might be better just to say that you only
get to change experimental catalog entries once a year in major
releases. (Slow progress is still better than no progress.)
regards, tom lane
On 1/13/26 12:24, Tom Lane wrote: > Jacob Champion <jacob.champion@enterprisedb.com> writes: >> On Tue, Jan 13, 2026 at 7:17 AM Andres Freund <andres@anarazel.de> wrote: >>> I don't even know how you could implement 3) realistically. We have zero >>> infrastructure for making e.g. parser, keyword list etc change due to defines >>> compile time. > >> Is that an architecturally unsolvable thing, or is it a simple matter >> of programming? Would it be nice to have said infrastructure? > > You'd have to throw out flex and bison and build some sort of > extensible parser. That has some attraction to me personally > (I worked on such systems decades ago at HP), but it's fairly > hard to justify the amount of effort that would be needed to > get there. It might well be slower than a flex/bison parser, > and/or have poorer detection of grammar inconsistencies, either > of which would be bad for our usage. > >> But overall, I think I'd rather work with an environment where we >> occasionally say "hey, this clearly-labeled experimental feature was >> not baked enough, pull it back now" as opposed to occasionally saying >> "hey, we never got this one feature everyone wants because we'll never >> be practically sure that it's 'right' enough without user testing". > > This seems workable for some kinds of features and less so for others. > As an example, forcing initdb post-release seems like a nonstarter > even for "experimental" features, so anything that affects initial > catalog contents is still going to be a problem. <snip> > It might be better just to say that you only get to change > experimental catalog entries once a year in major releases. (Slow > progress is still better than no progress.) +1 This seems like the best compromise. Overall I think we need a way to make progress on big invasive features that need incremental rollout and broad testing to get right. Moving forward while clearly designating at least some of these as "experimental and subject to breaking changes once every major release" is far better than never making progress at all IMHO. -- Joe Conway PostgreSQL Contributors Team Amazon Web Services: https://aws.amazon.com
On Tue, Jan 13, 2026 at 9:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > You'd have to throw out flex and bison and build some sort of > extensible parser. That has some attraction to me personally > (I worked on such systems decades ago at HP), but it's fairly > hard to justify the amount of effort that would be needed to > get there. Makes sense. If an experiment had so many changes to the parser that it'd be worth the (potentially considerable) effort, maybe we could ship just two parser versions (a base, and a draft) and maintain the diff between them with some [handwavy magic] preprocessing? Probably too early in the thread to get into the weeds that much, but the ability to ship a minor-version parser improvement for a draft feature, without putting the supported core at risk, seems like it could be worth some amount of maintenance cost. > This seems workable for some kinds of features and less so for others. > As an example, forcing initdb post-release seems like a nonstarter > even for "experimental" features, so anything that affects initial > catalog contents is still going to be a problem. Yeah, a surprise initdb (even if well-documented) seems like it might discourage use of any experimental features, if it's not easy to tell which experiments might lock you into future pain. --Jacob
On Tue, Jan 13, 2026 at 3:18 PM Peter Eisentraut <peter@eisentraut.org> wrote: > *snip* > Also, perhaps others have had similar thoughts about other development > projects, in which case it would be good to get an overview and think > about how these principles could be applied in a general way. > I'm not getting tired of adding TDE to this list, as one way or another, this will require either extensibility changes in a minimal case or more, depending on the route we take in the future. > Just to put forward another example that I'm familiar with, I have this > currently-dormant column encryption patch set [1] that has vaguely > similar properties in that it is a large patch, lots of boilerplate, > lots of details that are best checked while actually using it, but > possibly requiring incompatible changes if fixes are required. From an end-user perspective, option 2 appears to be the most sensible and feasible choice. You're absolutely right; some things do need user testing and feedback before it's known if and how they might be finally useful, or what additional adjustments are needed. The question is how to select these experimental features based on which specific criteria. At the same time, I understand Andres' concern about being too big, and smaller increments are the way to go, as there isn't a single person capable of reviewing everything. The biggest strength is the maturity and sometimes the slow pace of PostgreSQL, so the question would be how to define the acceptance, so it doesn't end up being "spoiled". For the actual timing, this should occur as part of the major release. We should definitely avoid incorporating experimental features into minor updates, as this contradicts the overall stability and concept of the current major updates. > > [1]: > https://url.avanan.click/v2/r01/___https://www.postgresql.org/message-id/flat/89157929-c2b6-817b-6025-8e4b2d89d88f*40enterprisedb.com___.YXAzOnBlcmNvbmE6YTpnOjg2M2Y3NTQxYWY1YjFhZjBmOTcyMDA2NTI0M2UyM2QzOjc6N2EwYjoyMzY5ZWRiNTdkNDRjYTM0ZWFmZWNjNDM0YTk0M2Y3MTY4NzlhMGJmNTZjNDMzY2U0MWJhOTc4NDIwZTI3YzljOnA6VDpO > > > >
On Wednesday, January 14, 2026, Kai Wagner <kai.wagner@percona.com> wrote:
For the actual timing, this should occur as part of the major release.
We should definitely avoid incorporating experimental features into
minor updates, as this contradicts the overall stability and concept
of the current major updates.
That wasn’t the idea though. What is your opinion on changing the behavior of an experimental feature added in 19.0 in the 19.4 release so long as it doesn’t require initdb?
David J.
On Wed, Jan 14, 2026 at 8:11 PM David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Wednesday, January 14, 2026, Kai Wagner <kai.wagner@percona.com> wrote: >> >> >> For the actual timing, this should occur as part of the major release. >> We should definitely avoid incorporating experimental features into >> minor updates, as this contradicts the overall stability and concept >> of the current major updates. >> > > That wasn’t the idea though. What is your opinion on changing the behavior of an experimental feature added in 19.0 inthe 19.4 release so long as it doesn’t require initdb? > Perhaps I wasn't fully clear with my statement, but I was referring to introducing new experimental features, not adjusting/fixing/improving existing and already shipped experimental features. That, of course, is and should be part of a minor update, as you don't want to wait another year for actual fixes and improvements to land for existing ones. Kai > David J. >
> On 13 Jan 2026, at 15:16, Peter Eisentraut <peter@eisentraut.org> wrote: > 2) A run-time setting (GUC) like experimental_pgq = on/off. This would be checked in the relevant DDL (CREATE/ALTER/DROP)commands as well as the GRAPH_TABLE function. So without that you couldn't do anything with it, but forexample pg_dump and psql and ecpg preproc would still work and the system catalogs exist. Default to off for one release(subject to change). Such a GUC would IMHO only make sense if we remove it when we promote the feature, but removing a GUC also comes with a cost for anyone having baked it into their scripts etc. If we feel confident enough that a patch satisfies the security requirements to merge it, I think we should make it available. If a feature is deemed experimental in terms of it being feature completeness (missing parts etc), or because user visible parts might change, then the documentation is IMO the vehicle for handling that. -- Daniel Gustafsson
> On 13 Jan 2026, at 18:24, Tom Lane <tgl@sss.pgh.pa.us> wrote: > It might be better just to say that you only > get to change experimental catalog entries once a year in major > releases. (Slow progress is still better than no progress.) I think this idea has a lot of merit. -- Daniel Gustafsson
On 14/01/26 17:57, Daniel Gustafsson wrote: >> On 13 Jan 2026, at 15:16, Peter Eisentraut <peter@eisentraut.org> wrote: > >> 2) A run-time setting (GUC) like experimental_pgq = on/off. This would be checked in the relevant DDL (CREATE/ALTER/DROP)commands as well as the GRAPH_TABLE function. So without that you couldn't do anything with it, but forexample pg_dump and psql and ecpg preproc would still work and the system catalogs exist. Default to off for one release(subject to change). > > Such a GUC would IMHO only make sense if we remove it when we promote the > feature, but removing a GUC also comes with a cost for anyone having baked it > into their scripts etc. If we feel confident enough that a patch satisfies the > security requirements to merge it, I think we should make it available. > Instead of having a GUC for each potential experimental feature we could have just a single GUC with a list of experimental features that are enabled. SET enable_experimental_features = "foo,bar,baz"; -- Matheus Alcantara EDB: https://www.enterprisedb.com
> On 14 Jan 2026, at 22:15, Matheus Alcantara <matheusssilv97@gmail.com> wrote: > > On 14/01/26 17:57, Daniel Gustafsson wrote: >>> On 13 Jan 2026, at 15:16, Peter Eisentraut <peter@eisentraut.org> wrote: >>> 2) A run-time setting (GUC) like experimental_pgq = on/off. This would be checked in the relevant DDL (CREATE/ALTER/DROP)commands as well as the GRAPH_TABLE function. So without that you couldn't do anything with it, but forexample pg_dump and psql and ecpg preproc would still work and the system catalogs exist. Default to off for one release(subject to change). >> Such a GUC would IMHO only make sense if we remove it when we promote the >> feature, but removing a GUC also comes with a cost for anyone having baked it >> into their scripts etc. If we feel confident enough that a patch satisfies the >> security requirements to merge it, I think we should make it available. > Instead of having a GUC for each potential experimental feature we could have just a single GUC with a list of experimentalfeatures that are enabled. > > SET enable_experimental_features = "foo,bar,baz"; That is an option to avoid the need to retire/remove GUC's. Such a format makes it harder to know which experimental features which are disabled when querying pg_settings or looking at it in other ways where the postgresql.conf comment isn't immediately visible, but that might not be a big concern. -- Daniel Gustafsson
On Wednesday, January 14, 2026, Daniel Gustafsson <daniel@yesql.se> wrote:
Such a GUC would IMHO only make sense if we remove it when we promote the
feature, but removing a GUC also comes with a cost for anyone having baked it
into their scripts etc.
The very nature of experimental, that one must accept having to modify their applications during any given minor release, nullifies this concern. You know when you set the value to true you will have to remove that command potentially at the next major release. It’s part of the bargain.
David J.
On 2026-01-13 Tu 12:24 PM, Tom Lane wrote:
Jacob Champion <jacob.champion@enterprisedb.com> writes:On Tue, Jan 13, 2026 at 7:17 AM Andres Freund <andres@anarazel.de> wrote:I don't even know how you could implement 3) realistically. We have zero infrastructure for making e.g. parser, keyword list etc change due to defines compile time.Is that an architecturally unsolvable thing, or is it a simple matter of programming? Would it be nice to have said infrastructure?You'd have to throw out flex and bison and build some sort of extensible parser. That has some attraction to me personally (I worked on such systems decades ago at HP), but it's fairly hard to justify the amount of effort that would be needed to get there. It might well be slower than a flex/bison parser, and/or have poorer detection of grammar inconsistencies, either of which would be bad for our usage.
Maybe, but maybe not. ISTR that gcc abandoned use of bison for their C compiler a long time ago, and that gnat's Ada compiler was hand cut from the get go.
SQL is a different kettle of fish, of course - it dwarfs C and Ada in complexity.
Not saying I think this would be a good thing or not. I agree that the effort required would be huge and the benefit might be modest, but it is feasible IMHO.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Hi, On 2026-01-15 12:05:44 -0500, Andrew Dunstan wrote: > On 2026-01-13 Tu 12:24 PM, Tom Lane wrote: > > Jacob Champion<jacob.champion@enterprisedb.com> writes: > > > On Tue, Jan 13, 2026 at 7:17 AM Andres Freund<andres@anarazel.de> wrote: > > > > I don't even know how you could implement 3) realistically. We have zero > > > > infrastructure for making e.g. parser, keyword list etc change due to defines > > > > compile time. > > > Is that an architecturally unsolvable thing, or is it a simple matter > > > of programming? Would it be nice to have said infrastructure? > > You'd have to throw out flex and bison and build some sort of > > extensible parser. That has some attraction to me personally > > (I worked on such systems decades ago at HP), but it's fairly > > hard to justify the amount of effort that would be needed to > > get there. It might well be slower than a flex/bison parser, > > and/or have poorer detection of grammar inconsistencies, either > > of which would be bad for our usage. > > > Maybe, but maybe not. ISTR that gcc abandoned use of bison for their C > compiler a long time ago, and that gnat's Ada compiler was hand cut from the > get go. > > SQL is a different kettle of fish, of course - it dwarfs C and Ada in > complexity. The handwritten parser is also used for C++ which is *quite* a bit more complicated than C, and probably roughly on-par with SQL. I think the situation for C like languages is a bit different than with SQL though, because typically the compiler specific syntax extensions are much more modest than with SQL. Due to the different compilers for C like language, it's probably easier to find grammar / language issues than for something like postgres SQL dialect, which differs substantially from other SQL implementations and thus only has a single parser. That said, I do suspect we might eventually want to just give up on the "parser generator as a sanity check" aspect and go with a hand written recursive descent parser. For speed, extensibility and better error recovery/messages. Greetings, Andres Freund
Hi, Sorry for the duplicate email, I somehow corrupted the last one. On 2026-01-15 12:05:44 -0500, Andrew Dunstan wrote: > > On 2026-01-13 Tu 12:24 PM, Tom Lane wrote: > > Jacob Champion<jacob.champion@enterprisedb.com> writes: > > > On Tue, Jan 13, 2026 at 7:17 AM Andres Freund<andres@anarazel.de> wrote: > > > > I don't even know how you could implement 3) realistically. We have zero > > > > infrastructure for making e.g. parser, keyword list etc change due to defines > > > > compile time. > > > Is that an architecturally unsolvable thing, or is it a simple matter > > > of programming? Would it be nice to have said infrastructure? > > You'd have to throw out flex and bison and build some sort of > > extensible parser. That has some attraction to me personally > > (I worked on such systems decades ago at HP), but it's fairly > > hard to justify the amount of effort that would be needed to > > get there. It might well be slower than a flex/bison parser, > > and/or have poorer detection of grammar inconsistencies, either > > of which would be bad for our usage. > > > Maybe, but maybe not. ISTR that gcc abandoned use of bison for their C > compiler a long time ago, and that gnat's Ada compiler was hand cut from the > get go. > > SQL is a different kettle of fish, of course - it dwarfs C and Ada in > complexity. The handwritten parser is also used for C++ which is *quite* a bit more complicated than C, and probably roughly on-par with SQL. I think the situation for C like languages is a bit different than with SQL though, because typically the compiler specific syntax extensions are much more modest than with SQL. Due to the different compilers for C like language, it's probably easier to find grammar / language issues than for something like postgres SQL dialect, which differs substantially from other SQL implementations and thus only has a single parser. That said, I do suspect we might eventually want to just give up on the "parser generator as a sanity check" aspect and go with a hand written recursive descent parser. For speed, extensibility and better error recovery/messages. Greetings, Andres Freund
Andrew Dunstan <andrew@dunslane.net> writes:
> On 2026-01-13 Tu 12:24 PM, Tom Lane wrote:
>> You'd have to throw out flex and bison and build some sort of
>> extensible parser. That has some attraction to me personally
>> (I worked on such systems decades ago at HP), but it's fairly
>> hard to justify the amount of effort that would be needed to
>> get there. It might well be slower than a flex/bison parser,
>> and/or have poorer detection of grammar inconsistencies, either
>> of which would be bad for our usage.
> Maybe, but maybe not. ISTR that gcc abandoned use of bison for their C
> compiler a long time ago, and that gnat's Ada compiler was hand cut from
> the get go.
> SQL is a different kettle of fish, of course - it dwarfs C and Ada in
> complexity.
Yeah. And we have additional problems besides the grammar being far
larger than it is for those projects:
* It's a moving target (to a much greater degree than C, anyway).
* A lot of our code is written by people who are not parser experts.
So it gives me great comfort that bison will complain if you hand it
an ambiguous or unsatisfiable grammar. If we went over to a
handwritten parser, I'd have next to no faith in it not being buggy.
There's good reasons why people put so much effort into parser
generators back in the day.
regards, tom lane