Обсуждение: pg_bsd_indent - improvements around offsetof and sizeof

Поиск
Список
Период
Сортировка

pg_bsd_indent - improvements around offsetof and sizeof

От
Piotr Stefaniak
Дата:
Hello,

I think I've managed to improve pg_bsd_indent's handling of two types of
cases.

The first are like in this example:
-    hashp = (HTAB *) DynaHashAlloc(sizeof(HTAB) + strlen(tabname) +1);
+    hashp = (HTAB *) DynaHashAlloc(sizeof(HTAB) + strlen(tabname) + 1);
Pristine pg_bsd_indent is inconsistent in masking parentheses as those
that are part of a cast and those that "are part of sizeof": seeing a
type name following an lparen it always masks that lparen as a part of a
cast; seeing an rparen it only removes the bit if it doesn't overlap
with sizeof_mask. In the example above, "(HTAB" started both "cast
parens" and "sizeof parens" at the same time, and the immediately
following rparen ended only the "sizeof parens". According to indent,
the cast-to type then ends at "tabname)" and what follows is the cast's
operand, including the + operator; in that case it's assumed to be unary
and not binary, which is why indent doesn't add the space after it.
The fix was to make it consistent about masking parens:
-                    ps.cast_mask |= 1 << ps.p_l_follow;
+                    ps.cast_mask |= (1 << ps.p_l_follow & ~ps.sizeof_mask);

The second type of cases are like this:
-    nse = palloc(offsetof(PLpgSQL_nsitem, name) +strlen(name) + 1);
+    nse = palloc(offsetof(PLpgSQL_nsitem, name) + strlen(name) + 1);
pg_bsd_indent simply hasn't been taught that a parenthesized type name
following the offsetof macro and then an lparen is another exception to
the rule of thumb that a construction like that generally means a cast.

You'll also notice other, seemingly unrelated changes, most notably the
rearrangement in numbers assigned to keywords. I've done it that way so
that it was easier and simpler to keep the -bs option functioning as
designed.

I've also renamed "sizeof_mask" to "not_cast_mask", because I think the
latter is a better description of what the mask does (it prevents
interpreting parenthesized type names as a cast where they aren't,
namely where they follow sizeof or offsetof; I haven't done any support
for function declarators and I don't plan to - the fact that
pg_bsd_indent thinks that "(int" in "char func(int);" begins a cast is
amusing but it seems harmless for now).

I'm attaching the patch for pg_bsd_indent and also a full diff that
shows the change in its behavior when run against PG's sources.

Вложения

Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Robert Haas
Дата:
On Sun, May 22, 2016 at 4:16 PM, Piotr Stefaniak
<postgres@piotr-stefaniak.me> wrote:
> I think I've managed to improve pg_bsd_indent's handling of two types of
> cases.

Wow, that seems pretty great.  I haven't scrutinized your changes to
pg_bsd_indent, but effect_on_pg.diff looks like a large improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, May 22, 2016 at 4:16 PM, Piotr Stefaniak
> <postgres@piotr-stefaniak.me> wrote:
>> I think I've managed to improve pg_bsd_indent's handling of two types of
>> cases.

> Wow, that seems pretty great.  I haven't scrutinized your changes to
> pg_bsd_indent, but effect_on_pg.diff looks like a large improvement.

I'm excited about this too, not least because it suggests that maybe
bsdindent isn't quite as opaque as it appears.  I'd love to see a fix
for its brain damage around function pointer typedef formatting, too.

Assuming this patch withstands more careful review, we will need to think
about project policy for how/when to apply such fixes.  The last time
we made any real change to pgindent's behavior was when we changed its
wrapping of comment blocks back around 8.1 ... and I cursed that decision
at least weekly for the next five years, because it caused constant
back-patching pain.  If we make a change like this, I think we should
*strongly* consider reindenting all the live back branches along with
HEAD.
        regards, tom lane



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Piotr Stefaniak
Дата:
On 2016-05-25 21:13, Tom Lane wrote:
> I'd love to see a fix for its brain damage around function pointer typedef formatting, too.

Show me a few examples and I'll look into it.

> I'm excited about this too, not least because it suggests that maybe bsdindent isn't quite as opaque as it appears.

It's old, hacked on many times over the past few decades and 
historically just a band-aid rather than something designed from the 
ground up, so it's not easy to work with. Which is why I think that a 
newer tool (like ClangFormat) should be considered as a replacement for 
pg_bsd_indent.




Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Andres Freund
Дата:
On 2016-05-25 22:01:53 +0200, Piotr Stefaniak wrote:
> On 2016-05-25 21:13, Tom Lane wrote:
> > I'd love to see a fix for its brain damage around function pointer typedef formatting, too.
> 
> Show me a few examples and I'll look into it.
> 
> > I'm excited about this too, not least because it suggests that maybe bsdindent isn't quite as opaque as it
appears.
> 
> It's old, hacked on many times over the past few decades and historically
> just a band-aid rather than something designed from the ground up, so it's
> not easy to work with. Which is why I think that a newer tool (like
> ClangFormat) should be considered as a replacement for pg_bsd_indent.

FWIW, I looked at using clang-format at some point, and it looked like
it'd be a number of additional options to make it work for our case
without changing the code layout too much. There seemed limited
enthusiasm from the authors about accepting relevant options.

Andres



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Alvaro Herrera
Дата:
Andres Freund wrote:
> On 2016-05-25 22:01:53 +0200, Piotr Stefaniak wrote:
> > On 2016-05-25 21:13, Tom Lane wrote:
> > > I'd love to see a fix for its brain damage around function pointer typedef formatting, too.
> > 
> > Show me a few examples and I'll look into it.

See src/include/replication/logical.h; the problem there is pretty
obvious.

For another broken construct, see tupleLockExtraInfo in
src/backend/access/heap/heapam.c.

Also, see pre_indent and post_indent in src/tools/pgindent/pgindent.
A few bugs in pg_bsd_indent are worked around there.

> > > I'm excited about this too, not least because it suggests that maybe bsdindent isn't quite as opaque as it
appears.
> > 
> > It's old, hacked on many times over the past few decades and historically
> > just a band-aid rather than something designed from the ground up, so it's
> > not easy to work with. Which is why I think that a newer tool (like
> > ClangFormat) should be considered as a replacement for pg_bsd_indent.
> 
> FWIW, I looked at using clang-format at some point, and it looked like
> it'd be a number of additional options to make it work for our case
> without changing the code layout too much. There seemed limited
> enthusiasm from the authors about accepting relevant options.

FWIW I looked at GNU indent a year ago or so, and there were a few
things about our style that they simply did not have options for.  I
don't recall the details but my conclusion was that it was a dead end.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Andres Freund
Дата:
On 2016-05-25 18:17:51 -0400, Alvaro Herrera wrote:
> Andres Freund wrote:
> > On 2016-05-25 22:01:53 +0200, Piotr Stefaniak wrote:
> > FWIW, I looked at using clang-format at some point, and it looked like
> > it'd be a number of additional options to make it work for our case
> > without changing the code layout too much. There seemed limited
> > enthusiasm from the authors about accepting relevant options.
> 
> FWIW I looked at GNU indent a year ago or so, and there were a few
> things about our style that they simply did not have options for.  I
> don't recall the details but my conclusion was that it was a dead end.

Might be worthwhile to look into 'uncrustify'. It's fairly
customizable.  A colleague at citus made all citus code be formatted by
it; and while there's some minor details I dislike, it seems to work
ok.



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Andres Freund wrote:
>> On 2016-05-25 22:01:53 +0200, Piotr Stefaniak wrote:
>>> On 2016-05-25 21:13, Tom Lane wrote:
>>>> I'd love to see a fix for its brain damage around function pointer typedef formatting, too.

>>> Show me a few examples and I'll look into it.

> See src/include/replication/logical.h; the problem there is pretty
> obvious.

More examples are in, eg, src/include/access/amapi.h.  It's aligning the
additional lines of a multiline function-pointer typedef to *something*,
but it's not very clear what exactly, and in any case it's indenting them
much too far to have any hope of readability.
        regards, tom lane



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> Might be worthwhile to look into 'uncrustify'. It's fairly
> customizable.  A colleague at citus made all citus code be formatted by
> it; and while there's some minor details I dislike, it seems to work
> ok.

Hmm ... a quick look says that it's been around for awhile and it's
actively maintained, which seem like positive things.  Would be worth
seeing if it can be configured to match our style.
        regards, tom lane



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Piotr Stefaniak
Дата:
On 2016-05-25 21:13, Tom Lane wrote:
> Assuming this patch withstands more careful review, we will need to think
> about project policy for how/when to apply such fixes.

I discovered yesterday that Bruce Evans had done the fix for sizeof in 
their fork of indent(1) in 2004 (r125623 [1]). The core fix is exactly 
the same as mine, he just did more fixes around it, which I haven't 
analyzed.

I'm trying to see if FreeBSD indent can successfully do pg_bsd_indent's 
job. So far I had to fix one thing, which is not adding a space after a 
cast operator, for which they added no option to turn it off. Currently 
I'm fighting one other bug, but I think that'll be it.

I took interest in FreeBSD's fork of indent(1) because they've fixed 
more things than NetBSD people have in their fork, it seems. I'm also 
hoping it'll be easier to reinvent GNU indent's -tsn ("set tabsize to n 
spaces") option for FreeBSD indent than it would be for any other of the 
forks that aren't GNU. I envision that to be the first step to getting 
rid of some of the work-arounds pgindent does, mainly running entab and 
detab as pre- and post-processing steps.

If porting FreeBSD indent to PostgreSQL's sources turns out to be 
successful, there will be a choice between rebasing pg_bsd_indent on 
that and picking specific patches and applying it on PG's fork of indent(1).

[1] 
https://svnweb.freebsd.org/base/head/usr.bin/indent/lexi.c?r1=125619&r2=125623



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Bruce Momjian
Дата:
On Wed, May 25, 2016 at 03:13:23PM -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Sun, May 22, 2016 at 4:16 PM, Piotr Stefaniak
> > <postgres@piotr-stefaniak.me> wrote:
> >> I think I've managed to improve pg_bsd_indent's handling of two types of
> >> cases.
> 
> > Wow, that seems pretty great.  I haven't scrutinized your changes to
> > pg_bsd_indent, but effect_on_pg.diff looks like a large improvement.
> 
> I'm excited about this too, not least because it suggests that maybe
> bsdindent isn't quite as opaque as it appears.  I'd love to see a fix
> for its brain damage around function pointer typedef formatting, too.
> 
> Assuming this patch withstands more careful review, we will need to think
> about project policy for how/when to apply such fixes.  The last time
> we made any real change to pgindent's behavior was when we changed its
> wrapping of comment blocks back around 8.1 ... and I cursed that decision
> at least weekly for the next five years, because it caused constant
> back-patching pain.  If we make a change like this, I think we should
> *strongly* consider reindenting all the live back branches along with
> HEAD.

Uh, we have been running on back branches anytime the pgindent rules
change as part of policy, e.g.:
commit 2616a5d300e5bb5a2838d2a065afa3740e08727fAuthor: Bruce Momjian <bruce@momjian.us>Date:   Tue May 6 11:26:26 2014
-0400   Remove tabs after spaces in C comments    This was not changed in HEAD, but will be done later as part of a
pgindentrun.  Future pgindent runs will also do this.    Report by Tom Lane    Backpatch through all supported
branches,but not HEAD
 


--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> On Wed, May 25, 2016 at 03:13:23PM -0400, Tom Lane wrote:
>> ... If we make a change like this, I think we should
>> *strongly* consider reindenting all the live back branches along with
>> HEAD.

> Uh, we have been running on back branches anytime the pgindent rules
> change as part of policy, e.g.:
>     commit 2616a5d300e5bb5a2838d2a065afa3740e08727f

That was not actually a pgindent run, but a one-time application of a very
limited filter.
        regards, tom lane



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Bruce Momjian
Дата:
On Tue, Jun 21, 2016 at 03:22:09PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Wed, May 25, 2016 at 03:13:23PM -0400, Tom Lane wrote:
> >> ... If we make a change like this, I think we should
> >> *strongly* consider reindenting all the live back branches along with
> >> HEAD.
> 
> > Uh, we have been running on back branches anytime the pgindent rules
> > change as part of policy, e.g.:
> >     commit 2616a5d300e5bb5a2838d2a065afa3740e08727f
> 
> That was not actually a pgindent run, but a one-time application of a very
> limited filter.

Ah, yes, entab -m (only C comment periods).

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Piotr Stefaniak
Дата:
On 2016-05-27 08:13, Piotr Stefaniak wrote:
> I'm trying to see if FreeBSD indent can successfully do pg_bsd_indent's
> job. So far I had to fix one thing, which is not adding a space after a
> cast operator, for which they added no option to turn it off. Currently
> I'm fighting one other bug, but I think that'll be it.

So... after fixing 12 times more bugs that I had anticipated (see the
list at the end of this email; also see attached patches.tgz if you want
to apply the patches yourself), my "fork" of FreeBSD indent(1) can do
pg_bsd_indent's job if you pass it three additional parameters (-nut
-cli1 -sac), producing a 6600-line unified diff, mostly of desired
changes (see freebsd-indent.diff.gz for details).

I'm in the process of pushing my changes upstream, but I was already
told that it's too late to get them into 11.0-RELEASE. Personally, I
don't mind that, hoping that the upstream will accept them eventually.

> I'm also hoping it'll be easier to reinvent GNU indent's -tsn ("set
> tabsize to n spaces") option for FreeBSD indent than it would be for
> any other of the forks that aren't GNU. I envision that to be the
> first step to getting rid of some of the work-arounds pgindent does,
> mainly running entab and detab as pre- and post-processing steps.

That and more I'll probably do later.

> If porting FreeBSD indent to PostgreSQL's sources turns out to be
> successful, there will be a choice between rebasing pg_bsd_indent on
> that and picking specific patches and applying it on PG's fork of
> indent(1).

At this point I think it wouldn't make any sense to port any changes to
current pg_bsd_indent.


The full list of changes I made to FreeBSD's indent(1) as of r289677:
       [bugfix] Fix typo in keyword "typedef".
       [bugfix] Avoid out of bound access of array codebuf pointed into
by s_code.
       [bugfix] Avoid out of bound access of array ps.paren_indents.
       [bugfix] Avoid out of bound access of array in_buffer.
       [bugfix] Avoid potential use-after-free.
       [bugfix] Semicolons inside struct declarations don't end the
declarations.
       [bugfix] Support "f" and "F" floating constant suffixes.
       [bugfix] Removed whitespace shouldn't be considered in column
calculations.
       [bugfix] Don't add surplus space on empty lines in comments.
       [bugfix] Bail out if there's no more space on the parser stack.
       [bugfix] Consistently indent declarations.
       [bugfix] Don't ignore the fact that offsetof is a keyword.
       [cleanup] Make definition of opchar conditional.
       [cleanup] Remove dead code relating to unix-style comments.
       [cleanup] Simplify pr_comment().
       [cleanup] Deduplicate code that decided if a comment needs a
blank line at the top.
       [bugfix] Fix wrapping of some lines in comments.
       [bugfix] Untangle the connection between pr_comment.c and io.c,
fixing at least two bugs
       [feature] Add -sac (space after cast) and -nsac options.
       [bugfix] Don't newline on cpp lines like #endif unless -bacc is on.
       [feature] Add -U option for providing a file containing list of
types.
       [optimization] Use bsearch() for looking up type keywords.


Вложения

Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Piotr Stefaniak
Дата:
On 2016-05-25 21:13, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sun, May 22, 2016 at 4:16 PM, Piotr Stefaniak
>> <postgres@piotr-stefaniak.me> wrote:
>>> I think I've managed to improve pg_bsd_indent's handling of two types of
>>> cases.
>
>> Wow, that seems pretty great.  I haven't scrutinized your changes to
>> pg_bsd_indent, but effect_on_pg.diff looks like a large improvement.

> Assuming this patch withstands more careful review, we will need to think
> about project policy for how/when to apply such fixes.

The patches have got committed upstream and work well for Postgres. You
can take FreeBSD indent(1) as of SVN r303746, apply patches from
https://github.com/pstef/freebsd_indent/commits/pass2 (subject to heavy
rebasing) and use as pg_bsd_indent for pgindent.

There are more fixes I intend to do, of which the most relevant for
Postgres are:
1) fixing "function pointer typedef formatting"
2) adding a -tsn option like in GNU indent, for setting how many columns
a tab character will produce. I had a preliminary patch implementing
that and I have to say that while it removes the need for entab, it also
introduces a lot of seemingly pointless changes in formatting which will
be arguably improvements or regressions.




Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Andres Freund
Дата:
On 2016-08-15 18:09:02 +0000, Piotr Stefaniak wrote:
> There are more fixes I intend to do, of which the most relevant for 
> Postgres are:
> 1) fixing "function pointer typedef formatting"

This alone would warrant a bottle of something rather expensive.



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Bruce Momjian
Дата:
On Tue, Aug 16, 2016 at 11:47:09AM -0700, Andres Freund wrote:
> On 2016-08-15 18:09:02 +0000, Piotr Stefaniak wrote:
> > There are more fixes I intend to do, of which the most relevant for 
> > Postgres are:
> > 1) fixing "function pointer typedef formatting"
> 
> This alone would warrant a bottle of something rather expensive.

Agreed.  I was kind of hoping we could use this for the pgindent run we
just did, but that is being done just before 9.6 final, which seems too
close.  I suggest we run it once everything is ready, and run it on all
back-branches so we can backpatch things.  The ideal time would probably
be right after we have done minor releases.  The problem is that this is
going to break queued-up patches, so maybe it has to be done right
before 10.0 beta, and again, to all back branches too.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +



Re: pg_bsd_indent - improvements around offsetof and sizeof

От
Alvaro Herrera
Дата:
Bruce Momjian wrote:
> On Tue, Aug 16, 2016 at 11:47:09AM -0700, Andres Freund wrote:
> > On 2016-08-15 18:09:02 +0000, Piotr Stefaniak wrote:
> > > There are more fixes I intend to do, of which the most relevant for 
> > > Postgres are:
> > > 1) fixing "function pointer typedef formatting"
> > 
> > This alone would warrant a bottle of something rather expensive.
> 
> Agreed.  I was kind of hoping we could use this for the pgindent run we
> just did, but that is being done just before 9.6 final, which seems too
> close.  I suggest we run it once everything is ready, and run it on all
> back-branches so we can backpatch things.  The ideal time would probably
> be right after we have done minor releases.  The problem is that this is
> going to break queued-up patches, so maybe it has to be done right
> before 10.0 beta, and again, to all back branches too.

I think it doesn't really matter -- surely we don't want to do it just
before some important release, but other than that I don't think there
are any constraints.  The amount of pain for large patch maintainers is
unrelated to the timing.

(I sketched a way to mechanically rebase patches across a pgindent run;
I haven't had the chance to try it.)

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services