Обсуждение: pg_bsd_indent compiles bytecode

Поиск
Список
Период
Сортировка

pg_bsd_indent compiles bytecode

От
Alvaro Herrera
Дата:
I just noticed that when you compile pg_bsd_indent with a PG tree that
has --enable-jit (or something around that), then it compiles the source
files into bytecode.

Obviously this is not harmful since these files don't get installed, but
I wonder if our compiles aren't being excessively generous.

-- 
Álvaro Herrera                                http://www.twitter.com/alvherre



Re: pg_bsd_indent compiles bytecode

От
Bruce Momjian
Дата:
On Tue, Jun 23, 2020 at 05:56:10PM -0400, Alvaro Herrera wrote:
> I just noticed that when you compile pg_bsd_indent with a PG tree that
> has --enable-jit (or something around that), then it compiles the source
> files into bytecode.
> 
> Obviously this is not harmful since these files don't get installed, but
> I wonder if our compiles aren't being excessively generous.

Are you saying pg_bsd_indent indents the JIT output files?  I assumed
people only ran pg_bsd_indent on dist-clean trees.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: pg_bsd_indent compiles bytecode

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> On Tue, Jun 23, 2020 at 05:56:10PM -0400, Alvaro Herrera wrote:
>> I just noticed that when you compile pg_bsd_indent with a PG tree that
>> has --enable-jit (or something around that), then it compiles the source
>> files into bytecode.
>> Obviously this is not harmful since these files don't get installed, but
>> I wonder if our compiles aren't being excessively generous.

> Are you saying pg_bsd_indent indents the JIT output files?  I assumed
> people only ran pg_bsd_indent on dist-clean trees.

I think what he means is that when pg_bsd_indent absorbs the CFLAGS
settings that PG uses (because it uses the pgxs build infrastructure),
it ends up also building .bc files.

I wouldn't care about this particularly for pg_bsd_indent itself,
but it suggests that we're probably building .bc files for client-side
files, which seems like a substantial waste of time.  Maybe we need
different CFLAGS for client and server?

            regards, tom lane



Re: pg_bsd_indent compiles bytecode

От
Bruce Momjian
Дата:
On Sat, Jun 27, 2020 at 05:12:57PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Tue, Jun 23, 2020 at 05:56:10PM -0400, Alvaro Herrera wrote:
> >> I just noticed that when you compile pg_bsd_indent with a PG tree that
> >> has --enable-jit (or something around that), then it compiles the source
> >> files into bytecode.
> >> Obviously this is not harmful since these files don't get installed, but
> >> I wonder if our compiles aren't being excessively generous.
> 
> > Are you saying pg_bsd_indent indents the JIT output files?  I assumed
> > people only ran pg_bsd_indent on dist-clean trees.
> 
> I think what he means is that when pg_bsd_indent absorbs the CFLAGS
> settings that PG uses (because it uses the pgxs build infrastructure),
> it ends up also building .bc files.

Wow, OK, I was confused then.

> I wouldn't care about this particularly for pg_bsd_indent itself,
> but it suggests that we're probably building .bc files for client-side
> files, which seems like a substantial waste of time.  Maybe we need
> different CFLAGS for client and server?

Understood.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




Re: pg_bsd_indent compiles bytecode

От
Andres Freund
Дата:
Hi,

On 2020-06-27 17:12:57 -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Tue, Jun 23, 2020 at 05:56:10PM -0400, Alvaro Herrera wrote:
> >> I just noticed that when you compile pg_bsd_indent with a PG tree that
> >> has --enable-jit (or something around that), then it compiles the source
> >> files into bytecode.
> >> Obviously this is not harmful since these files don't get installed, but
> >> I wonder if our compiles aren't being excessively generous.
> 
> > Are you saying pg_bsd_indent indents the JIT output files?  I assumed
> > people only ran pg_bsd_indent on dist-clean trees.
> 
> I think what he means is that when pg_bsd_indent absorbs the CFLAGS
> settings that PG uses (because it uses the pgxs build infrastructure),
> it ends up also building .bc files.

Hm. Yea, I think I see the problem. OBJS should only be expanded if
MODULE_big is set.


> I wouldn't care about this particularly for pg_bsd_indent itself,
> but it suggests that we're probably building .bc files for client-side
> files, which seems like a substantial waste of time.  Maybe we need
> different CFLAGS for client and server?

I don't think it'll apply to most in-tree client side programs, so it
shouldn't be too bad currently. Still should fix it, of course.

I can test that with another program, but for some reason pg_bsd_indent
fails to build against 13/HEAD, but builds fine against 12. Not sure yet
what's up:

/usr/bin/ld.gold: error: indent.o: multiple definition of 'input'
/usr/bin/ld.gold: args.o: previous definition here
/usr/bin/ld.gold: error: indent.o: multiple definition of 'output'
/usr/bin/ld.gold: args.o: previous definition here
/usr/bin/ld.gold: error: indent.o: multiple definition of 'labbuf'
/usr/bin/ld.gold: args.o: previous definition here
...

Greetings,

Andres Freund



Re: pg_bsd_indent compiles bytecode

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> I can test that with another program, but for some reason pg_bsd_indent
> fails to build against 13/HEAD, but builds fine against 12. Not sure yet
> what's up:

Huh.  Works here on RHEL8 ... what platform are you using?

            regards, tom lane



Re: pg_bsd_indent compiles bytecode

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2020-06-27 17:12:57 -0400, Tom Lane wrote:
>> I wouldn't care about this particularly for pg_bsd_indent itself,
>> but it suggests that we're probably building .bc files for client-side
>> files, which seems like a substantial waste of time.  Maybe we need
>> different CFLAGS for client and server?

> I don't think it'll apply to most in-tree client side programs, so it
> shouldn't be too bad currently. Still should fix it, of course.

Having now checked, there isn't any such problem.  No .bc files are
getting built except in src/backend and in other modules that feed
into the backend, such as src/timezone and most of contrib.

I do see .bc files getting built for pg_bsd_indent, as Alvaro reported.
Seems like it must be a bug in the pgxs make logic, not anything more
generic.

            regards, tom lane



Re: pg_bsd_indent compiles bytecode

От
Michael Paquier
Дата:
On Sat, Jun 27, 2020 at 06:54:04PM -0400, Tom Lane wrote:
> Having now checked, there isn't any such problem.  No .bc files are
> getting built except in src/backend and in other modules that feed
> into the backend, such as src/timezone and most of contrib.
>
> I do see .bc files getting built for pg_bsd_indent, as Alvaro reported.
> Seems like it must be a bug in the pgxs make logic, not anything more
> generic.

Yeah, and I think that it is caused by the following bit:
ifeq ($(with_llvm), yes)
all: $(addsuffix .bc, $(MODULES)) $(patsubst %.o,%.bc, $(OBJS))
endif

Shouldn't the latter part be ignored if PROGRAM is used?
--
Michael

Вложения

Re: pg_bsd_indent compiles bytecode

От
Tom Lane
Дата:
I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> The way that pg_bsd_indent defines its variables isn't standard C, as
>> far as I can tell, which explains the errors I was getting. All the
>> individual files include indent_globs.h, which declares/defines a bunch
>> of variables. Since it doesn't use extern, they'll all end up as full
>> definitions in each .o when -fno-common is used (the default now), hence
>> the multiple definition errors. The only reason it works with -fcommon
>> is that they'll end up processed as weak symbols and 'deduplicated' at
>> link time.

> Ugh.  I agree that's pretty bogus, even if there's anything in the
> C standard that allows it.  I'll put it on my to-do list.

I pushed the attached patch to the pg_bsd_indent repo.  Perhaps Piotr
would like to absorb it into upstream.

I don't intend to mark pg_bsd_indent with a new release number for
this --- for people who successfully compiled, it's the same as before.

            regards, tom lane

commit acb2f0a7f3689805b954ea19a927b5021fc69409
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Mon Jun 29 21:19:16 2020 -0400

    Avoid duplicate declarations of bsdindent's global variables.

    Arrange for all the variable declarations in indent_globs.h to look
    like "extern" declarations to every .c file except indent.c.
    This prevents linker failure due to duplicated global variables when
    the code is built with -fno-common, which is soon to be gcc's default.

    The method of temporarily #define'ing "extern" to empty is a hack,
    no doubt, but it avoids requiring a duplicate set of variable
    definitions, so it seemed like the best way.

    Discussion: https://postgr.es/m/20200629165051.xlfqhstajf6ynxyv@alap3.anarazel.de

diff --git a/indent.c b/indent.c
index 62d4d01..d3a0ece 100644
--- a/indent.c
+++ b/indent.c
@@ -49,6 +49,10 @@ static char sccsid[] = "@(#)indent.c    5.17 (Berkeley) 6/7/93";
 #include <stdlib.h>
 #include <string.h>
 #include <ctype.h>
+
+/* Tell indent_globs.h to define our global variables here */
+#define DECLARE_INDENT_GLOBALS 1
+
 #include "indent_globs.h"
 #include "indent_codes.h"
 #include "indent.h"
diff --git a/indent_globs.h b/indent_globs.h
index d018af1..398784b 100644
--- a/indent_globs.h
+++ b/indent_globs.h
@@ -50,9 +50,16 @@
 #define true  1
 #endif

+/*
+ * Exactly one calling file should define this symbol.  The global variables
+ * will be defined in that file, and just referenced elsewhere.
+ */
+#ifdef DECLARE_INDENT_GLOBALS
+#define extern
+#endif

-FILE       *input;        /* the fid for the input file */
-FILE       *output;        /* the output file */
+extern FILE *input;        /* the fid for the input file */
+extern FILE *output;        /* the output file */

 #define CHECK_SIZE_CODE(desired_size) \
     if (e_code + (desired_size) >= l_code) { \
@@ -106,94 +113,94 @@ FILE       *output;        /* the output file */
         s_token = tokenbuf + 1; \
     }

-char       *labbuf;        /* buffer for label */
-char       *s_lab;        /* start ... */
-char       *e_lab;        /* .. and end of stored label */
-char       *l_lab;        /* limit of label buffer */
+extern char *labbuf;        /* buffer for label */
+extern char *s_lab;        /* start ... */
+extern char *e_lab;        /* .. and end of stored label */
+extern char *l_lab;        /* limit of label buffer */

-char       *codebuf;        /* buffer for code section */
-char       *s_code;        /* start ... */
-char       *e_code;        /* .. and end of stored code */
-char       *l_code;        /* limit of code section */
+extern char *codebuf;        /* buffer for code section */
+extern char *s_code;        /* start ... */
+extern char *e_code;        /* .. and end of stored code */
+extern char *l_code;        /* limit of code section */

-char       *combuf;        /* buffer for comments */
-char       *s_com;        /* start ... */
-char       *e_com;        /* ... and end of stored comments */
-char       *l_com;        /* limit of comment buffer */
+extern char *combuf;        /* buffer for comments */
+extern char *s_com;        /* start ... */
+extern char *e_com;        /* ... and end of stored comments */
+extern char *l_com;        /* limit of comment buffer */

 #define token s_token
-char       *tokenbuf;        /* the last token scanned */
-char       *s_token;
-char       *e_token;
-char       *l_token;
+extern char *tokenbuf;        /* the last token scanned */
+extern char *s_token;
+extern char *e_token;
+extern char *l_token;

-char       *in_buffer;        /* input buffer */
-char       *in_buffer_limit;    /* the end of the input buffer */
-char       *buf_ptr;        /* ptr to next character to be taken from
+extern char *in_buffer;        /* input buffer */
+extern char *in_buffer_limit;    /* the end of the input buffer */
+extern char *buf_ptr;        /* ptr to next character to be taken from
                  * in_buffer */
-char       *buf_end;        /* ptr to first after last char in in_buffer */
+extern char *buf_end;        /* ptr to first after last char in in_buffer */

-char        sc_buf[sc_size];    /* input text is saved here when looking for
+extern char  sc_buf[sc_size];    /* input text is saved here when looking for
                  * the brace after an if, while, etc */
-char       *save_com;        /* start of the comment stored in sc_buf */
-char       *sc_end;        /* pointer into save_com buffer */
+extern char *save_com;        /* start of the comment stored in sc_buf */
+extern char *sc_end;        /* pointer into save_com buffer */

-char       *bp_save;        /* saved value of buf_ptr when taking input
+extern char *bp_save;        /* saved value of buf_ptr when taking input
                  * from save_com */
-char       *be_save;        /* similarly saved value of buf_end */
+extern char *be_save;        /* similarly saved value of buf_end */


-int         found_err;
-int         blanklines_after_declarations;
-int         blanklines_before_blockcomments;
-int         blanklines_after_procs;
-int         blanklines_around_conditional_compilation;
-int         swallow_optional_blanklines;
-int         n_real_blanklines;
-int         prefix_blankline_requested;
-int         postfix_blankline_requested;
-int         break_comma;    /* when true and not in parens, break after a
+extern int   found_err;
+extern int   blanklines_after_declarations;
+extern int   blanklines_before_blockcomments;
+extern int   blanklines_after_procs;
+extern int   blanklines_around_conditional_compilation;
+extern int   swallow_optional_blanklines;
+extern int   n_real_blanklines;
+extern int   prefix_blankline_requested;
+extern int   postfix_blankline_requested;
+extern int   break_comma;    /* when true and not in parens, break after a
                  * comma */
-int         btype_2;        /* when true, brace should be on same line as
+extern int   btype_2;        /* when true, brace should be on same line as
                  * if, while, etc */
-float       case_ind;        /* indentation level to be used for a "case
+extern float case_ind;        /* indentation level to be used for a "case
                  * n:" */
-int         code_lines;        /* count of lines with code */
-int         had_eof;        /* set to true when input is exhausted */
-int         line_no;        /* the current line number. */
-int         max_col;        /* the maximum allowable line length */
-int         verbose;        /* when true, non-essential error messages are
+extern int   code_lines;    /* count of lines with code */
+extern int   had_eof;        /* set to true when input is exhausted */
+extern int   line_no;        /* the current line number. */
+extern int   max_col;        /* the maximum allowable line length */
+extern int   verbose;        /* when true, non-essential error messages are
                  * printed */
-int         cuddle_else;    /* true if else should cuddle up to '}' */
-int         star_comment_cont;    /* true iff comment continuation lines should
+extern int   cuddle_else;    /* true if else should cuddle up to '}' */
+extern int   star_comment_cont;    /* true iff comment continuation lines should
                  * have stars at the beginning of each line. */
-int         comment_delimiter_on_blankline;
-int         troff;        /* true iff were generating troff input */
-int         procnames_start_line;    /* if true, the names of procedures
+extern int   comment_delimiter_on_blankline;
+extern int   troff;        /* true iff were generating troff input */
+extern int   procnames_start_line;    /* if true, the names of procedures
                      * being defined get placed in column
                      * 1 (ie. a newline is placed between
                      * the type of the procedure and its
                      * name) */
-int         proc_calls_space;    /* If true, procedure calls look like:
+extern int   proc_calls_space;    /* If true, procedure calls look like:
                  * foo(bar) rather than foo (bar) */
-int         format_block_comments;    /* true if comments beginning with
+extern int   format_block_comments;    /* true if comments beginning with
                      * `/ * \n' are to be reformatted */
-int         format_col1_comments;    /* If comments which start in column 1
+extern int   format_col1_comments;    /* If comments which start in column 1
                      * are to be magically reformatted
                      * (just like comments that begin in
                      * later columns) */
-int         inhibit_formatting;    /* true if INDENT OFF is in effect */
-int         suppress_blanklines;/* set iff following blanklines should be
+extern int   inhibit_formatting;    /* true if INDENT OFF is in effect */
+extern int   suppress_blanklines;/* set iff following blanklines should be
                  * suppressed */
-int         continuation_indent;/* set to the indentation between the edge of
+extern int   continuation_indent;/* set to the indentation between the edge of
                  * code and continuation lines */
-int         lineup_to_parens;    /* if true, continued code within parens will
+extern int   lineup_to_parens;    /* if true, continued code within parens will
                  * be lined up to the open paren */
-int         lineup_to_parens_always;    /* if true, do not attempt to keep
+extern int   lineup_to_parens_always;    /* if true, do not attempt to keep
                      * lined-up code within the margin */
-int         Bill_Shannon;    /* true iff a blank should always be inserted
+extern int   Bill_Shannon;    /* true iff a blank should always be inserted
                  * after sizeof */
-int         blanklines_after_declarations_at_proctop;    /* This is vaguely
+extern int   blanklines_after_declarations_at_proctop;    /* This is vaguely
                              * similar to
                              * blanklines_after_decla
                              * rations except that
@@ -206,26 +213,28 @@ int         blanklines_after_declarations_at_proctop;    /* This is vaguely
                              * to be generated even
                              * if there are no
                              * declarations */
-int         block_comment_max_col;
-int         extra_expression_indent;    /* true if continuation lines from the
+extern int   block_comment_max_col;
+extern int   extra_expression_indent;    /* true if continuation lines from the
                      * expression part of "if(e)",
                      * "while(e)", "for(e;e;e)" should be
                      * indented an extra tab stop so that
                      * they don't conflict with the code
                      * that follows */
-int        function_brace_split;    /* split function declaration and
+extern int   function_brace_split;    /* split function declaration and
                      * brace onto separate lines */
-int        use_tabs;            /* set true to use tabs for spacing,
+extern int   use_tabs;            /* set true to use tabs for spacing,
                      * false uses all spaces */
-int        auto_typedefs;        /* set true to recognize identifiers
+extern int   auto_typedefs;        /* set true to recognize identifiers
                      * ending in "_t" like typedefs */
-int        space_after_cast;        /* "b = (int) a" vs "b = (int)a" */
-int        postgres_tab_rules;        /* use Postgres tab-vs-space rules */
-int        tabsize;            /* the size of a tab */
-int        else_endif_com_ind;        /* the column in which comments to
+extern int   space_after_cast;        /* "b = (int) a" vs "b = (int)a" */
+extern int   postgres_tab_rules;    /* use Postgres tab-vs-space rules */
+extern int   tabsize;            /* the size of a tab */
+extern int   else_endif_com_ind;    /* the column in which comments to
                      * the right of #else and #endif
                      * should start */

+extern int   ifdef_level;
+
 struct parser_state {
     int         last_token;
     int         p_stack[256];    /* this is the parsers stack */
@@ -322,8 +331,13 @@ struct parser_state {
     int         tos;        /* pointer to top of stack */
     char        procname[100];    /* The name of the current procedure */
     int         just_saw_decl;
-}           ps;
+};

-int         ifdef_level;
-struct parser_state state_stack[5];
-struct parser_state match_state[5];
+extern struct parser_state ps;
+extern struct parser_state state_stack[5];
+extern struct parser_state match_state[5];
+
+/* Undo previous hackery */
+#ifdef DECLARE_INDENT_GLOBALS
+#undef extern
+#endif

Re: pg_bsd_indent compiles bytecode

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> It's easy enough to fix by just adding a ifndef PROGRAM around the piece
> adding the depency to the .bc files:

> ifeq ($(with_llvm), yes)
> ifndef PROGRAM
> all: $(addsuffix .bc, $(MODULES)) $(patsubst %.o,%.bc, $(OBJS))
> endif # PROGRAM
> endif # with_llvm

> but it's not particularly pretty. But given the way pgxs.mk is
> structured, I'm not sure there's really a great answer?

Yeah.  The only other plausible alternative I see is like this:

ifeq ($(with_llvm), yes)
ifdef MODULES
all: $(addsuffix .bc, $(MODULES))
endif # MODULES
ifdef MODULE_big
all: $(patsubst %.o,%.bc, $(OBJS))
endif # MODULE_big
endif # with_llvm

which might be a little nicer because it squares better with,
eg, the install/uninstall rules.  But it's not much better.

            regards, tom lane



Re: pg_bsd_indent compiles bytecode

От
Andres Freund
Дата:
Hi,

On 2020-06-29 21:27:48 -0400, Tom Lane wrote:
> I wrote:
> > Andres Freund <andres@anarazel.de> writes:
> >> The way that pg_bsd_indent defines its variables isn't standard C, as
> >> far as I can tell, which explains the errors I was getting. All the
> >> individual files include indent_globs.h, which declares/defines a bunch
> >> of variables. Since it doesn't use extern, they'll all end up as full
> >> definitions in each .o when -fno-common is used (the default now), hence
> >> the multiple definition errors. The only reason it works with -fcommon
> >> is that they'll end up processed as weak symbols and 'deduplicated' at
> >> link time.
> 
> > Ugh.  I agree that's pretty bogus, even if there's anything in the
> > C standard that allows it.  I'll put it on my to-do list.
> 
> I pushed the attached patch to the pg_bsd_indent repo.  Perhaps Piotr
> would like to absorb it into upstream.

Thanks!


> I don't intend to mark pg_bsd_indent with a new release number for
> this --- for people who successfully compiled, it's the same as before.

Makes sense to me.

Greetings,

Andres Freund