Обсуждение: Add support for automatically updating Unicode derived files

Поиск
Список
Период
Сортировка

Add support for automatically updating Unicode derived files

От
Peter Eisentraut
Дата:
Continuing the discussion from [0] and [1], here is a patch that 
automates the process of updating Unicode derived files.  Summary:

- Edit UNICODE_VERSION and/or CLDR_VERSION in src/Makefile.global.in
- Run make update-unicode
- Commit

I have added that to the release checklist in RELEASE_NOTES.

This also includes the script used in [0] that was not committed at that 
time.  Other than that, this just refactors existing build code.

Open questions that are currently not handled consistently:

- Should the downloaded files be listed in .gitignore?
- Should the downloaded files be cleaned by make clean (or distclean or 
maintainer-clean or none)?
- Should the generated files be excluded from pgindent?  Currently, the 
generated files will not pass pgindent unchanged, so that could cause 
annoying whitespace battles when these files are updated and re-indented 
around release time.


[0]: 
https://www.postgresql.org/message-id/flat/bbb19114-af1e-513b-08a9-61272794bd5c%402ndquadrant.com
[1]: 
https://www.postgresql.org/message-id/flat/77f69366-ca31-6437-079f-47fce69bae1b%402ndquadrant.com

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: Add support for automatically updating Unicode derived files

От
John Naylor
Дата:
On Tue, Oct 29, 2019 at 6:06 AM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
>
> Continuing the discussion from [0] and [1], here is a patch that
> automates the process of updating Unicode derived files.  Summary:
>
> - Edit UNICODE_VERSION and/or CLDR_VERSION in src/Makefile.global.in
> - Run make update-unicode
> - Commit

Hi Peter,

I gave "make update-unicode" a try. It's unclear to me what the state
of the build tree should be when a maintainer runs this, so I'll just
report what happens when running naively (on MacOS).

After only running configure, "make update-unicode" gives this error
at normalization-check:

ld: library not found for -lpgcommon
clang: error: linker command failed with exit code 1 (use -v to see invocation)

After commenting that out, the next command "$(MAKE) -C
contrib/unaccent $@" failed, seemingly because $(PYTHON) is empty
unless --with-python was specified at configure time.

> Open questions that are currently not handled consistently:
>
> - Should the downloaded files be listed in .gitignore?

These files are transient byproducts of a build, and we don't want
them committed, so they seem like a normal candidate for .gitignore.

> - Should the downloaded files be cleaned by make clean (or distclean or
> maintainer-clean or none)?

It seems one would want to make clean without removing these files,
and maintainer clean is for removing things that are preserved in
distribution tarballs. So I would go with distclean.

> - Should the generated files be excluded from pgindent?  Currently, the
> generated files will not pass pgindent unchanged, so that could cause
> annoying whitespace battles when these files are updated and re-indented
> around release time.

I see what you mean in the norm table header. I think generated files
should not be pgindent'd, since creating them is already a consistent,
mechanical process, and their presentation is not as important as
other code.

Other comments:

+print "/* generated by
src/common/unicode/generate-unicode_combining_table.pl, do not edit
*/\n\n";

I would print out the full boilerplate like for other generated headers.

Lastly, src/common/unicode/README is outdated (and possibly no longer
useful at all?).

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add support for automatically updating Unicode derived files

От
Peter Eisentraut
Дата:
On 2019-12-19 23:48, John Naylor wrote:
> I gave "make update-unicode" a try. It's unclear to me what the state
> of the build tree should be when a maintainer runs this, so I'll just
> report what happens when running naively (on MacOS).

Yeah, that wasn't fully thought through, it appears.

> After only running configure, "make update-unicode" gives this error
> at normalization-check:
> 
> ld: library not found for -lpgcommon
> clang: error: linker command failed with exit code 1 (use -v to see invocation)

Fixed by adding more make dependencies.

> After commenting that out, the next command "$(MAKE) -C
> contrib/unaccent $@" failed, seemingly because $(PYTHON) is empty
> unless --with-python was specified at configure time.

I'm not sure whether that's worth addressing.

>> Open questions that are currently not handled consistently:
>>
>> - Should the downloaded files be listed in .gitignore?
> 
> These files are transient byproducts of a build, and we don't want
> them committed, so they seem like a normal candidate for .gitignore.

OK done

>> - Should the downloaded files be cleaned by make clean (or distclean or
>> maintainer-clean or none)?
> 
> It seems one would want to make clean without removing these files,
> and maintainer clean is for removing things that are preserved in
> distribution tarballs. So I would go with distclean.

also done

>> - Should the generated files be excluded from pgindent?  Currently, the
>> generated files will not pass pgindent unchanged, so that could cause
>> annoying whitespace battles when these files are updated and re-indented
>> around release time.
> 
> I see what you mean in the norm table header. I think generated files
> should not be pgindent'd, since creating them is already a consistent,
> mechanical process, and their presentation is not as important as
> other code.

I've left it alone for now because the little indentation problem 
currently present might actually go away with my Unicode normalization 
support patch.

> Other comments:
> 
> +print "/* generated by
> src/common/unicode/generate-unicode_combining_table.pl, do not edit
> */\n\n";
> 
> I would print out the full boilerplate like for other generated headers.

Hmm, you are probably comparing with 
src/common/unicode/generate-unicode_norm_table.pl, but other file 
generating scripts around the tree print out a small header in the style 
that I have.  I'd rather adjust the output of 
generate-unicode_norm_table.pl to match those.  (It's also not quite 
correct to make copyright claims about automatically generated output.)

> Lastly, src/common/unicode/README is outdated (and possibly no longer
> useful at all?).

updated

new patch attached

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: Add support for automatically updating Unicode derived files

От
John Naylor
Дата:
On Thu, Dec 26, 2019 at 12:39 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
>
> On 2019-12-19 23:48, John Naylor wrote:
> > I would print out the full boilerplate like for other generated headers.
>
> Hmm, you are probably comparing with
> src/common/unicode/generate-unicode_norm_table.pl, but other file
> generating scripts around the tree print out a small header in the style
> that I have.  I'd rather adjust the output of
> generate-unicode_norm_table.pl to match those.  (It's also not quite
> correct to make copyright claims about automatically generated output.)

Hmm, the scripts I'm most familiar with have full headers. Your point
about copyright makes sense, and using smaller file headers would aid
readability of the scripts, but I also see how others may feel
differently.

v2 looks good to me, marked ready for committer.

--
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add support for automatically updating Unicode derived files

От
Peter Eisentraut
Дата:
On 2020-01-03 15:13, John Naylor wrote:
> On Thu, Dec 26, 2019 at 12:39 PM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>>
>> On 2019-12-19 23:48, John Naylor wrote:
>>> I would print out the full boilerplate like for other generated headers.
>>
>> Hmm, you are probably comparing with
>> src/common/unicode/generate-unicode_norm_table.pl, but other file
>> generating scripts around the tree print out a small header in the style
>> that I have.  I'd rather adjust the output of
>> generate-unicode_norm_table.pl to match those.  (It's also not quite
>> correct to make copyright claims about automatically generated output.)
> 
> Hmm, the scripts I'm most familiar with have full headers. Your point
> about copyright makes sense, and using smaller file headers would aid
> readability of the scripts, but I also see how others may feel
> differently.
> 
> v2 looks good to me, marked ready for committer.

Committed, thanks.

I have added a little tweak so that it works also without --with-python, 
to avoid gratuitous annoyances.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add support for automatically updating Unicode derived files

От
Tom Lane
Дата:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> Committed, thanks.

This patch is making src/tools/pginclude/headerscheck unhappy:

./src/include/common/unicode_combining_table.h:3: error: array type has incomplete element type

I guess that header needs another #include, or else you need to
move some declarations around.

            regards, tom lane



Re: Add support for automatically updating Unicode derived files

От
Peter Eisentraut
Дата:
On 2020-01-15 01:37, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
>> Committed, thanks.
> 
> This patch is making src/tools/pginclude/headerscheck unhappy:
> 
> ./src/include/common/unicode_combining_table.h:3: error: array type has incomplete element type
> 
> I guess that header needs another #include, or else you need to
> move some declarations around.

Hmm, this file is only meant to be included inside one particular 
function.  Making it standalone includable would seem to be unnecessary. 
  What should we do?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add support for automatically updating Unicode derived files

От
Tom Lane
Дата:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 2020-01-15 01:37, Tom Lane wrote:
>> This patch is making src/tools/pginclude/headerscheck unhappy:
>> ./src/include/common/unicode_combining_table.h:3: error: array type has incomplete element type

> Hmm, this file is only meant to be included inside one particular
> function.  Making it standalone includable would seem to be unnecessary.
>   What should we do?

Well, we could make it a documented exception in headerscheck and
cpluspluscheck.

            regards, tom lane



Re: Add support for automatically updating Unicode derived files

От
Peter Eisentraut
Дата:
On 2020-01-20 16:43, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
>> On 2020-01-15 01:37, Tom Lane wrote:
>>> This patch is making src/tools/pginclude/headerscheck unhappy:
>>> ./src/include/common/unicode_combining_table.h:3: error: array type has incomplete element type
> 
>> Hmm, this file is only meant to be included inside one particular
>> function.  Making it standalone includable would seem to be unnecessary.
>>    What should we do?
> 
> Well, we could make it a documented exception in headerscheck and
> cpluspluscheck.

OK, done.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add support for automatically updating Unicode derived files

От
Peter Eisentraut
Дата:
I have committed the first Unicode data update using this new "make 
update-unicode" facility.

CLDR is released regularly every 6 months, so around this time every 
year would be the appropriate time to pull in the latest updates in 
preparation for our own release.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services