Re: Radix tree for character conversion

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: Radix tree for character conversion
Дата
Msg-id 20161108.104356.265607041.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Radix tree for character conversion  (Daniel Gustafsson <daniel@yesql.se>)
Ответы Re: Radix tree for character conversion
Список pgsql-hackers
Hello,

At Mon, 7 Nov 2016 12:32:55 +0100, Daniel Gustafsson <daniel@yesql.se> wrote in
<EE8775B6-BE30-459D-9DDB-F3D0B3FF573D@yesql.se>
> > On 04 Nov 2016, at 08:34, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > I'm not sure how the discussion about this goes, these patches
> > makes me think about coding style of Perl.
> 
> Some of this can absolutely be considered style and more or less down to
> personal preference.  I haven’t seen any coding conventions for Perl so I
> assume it’s down to consensus among the committers.  My rationale for these
> patches in the first place was that I perceived this thread to partly want to
> clean up the code and make it more modern Perl.
> 
> > The distinction between executable script and library is by
> > intention with an obscure basis. Existing scripts don't get less
> > modification, but library uses more restricted scopes to get rid
> > of the troubles caused by using global scopes. But I don't have a
> > clear preference on that. The TAP test scripts takes OO notations
> > but I'm not sure convutils.pl also be better to take the same
> > notation. It would be rarely edited hereafter and won't gets
> > grown any more.
> 
> I think the current convutils module is fine and converting it to OO would be
> overkill.

Agreed.

> > As far as I see the obvious bug fixes in the patchset are the
> > following,
> 
> Agreed, with some comments:
> 
> > - 0007: load_maptable fogets to close input file.
> 
> An interesting note on this is that it’s not even a bug =) Since $in is a
> scalar reference, there is no need to explicitly close() the filehandle since
> the reference counter will close it on leaving scope, but there’s no harm in
> doing it ourselves and it also makes for less confusion for anyone not familiar
> with Perl internals.

Wow. I didn't know that perl has such a hidden-OO
feature. Nevertheless, implicit close is not friendly to who are
not familiar with newer perl.

Your comment led me to confirm the requirement to build PostgreSQL.

https://www.postgresql.org/docs/devel/static/install-requirements.html

| Perl 5.8 or later is needed to build from a Git checkout, or if
| you changed the input files for any of the build steps that use
| Perl scripts. If building on Windows you will need Perl in any
| case. Perl is also required to run some test suites.

So, we should assume Perl 5.8 (released in 2002!) on build
time. And actually 5.10 on RedHat 6.4, 5.16 on my
environment(ContOS 7.2), and the official doc is at 5.24. Active
perl is 5.24. According to this, we should use syntax supported
as of 5.8 and/but not obsolete until 5.24, then to follow the
latest convention. But not OO. (But I can't squeeze out a
concrete syntax set out of this requirements :( )


> > - 0010: commment for load_maptables is wrong.
> 
> There is also a fix for a typo in make_mapchecker.pl
> 
> > - 0011: hash reference is incorrectly dereferenced
> > 
> > All other fixes other than the above three seem to be styling or
> > syntax-generation issues and I don't know whether any
> > recommendation exists…
> 
> I think there are some more fixes that arent styling/syntax remaining.  I’ll go
> through the patches one by one:
> 
> 0007 - While this might be considered styling/syntax, my $0.02 is that it’s not
> and instead a worthwhile change.  I’ll illustrate with an example from the
> patch in question:
> 
> Using a bareword global variable in open() for the filehandle was replaced with
> the three-part form in 5.6 and is now even actively discouraged from in the
> Perl documentation (and has been so since the 5.20 docs).  The problem is that
> they are global and can thus easily clash, so easily that the 0007 patch
> actually fixes one such occurrence:

That's what should be adopted in the criteria above.
 - Don't use bareword globals. - Use open() with separate MODE argument.

> print_radix_map() opens the file in the global filehandle OUT and passes it to
> print_radix_table() with the typeglob *OUT; print_radit_table() in turn passes
> the filehandle to print_segmented_table() which writes to the file using the
> parameter $hd, except in one case where it uses the global OUT variable without
> knowing it will be the right file.  This is where the hunk below in 0007 comes
> in:
> 
> -               print OUT "$line\n";
> +               print { $$hd } "$line\n";
> 
> In this case OUT references the right file and it produces the right result,
> but it illustrates how easy it is to get wrong (which can cause very subtle
> bugs).  So, when poking at this code we might as well, IMHO, use what is today
> in Perl considered the right way to deal with filehandle references.

Thanks for the detail. Ok, I'll change the style so in the next
patch.

> Using implicit filemodes can also introduce bugs when opening filenames passed
> in from the outside as we do in UCS_to_most.pl.  Considering the use case of
> these scripts it’s obviously quite low on the list of risks but still.

Ok, I'll do that.

> 0008 - I don’t think there are any recommendations whether or not to use use
> strict; in the codebase, there certainly are lots of scripts not doing it.
> Personally I think it’s good hygiene to always use strict but here it might
> just be janitorial nitpicking (which I too am guilty of liking..  =)).

I used strict as an amulet (or armor) not to type incorrect
symbols. Breaking well-working scripts by adding strict is not
reasonable but using it in new or heavily rewritten scripts are
reasonable. I changed my mind to use newer style in rewriting
existing scripts.

> 0009 - local $var; is to provide a temporary value of $var, where $var exists,
> for the current scope (and was mostly used back in Perl 4).  Since we are
> passing by value to ucs2utf(), and creating $utf inside it, using my to create
> the variable is the right thing even though the end result is the same.

Yes, you're right. The point was that the deferrence between
lexical and dynamic scopes doesn't make any difference in the
context. But I'll rewrite it according to the new policy.

> 0010 and 0011 are already dealt with above.
> 
> So to summarize, I think there are a few more (while not all) hunks that are of
> interest which aren’t just syntax/style which can serve to make the code easer
> to read/work with down line should we need to.

Addition to this, I'll remove existing authority files and modify
radix generator so that it can read plain map files in the next
patch.

Thank you for the significant comments.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Re: BUG #13755: pgwin32_is_service not checking if SECURITY_SERVICE_SID is disabled
Следующее
От: Hao Lee
Дата:
Сообщение: Do we need use more meaningful variables to replace 0 in catalog head files?