Обсуждение: Redundant/mis-use of _(x) gettext macro?
Hi. As originally reported [1] in the EXCEPT (TABLE ...) thread, I felt the _() gettext macro is mis-used when it contains nothing but a quoted format string. AFAIK the purpose of using gettext (e.g. the "_(x)" macro) is for marking the string for i18n translation. But there's nothing even to translate here. Only 2 examples were found like this. Granted, this is probably not very harmful, but IMO it's better to patch the mis-use to prevent it from propagating. PSA the patch for more details. ====== [1] https://www.postgresql.org/message-id/CAHut%2BPtVaOYVP1R0TOoNh5G9CUBurrP%3D6i7fJCdfWEZiqVCRvw%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Вложения
> On Apr 1, 2026, at 11:48, Peter Smith <smithpb2250@gmail.com> wrote: > > Hi. > > As originally reported [1] in the EXCEPT (TABLE ...) thread, I felt > the _() gettext macro is mis-used when it contains nothing but a > quoted format string. > > AFAIK the purpose of using gettext (e.g. the "_(x)" macro) is for > marking the string for i18n translation. But there's nothing even to > translate here. Indeed. > > Only 2 examples were found like this. Granted, this is probably not > very harmful, but IMO it's better to patch the mis-use to prevent it > from propagating. +1 > > PSA the patch for more details. > > ====== > [1] https://www.postgresql.org/message-id/CAHut%2BPtVaOYVP1R0TOoNh5G9CUBurrP%3D6i7fJCdfWEZiqVCRvw%40mail.gmail.com > > Kind Regards, > Peter Smith. > Fujitsu Australia > <v1-0001-misuse-of-gettext-macro.patch> LGTM. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
On 2026-Apr-01, Peter Smith wrote: > Hi. > > As originally reported [1] in the EXCEPT (TABLE ...) thread, I felt > the _() gettext macro is mis-used when it contains nothing but a > quoted format string. No, you feel wrong -- this is necessary so that the translator has control over the quoting style of a list of items. Not all translations use double quoting. Some examples from different language files: msgstr "unbekannte Komprimierungsoption: »%s«" msgstr "opción de compresión no reconocida: «%s»" msgstr "option de compression inconnue : « %s »" msgstr "tidak dapat menentukan encoding untuk lokal « %s » : codesetnya adalah « %s »" -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "In fact, the basic problem with Perl 5's subroutines is that they're not crufty enough, so the cruft leaks out into user-defined code instead, by the Conservation of Cruft Principle." (Larry Wall, Apocalypse 6)
On Wed, Apr 1, 2026 at 10:52 PM Álvaro Herrera <alvherre@kurilemu.de> wrote: > > On 2026-Apr-01, Peter Smith wrote: > > > Hi. > > > > As originally reported [1] in the EXCEPT (TABLE ...) thread, I felt > > the _() gettext macro is mis-used when it contains nothing but a > > quoted format string. > > No, you feel wrong -- this is necessary so that the translator has > control over the quoting style of a list of items. Not all translations > use double quoting. Some examples from different language files: > > msgstr "unbekannte Komprimierungsoption: »%s«" > msgstr "opción de compresión no reconocida: «%s»" > msgstr "option de compression inconnue : « %s »" > msgstr "tidak dapat menentukan encoding untuk lokal « %s » : codesetnya adalah « %s »" > OK. Thanks for your explanation. ====== Kind Regards, Peter Smith. Fujitsu Australia
> On Apr 2, 2026, at 10:55, Peter Smith <smithpb2250@gmail.com> wrote: > > On Wed, Apr 1, 2026 at 10:52 PM Álvaro Herrera <alvherre@kurilemu.de> wrote: >> >> On 2026-Apr-01, Peter Smith wrote: >> >>> Hi. >>> >>> As originally reported [1] in the EXCEPT (TABLE ...) thread, I felt >>> the _() gettext macro is mis-used when it contains nothing but a >>> quoted format string. >> >> No, you feel wrong -- this is necessary so that the translator has >> control over the quoting style of a list of items. Not all translations >> use double quoting. Some examples from different language files: >> >> msgstr "unbekannte Komprimierungsoption: »%s«" >> msgstr "opción de compresión no reconocida: «%s»" >> msgstr "option de compression inconnue : « %s »" >> msgstr "tidak dapat menentukan encoding untuk lokal « %s » : codesetnya adalah « %s »" >> > > OK. Thanks for your explanation. > Yep, good to learn. Noted. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
Chao Li <li.evan.chao@gmail.com> writes: > On Apr 2, 2026, at 10:55, Peter Smith <smithpb2250@gmail.com> wrote: >> On Wed, Apr 1, 2026 at 10:52 PM Álvaro Herrera <alvherre@kurilemu.de> wrote: >>> No, you feel wrong -- this is necessary so that the translator has >>> control over the quoting style of a list of items. Not all translations >>> use double quoting. Some examples from different language files: > Yep, good to learn. Noted. This and related message style matters are covered in our docs, in https://www.postgresql.org/docs/devel/error-style-guide.html regards, tom lane
On Wed, Apr 1, 2026 at 10:52 PM Álvaro Herrera <alvherre@kurilemu.de> wrote: > > On 2026-Apr-01, Peter Smith wrote: > > > Hi. > > > > As originally reported [1] in the EXCEPT (TABLE ...) thread, I felt > > the _() gettext macro is mis-used when it contains nothing but a > > quoted format string. > > No, you feel wrong -- this is necessary so that the translator has > control over the quoting style of a list of items. Not all translations > use double quoting. Some examples from different language files: > > msgstr "unbekannte Komprimierungsoption: »%s«" > msgstr "opción de compresión no reconocida: «%s»" > msgstr "option de compression inconnue : « %s »" > msgstr "tidak dapat menentukan encoding untuk lokal « %s » : codesetnya adalah « %s »" > The explanation [1] that even the comma separators and the quotes are translatable parts of the message reminded me to revisit the `GetPublicationsStr` function. This function builds a comma-separated list of publication names. It is called in 2 scenarios: 1. to construct a pubname list to be included in some SQL (here, parameter quote_literal=true) 2. to construct a pubname list to be included in some error message (here, parameter quote_literal=false). In hindsight, it looks like this function has been broken since it was originally implemented 4 yrs ago (8f2e2bb), because it does not allow translation of commas and quotes when the result is being used for error messages. PSA a patch to fix that. ====== [1] https://www.postgresql.org/message-id/202604011144.jeo56tazdx6z%40alvherre.pgsql Kind Regards, Peter Smith. Fujitsu Australia
Вложения
On Tue, Apr 7, 2026 at 3:05 PM Peter Smith <smithpb2250@gmail.com> wrote: > > On Wed, Apr 1, 2026 at 10:52 PM Álvaro Herrera <alvherre@kurilemu.de> wrote: > > > > On 2026-Apr-01, Peter Smith wrote: > > > > > Hi. > > > > > > As originally reported [1] in the EXCEPT (TABLE ...) thread, I felt > > > the _() gettext macro is mis-used when it contains nothing but a > > > quoted format string. > > > > No, you feel wrong -- this is necessary so that the translator has > > control over the quoting style of a list of items. Not all translations > > use double quoting. Some examples from different language files: > > > > msgstr "unbekannte Komprimierungsoption: »%s«" > > msgstr "opción de compresión no reconocida: «%s»" > > msgstr "option de compression inconnue : « %s »" > > msgstr "tidak dapat menentukan encoding untuk lokal « %s » : codesetnya adalah « %s »" > > > > The explanation [1] that even the comma separators and the quotes are > translatable parts of the message reminded me to revisit the > `GetPublicationsStr` function. > > This function builds a comma-separated list of publication names. > > It is called in 2 scenarios: > 1. to construct a pubname list to be included in some SQL (here, > parameter quote_literal=true) > 2. to construct a pubname list to be included in some error message > (here, parameter quote_literal=false). > > In hindsight, it looks like this function has been broken since it was > originally implemented 4 yrs ago (8f2e2bb), because it does not allow > translation of commas and quotes when the result is being used for > error messages. > > PSA a patch to fix that. > > ====== > [1] https://www.postgresql.org/message-id/202604011144.jeo56tazdx6z%40alvherre.pgsql > PSA patch v2. v2 removes translation of the comma separator, due to the discussion over at [1]. ====== [1] https://www.postgresql.org/message-id/CAApHDvoFSu5zLFvx96aZ5wvL7tcB9aR2hBPCaMPs8D_f0Z7eSw%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Вложения
On 2026-Apr-23, Peter Smith wrote: > v2 removes translation of the comma separator, due to the discussion > over at [1]. Hmm, at least Japanese uses a different character for commas, and apparently French likes to add a space, so I think this is a bad move. I think we could handle these things by including the comma together with the literal in each element of the list being constructed, as in the attached. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Вложения
On Thu, Apr 23, 2026 at 7:25 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> On 2026-Apr-23, Peter Smith wrote:
>
> > v2 removes translation of the comma separator, due to the discussion
> > over at [1].
>
> Hmm, at least Japanese uses a different character for commas, and
> apparently French likes to add a space, so I think this is a bad move.
> I think we could handle these things by including the comma together
> with the literal in each element of the list being constructed, as in
> the attached.
>
OK. Including the comma within a larger translated string seems like a
better idea.
Since you now have the list `length`, I wondered why not simplify
further to use list_nth indexing? Then you can remove
`foreach_current_index` and `lc`.
PSA.
~~~~
Also, why did you choose to implement `last` versus `first` logic?
e.g. How about this?
------
GetPublicationsStr(List *publications, StringInfo dest, bool quote_literal)
{
ListCell *lc;
bool first = true;
Assert(publications != NIL);
foreach(lc, publications)
{
char *pubname = strVal(lfirst(lc));
if (quote_literal)
{
if (!first)
appendStringInfoString(dest, ", ");
appendStringInfoString(dest, quote_literal_cstr(pubname));
}
else
{
if (first)
appendStringInfo(dest, _("\"%s\""), pubname);
else
appendStringInfo(dest, _(", \"%s\""), pubname);
}
first = false;
}
}
------
======
Kind Regards,
Peter Smith.
Fujitsu Australia
Вложения
On 2026-Apr-24, Peter Smith wrote:
> OK. Including the comma within a larger translated string seems like a
> better idea.
>
> Since you now have the list `length`, I wondered why not simplify
> further to use list_nth indexing? Then you can remove
> `foreach_current_index` and `lc`.
WFM.
> Also, why did you choose to implement `last` versus `first` logic?
> e.g. How about this?
> {
> if (first)
> appendStringInfo(dest, _("\"%s\""), pubname);
> else
> appendStringInfo(dest, _(", \"%s\""), pubname);
> }
I don't know, it just seemed more natural. The whole ", foo" style
simply feels weird to me. It seems a matter of choice only though, so
if you feel strongly about this, I'm not opposed.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Hay dos momentos en la vida de un hombre en los que no debería
especular: cuando puede permitírselo y cuando no puede" (Mark Twain)
On Sat, 25 Apr 2026 at 00:14, Álvaro Herrera <alvherre@kurilemu.de> wrote: > > On 2026-Apr-24, Peter Smith wrote: > > > OK. Including the comma within a larger translated string seems like a > > better idea. > > > > Since you now have the list `length`, I wondered why not simplify > > further to use list_nth indexing? Then you can remove > > `foreach_current_index` and `lc`. > > WFM. Wouldn't it be simpler to put the comma on the other end of the name and use that if (foreach_current_index(lc) > 0) ? Or was this to reuse a translation and you didn't want to swap "\"%s\", " for , "\"%s\" ? David
On Fri, Apr 24, 2026 at 10:14 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> On 2026-Apr-24, Peter Smith wrote:
>
> > OK. Including the comma within a larger translated string seems like a
> > better idea.
> >
> > Since you now have the list `length`, I wondered why not simplify
> > further to use list_nth indexing? Then you can remove
> > `foreach_current_index` and `lc`.
>
> WFM.
>
> > Also, why did you choose to implement `last` versus `first` logic?
> > e.g. How about this?
>
> > {
> > if (first)
> > appendStringInfo(dest, _("\"%s\""), pubname);
> > else
> > appendStringInfo(dest, _(", \"%s\""), pubname);
> > }
>
> I don't know, it just seemed more natural. The whole ", foo" style
> simply feels weird to me. It seems a matter of choice only though, so
> if you feel strongly about this, I'm not opposed.
>
Personally, I prefer `first` logic because
a) The length of the list is not required up-front
b) At each iteration, the built string makes sense instead of always
having a dangling ',' until the final iteration
But it is not a big deal. This thread is about the translation part,
so whatever way makes the most sense for translators is fine with me.
======
Kind Regards,
Peter Smith
Fujitsu Australia