Обсуждение: Get rid of translation strings that only contain punctuation

Поиск
Список
Период
Сортировка

Get rid of translation strings that only contain punctuation

От
David Rowley
Дата:
(Follow-on work from [1])

We've got a few parts of the code that translate strings that contain
only a single punctuation character. I'm not a translator, but I
suspect that these would be tricky to deal with as such short strings
could be used for various different things, and if the required
translation was to differ between requirements, then you're out of
luck.

I looked at: git grep -A 1 "msgid \", \"" and I see French is the only
translation to do anything different with the ", " string, and only in
psql.

src/bin/psql/po/fr.po:msgid ", "
src/bin/psql/po/fr.po-msgstr " , "

This is used for suffixing "unique" or "unique nulls not distinct". I
adjusted the logic there to get rid of the short translation string.

Quite a few are new to v19: fd366065e (AmitK), 48efefa6c (AmitK),
0fc33b005 (PeterE)
The relation.c one is from v18: 8fcd80258 (AmitK)
The describe.c one is from v15: 94aa7cc5f (PeterE)

Should we get rid of these?

David

[1] https://postgr.es/m/CAApHDvohYOdrvhVxXzCJNX_GYMSWBfjTTtB6hgDauEtZ8Nar2A@mail.gmail.com

Вложения

Re: Get rid of translation strings that only contain punctuation

От
Tom Lane
Дата:
David Rowley <dgrowleyml@gmail.com> writes:
> We've got a few parts of the code that translate strings that contain
> only a single punctuation character. I'm not a translator, but I
> suspect that these would be tricky to deal with as such short strings
> could be used for various different things, and if the required
> translation was to differ between requirements, then you're out of
> luck.

Yeah.  I concur with your feeling that a separate translatable string
containing just a punctuation mark is probably the Wrong Thing.  But
just removing the translation marker doesn't fix the problem.  You
need more extensive restructuring so that what needs to be translated
is a coherent message.

We previously discussed the append_tuple_value_detail case [1], and
I opined that the right fix was to change things so that what that
function produces is a string that doesn't need translation because
it matches SQL syntax for a row constructor.  It doesn't look like
that's happened yet.

            regards, tom lane

[1] https://www.postgresql.org/message-id/227279.1775956328%40sss.pgh.pa.us



Re: Get rid of translation strings that only contain punctuation

От
Peter Smith
Дата:
On Wed, Apr 22, 2026 at 10:30 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> (Follow-on work from [1])
>
> We've got a few parts of the code that translate strings that contain
> only a single punctuation character. I'm not a translator, but I
> suspect that these would be tricky to deal with as such short strings
> could be used for various different things, and if the required
> translation was to differ between requirements, then you're out of
> luck.
>
> I looked at: git grep -A 1 "msgid \", \"" and I see French is the only
> translation to do anything different with the ", " string, and only in
> psql.
>
> src/bin/psql/po/fr.po:msgid ", "
> src/bin/psql/po/fr.po-msgstr " , "
>
> This is used for suffixing "unique" or "unique nulls not distinct". I
> adjusted the logic there to get rid of the short translation string.
>
> Quite a few are new to v19: fd366065e (AmitK), 48efefa6c (AmitK),
> 0fc33b005 (PeterE)
> The relation.c one is from v18: 8fcd80258 (AmitK)
> The describe.c one is from v15: 94aa7cc5f (PeterE)
>
> Should we get rid of these?
>

This question overlaps with another thread of mine [1].

There, I was told that a punctuation double-quote (")  *should* be translated.

OTOH, I did not see why the comma separator (,) should be translated
-- my patch did so only to be the same as existing code.

======
[1] https://www.postgresql.org/message-id/CAHut%2BPui7RaQ8OfJEVn2ry-ykjnGc%2B3ujsFmcHDFw9FsXw_tRw%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Get rid of translation strings that only contain punctuation

От
Tom Lane
Дата:
Peter Smith <smithpb2250@gmail.com> writes:
> On Wed, Apr 22, 2026 at 10:30 AM David Rowley <dgrowleyml@gmail.com> wrote:
>> Should we get rid of these?

> This question overlaps with another thread of mine [1].
> There, I was told that a punctuation double-quote (")  *should* be translated.

It should be, but it *has to be translated as part of a coherent
message*.  As the examples in [1] show, several languages translate
opening and closing double-quotes differently.  So if you write _("\"")
there is zero hope of that being usefully translatable.

This all goes back to the translatability guideline about not
constructing messages out of parts [2].  If you've got a single
punctuation mark as a separate string, you are violating both the
letter and the spirit of that guideline, and that has consequences
for translatability.

            regards, tom lane

[1] https://www.postgresql.org/message-id/CAHut%2BPui7RaQ8OfJEVn2ry-ykjnGc%2B3ujsFmcHDFw9FsXw_tRw%40mail.gmail.com
[2] https://www.postgresql.org/docs/devel/nls-programmer.html#NLS-GUIDELINES



Re: Get rid of translation strings that only contain punctuation

От
Peter Smith
Дата:
On Wed, Apr 22, 2026 at 11:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Smith <smithpb2250@gmail.com> writes:
> > On Wed, Apr 22, 2026 at 10:30 AM David Rowley <dgrowleyml@gmail.com> wrote:
> >> Should we get rid of these?
>
> > This question overlaps with another thread of mine [1].
> > There, I was told that a punctuation double-quote (")  *should* be translated.
>
> It should be, but it *has to be translated as part of a coherent
> message*.  As the examples in [1] show, several languages translate
> opening and closing double-quotes differently.  So if you write _("\"")
> there is zero hope of that being usefully translatable.
>
> This all goes back to the translatability guideline about not
> constructing messages out of parts [2].  If you've got a single
> punctuation mark as a separate string, you are violating both the
> letter and the spirit of that guideline, and that has consequences
> for translatability.
>
>                         regards, tom lane
>
> [1] https://www.postgresql.org/message-id/CAHut%2BPui7RaQ8OfJEVn2ry-ykjnGc%2B3ujsFmcHDFw9FsXw_tRw%40mail.gmail.com
> [2] https://www.postgresql.org/docs/devel/nls-programmer.html#NLS-GUIDELINES

To my knowledge, we aren't violating that guideline because our
substituted parts aren't words of a sentence; they are quoted names in
a list

e.g.

Case#1: publication "XXX" has a problem

Case#2: the following publications have a problem: "XXX", "YYY", "ZZZ"

~~~

Case#1 is easy. "publication \"%s\" has a problem"
The quotes are part of the message, so they get translated as normal.

Case#2 is more fiddly. "the following publications have a problem: %s"
The substituted quoted-name list is constructed at runtime, but still,
we require those quotes to be translated so that quoted-names in cases
#1 and #2 look the same.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Get rid of translation strings that only contain punctuation

От
Tom Lane
Дата:
Peter Smith <smithpb2250@gmail.com> writes:
> Case#1: publication "XXX" has a problem
> Case#2: the following publications have a problem: "XXX", "YYY", "ZZZ"

Entirely aside from the mechanics of producing the output,
I am not sure I buy that emitting that is a desirable goal.
It seems to be based on an English-centric notion that singular
and indefinitely-many plural are the only two categories.
This is incorrect (see the documentation for ngettext()).

Is there a good reason not to output a separate message for
each publication?  If we need to throw an ereport(ERROR)
covering them all, maybe list them in separate sentences
in a DETAIL message.

            regards, tom lane



Re: Get rid of translation strings that only contain punctuation

От
Peter Smith
Дата:
On Wed, Apr 22, 2026 at 12:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Smith <smithpb2250@gmail.com> writes:
> > Case#1: publication "XXX" has a problem
> > Case#2: the following publications have a problem: "XXX", "YYY", "ZZZ"
>
> Entirely aside from the mechanics of producing the output,
> I am not sure I buy that emitting that is a desirable goal.
> It seems to be based on an English-centric notion that singular
> and indefinitely-many plural are the only two categories.
> This is incorrect (see the documentation for ngettext()).
>

Those case#1 and case#2 were just illustrative. The real code is using
`errmsg_plural` and `errdetail_plural`, so I think that makes it ok
for languages that have multiple forms of "plural".

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: Get rid of translation strings that only contain punctuation

От
Amit Kapila
Дата:
On Wed, Apr 22, 2026 at 6:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> We previously discussed the append_tuple_value_detail case [1], and
> I opined that the right fix was to change things so that what that
> function produces is a string that doesn't need translation because
> it matches SQL syntax for a row constructor.  It doesn't look like
> that's happened yet.
>

I'll look into your suggestions.

--
With Regards,
Amit Kapila.



Re: Get rid of translation strings that only contain punctuation

От
Álvaro Herrera
Дата:
On 2026-Apr-22, David Rowley wrote:

> We've got a few parts of the code that translate strings that contain
> only a single punctuation character. I'm not a translator, but I
> suspect that these would be tricky to deal with as such short strings
> could be used for various different things, and if the required
> translation was to differ between requirements, then you're out of
> luck.

Yeah.

> I looked at: git grep -A 1 "msgid \", \"" and I see French is the only
> translation to do anything different with the ", " string, and only in
> psql.

Japanese also uses different punctuation characters, so I don't think we
should get rid of translating these characters.

Instead we should do what Tom says and integrate these characters into a
larger string.  I showed one example in a nearby thread from Peter Smith
[1], and I think a couple of the spots you're patching can be easily
done in the same way.

As for the one in guc.c, I think what we should do is change
config_enum_get_options() to have an API similar to GetPublicationsStr:
instead of receiving prefix, suffix and separator, we should tell that
function that we're constructing a list to be used in as an SQL value
(GetConfigOptionValues), or one to be displayed to the user
(parse_and_validate_value); and have the function add the separators and
other decoration as needed, using the same technique.  (The other prefix
"Available values: " can be added by the caller, I think, and maybe the
braces also, not sure.)

[1] https://postgr.es/m/aeniYoOwCQmtWtQW@alvherre.pgsql

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/