Обсуждение: Get rid of translation strings that only contain punctuation
(Follow-on work from [1]) We've got a few parts of the code that translate strings that contain only a single punctuation character. I'm not a translator, but I suspect that these would be tricky to deal with as such short strings could be used for various different things, and if the required translation was to differ between requirements, then you're out of luck. I looked at: git grep -A 1 "msgid \", \"" and I see French is the only translation to do anything different with the ", " string, and only in psql. src/bin/psql/po/fr.po:msgid ", " src/bin/psql/po/fr.po-msgstr " , " This is used for suffixing "unique" or "unique nulls not distinct". I adjusted the logic there to get rid of the short translation string. Quite a few are new to v19: fd366065e (AmitK), 48efefa6c (AmitK), 0fc33b005 (PeterE) The relation.c one is from v18: 8fcd80258 (AmitK) The describe.c one is from v15: 94aa7cc5f (PeterE) Should we get rid of these? David [1] https://postgr.es/m/CAApHDvohYOdrvhVxXzCJNX_GYMSWBfjTTtB6hgDauEtZ8Nar2A@mail.gmail.com
Вложения
David Rowley <dgrowleyml@gmail.com> writes:
> We've got a few parts of the code that translate strings that contain
> only a single punctuation character. I'm not a translator, but I
> suspect that these would be tricky to deal with as such short strings
> could be used for various different things, and if the required
> translation was to differ between requirements, then you're out of
> luck.
Yeah. I concur with your feeling that a separate translatable string
containing just a punctuation mark is probably the Wrong Thing. But
just removing the translation marker doesn't fix the problem. You
need more extensive restructuring so that what needs to be translated
is a coherent message.
We previously discussed the append_tuple_value_detail case [1], and
I opined that the right fix was to change things so that what that
function produces is a string that doesn't need translation because
it matches SQL syntax for a row constructor. It doesn't look like
that's happened yet.
regards, tom lane
[1] https://www.postgresql.org/message-id/227279.1775956328%40sss.pgh.pa.us
On Wed, Apr 22, 2026 at 10:30 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> (Follow-on work from [1])
>
> We've got a few parts of the code that translate strings that contain
> only a single punctuation character. I'm not a translator, but I
> suspect that these would be tricky to deal with as such short strings
> could be used for various different things, and if the required
> translation was to differ between requirements, then you're out of
> luck.
>
> I looked at: git grep -A 1 "msgid \", \"" and I see French is the only
> translation to do anything different with the ", " string, and only in
> psql.
>
> src/bin/psql/po/fr.po:msgid ", "
> src/bin/psql/po/fr.po-msgstr " , "
>
> This is used for suffixing "unique" or "unique nulls not distinct". I
> adjusted the logic there to get rid of the short translation string.
>
> Quite a few are new to v19: fd366065e (AmitK), 48efefa6c (AmitK),
> 0fc33b005 (PeterE)
> The relation.c one is from v18: 8fcd80258 (AmitK)
> The describe.c one is from v15: 94aa7cc5f (PeterE)
>
> Should we get rid of these?
>
This question overlaps with another thread of mine [1].
There, I was told that a punctuation double-quote (") *should* be translated.
OTOH, I did not see why the comma separator (,) should be translated
-- my patch did so only to be the same as existing code.
======
[1] https://www.postgresql.org/message-id/CAHut%2BPui7RaQ8OfJEVn2ry-ykjnGc%2B3ujsFmcHDFw9FsXw_tRw%40mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
Peter Smith <smithpb2250@gmail.com> writes:
> On Wed, Apr 22, 2026 at 10:30 AM David Rowley <dgrowleyml@gmail.com> wrote:
>> Should we get rid of these?
> This question overlaps with another thread of mine [1].
> There, I was told that a punctuation double-quote (") *should* be translated.
It should be, but it *has to be translated as part of a coherent
message*. As the examples in [1] show, several languages translate
opening and closing double-quotes differently. So if you write _("\"")
there is zero hope of that being usefully translatable.
This all goes back to the translatability guideline about not
constructing messages out of parts [2]. If you've got a single
punctuation mark as a separate string, you are violating both the
letter and the spirit of that guideline, and that has consequences
for translatability.
regards, tom lane
[1] https://www.postgresql.org/message-id/CAHut%2BPui7RaQ8OfJEVn2ry-ykjnGc%2B3ujsFmcHDFw9FsXw_tRw%40mail.gmail.com
[2] https://www.postgresql.org/docs/devel/nls-programmer.html#NLS-GUIDELINES
On Wed, Apr 22, 2026 at 11:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Smith <smithpb2250@gmail.com> writes:
> > On Wed, Apr 22, 2026 at 10:30 AM David Rowley <dgrowleyml@gmail.com> wrote:
> >> Should we get rid of these?
>
> > This question overlaps with another thread of mine [1].
> > There, I was told that a punctuation double-quote (") *should* be translated.
>
> It should be, but it *has to be translated as part of a coherent
> message*. As the examples in [1] show, several languages translate
> opening and closing double-quotes differently. So if you write _("\"")
> there is zero hope of that being usefully translatable.
>
> This all goes back to the translatability guideline about not
> constructing messages out of parts [2]. If you've got a single
> punctuation mark as a separate string, you are violating both the
> letter and the spirit of that guideline, and that has consequences
> for translatability.
>
> regards, tom lane
>
> [1] https://www.postgresql.org/message-id/CAHut%2BPui7RaQ8OfJEVn2ry-ykjnGc%2B3ujsFmcHDFw9FsXw_tRw%40mail.gmail.com
> [2] https://www.postgresql.org/docs/devel/nls-programmer.html#NLS-GUIDELINES
To my knowledge, we aren't violating that guideline because our
substituted parts aren't words of a sentence; they are quoted names in
a list
e.g.
Case#1: publication "XXX" has a problem
Case#2: the following publications have a problem: "XXX", "YYY", "ZZZ"
~~~
Case#1 is easy. "publication \"%s\" has a problem"
The quotes are part of the message, so they get translated as normal.
Case#2 is more fiddly. "the following publications have a problem: %s"
The substituted quoted-name list is constructed at runtime, but still,
we require those quotes to be translated so that quoted-names in cases
#1 and #2 look the same.
======
Kind Regards,
Peter Smith.
Fujitsu Australia
Peter Smith <smithpb2250@gmail.com> writes:
> Case#1: publication "XXX" has a problem
> Case#2: the following publications have a problem: "XXX", "YYY", "ZZZ"
Entirely aside from the mechanics of producing the output,
I am not sure I buy that emitting that is a desirable goal.
It seems to be based on an English-centric notion that singular
and indefinitely-many plural are the only two categories.
This is incorrect (see the documentation for ngettext()).
Is there a good reason not to output a separate message for
each publication? If we need to throw an ereport(ERROR)
covering them all, maybe list them in separate sentences
in a DETAIL message.
regards, tom lane
On Wed, Apr 22, 2026 at 12:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Peter Smith <smithpb2250@gmail.com> writes: > > Case#1: publication "XXX" has a problem > > Case#2: the following publications have a problem: "XXX", "YYY", "ZZZ" > > Entirely aside from the mechanics of producing the output, > I am not sure I buy that emitting that is a desirable goal. > It seems to be based on an English-centric notion that singular > and indefinitely-many plural are the only two categories. > This is incorrect (see the documentation for ngettext()). > Those case#1 and case#2 were just illustrative. The real code is using `errmsg_plural` and `errdetail_plural`, so I think that makes it ok for languages that have multiple forms of "plural". ====== Kind Regards, Peter Smith. Fujitsu Australia
On Wed, Apr 22, 2026 at 6:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > We previously discussed the append_tuple_value_detail case [1], and > I opined that the right fix was to change things so that what that > function produces is a string that doesn't need translation because > it matches SQL syntax for a row constructor. It doesn't look like > that's happened yet. > I'll look into your suggestions. -- With Regards, Amit Kapila.
On 2026-Apr-22, David Rowley wrote: > We've got a few parts of the code that translate strings that contain > only a single punctuation character. I'm not a translator, but I > suspect that these would be tricky to deal with as such short strings > could be used for various different things, and if the required > translation was to differ between requirements, then you're out of > luck. Yeah. > I looked at: git grep -A 1 "msgid \", \"" and I see French is the only > translation to do anything different with the ", " string, and only in > psql. Japanese also uses different punctuation characters, so I don't think we should get rid of translating these characters. Instead we should do what Tom says and integrate these characters into a larger string. I showed one example in a nearby thread from Peter Smith [1], and I think a couple of the spots you're patching can be easily done in the same way. As for the one in guc.c, I think what we should do is change config_enum_get_options() to have an API similar to GetPublicationsStr: instead of receiving prefix, suffix and separator, we should tell that function that we're constructing a list to be used in as an SQL value (GetConfigOptionValues), or one to be displayed to the user (parse_and_validate_value); and have the function add the separators and other decoration as needed, using the same technique. (The other prefix "Available values: " can be added by the caller, I think, and maybe the braces also, not sure.) [1] https://postgr.es/m/aeniYoOwCQmtWtQW@alvherre.pgsql -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/