warning: long, Re: Database design problem: multilingual strings

Поиск
Список
Период
Сортировка
От Karsten Hilbert
Тема warning: long, Re: Database design problem: multilingual strings
Дата
Msg-id 20030624195441.L9075@hermes.hilbert.loc
обсуждение исходный текст
Ответ на Database design problem: multilingual strings  (Antonios Christofides <A.Christofides@itia.ntua.gr>)
Список pgsql-general
Hi !

We had this problem in GnuMed (www.gnumed.org). Eventually, we
decided that it is only really solvable automatically for "fixed"
strings. That is, strings that are known at database creation.
User supplied strings need user supplied translations as well.
The translation mechanism works for them just as well but you
depend on the user to supply a translation.

I am attaching the solution we use in GnuMed. The schema file
shows our table setup:

-----------------------------------------------------------
-- =============================================
-- GnuMed fixed string internationalisation
-- ========================================
-- $Source: /cvsroot/gnumed/gnumed/gnumed/server/sql/gmI18N.sql,v $
-- $Id: gmI18N.sql,v 1.14 2003/06/10 09:58:11 ncq Exp $
-- license: GPL
-- author: Karsten.Hilbert@gmx.net
-- =============================================
-- Import this script into any GnuMed database you create.

-- This will allow for transparent translation of 'fixed'
-- strings in the database. Simply switching the language in
-- i18n_curr_lang will enable the user to see another language.

-- For details please see the Developer's Guide.
-- =============================================
-- force terminate + exit(3) on errors if non-interactive
\set ON_ERROR_STOP 1
-- =============================================

create table i18n_curr_lang (
    id serial primary key,
    owner name default CURRENT_USER unique not null,
    lang varchar(15) not null
);

comment on table i18n_curr_lang is
    'holds the currently selected language per user for fixed strings in the database';

-- =============================================
create table i18n_keys (
    id serial primary key,
    orig text unique
);

comment on table i18n_keys is
    'this table holds all the original strings that need translation so give this to your language teams,
    the function i18n() will take care to enter relevant strings into this table,
    the table table does NOT play any role in runtime translation activity';

-- =============================================
create table i18n_translations (
    id serial primary key,
    lang varchar(10),
    orig text,
    trans text,
    unique (lang, orig)
);
create index idx_orig on i18n_translations(orig);

-- =============================================
create function i18n(text) returns text as '
DECLARE
    original ALIAS FOR $1;
BEGIN
    if not exists(select id from i18n_keys where orig = original) then
        insert into i18n_keys (orig) values (original);
    end if;
    return original;
END;
' language 'plpgsql';

comment on function i18n(text) is
    'insert original strings into i18n_keys for later translation';

-- =============================================
create function _(text) returns text as '
DECLARE
    orig_str ALIAS FOR $1;
    trans_str text;
    my_lang varchar(10);
BEGIN
    -- no translation available at all ?
    if not exists(select orig from i18n_translations where orig = orig_str) then
        return orig_str;
    end if;

    -- get language
    select into my_lang lang
        from i18n_curr_lang
    where
        owner = CURRENT_USER;
    if not found then
        return orig_str;
    end if;

    -- get translation
    select into trans_str trans
        from i18n_translations
    where
        lang = my_lang
            and
        orig = orig_str;
    if not found then
        return orig_str;
    end if;
    return trans_str;
END;
' language 'plpgsql';

comment on function _(text) is
    'will return either the input or the translation if it exists';

-- =============================================
create function set_curr_lang(text) returns unknown as '
DECLARE
    language ALIAS FOR $1;
BEGIN
    if exists(select id from i18n_translations where lang = language) then
        delete from i18n_curr_lang where owner = CURRENT_USER;
        insert into i18n_curr_lang (lang) values (language);

        delete from i18n_curr_lang where owner = (select trim(leading ''_'' from CURRENT_USER));
        insert into i18n_curr_lang (lang, owner) values (language, (select trim(leading ''_'' from CURRENT_USER)));

        return 1;
    else
        raise exception ''Cannot set current language to [%]. No translations available.'', language;
        return NULL;
    end if;
    return NULL;
END;
' language 'plpgsql';

comment on function set_curr_lang(text) is
    'set preferred language:
     - for "current user" and "_current_user"
     - only if translations for this language are available';

-- =============================================
create function set_curr_lang(text, name) returns unknown as '
DECLARE
    language ALIAS FOR $1;
    username ALIAS FOR $2;
BEGIN
    if exists(select id from i18n_translations where lang = language) then
        delete from i18n_curr_lang where owner = username;
        insert into i18n_curr_lang (owner, lang) values (username, language);
        return 1;
    else
        raise exception ''Cannot set current language to [%]. No translations available.'', language;
        return NULL;
    end if;
    return NULL;
END;
' language 'plpgsql';

comment on function set_curr_lang(text, name) is
    'set language to first argument for the user named in
     the second argument if translations are available';

-- =============================================
-- there's most likely no harm in granting select to all
GRANT SELECT on
    i18n_curr_lang,
    i18n_keys,
    i18n_translations
TO group "gm-public";

-- users need to be able to change this
-- FIXME: more groups need to have access here
GRANT SELECT, INSERT, UPDATE, DELETE on
    i18n_curr_lang,
    i18n_curr_lang_id_seq
TO group "_gm-doctors";

-- =============================================
-- do simple schema revision tracking
INSERT INTO gm_schema_revision (filename, version) VALUES('$RCSfile: gmI18N.sql,v $', '$Revision: 1.14 $');

-----------------------------------------------------------

  Then, there's the relevant part from our developer's guide:

-----------------------------------------------------------

                                                           GNUMed:
      
Prev                                             Chapter 3. Coding Guidelines
 Next 

-------------------------------------------------------------------------------------------------------------------------------

3.7. Backend I18N for non-dynamic ("fixed") strings in the backend.

3.7.1. Introduction

In enumerations we often see fixed strings being stored in the backend. There's no good way a client can translate
thoseto the 
local language. Nevertheless we need to provide a translation. Consider the following example:

We want a table that enumerates family relations. The obvious table design would be


+-----------------------------------------------------------------------------------------------------------------------------+
|create table member (
     | 
|  id serial primary key,
     | 
|  name varchar(20)
     | 
|);
     | 
|
     | 

+-----------------------------------------------------------------------------------------------------------------------------+

Other tables will obviously reference table.id but we want the frontend to be able to show a spelled-out name for the
family
member type. A simple


+-----------------------------------------------------------------------------------------------------------------------------+
| select name from member where id='some ID';
     | 
|
     | 

+-----------------------------------------------------------------------------------------------------------------------------+

will, however, always return the version that was put into the database in upon installation. Typically this would be
doneby 
statements such as


+-----------------------------------------------------------------------------------------------------------------------------+
| insert into member(name) values('sister');
     | 
|
     | 

+-----------------------------------------------------------------------------------------------------------------------------+

Hence, queries would always return the English 'sister'.

PostgreSQL does not directly support localization of database content. Therefor the following scheme has been devised:

At the top of your psql script schema definition files include the file gnumed/server/gmI18N.sql which provides a
localization
infrastructure. For your convenience, just copy/paste the following two lines:


+-----------------------------------------------------------------------------------------------------------------------------+
|-- do fixed string i18n()ing
     | 
|\i gmI18N.sql
     | 
|
     | 

+-----------------------------------------------------------------------------------------------------------------------------+

The database will then contain several new tables starting with i18n_* and a few functions.

3.7.1.1. i18n_curr_lang

Here you can/should set the currently preferred language on a per-user basis. Only one language per user is allowed at
anyone 
time. Switching the language here will enable the user to see another translation (if provided).

3.7.1.2. i18n_keys

This is just a convenience table listing all the strings that need translations. Dump this and give to translation
teams.A 
tool will be provided to make use of this table. It is of no importance to the actual online translation process.

3.7.1.3. i18n_translations

This is where translations actually live. As in gettext the original string is used as the key and the language code
(which
should correspond with those used in i18n_curr_lang) as a discrimator.

3.7.2. How to translate strings

Make your string insertions aware of i18n issues. This is what the function i18n(text) is for. Regarding the above
example
insertions need to be rewritten from


+-----------------------------------------------------------------------------------------------------------------------------+
| insert into member(name) values('sister');
     | 
|
     | 

+-----------------------------------------------------------------------------------------------------------------------------+

to


+-----------------------------------------------------------------------------------------------------------------------------+
|
     | 
| insert into member(name) values(i18n('sister'));
     | 
|
     | 

+-----------------------------------------------------------------------------------------------------------------------------+

The i18n() function will take care of inserting the string 'sister' into the i18n_keys table where translation teams
willfind 
it and provide a translation. Later on, when a translation is available it will be inserted into i18n_translations:


+-----------------------------------------------------------------------------------------------------------------------------+
| insert into i18n_translations(lang, orig, trans) values ('de_DE', 'sister', 'Schwester');
     | 
|
     | 

+-----------------------------------------------------------------------------------------------------------------------------+

3.7.3. How to make your tables translate strings

Now that we have translations available in i18n_translations we can start making our tables aware of them.
Unfortunately,
PostgreSQL does not yet support column-level select rules. We therefor have to create views wrapping the original
tables.Note 
that the original table will still be useable. Original tables which have translated strings should be named
"_tablename"while 
views translating them should be named "v_i18n_tablename". Going back to our previous example, the table


+-----------------------------------------------------------------------------------------------------------------------------+
|create table member (
     | 
|  id serial primary key,
     | 
|  name varchar(20)
     | 
|);
     | 
|
     | 

+-----------------------------------------------------------------------------------------------------------------------------+

should be renamed to "_member" and a view created on it:


+-----------------------------------------------------------------------------------------------------------------------------+
|
     | 
|create view v_i18n_member (id, name) as
     | 
|  select _member.id, _(_member.name)
     | 
|  from member;
     | 
|
     | 

+-----------------------------------------------------------------------------------------------------------------------------+

By making sure to use the same column names in the view we minimize frontend coding changes.

You will notice how the function _() is used to access the translation for the attribute "name". This function is
providedby 
gmI18N.sql and provides nearly the same functionality as gettext.gettext() which is often aliased to _() in Python and
other
languages. It will return a translation based on the user's currently selected language in i18n_curr_lang and the
translation
for that language in i18n_translations using the original string as the key.

If no translation is available for a given string _() will return the original string. Also, if the user did not select
a
language in i18n_curr_lang the original is returned.

3.7.4. How to make the frontend use translated strings

All the backend infrastructure is in place now so we can make frontends aware of translated strings. The first step is
tomake 
frontends use the v_i18n_* views instead of the tables. If we fail to do that everything will still work. We just won't
get
translations :-)

The second step is to make sure the current user has a language selected in i18n_curr_lang. Use something like


+-----------------------------------------------------------------------------------------------------------------------------+
|insert into i18n_curr_lang(lang) values ('de_DE');
     | 
|
     | 

+-----------------------------------------------------------------------------------------------------------------------------+

This will default to the CURRENT_USER. The actual value need not conform to anything in particular. It can be "Klingon"
for
that matter. Make sure then to have "Klingon" translations available in i18n_translations.

This i18n technique does not take care of strings that are inserted into the database dynamically (at runtime). It only
makes
sense for strings that are inserted once. Such strings are often used for enumerations.

All this crap isn't necessary anymore once PostgreSQL supports native internationalization of 'fixed' strings.


-------------------------------------------------------------------------------------------------------------------------------
Prev                                                         Home
  Next 
Client Internationalization /                                 Up                                   Interacting with the
Backend
Localization
       

-----------------------------------------------------------

There are known drawbacks but this is what we currently use.
Hope that helps !

Karsten Hilbert, MD
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

В списке pgsql-general по дате отправления:

Предыдущее
От: "Carlos Oliva"
Дата:
Сообщение: Re: Eliminating start error message: "unary operator
Следующее
От: Ernest E Vogelsinger
Дата:
Сообщение: Re: Database design problem: multilingual strings